Pandas is the most popular Python Library to analyze tabular data and, when it comes to loading data, the function read_csv is very convenient. Upon opening its documentation one may be startled by the long list of arguments and options, but these allow the user to load tabular data in many different formats.

For that reason, reading a simple CSV file in Pandas can become a bit complex. In order to become better acquainted with the parameters in read_csv I’ll share a few examples using it. …


I have been using JEP (Java Embedded Python) to allow calling the code written, in Python, by a team of data scientists in some Java micro-services for quite a while now. In general, this solution has been very stable and fast.

However, a while back and after some specific changes in the Python code, we started seeing some segfaults occurring when the code, that would run smoothly in Python, was executed through JEP. Fortunately, we figured out a solution and I would like to share my experience while debugging this issue. I hope some of it resonate with others!

First…


As a programmer, the command line terminal is one of the most important tools I use. By default, the appearance of this application is very plain, but there are ways to customize its looks and functionality. So, I’d like to share some of my favorite tools that can improve the experience while using it. In particular, I’m a big fan of zinit and powerlevel10k. You can also see my shell configuration in my dotfiles.

I switched to zsh (pronounced z-shell) a while back when it became the default in macOS Catalina (replacing bash). It offers some additional features and can…


I would like to use this opportunity to write about some useful patterns to organize your R code. These may be well know to advanced R users, but are left out of most tutorials. So, sit back, relax, and enjoy some useful simple tips for R!

1. Loading R packages

The most famous way of loading packages in R is through the library() command. One disadvantage of this option is the fact that an error is thrown if the package is not found in the R environment. …


In this post, I will mention a few R functions that are very useful to manipulate other functions or objects. If you are not acquainted with them, I hope that they open a whole world of possibilities in the R code you develop, as they did for me.

1. Ellipsis

Ellipsis (...) may be placed in the definition of a function to substitute multiple values that are given as arguments to that function and are not captured by other argument variables.

The expression ...

João Matias

Data scientist and mathematician

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store