Seaborn Heatmaps: Customizing Correlation Matrix Visualizations

Mathewkatz
3 min readFeb 16, 2021

--

For data scientists, checking correlations is crucial part of the exploratory data analysis (EDA) process. This analysis is one of the methods used to decide which features affect the target variable the most, and in turn, get used in predicting this target variable. Because visualization is generally much easier to understand than reading tabular data, heatmaps are typically used to visualize correlation matrices. A simple way to plot a heatmap in Python is by importing and implementing the Seaborn library.

The Seaborn heatmap function has 18 arguments that can be used to customize a correlation matrix, improving how fast insights can be derived.

Let’s do it!

The first thing we need to do is import the Seaborn library and load the data.

The data was collected by creating a website where participants were shown two fun-sized candies and asked to click on the one they would prefer to receive. In total, more than 269 thousand votes were collected from 8,371 different IP addresses. (For binary variables, 1 means yes, 0 means no.) Thought it’d be fun to look at!

The full function with all the arguments is here:

That is the ‘regular’ heatmap with no arguments.

For an even easier interpretation, an argument called annot=True should be passed as well, which helps display the correlation coefficient.

There are times where correlation coefficients may be running towards 5 decimal digits. A way to improve readability is to pass the argument fmt =’.3g'or fmt = ‘.1g' because by default the function displays two digits after the decimalLet's specify the default argument to fmt='.1g' .

The next three arguments have to do with rescaling the color bar. There are times where the correlation matrix bar doesn’t start at zero, a negative number, or end at a particular number of choice — or even have a distinct center. All this can be customized by specifying these three arguments: vmin, which is the minimum value of the bar; vmax, which is the maximum value of the bar; and center . By default, all three aren’t specified. Let’s say we want our color bar to be between -1 to 1 and be centered at 0.

The colors changed!! This has to do with changing the center from None to Zero or any other number. But this does not mean we can’t change the color back or to any other available color. Change the color by specifying the argument cmap .

We’re going to use the argument cmap = ‘winter’ .

For other color variations check out this website:

There are times where the heatmap may look better with some border thickness and a change of color. This is where the arguments linewidths and linecolor apply. Let's specify the linewidths and the linecolor to 3 and black, respectively.

--

--

Mathewkatz
Mathewkatz

No responses yet