
Note that, although do-able, pairs are not good at handling logical or factor values, as well as other categorical or discrete values. can all be fitted into the pairs() function, and this should satisfy most of our needs, if we only want to take a glimpse at the correlation between variables.

In other words, a data frame, a tibble, a time-series data, etc. The pairs() function requires a minimum input of x, which is described as “the coordinates of points given as numeric columns of a matrix or data frame”.
#SCATTERPLOT MATRIX SERIES#
The example dataset being used is called Seatbelts, which is a time series data. There are various methods to plot a scatterplot matrix, and this plot will introduce 6 different methods of creating the scatterplot matrix, compare their difference, and discuss their pros and cons. In creating a model, collinearity is not desired, and by inspecting the scatterplot matrix, we would have an idea of what to include into the model at the beginning. This is very useful for having a vague idea about linear correlation between variables. Scatterplot matrix is a collection of scatterplots being organized into a matrix, and each scatterplot shows the relationship between a pair of variables.
#SCATTERPLOT MATRIX CODE#
I have used it many times! Oh, and I re-arranged the main() part of the code so that it can be a formal example code or not get called if it is being imported into another piece of code.Plotting Scatterplot matrices in R Weijia Bao, Sixing Hao # correct axes limits, so we pull them from other axes # FIX #2: if numvars is odd, the bottom right corner plot doesn't have the # FIX #1: this needed to be changed from. Each row of "data" is plottedįig.subplots_adjust(hspace=0.0, wspace=0.0) import itertoolsĭef scatterplot_matrix(data, names=, **kwargs):

Not a fix, but I made it optional to explicitly input names, so that it puts a default xi for variable i in the diagonal positions.īelow you'll find an updated version of your code that addresses these two points, otherwise preserving the beauty of your code. It just leaves it as the default 0.1 ticks. If you have an odd number of variables you are plotting with, the bottom right corner axes doesn't pull the correct xtics or ytics. The axis tics weren't lining up like I would expect (i.e., in your example above, you should be able to draw a vertical and horizontal line through any point across all plots and the lines should cross through the corresponding point in the other plots, but as it sits now this doesn't occur. As I was working with it, I noticed a few little things that didn't look quite right. Thanks for sharing your code! You figured out all the hard stuff for us. # Set up ticks only on one side for the "edge" subplots.įor i, j in zip(*np.triu_indices_from(axes, k=1)):Īxes.plot(data, data, **kwargs)Īxes.annotate(label, (0.5, 0.5), xycoords='axes fraction',įor i, j in zip(range(numvars), itertools.cycle((-1, 0))): Returns the matplotlib figureįig, axes = plt.subplots(nrows=numvars, ncols=numvars, figsize=(8,8))įig.subplots_adjust(hspace=0.05, wspace=0.05) Passed on to matplotlib's "plot" command. Each row of "data" is plottedĪgainst other rows, resulting in a nrows by nrows grid of subplots with theĭiagonal subplots labeled with "names". """Plots a scatterplot matrix of subplots. Linestyle='none', marker='o', color='black', mfc='none')įig.suptitle('Simple Scatterplot Matrix')ĭef scatterplot_matrix(data, names, **kwargs): There's always a name associated with each data series, so you can omit having to specify names.)ĭata = 10 * np.random.random((numvars, numdata))įig = scatterplot_matrix(data, ,

If you're always going to be working with structured or rec arrays, then you can simplify this a touch. I'm not quite sure what your data looks like, but it's quite simple to just build a function to do this from scratch. The expectation is that you'd write a simple function to string things together however you'd like. Generally speaking, matplotlib doesn't usually contain plotting functions that operate on more than one axes object (subplot, in this case).
