I have plotted Biplot in Matlab and have created it using fortran in the past. Last month, while playing with PCA, needed to plot biplots in python. Unlike MATLAB, there is no straight forward implementation of biplot in python, so wrote a simple python function to plot it given score and coefficients from a principal component analysis.
Here’s the function.
def biplot(score,coeff,pcax,pcay,labels=None): pca1=pcax-1 pca2=pcay-1 xs = score[:,pca1] ys = score[:,pca2] n=score.shape[1] scalex = 1.0/(xs.max()- xs.min()) scaley = 1.0/(ys.max()- ys.min()) plt.scatter(xs*scalex,ys*scaley) for i in range(n): plt.arrow(0, 0, coeff[i,pca1], coeff[i,pca2],color='r',alpha=0.5) if labels is None: plt.text(coeff[i,pca1]* 1.15, coeff[i,pca2] * 1.15, "Var"+str(i+1), color='g', ha='center', va='center') else: plt.text(coeff[i,pca1]* 1.15, coeff[i,pca2] * 1.15, labels[i], color='g', ha='center', va='center') plt.xlim(-1,1) plt.ylim(-1,1) plt.xlabel("PC{}".format(pcax)) plt.ylabel("PC{}".format(pcay)) plt.grid()
Plotted using
biplot(score,pca.components_,1,2,labels=categories)
What is Biplot?
Biplot is one of the most useful and versatile methods of multivariate data visualisation. The bipolar extends the idea of a simple scatter plot of two variables to the case of many variables, with the objective of visualising the maximum possible information in the data.
From wikipedia
A biplot allows information on both samples and variables of a data matrix to be displayed graphically. Samples are displayed as points while variables are displayed either as vectors, linear axes or nonlinear trajectories.
If you would like to dig deeper, here’s a link on a comprehensive introduction to Biplots [PDF].
Leave a comment