Joint plot with Matplotlib

Today I am releasing a simple module to create joint plot with Matplotlib on github. Joint plot is available in the excellent seaborn library but unfortunately it’s not always available on many systems. Recently I needed this functionality, so wrote this simple module with matplotlib.

The functionality is almost similar to seaborn but with limited feature. This has helped me in my work, releasing it in the hope that others might find it useful.

Sample usage:

Import Jointplot
tips=pd.read_csv(r'tests/tips_p.csv')
data=np.c_[tips['total_bill'].values,tips['tip'].values]
jointPlot(data,kde=True)

Find the code at this github repository.

Advertisements

Quarterly Results Analysis with Python

Every quarter when results come out, I spend sufficient hours looking at the financial results of the stocks I am tracking to elicit nagging comments from my better half. So developed this simple python script which does the analysis and generates me the PDF which I email to myself to look at during office commute.

The code as always is available at my github page here.

Some sample plots

Intend to update the scripts for other analysis as and when I get some time.

Simple trick to read mixed format data with numpy’s genfromtxt

Suppose your data is like the following


DD,MM,YY,AMT,WHERE
30,5,2015,2,TRAVEL
30,5,2015,50,TRAVEL
30,5,2015,5,PHONE
31,5,2015,6.62,TESCO
31,5,2015,5,POUNDSHOP
31,5,2015,4.51,SAINSBURY
31,5,2015,1,PHONE

Let’s load it with numpy using genfromtxt


expdata=np.genfromtxt(fname, delimiter=',',skip_header=1,usecols=[0,1,2,3,4])

print expdata[0]
print expdata[-1]

This prints the following


[  3.00000000e+01   5.00000000e+00   2.01500000e+03   2.00000000e+00             NaN]
[   30.     5.  2015.    50.    NaN]

Numeric data is read but not quite as what we wanted. Notice the NaN.

You can supply genfromtxt the datatypes using the keyword format like

dtype=([(‘f0’, ‘<i4’), (‘f1’, ‘<i4’), (‘f2’, ‘<i4’), (‘f3’, ‘<f8’), (‘f4’, ‘|S14’)])


expdata=np.genfromtxt(fname, delimiter=',',skip_header=1,usecols=[0,1,2,3,4],dtype=([('f0', '&amp;amp;lt;i4'), ('f1', '&amp;amp;lt;i4'), ('f2', '&amp;amp;lt;i4'), ('f3', '&amp;amp;lt;f8'), ('f4', '|S14')]))

(30, 5, 2015, 2.0, 'TRAVEL')
(30, 5, 2015, 50.0, 'TRAVEL')

But that looks too much work but there’s a simple smart way to do this. Use dtype=None

 


expdata=np.genfromtxt(fname, delimiter=',',skip_header=1,usecols=[0,1,2,3,4],dtype=None)

(30, 5, 2015, 2.0, 'TRAVEL')
(30, 5, 2015, 50.0, 'TRAVEL')

Using dtype=None is a good trick if you don’t know what your columns should be and you need some help in getting the format. This might be slower but it does work and once you have the data you can replace the dtype none with the appropriate arguments.

Quick Gantt Chart with Matplotlib

ProjectPlan_GANTT_CHart_MatplotlibThe problem: You need a quick Gantt chart for a quick proposal report and you dint have any project planner software installed.

Solution:

While there are many different ways, you only have access to python, well Here’s a simple Gantt chart plotter with just matplotlib.

Not for rigorous use but a good substitute to make quick Gantt plot for quick report.

Continue reading

Biplot  in Python revisited. 

Mark sent a recent email complaining  the previous biplot code not working. Though I was not able to replicate his errors,but from the error message figured the error was due to the PCA numbers supplied.

That reminded me of the simplification task that I intended to do on the previous version to make it work like it works in matlab.

No dependency on PCA data structure, send to variables, scores and coefficient and plot the biplot.

Continue reading

Matlab to Python – some code examples

Two years back, I was converting a matlab script to python, here are some of the errors that I encountered during the conversion. Found them documented in that converted script, posting them here for wider audience.



Matlab to Python

1. () to []

nd=topo(j,i)
X(j,i)=coords(nd,1)
Y(j,i)=coords(nd,2)

SyntaxError: can't assign to function call

nd=topo[j,i]
X[j,i]=coords[nd,1]
Y[j,i]=coords[nd,2]

2. 1 to 0

X(j,i)=coords(nd,1)
Y(j,i)=coords(nd,2)

IndexError: index (2) out of range (0<=index<2) in dimension 1

X[j,i]=coords[nd,0]
Y[j,i]=coords[nd,1]

3. zeros to zeros

B=zeros(2,edof)

In Python

B=np.zeros((2,edof))

4. array to array

D=G*[1 0; 0 1];

in python

D_ps=G* np.array([[1.0,0.0],[0.0,1.0]])

5. % to #

% is comments in matlab

Python

# is comments in python

6. For to for

for i=1:nnel
node(i)=nodes(iel,i);
end

python

for i in range(nnel):
node[i]=nodes[iel,i]

7. Matlab find to python find

L1 = find(coordinates(:,2)==min(coordinates(:,2)))

in python

l1 = np.where(coords[:,1]==np.min(coords[:,1])

Pdf With Matplotlib

I thought everyone knew about but i was surprised this this little feature of matplotlib is not that known as widely as I assumed.

We all know we can save a plot from matplotlib to pdf but there other little feature hiding in the backends where we can write out a multiple page pdf

Here’s a simple code to write out multiple page pdf using matplotlib.


import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

pdf=PdfPages('matplotlibplot.pdf')

fig=plt.figure()
plt.plot(np.random.rand(10));
pdf.savefig(fig)

fig=plt.figure()
plt.hist(np.random.randn(1000));
pdf.savefig(fig)

pdf.close()

So easy. Instantiate pdfpages, Open the pdf file, save figures. Works from version 0.99 of matplotlib.
With little use of plt.text, plt.annotate, can be used to produce quick pdf reports. In fact,I have pandas and matplotlib workflow to pump out full blown management report on continuous improvement initiative.
Here’s a simple full example of creating a simple pdf report using the above feature

Continue reading