# Gotchas in Python For Matlab Users

“What an odd place to put these comments!” This was my first thought when I saw this. With goal of Apprentice to Guru in mind, I was browsing through some python code and came upon this python file by Roland Memisevic.

Lot of useful information for anyone moving to python from matlab.

```
No switch statement in python!

Annoying things in python. The difference between matrix and array behavior. Try
a = np.asmatrix(randn(100, 1))
b = a.T*a
b.shape

a = np.asarray(randn(100, 1))
b = a.T*a
b.shape

# explanation ... * changes behavior between the two. First it is matrix multiply. For array it isn't.

We can be more explicit. dot() gives matrix multiplaction for arrays.

a = np.asmatrix(randn(100, 1))
b = dot(a.T, a)
b.shape

a = np.asarray(randn(100, 1))
b = dot(a.T, a)
b.shape

What happens with a.T*a in the array case? We can force array behavior with multiply.

a = np.asmatrix(randn(100, 1))
b = multiply(a.T, a)
b.shape

a = np.asarray(randn(100, 1))
b = multiply(a.T, a)
b.shape

This means .* in MATLAB, but it has the added useful/confusing behavior that it automatically tiles to form the multiplication.

Consider this MATLAB construct.

a = exp(randn(10, 400))
suma = sum(a, 1)
b = a./repmat(suma, 10, 1)
size(b)
Note the repmat in there. Instead in python this can be done with:

a = np.exp(randn(10, 400))
b = a/a.sum(0)
b.shape

Here, sum is summing over the first dimension (python indexes start from 0 in python) and automatically doing the repmat (tiling) for us! Neat eh? This also works with matrices,

a = np.asmatrix(np.exp(randn(10, 400)))
b = a/a.sum(0)
b.shape

Of course we should use things in design matrix format, so we have

a = np.asmatrix(np.exp(randn(400, 10)))
b = a/a.sum(1)
b.shape

And finally, let's just check that works with arrays ...

a = np.asarray(np.exp(randn(400, 10)))
b = a/a.sum(1)
b.shape

It doesn't work ... the problem is that the result of the sum in array is a one dimensional array and you can't do the automatic repmat!!

These behaviours are nasty because your code will work/fail simply dependent on whether someone has fed you an array or a matrix.

The repmat automatic tiling can also be a pain ... as it happens automaticaly, and can hide dimension errors.

Other Gotchas
=============

a = [1 2 3 4; 5 6 7 8];
reshape(a, 4, 2)

a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
a.reshape(4, 2)
print a

The problem is because python follows C. Row-major order is used in C and Python; column-major order is used in Fortran and Matlab.

Fix is to use

a.reshape(4, 2, 'F')

Same issues apply to "flatten".

a = [1 2; 3 4]
a(:)'

np.array([[1, 2], [3, 4]]).flatten()

np.array([[1, 2], [3, 4]]).flatten(1).T

Again the issue of arrays changing dimension rears its head here.

a = ones(1, 10)
b = ones(10, 10)
c = [b(:)' a]

a = np.ones((1, 10))
b = np.ones((10, 10))
c= r_[b.flatten(1).T, a]

Zeros and randn different behavior

# Python
np.zeros(1, 10)
np.zeros((1, 10))
np.zeros(10)

np.random.randn((1, 10))
np.random.randn(1, 10)
np.random.randn(10)

% MATLAB
randn(10)

Indexing
--------

Similar to matlab, but ranges in python stop before the highest number:

a = [1 2 3 4];
a(1:3)

a = np.array([1, 2, 3, 4])
print a[0:3]

Also beware that the step parameter comes at the end in numpy.

a = 1:10:200

a = r_[1:200:10]

The end value in matlab is replaced with -1. Any -ve number is considered to be indexing from the end, i.e. -2 is end-1, -3 is end-2 etc. Although it will stop before that end number ... need to use [0:] to go to end ...

To reverse the indexing of an array

a(end:-1:1)

becomes

a[::-1]

Beware the difference between

a = [1 2; 3 4]
a(1) = 0.0
a

and

a = np.array([[1, 2], [3, 4]])
a[0] = 0.0
print a

This can catch you out if the array with

a = np.random.randn(1, 40)
p a[0]

It is particularly confusing as for one dimensional arrays it works fine ... but the problems start if you start by saying a = zeros(18) vs a = zeros(1, 18)

np.asarray(randn(100, 1)).sum(0)

np.asarray(randn(100, 1)).sum()

Editing
=======

After editing modules you need to reload the module.

Plotting
========

plot(plotvals, y, 'k-', 'linewidth', 2) becomes
pp.plot(plotvals.T, y.T, 'k-', linewidth=2)

Matplotlib seems to accept only arrays not matrices!

cov
===

The cov command assumes things are the wrong way around.

cov(randn(100, 2))

np.cov(np.randn(100, 2))

use np.cov(np.randn(100, 2), rowvar=0)

Rank in MATLAB and Python
=========================
In MATLAB rank estimates the rank of a matrix through svd

rank([0, 1, 2, 3; 0, 2, 4, 6; 3, 8, 2, 3; 4, 2, 1, 5])

equivalent to

A = [0, 1, 2, 3; 0, 2, 4, 6; 3, 8, 2, 3; 4, 2, 1, 5]
s = svd(A)
tol = max(diag(A))*eps(max(s))
r = sum(s &gt; tol)

in python it gives the dimension

np.rank([[0, 1, 2, 3],[0, 2, 4, 6],[3, 8, 2, 3],[ 4, 2, 1, 5]])

LAMBDA
======

lambda is a keyword in python

Tile and Repmat
===============
a = [1; 2; 3; 4]
size(a)
repmat(a, [2, 3])
size(a)

a = np.array([[1], [2], [3], [4]])
a.shape
np.tile(a, (2, 3))

Mgrid
=====

Returns arguments in a different order from meshgrid.
[X, Y] = meshgrid(0:3, 0:4)
Y, X = mgrid[0:4,0:5]

Bizarre behaviour (bug?)
========================

a = [1; 2; 3; 4]
size(a)
b = repmat(a, [1, 1, 2])
size(b)

a = np.array([[1], [2], [3], [4]])
a.shape
b = np.tile(a, (1, 1, 2))
b.shape

A fix I found was to do

b = np.tile(a, (a.shape[0], a.shape[1], 2))

for this case.

Sum

sum(np.random.randn(100, 3), 1).shape
np.sum(np.random.randn(100, 3), 1).shape

```