“What an odd place to put these comments!” This was my first thought when I saw this. With goal of Apprentice to Guru in mind, I was browsing through some python code and came upon this python file by Roland Memisevic.

Lot of useful information for anyone moving to python from matlab.

No switch statement in python! Annoying things in python. The difference between matrix and array behavior. Try a = np.asmatrix(randn(100, 1)) b = a.T*a b.shape a = np.asarray(randn(100, 1)) b = a.T*a b.shape # explanation ... * changes behavior between the two. First it is matrix multiply. For array it isn't. We can be more explicit. dot() gives matrix multiplaction for arrays. a = np.asmatrix(randn(100, 1)) b = dot(a.T, a) b.shape a = np.asarray(randn(100, 1)) b = dot(a.T, a) b.shape What happens with a.T*a in the array case? We can force array behavior with multiply. a = np.asmatrix(randn(100, 1)) b = multiply(a.T, a) b.shape a = np.asarray(randn(100, 1)) b = multiply(a.T, a) b.shape This means .* in MATLAB, but it has the added useful/confusing behavior that it automatically tiles to form the multiplication. Consider this MATLAB construct. a = exp(randn(10, 400)) suma = sum(a, 1) b = a./repmat(suma, 10, 1) size(b) Note the repmat in there. Instead in python this can be done with: a = np.exp(randn(10, 400)) b = a/a.sum(0) b.shape Here, sum is summing over the first dimension (python indexes start from 0 in python) and automatically doing the repmat (tiling) for us! Neat eh? This also works with matrices, a = np.asmatrix(np.exp(randn(10, 400))) b = a/a.sum(0) b.shape Of course we should use things in design matrix format, so we have a = np.asmatrix(np.exp(randn(400, 10))) b = a/a.sum(1) b.shape And finally, let's just check that works with arrays ... a = np.asarray(np.exp(randn(400, 10))) b = a/a.sum(1) b.shape It doesn't work ... the problem is that the result of the sum in array is a one dimensional array and you can't do the automatic repmat!! These behaviours are nasty because your code will work/fail simply dependent on whether someone has fed you an array or a matrix. The repmat automatic tiling can also be a pain ... as it happens automaticaly, and can hide dimension errors. Other Gotchas ============= a = [1 2 3 4; 5 6 7 8]; reshape(a, 4, 2) a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) a.reshape(4, 2) print a The problem is because python follows C. Row-major order is used in C and Python; column-major order is used in Fortran and Matlab. Fix is to use a.reshape(4, 2, 'F') Same issues apply to "flatten". a = [1 2; 3 4] a(:)' np.array([[1, 2], [3, 4]]).flatten() Instead you have to use np.array([[1, 2], [3, 4]]).flatten(1).T Again the issue of arrays changing dimension rears its head here. a = ones(1, 10) b = ones(10, 10) c = [b(:)' a] a = np.ones((1, 10)) b = np.ones((10, 10)) c= r_[b.flatten(1).T, a] Zeros and randn different behavior # Python np.zeros(1, 10) np.zeros((1, 10)) np.zeros(10) np.random.randn((1, 10)) np.random.randn(1, 10) np.random.randn(10) % MATLAB randn(10) Indexing -------- Similar to matlab, but ranges in python stop before the highest number: a = [1 2 3 4]; a(1:3) a = np.array([1, 2, 3, 4]) print a[0:3] Also beware that the step parameter comes at the end in numpy. a = 1:10:200 a = r_[1:200:10] The end value in matlab is replaced with -1. Any -ve number is considered to be indexing from the end, i.e. -2 is end-1, -3 is end-2 etc. Although it will stop before that end number ... need to use [0:] to go to end ... To reverse the indexing of an array a(end:-1:1) becomes a[::-1] Beware the difference between a = [1 2; 3 4] a(1) = 0.0 a and a = np.array([[1, 2], [3, 4]]) a[0] = 0.0 print a This can catch you out if the array with a = np.random.randn(1, 40) p a[0] It is particularly confusing as for one dimensional arrays it works fine ... but the problems start if you start by saying a = zeros(18) vs a = zeros(1, 18) np.asarray(randn(100, 1)).sum(0) np.asarray(randn(100, 1)).sum() Editing ======= After editing modules you need to reload the module. Plotting ======== plot(plotvals, y, 'k-', 'linewidth', 2) becomes pp.plot(plotvals.T, y.T, 'k-', linewidth=2) Matplotlib seems to accept only arrays not matrices! cov === The cov command assumes things are the wrong way around. cov(randn(100, 2)) np.cov(np.randn(100, 2)) use np.cov(np.randn(100, 2), rowvar=0) Rank in MATLAB and Python ========================= In MATLAB rank estimates the rank of a matrix through svd rank([0, 1, 2, 3; 0, 2, 4, 6; 3, 8, 2, 3; 4, 2, 1, 5]) equivalent to A = [0, 1, 2, 3; 0, 2, 4, 6; 3, 8, 2, 3; 4, 2, 1, 5] s = svd(A) tol = max(diag(A))*eps(max(s)) r = sum(s > tol) in python it gives the dimension np.rank([[0, 1, 2, 3],[0, 2, 4, 6],[3, 8, 2, 3],[ 4, 2, 1, 5]]) LAMBDA ====== lambda is a keyword in python Tile and Repmat =============== a = [1; 2; 3; 4] size(a) repmat(a, [2, 3]) size(a) a = np.array([[1], [2], [3], [4]]) a.shape np.tile(a, (2, 3)) Mgrid ===== Returns arguments in a different order from meshgrid. [X, Y] = meshgrid(0:3, 0:4) Y, X = mgrid[0:4,0:5] Bizarre behaviour (bug?) ======================== a = [1; 2; 3; 4] size(a) b = repmat(a, [1, 1, 2]) size(b) a = np.array([[1], [2], [3], [4]]) a.shape b = np.tile(a, (1, 1, 2)) b.shape A fix I found was to do b = np.tile(a, (a.shape[0], a.shape[1], 2)) for this case. Sum sum(np.random.randn(100, 3), 1).shape np.sum(np.random.randn(100, 3), 1).shape

Advertisements