Tracking History of Single File in your Git Repo

Today’s tip of the day is on git.

gitk is the battery included, simple GUI that can show you the state of your git at any point in time? It is very convenient and much more intuitive to use and useful than the git log and git reflog CLI commands.

I use it every day. The fact that GITK is shipped with git is a good thing as if you have git, you will have this simple utility.

Typing gitk in your bash or command line will open up this utility.

If you want to see all your branches you can use

gitk --all

This shows all the branches and structures.

But what if you want to see only the history of a single file. what if you want to track and piece through the history of just one file.

Well, there is one less know command-line option to gitk which comes in handy. The “--

Here’s how to use it.

gitk -- single file

See the demo below

let me know if you know any other commands that I don’t know in the comments below?

More tips like this in the links below

Fix Bloated Git Repo With these Commands

If you are using git and eventually your git repo will gather dust, will bloat. I use git to manage my desktop and overtime this particular git repo becomes bloated, so these are two commands that I come back to every quarter or so to keep things tidy.

git fsck

Verifies the connectivity and validity of the objects in the database.

git-fsck tests SHA-1 and general object sanity, and it does full tracking of the resulting reachability and everything else. It prints out any corruption it finds (missing or bad objects).

git-gc

Cleanup unnecessary files and optimize the local repository

Runs a number of housekeeping tasks within the current repository, such as compressing file revisions (to reduce disk space and increase performance), removing unreachable objects which may have been created from prior invocations of git add, and packing refs, pruning reflog, metadata or stale working trees.

Am I doing something wrong? Is there a better way, please let me know in the comments?

Related posts you might be interested in

Use H5REPACK to Reclaim the Space

h5repack is a command-line tool that applies HDF5 filters to an input file file1, saving the output in a new file, file2.

Removing entire nodes (groups or datasets) from an hdf5 file should be no problem.
However, if you want to reclaim the space you have to run the h5repack tool.

h5repack -h

From the hdf5 docs:

Deleting a Dataset from a File and Reclaiming Space

HDF5 does not at this time provide an easy mechanism to remove a dataset from a file or to reclaim the storage space occupied by a deleted object.

Removing a dataset and reclaiming the space it used can be done with the H5Ldelete function and the h5repack utility program. With the H5Ldelete function, links to a dataset can be removed from the file structure. After all the links have been removed, the dataset becomes inaccessible to any application and is effectively removed from the file. The way to recover the space occupied by an unlinked dataset is to write all of the objects of the file into a new file. Any unlinked object is inaccessible to the application and will not be included in the new file. Writing objects to a new file can be done with a custom program or with the h5repack utility program.

Few related posts

To Get Conda in cmd System-Wide

You have installed Miniconda3 on my Windows (10/7) laptop. You can run python through the Anaconda Prompt but python is not recognised in the Windows Command Prompt.

From Windows Command Prompt if you type in ‘conda info’ you get this because it doesn’t even recognise conda:

‘conda’ is not recognized as an internal ….

How to solve this?

Sometimes having conda in cmd line is a useful thing to have. Opening Anaconda prompts just for accessing conda utilities is a hassle, so I always have conda available in cmd systemwide.

Here’s are two steps to follow.

To get conda in cmd system-wide

Step 1

If Anaconda is installed for the current user only, add %USERPROFILE%\Anaconda3\condabin (I mean condabin, not Scripts) into the environment variable PATH (the user one). If Anaconda is installed for all users on your machine, add C:\ProgramData\Anaconda3\condabin into PATH.

How do I set system environment variables on Windows?

set path=%path%%USERPROFILE%\Anaconda3\condabin

Step 2
Open a new Powershell or CMD, run the following command once to initialize conda.

conda init


These steps make sure the conda command is exposed to your cmd.exe and Powershell.

Hope this helps someone.

Pair this post with these useful posts to dig deeper.

Using Git Patch for Temporary Changes

Here’s a case that comes up often when you are working on a code that is being developed at multiple locations and by multiple developers.

To do a full integration test on my local system, I need to change some of the paths in the test files. I have done this once and then saved them as a patch.

To save the diff as a patch

git diff -p > locat_test.patch

Applying the patch when doing the local testing

git apply local_test.patch

After the test, I discard the local repo back to the original state.

git reset --hard

This has helped me a lot. I have multiple such small patches in my local repo just to do these quick transitory changes to the code base for a specific location.

Do you know of any use case similar to this, please let me know?

Taking a Break with Schtasks

Taking a break from the computer has become essential in this WFH environment. In the office, it was natural and happened often but at home, this is not the case.

I have tried many things.

Setting up a timer that is displayed on my desktop to remind me to take a break. This failed as resetting the countdown timer every time added the friction to render the whole exercise fruitless

Then I tried to use the outlook calendar to remind me to take a break.

This failed as slowly the calendar reminders were ignored.

Both of these approaches failed.

Needed something more visual. So this is my current setup. All done using standard available tools.

No software or installation required.

schtasks /create /tn take_a_break /sc HOURLY /mo 2 /st 08:30 /et 18:00 /tr "https://large-type.com/#Take%20a%20Break"

Every 2 hours it launches the default browser with the big fonts telling me to take a break. Much more intuitive and useful than anything I have tried earlier.

Another beauty of the system is that if this doesn’t get me started, I can configure this to lock my screen at that moment. A slight inconvenience but something that I will try if this big pop up doesn’t work.

How do you deal with this?

Add an app to run automatically at startup in Windows 10

Last summer, changed my office laptop after a long time and have been using windows 7 on it but then with the new system got upgraded to windows 10.

Windows 10 is nice but there are few quirks that are intimidating for any new user moving to it. One such simple thing is if you want to run an app automatically at startup in windows 10.

In all previous windows, it was simple to navigate to the startup folder and place a shortcut of your app to start it on startup

But in Windows 10, this has changed. Now to get to the startup folder you need to type shell:startup in the run command line.

Here’s a gif of the same in action.

Steps

  • Press the Windows logo key  + R,
  • Type shell:startup, then select OK. This opens the Startup folder.
  • Copy and paste the shortcut to the app from the file location to the Startup folder.

Have you faced this or something similar in your move to windows 10?

For Magic in CMD

A windows batch script command that has come in handy for me in number of occasions. This has been a time saver for many occasions, most recently this was used for running a machine learning model on number of bank statements stored in various excel workbooks stored in a folder.

Another was when I had to convert and process large number of FEM models stored across in the a drive.

Here’s the command

for /f %e in ('dir *.pm /b/s') do echo %e

echo %e is the command that needs to be fired.

dir *.pm /b/s list all *.pm files names in the current directory and sub folders

Some specific examples:

for /f %e in ('dir *.xls /b/s') do python make_inference.py %e
for /f %e in ('dir d:\monte_carlo_inputs\*.pm /b/s') do call monte_carlo.bat %e -e exec.script
for /f %e in ('dir d:\*.fem /b/s') do auto_nlc -fem %e -j input.json

Hope this helps. I know it helps me a lot every time I have do something repeatedly.

Stupid Mistake and How to recover from git reset hard

Post pandemic, I had a desktop and laptop assigned to me. I always preferred the desktop as laptop seemed too slow compared to desktop.

But in mid-March 2020, everything changed, the desktop was left alone in office and the neglected laptop become the primary workhorse.

While migrating my work from desktop to laptop, I dumped few things on my desktop and as i was in the habit of have a version-controlled desktop,

I did these two commands.

git init
git add *

As soon as the command completed, I realized, I do not want to git control some of the files, so instinctually did this.

git reset --hard

This deleted all the files on the desktop. At this I realized my stupid mistake. The quarter worth of work was deleted from the desktop. I checked recycle bin and there was nothing.

After a frantic and panic driven hour, finally did find the solution and got back my accidentally deleted files

Solution:

git fsck --lost-found

Now go to .git/lost-found folder and you should see the files, but they will be hashed. Need to save them individually

So next day was spent looking the lost-found folder and restoring the files, which essentially meant renaming the folder and files and getting them out of this lost-found folder.

Glad to have found this, otherwise, the first few weeks of the pandemic were destined to be recreating the work done in January and February.

Simple trick to read mixed format data with numpy’s genfromtxt

Suppose your data is like the following


DD,MM,YY,AMT,WHERE
30,5,2015,2,TRAVEL
30,5,2015,50,TRAVEL
30,5,2015,5,PHONE
31,5,2015,6.62,TESCO
31,5,2015,5,POUNDSHOP
31,5,2015,4.51,SAINSBURY
31,5,2015,1,PHONE

Let’s load it with numpy using genfromtxt


expdata=np.genfromtxt(fname, delimiter=',',skip_header=1,usecols=[0,1,2,3,4])

print expdata[0]
print expdata[-1]

This prints the following


[  3.00000000e+01   5.00000000e+00   2.01500000e+03   2.00000000e+00             NaN]
[   30.     5.  2015.    50.    NaN]

Numeric data is read but not quite as what we wanted. Notice the NaN.

You can supply genfromtxt the datatypes using the keyword format like

dtype=([(‘f0’, ‘<i4’), (‘f1’, ‘<i4’), (‘f2’, ‘<i4’), (‘f3’, ‘<f8’), (‘f4’, ‘|S14’)])


expdata=np.genfromtxt(fname, delimiter=',',skip_header=1,usecols=[0,1,2,3,4],dtype=([('f0', '&amp;amp;lt;i4'), ('f1', '&amp;amp;lt;i4'), ('f2', '&amp;amp;lt;i4'), ('f3', '&amp;amp;lt;f8'), ('f4', '|S14')]))

(30, 5, 2015, 2.0, 'TRAVEL')
(30, 5, 2015, 50.0, 'TRAVEL')

But that looks too much work but there’s a simple smart way to do this. Use dtype=None

 


expdata=np.genfromtxt(fname, delimiter=',',skip_header=1,usecols=[0,1,2,3,4],dtype=None)

(30, 5, 2015, 2.0, 'TRAVEL')
(30, 5, 2015, 50.0, 'TRAVEL')

Using dtype=None is a good trick if you don’t know what your columns should be and you need some help in getting the format. This might be slower but it does work and once you have the data you can replace the dtype none with the appropriate arguments.

Exploring Matplotlib Styles

Last week got some free time. Used it to upgrade my python installation on Mac, a long awaited task.

Looking at the upgrade log, was most excited to finally see the new version of matplotlib.

So launched it and went straight to the new style package.

Matplotlib is great at graphs but the default style before 1.4.3 left many things wanting.

The style package adds support for easy-to-switch plotting “styles” with the same parameters as a matplotlibrc file.

import matplotlib.pyplot as plt

What are different styles available in matplotlib?

print(plt.style.available)

[u'dark_background', u'bmh', u'grayscale', u'ggplot', u'fivethirtyeight']

Here’s how to use this.

But first let’s generate some data

import numpy as np
data = np.sin(np.linspace(0, 2*np.pi))

The default plot

plt.plot(data, 'r-o')

default_matplotlib_1.4.3_styles

Let’s use ggplot

plt.style.use('ggplot') 
plt.plot(data, 'r-o')

ggplot_matplotlib_style

Dark Background like excel 2007

plt.style.use('dark_background')
plt.plot(data, 'r-o')

dark_background_matplotlib

BMH style

plt.style.use(‘bmh’)
plt.plot(data, 'r-o')

bmh_matplotlib_style

Graystyle

plt.style.use(‘grayscale’)
plt.plot(data, 'r-o')

grayscale_matplotlib_style

fivethirtyeight Style

plt.style.use(‘fivethirtyeight’)
plt.plot(data, 'r-o')

fivethirtyeight_matplotlib

We can even add our own custom .mplstyle files to ~/.matplotlib/stylelib or call use with a URL pointing to a file with matplotlibrc settings. Follow the following link to define your own style.

A Simple but useful Python tip.

Want to convert a numpy arrays containing coords from

x1,y1,z1
x2,y2,z2
.
.
.
xn,yn,zn

to

x1
y1
z1
x2
y2
z2
….
xn
yn
zn

Example


print coords
[[ 0.    0.    0.  ]
 [ 0.25  0.    0.  ]
 [ 0.5   0.    0.  ]
 [ 0.75  0.    0.  ]
 [ 1.    0.    0.  ]
 [ 0.    0.25  0.  ]
 [ 0.25  0.25  0.  ]
 [ 0.5   0.25  0.  ]
 [ 0.75  0.25  0.  ]
 [ 1.    0.25  0.  ]
 [ 0.    0.5   0.  ]
 [ 0.25  0.5   0.  ]
 [ 0.5   0.5   0.  ]
 [ 0.75  0.5   0.  ]
 [ 1.    0.5   0.  ]
 [ 0.    0.75  0.  ]
 [ 0.25  0.75  0.  ]
 [ 0.5   0.75  0.  ]
 [ 0.75  0.75  0.  ]
 [ 1.    0.75  0.  ]
 [ 0.    1.    0.  ]
 [ 0.25  1.    0.  ]
 [ 0.5   1.    0.  ]
 [ 0.75  1.    0.  ]
 [ 1.    1.    0.  ]]


change = coords.reshape((-1,1))

print change
[[ 0.  ]
 [ 0.  ]
 [ 0.  ]
 [ 0.25]
 [ 0.  ]
 [ 0.  ]
 [ 0.5 ]
 [ 0.  ]
 [ 0.  ]
 [ 0.75]
 [ 0.  ]
 [ 0.  ]
 [ 1.  ]
 [ 0.  ]
 [ 0.  ]
 [ 0.  ]
 [ 0.25]
 [ 0.  ]
 [ 0.25]
 [ 0.25]
 [ 0.  ]
 [ 0.5 ]
 [ 0.25]
 [ 0.  ]
 [ 0.75]
 [ 0.25]
 [ 0.  ]
 [ 1.  ]
 [ 0.25]
 [ 0.  ]
 [ 0.  ]
 [ 0.5 ]
 [ 0.  ]
 [ 0.25]
 [ 0.5 ]
 [ 0.  ]
 [ 0.5 ]
 [ 0.5 ]
 [ 0.  ]
 [ 0.75]
 [ 0.5 ]
 [ 0.  ]
 [ 1.  ]
 [ 0.5 ]
 [ 0.  ]
 [ 0.  ]
 [ 0.75]
 [ 0.  ]
 [ 0.25]
 [ 0.75]
 [ 0.  ]
 [ 0.5 ]
 [ 0.75]
 [ 0.  ]
 [ 0.75]
 [ 0.75]
 [ 0.  ]
 [ 1.  ]
 [ 0.75]
 [ 0.  ]
 [ 0.  ]
 [ 1.  ]
 [ 0.  ]
 [ 0.25]
 [ 1.  ]
 [ 0.  ]
 [ 0.5 ]
 [ 1.  ]
 [ 0.  ]
 [ 0.75]
 [ 1.  ]
 [ 0.  ]
 [ 1.  ]
 [ 1.  ]
 [ 0.  ]]