Some tricks with pip install

TL;DR Don't user sudo python setup.py intall use pip install -e ./ --user

I regularly Go through the following process:

  1. Python packages are confusing
  2. I should read the docs
  3. pip is cool
  4. I have no idea what I am doing

But this time I actually learned something useful.

Background 1: easy_install and pip

If you have spent any reasonable amount of time you have probably learned about easy_install and pip. You have probably also generally learned that easy_install is not good and should be replaced with pip in most cases.

Background 2: Working with Package Source

So pip install works fabulously with packages in PyPI, but that does not help you if you work with actual package source. If you deal with the source for python package source there are three common ways to deal with developing and testing on the package source itself.

  1. Work in the same directory as the package source
  2. Modify sys.path to add your working directory
  3. Install the package using sudo python ./setup.py install

One and Two are just fragile, crazy, and if you have ever tried it you know somewhere in the back of your head this is bad. Three is theoretically the right way to install and test python Packages, but it has a number of downsides.

  1. Requires sudo
  2. Leaves a build directory owned by root
  3. Needs to be re-run every time the source changes
  4. Can't easily uninstall packages installed this way

Useful Thing 1: Using pip for local source installation

sudo pip install ./ can be used instead of setup.py. The solves the second and fourth problem on our list. No more build directory is left behind after the installation and sudo pip uninstall <Package> will uninstall the package in one command

Useful Thing 2: Installing in non-System paths

Adding a --user flag to pip install puts the installed package at ~/Library instead of /Library. Because ~/Library is owned by the user and not root there is no need for sudo.

By adding the -e flag to pip a symlink to the source source is installed instead of the byte-code compiled source. It is not really a symlink, but it is a close enough analogy to understand what is happening.

Summary

Developing and Testing Packages locally can be greatly improved by using pip install -e ./ --user. This keeps the "installed" package up to date with the source and can be easy uninstalled with pip uninstall <Package>

regex in python

I ran across this neat little regex trick the other day. My goal was to match three kinds of lines with one regex and extract all of the data from line. The lines are formatted as follows:

4 STRETCH   4  1                     2.0586112       1.0893702
5 BEND      4  1  2                  1.9052943     109.1653223
6 TORSION   4  1  2  3               3.1415927     180.0000000

The first column is coordinate number (positive integer). The second column is a coordinate type (string). The third column through the sixth columns are atom numbers (positive integers). The seventh is a coordinate value in either bohrs or radians (floating point). The eighth column is a coordinate value in either angstroms or degrees (floating point).

"(\d+) +([A-Z]+) +(\d+) +(\d+) +(\d+)? +(\d+)? +(-?\d+\.\d+) +(-?\d+\.\d+)"

This regex has three distinct parts:

 

  1. \d+ represents any positive integer
  2. [A-Z]+ represents any string of all capital letters
  3. -?\d+\.\d+ represents any floating point number
regex_groups = re.search(LONG_REGEX,line.strip())
if regex_groups:
	regex_groups = list(regex_groups.groups())
	if regex_groups[5]:
		# This is a torsion line
	elif regex_groups[4]:
		# This is a bend line
	elif regex_groups[3]:
		# This is a stretch line
	else:
		# This is some other line
	

The fifth and sixth columns are made optional in the regex with the ? character. The result was the ability to recognize and parse the contents of three similar, but distinct types of lines. I was also able to setup syntax highlighting for the blog.

I ran across this neat little regex trick the other day. My goal was to match three kinds of lines with one regex and extract all of the data from line. The lines are formatted as follows:

4 STRETCH   4  1                     2.0586112       1.0893702
5 BEND      4  1  2                  1.9052943     109.1653223
6 TORSION   4  1  2  3               3.1415927     180.0000000

The first column is coordinate number (positive integer). The second column is a coordinate type (string). The third column through the sixth columns are atom numbers (positive integers). The seventh is a coordinate value in either bohrs or radians (floating point). The eighth column is a coordinate value in either angstroms or degrees (floating point).

"(\d+) +([A-Z]+) +(\d+) +(\d+) +(\d+)? +(\d+)? +(-?\d+\.\d+) +(-?\d+\.\d+)"

This regex has three distinct parts:

 

  1. \d+ represents any positive integer
  2. [A-Z]+ represents any string of all capital letters
  3. -?\d+\.\d+ represents any floating point number
regex_groups = re.search(LONG_REGEX,line.strip())
if regex_groups:
	regex_groups = list(regex_groups.groups())
	if regex_groups[5]:
		# This is a torsion line
	elif regex_groups[4]:
		# This is a bend line
	elif regex_groups[3]:
		# This is a stretch line
	else:
		# This is some other line
	

The fifth and sixth columns are made optional in the regex with the ? character. The result was the ability to recognize and parse the contents of three similar, but distinct types of lines. I was also able to setup syntax highlighting for the blog.

Best before solution

I listen to music on spotify. Turns out spotify has coding puzzles on their website. While I was busy installing windows on a friend's computer I decided to try one of the puzzles. The puzzle is called best before. You can find the puzzle description here. You can find my solution in python here