Packaging a Pandas / Numpy project for Debian 9 and 10 using dh-virtualenv

Camilo Matajira Avatar

In this post we are going to create a debian package for a project that involves importing pandas and numpy. We will do it for Debian 9 and Debian 10.

In a previous post, I made a tutorial on how to use dh-virtualenv to create a debian package for a python project. In case you are interested you can read it here.


I wrote a specific post for Pandas and Numpy because the usual packaging fails due to a missing Fortran library (libgfortran) that is not easily accessible (at least I don’t know yet how to get it).

dpkg-shlibdeps: error: cannot find library libgfortran-ed201abd.so.3.0.0 needed by debian/test/opt/venvs/test/lib/python3.7/sit
e-packages/numpy/.libs/libopenblasp-r0-2ecf47d5.3.7.dev.so (ELF format: 'elf64-x86-64' abi: '0201003e00000000'; RPATH: '')
dpkg-shlibdeps: error: cannot continue due to the error above
Note: libraries are not searched in other binary packages that do not have any shlibs or symbols file.
To help dpkg-shlibdeps find private libraries, you might need to use -l.

To tackle this problem you have two options:

1. Activate ‘override_dh_shlibdeps’

This is the best option given that compile time for my project using this technique took around 1 minutes (instead of 27 minutes with the second option).

For this approach, you need to modify the debian/rules file. Specifically you need to un-comment the override_dh_shlibdeps in the ‘.PHONY’ line.

Also you need to uncomment the ‘override_dh_shlibdeps’ block. And add the problematic libraries, in this case Numpy and Pandas.

.PHONY: clean build-arch override_dh_virtualenv override_dh_strip override_dh_shlibdeps

(...)

override_dh_shlibdeps:
    dh_shlibdeps -X/x86/ -X/numpy/ -X/pandas/

2. Add extra pip argument ‘–no-binary=:all:’

This option is second best given that the packaging process will take around 27 minutes. I list this option because there might be some use case in which the override_dh_shlibdeps is not enough.

To implement this option you need to modify the debian/rules file and add the following: –extra-pip-arg=’–no-binary=:all:’

DH_VENV_ARGS=--setuptools --builtin-venv --python=$(SNAKE) $(EXTRA_REQUIREMENTS) \
                                                 --extra-pip-arg=--progress-bar=pretty --extra-pip-arg='--no-binary=:all:'

Missing ‘six’ module at runtime

With these modifications the package will build. However at runtime Python will request a missing ‘six module’.
I have tried several options but I don’t have yet a clean solution for this.
My best approach is to tell debian to put python3-six as a dependency for the package and do some hacking to add the ‘six’ module to the path.

To make debian require python3-six, add it to the Pre-Depends line in the debian/control file:

Pre-Depends: dpkg (>= 1.16.1), python3, python3-venv, python3-six, ${misc:Pre-Depends}

To make the hack work, first you need to know where your system will store the ‘six’ module. For this, install the module and then check where the library is stored:

   apt-get update && apt-get install python3-six
   dpkg -L python3-six

In my case it is stored in usr/lib/python3/dist-packages. Hence I will add the following to the __init__.py of my main module:

   import sys
   sys.path.append('usr/lib/python3/dist-packages')

I know this is not the cleanest approach, but for some reason the python interpreter inside the virtual environment doest not uses the ‘dist-packages’ libraries. I have tried using the –use-system-package option, but it does not work for me.

Hope this helps you build your project.

Camilo Matajira Avatar