Area Man Learns to Make Wheels

I have been toying with Python wheel packages, of which I knew very little a few months back, except of their existence, and things like: they’re essentially zip files, and they somehow make shipping C extensions easier.

I wanted to write down the few things I have learned since, so that I would appear to be little less confused and maybe a little more informed.

Background is as follows: one of the libraries Tahoe-LAFS depends on is zfec, which implements an “erasure code”, or “forward error correction code”. Parts of zfec is implemented in C, presumably for the sake of speed, and that poses a little bit of an inconvenience.

In the olden days (up to zfec version 1.5.3, that is), folks that installed zfec had to have a C compiler in order to be able to build those bits of C. People that use free operating systems or macOS usually have a C compiler installed, but not universally, so this has been a minor hassle for them. Windows people are inconvenienced a little more because their compiler installer downloads are somewhere deep within the layers of Microsoft’s vast and expansive empire of websites.

Yet another inconvenience is that if your Python project depended on libraries that ships with C extensions in source form, your continuous integration systems would take a little longer to run, because of the additional steps required when building and installing those libraries.

It would be nice if we could make installing zfec and thereby Tahoe-LAFS a little bit easier for more people. One big target is Windows, where people need to install Visual C++ for Python 2.7, otherwise known as vcpython27.

I did some work on packaging and publishing zfec. Ramakrishnan helped a lot with GitHub pull request reviews, and with setting me up with appropriate permissions to zfec’s GitHub project and PyPI channels. Thank you Ram!

Python wheels

Python wheels solve the problem of shipping Python packages that contain C extensions by enabling maintainers to ship their software with pre-built extension bits for various platforms. Installing such Python packages are less of a hassle.

On the other hand, packaging is more of a hassle: you will have to prepare wheel packages for all the combinations of various versions of Python, various Operating Systems, and various instruction set architectures that you intend to support. Normally you will have to consider a matrix that consist of:

In addition, you will have to also publish a source distribution package, as a catch-all for any other exotic possibilities.

(Before wheels, Python had eggs. Wheels are currently the standard: see Wheel vs Egg.)

Building wheels

Normally, on a given platform, you could build a zfec binary wheel package for that platform like:

$ git clone https://github.com/tahoe-lafs/zfec.git
$ cd zfec; virtualenv venv; source venv/bin/activate
$ pip install setuptools wheel
$ python setup.py bdist_wheel
...

On my x86 Debian stable machine, this results in a package like dist/zfec-1.5.5-cp27-cp27mu-linux_i686.whl. On my amd64 Debian testing machine, that results in dist/zfec-1.5.5-cp38-cp38-linux_x86_64.whl. You get the idea.

In theory, I could upload those files to PyPI like below and declare partial victory:

$ pip install twine
$ twine upload dist/*

In practice, there’s a better way.

If you try to build wheel packages for the various combinations of Python versions and operating systems and instruction set architectures “by hand” in the above manner, that could become quite a tedious chore, and a very unreliable one at that. The process should be automated as much as possible.

A nice pre-packaged solution that does just this exists, namely cibuildwheel.

Using cibuildwheel

cibuildwheel is a nifty piece of software that helps with building Python wheel packages on a variety of CI providers. For packaging zfec, I chose GitHub Actions as the CI provider, and adopted the example configuration given in cibuildwheel documentation pretty much as-is.

When a release is tagged, this CI configuration is kicked into action, and in the end, a zip file that contains a whole bunch of binary wheel packages and a source package is produced as a CI artifact. This way, I was able to publish a nice long list of wheel packages for zfec 1.5.5 release: there’s wheel packages for CPython 2.7 through 3.9! PyPy2! PyPy3! Windows, both 32 bit and 64 bit! macOS! etc!

Note that there’s still a manual step of uploading packages to PyPI. As cibuildwheel docs correctly suggests, manual steps are for chumps.

Igor Freire has kindly made pull request 33 against zfec, which should automate this step. I haven’t had a chance to test and merge this PR yet, but I am eager to do that. Thank you Igor!

Fewer wheels with abi3

This is something for the future: although the number of wheel packages produced in the above step is rather big, it doesn’t have to be always that way.

From Python 3.2 onward, there’s the concept of a Stable Application Binary Interface, aka “abi3”, aka Py_LIMITED_API, which means that the need for Python version specific wheels could go away. With fairly new versions of setuptools and wheel, we should be able to do something like this:

$ python3 setup.py bdist_wheel --py-limited-api=cp36

That should produce a cp36-abi3 wheel for the current OS and architecture, still usable across Python 3.6 and onwards on the same OS and architecture. Installing abi3 wheels will require a fairly new version of pip.

Obviously building of abi3 wheels too should be automated. At the time of writing this, cibuildwheel is still discussing how to support abi3. I have filed a reminder issue against zfec to ship abi3 wheels when it can be done with minimal hassle.

I learned about abi3 when Jean-Paul noted that bcrypt has been able to ship fewer wheel packages this way. Thank you Jean-Paul!

Maybe zfec can adopt what bcrypt has done, but this isn’t really an urgent thing.

Mistakes were made

Releasing zfec was not as smooth as I make it out to be.

The first release I made was zfec 1.5.4, which did not work on Windows. I published zfec 1.5.4 on TestPyPI, and then made a draft pull request against Tahoe-LAFS. That seemed to work fine, modulo some usual CI noise, so I published zfec 1.5.4 on PyPI. But the zfec 1.5.4 packages I uploaded to PyPI broke Windows CI for Tahoe-LAFS, with this rather unhelpful error message:

[...]
  File "d:\a\tahoe-lafs\tahoe-lafs\.tox\py27-coverage\lib\site-packages\allmydata\codec.py", line 20, in <module>
    import zfec
  File "d:\a\tahoe-lafs\tahoe-lafs\.tox\py27-coverage\lib\site-packages\zfec\__init__.py", line 13, in <module>
    from ._fec import Encoder, Decoder, Error
exceptions.ImportError: DLL load failed: %1 is not a valid Win32 application.
[...]

Uh-oh. I don’t know what that means!

Fortunately PyPI allows project owners to delete individual files from a release, or yank entire releases. At this time I was a maintainer of zfec project at PyPI – maintainers have upload rights, but they are not allowed to delete files or yank a release.

I asked Ram to make me a zfec project owner, and he obliged. I went ahead removed the zfec wheel packages that broke Tahoe-LAFS CI, and Tahoe-LAFS CI began to spin again. But the original problem I set out to solve remained: to install zfec on Windows, you still need to have a compiler.

I have not been able to figure out why zfec 1.5.4 binary wheels failed on Windows. I haven’t spent a bunch of time on figuring out that either: instead I updated zfec’s cibuildwheel from version 1.6.0 to 1.6.4, and changed CIBW_TEST_COMMAND such that there’s a bit of sanity checking of the wheel packages produced by CI.

Those two things seemed to have made a difference: when I tagged zfec 1.5.5, and did some local testing of the CI-produced binary wheel packages on macOS, Linux, and Windows, things seemed to work.

Testing on Windows

To test zfec locally, I could use my old laptops that run various versions of Debian, and an old MacBook Pro that runs macOS. I do not have a Windows machine.

Microsoft offers some Windows 10 virtual machine images for download. I tried using their VirtualBox image, but things were far too slow in my old computers to be usable.

This problem was solved when cypher at tahoe’s IRC channel pointed out that there’s a Windows 10 Vagrant box, which turned out to be nifty: it offers a command line and not the full desktop, and they’ve included Chocolatey package manager, and OpenSSH, among other things. So you can do:

$ vagrant init gusztavvargadr/windows-10
$ vagrant up
$ vagrant ssh

And that would drop you at a Windows command prompt. Nifty! I did not know that this was even possible. Thank you cypher!

So I tested zfec 1.5.5 packages a little more on setups that are available to me, and felt confident that things will not break this time, or maybe they would break in a different manner this time.

I published zfec 1.5.5 at PyPI, and then made pull request 862 against Tahoe-LAFS that removed installation of vcpython27 from our Windows CI steps. CI looked fine, the PR was approved, and I merged it.

And then things broke again this time, in a different manner.

C’est la vie

While I was working on zfec packaging, Tahoe-LAFS had added netifaces library, and removed its corresponding old homegrown methods of querying network interfaces, in PR 872.

It turned out that netifaces uses a C extension, but a wheel package that targets 64-bit Windows and Python 2.7 has not been published yet. My branch for PR 862 was a bit old and did not have the changes that added netifaces, so I failed to detect that we were not quite ready to remove vcpython27 installation step from CI.

I have since made PR 909 against Tahoe-LAFS, which put vcpython27 installation step back into the CI.

I have also made PR 68 against netifaces, which builds a cp27-cp27m-win_amd64.whl package during its CI steps. Until that PR is reviewed, merged, and new netifaces packages are published, Windows users of Tahoe-LAFS will still need to install a compiler.

As they say, c’est la vie.

Further reading

I opened and closed a bunch of browser tabs while working on wheel packaging. Someday I might actually read them!

Just kidding. I am not going to read them all.

(Posted on December 2, 2020.)