Using Pipfile in Binder
I recently attended a workshop, organised by the excellent team of the Turing Way project, on a tool called BinderHub. BinderHub, along with public hosting platform MyBinder, allows you to publish computational notebooks online as “binders” such that they’re not static but fully interactive. It’s able to do this by using a tool called repo2docker to capture the full computational environment and dependencies required to run the notebook.
!!! aside “What is the Turing Way?” The Turing Way is, in its own words, “a lightly opinionated guide to reproducible data science.” The team is building an open textbook and running a number of workshops for scientists and research software engineers, and you should check out the project on Github. You could even contribute!
The Binder process goes roughly like this:
- Do some work in a Jupyter Notebook or similar
- Put it into a public git repository
- Add some extra metadata describing the packages and versions your code relies on
- Go to mybinder.org and tell it where to find your repository
- Open the URL it generates for you
- Profit
Other than step 5, which can take some time to build the binder, this is a remarkably quick process. It supports a number of different languages too, including built-in support for R, Python and Julia and the ability to configure pretty much any other language that will run on Linux.
However, the Python support currently requires you to have either a requirements.txt
or Conda-style environment.yml
file to specify dependencies, and I commonly use a Pipfile
for this instead. Pipfile
allows you to specify a loose range of compatible versions for maximal convenience, but then locks in specific versions for maximal reproducibility. You can upgrade packages any time you want, but you’re fully in control of when that happens, and the locked versions are checked into version control so that everyone working on a project gets consistency.
Since Pipfile
is emerging as something of a standard thought I’d see if I could use that in a binder, and it turns out to be remarkably simple. The reference implementation of Pipfile
is a tool called pipenv
by the prolific Kenneth Reitz. All you need to use this in your binder is two files of one line each.
requirements.txt
tells repo2binder to build a Python-based binder, and contains a single line to install the pipenv package:
pipenv
Then postBuild
is used by repo2binder to install all other dependencies using pipenv:
pipenv install --system
The --system
flag tells pipenv to install packages globally (its default behaviour is to create a Python virtualenv).
With these two files, the binder builds and runs as expected. You can see a complete example that I put together during the workshop here on Gitlab.
Webmentions
You can respond to this post, "Using Pipfile in Binder", by:
liking, boosting or replying to a tweet or toot that mentions it; or
sending a webmention from your own site to https://erambler.co.uk/blog/pipfile-binder-two-lines/
Comments
Powered by Cactus Comments 🌵