Introduction to Jupyter notebooks

· · Read in about 4 min · Comments · Source
· · ·

One of my reasons for moving this blog over to Nikola after only a year on Hugo was to be able to include Jupyter notebooks as blog posts, so here's an example of that. (Also I'm off sick from work today, which generally means I'm bored and need something low-energy and low-risk to occupy my mind.) I'm going to demonstrate a few features without going into too much explanation. If you have a working Jupyter setup, you'll be able to use the "Source" link at the top of this post to download the full working notebook and play with it yourself.

Jupyter is a platform that lets you write combination program/documents in a notebook style. You can interweave text (like this paragraph), code and the results of running that code all in a single document.

This notebook uses the Python 3 programming language, but Jupyter has additional "kernels" available to support loads of other languages, including data science favourites R and Julia. Python is a very powerful language, but we often need to include (import) additional modules written by other developers to gain additional functionality. numpy is a general-purpose mathematics module and matplotlib lets us create plots. These module names are quite long and we'll use them a lot, so we'll also give them aliases np and plt respectively.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

You can notice a couple of things about the above code snippet. First, it's flagged with In [1]: which tells us that it has been run, and was the first snippet of all in the document to be run. These numbers are important, because it's possible to run code snippets one at a time in arbitrary order, so they may not have been run in the order they appear in the document.

Second, it has produced no output. In this case, this is a good thing, as importing a module is a silent operation and would only produce output in the form of an error message if it failed (for example, if the module wasn't installed correctly or was misspelled). For example, see what happens if I deliberately misspell a module name:

In [2]:
import numpie
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-fb56a7ca31d1> in <module>()
----> 1 import numpie

ModuleNotFoundError: No module named 'numpie'

Now I can ask these modules to do some work for me! For now, what I'm going to do is plot an exponential growth curve. If you're not familiar with the maths, this is a curve that can be used to represent the growth of a population of bacteria (or people!) in the presence of abundant food, space and other resources.

First I need to define the range over which I'm going to plot this function: 0 to 10 in steps of 0.1. I'll call it $x$ because this will go along the x-axis of my graph.

In [3]:
x = np.arange(0, 10, step=0.1)

There's still no output: I need to specifically ask Python to show me the contents of x:

In [4]:
print(x)
[ 0.   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.   1.1  1.2  1.3  1.4
  1.5  1.6  1.7  1.8  1.9  2.   2.1  2.2  2.3  2.4  2.5  2.6  2.7  2.8  2.9
  3.   3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  4.   4.1  4.2  4.3  4.4
  4.5  4.6  4.7  4.8  4.9  5.   5.1  5.2  5.3  5.4  5.5  5.6  5.7  5.8  5.9
  6.   6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8  6.9  7.   7.1  7.2  7.3  7.4
  7.5  7.6  7.7  7.8  7.9  8.   8.1  8.2  8.3  8.4  8.5  8.6  8.7  8.8  8.9
  9.   9.1  9.2  9.3  9.4  9.5  9.6  9.7  9.8  9.9]

I now have a list of the numbers I asked for. Notice that this doesn't include 10: the upper bound isn't included (this is a traditional maths thing but is worth being aware of). Next I want to calculate $e^x$, or as we'll translate it for Python:

In [5]:
y = np.exp(x)
print(y[:5])
[ 1.          1.10517092  1.22140276  1.34985881  1.4918247 ]

As this is a very long list, I asked Python to only show me the first 5 items in it: y[:5]. Finally, I ask matplotlib to make me a graph of these two lists of numbers, label the x and y axes and give it a title.

In [6]:
plt.figure(figsize=(10,6))
plt.plot(x, y)
plt.xlabel('$x$')
plt.ylabel('$e^x$')
plt.title('Example of an exponential curve')
plt.show()

So that's just a little taster of what Jupyter notebooks can do. Hopefully I'll come up with some more interesting posts to combine text and code soon!

Comments

Comments powered by Disqus