1 Configuring your computer to use Python for scientific computing

1.1 Why Python?

There are plenty of programming languages that are widely used in data science and in scientific computing more generally. Some of these, in addition to Python, are Matlab/Octave, Mathematica, R, Julia, Java, JavaScript, Rust, and C++.

I have chosen to use Python. I believe language wars are counterproductive and welcome anyone to port the code we use to any language of their choice, I nonetheless feel we should explain this choice.

Python is a flexible programming language that is widely used in many applications. This is in contrast to more domain-specific languages like R and Julia. It is easily extendable, which is in many ways responsible for its breadth of use. We find that there is a decent Python-based tool for many applications we can dream up, certainly in data science. However, the Python-based tool is often not the very best for the particular task at hand, but it is almost always pretty good. Thus, knowing Python is like having a Swiss Army knife; you can wield it to effectively accomplish myriad tasks. Finally, we also find that it has a shallow learning curve with most students.

Perhaps most importantly, specifically for neuroscience applications, is that Python is widely used in machine learning and AI. The development of packages like TensorFlow, PyTorch, JAX, Keras, and scikit-learn have led to very widespread adoption of Python.

1.2 Jupyter notebooks

The materials of this course are constructed from Jupyter notebooks. To quote Jupyter’s documentation,

Jupyter Notebook and its flexible interface extends the notebook beyond code to visualization, multimedia, collaboration, and more. In addition to running your code, it stores code and output, together with markdown notes, in an editable document called a notebook.

This allows for executable documents that have code, but also richly formatted text and graphics, enabling the reader to interact with the material as they read it.

Specifically, notebooks are comprised of cells, where each cell contains either executable Python code or text.

While you read the materials, you can read the HTML-rendered versions of the notebooks. To execute (and even edit!) code in the notebooks, you will need to run them. There are many options available to run Jupyter notebooks. Here are a few we have found useful.

JupyterLab: This is a browser-based interface to Jupyter notebooks and more (including a terminal application, text editor, file manager, etc.). As of March 2025, Chrome, Firefox, Safari, and Edge are supported. I encourage you to run your code own machine. I give instructions below on how to do the necessary installations and launch JupyterLab.
VSCode: This is an excellent source code editor that supports Jupyter notebooks. Be sure to read the documentation on how to use Jupyter notebooks in VSCode.
Google Colab: Google offers this service to run notebooks in the cloud on their machines. There are a few caveats, though. First, not all packages and updates are available in Colab. Furthermore, not all interactivity that will work natively in Jupyter notebooks works with Colab. If a notebook sits idle for too long, you will be disconnected from Colab. Finally, there is a limit to resources that are available for free, and as of March 2025, that limit is unpublished and can vary. All of the notebooks in the HTML rendering of this book have an “Open in Colab” button at the upper right that allows you to launch the notebook in Colab. This is a quick-and-easy way to execute the book’s contents.

1.3 Marimo

Marimo offers a very nice notebook interface that is a departure from Jupyter notebooks in its structure. The biggest departure is that Marimo notebooks are specifically for Python, as opposed to being language agnostic like Jupyter. As a result, Marimo notebooks can offer many features not seen in Jupyter notebooks (without add-ons). The two most compelling, at least to me, are

Marimo notebooks are simple .py files which allow for easier version control and simple execution as scripts.
Marimo notebooks are reactive, meaning that the ordering of the cells is irrelevant and the notebook runs all cells that need to be rerun as a result of a change of value of a variable in any given cell.

In the course, we will use Jupyter notebooks, but you are welcome to play with Marimo notebooks. Upon completing the installation instructions in this notebook, Marimo will be installed.

1.4 Installing Python tools

Prior to embarking on your journey into data analysis, you need to have a functioning Python distribution installed on your computer. We present two methods for installation and package management with pixi being our preferred software.

1.4.1 Option 1: Pixi (preferred)

Pixi is a package management tool that allows installation of packages. Importantly, it does so in a project-based way. That is, for each project, you use Pixi to create and manage the packages needed for that project. Our “project” here is our data analysis/statistical inference course.

Step 1: Install Pixi. To install Pixi, you need access to the command line. For macOS users, hit Command-space, type in “terminal” and open the Terminal app. In Windows, open PowerShell by opening the Start Menu, typing “PowerShell” in the search bar, and selecting “Windows PowerShell.” I assume you know how to get access to the command line if you are using Linux.

On the command line, do the following.

macOS or Linux

curl -fsSL https://pixi.sh/install.sh | sh

Windows

powershell -ExecutionPolicy ByPass -c "irm -useb https://pixi.sh/install.ps1 | iex"

Step 2: Create a directory for your work in the course. You might want to name the directory huji_stats/, which is what I have named it. You can do this either with the command line of your graphical file management program (e.g., Finder for macOS).

Step 3 Navigate to the directory you created on the command line. For example, if the directory is huji_stats/ in your home directory and you are in your home directory, you can do

cd huji_stats

on the command line.

Step 4 Download the requisite Pixi files: pixi.toml, pixi.lock. These files need to be stored in the directory you created in step 3. You may download them by right-clicking those links, or by doing the following on the command line.

macOS or Linux

curl -fsSL https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml
curl -fsSL https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.lock

Windows

irm -useb https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml -OutFile pixi.toml

irm -useb https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/pixi.toml -OutFile pixi.lock

Step 5 To be able to use all of the packages, you need to invoke a Pixi shell. To do so, execute the following on the command line.

pixi shell

You are now good to go! After you are done working, to exit the Pixi shell, hit Control-D.

For doing work for this class, you will need to cd into the directory you created in step 2 and execute pixi shell every time you open a new terminal (or PowerShell) window.

1.4.2 Option 2: Conda (Pixi is preferred)

Conda is a system-level package manager. It differs from Pixi in that it operates system-wide (i.e., not just for a specific project) and uses environments to define a set of packages for a given mode of computing as oppose to specific projects like Pixi. We will use Miniconda to get access to the Conda pacakge manager.

To download and install Miniconda, do the following.

1.4.2.1 Windows

Go to the Miniconda page and go to the “Quick command line install” section.
Click on the “Windows PowerShell” tab.
Copy all of the contents in the gray box (starting the curl).
Go to the Start menu and search for “PowerShell.” Click to open a PowerShell window. Alternatively, you can hit Windows + R and type PowerShell in the text box.
Paste the copied text into the PowerShell window and hit enter.

1.4.2.2 macOS

Go to the Miniconda page and go to the “Quick command line install” section.
Click on the “macOS” tab.
Copy all of the contents in the gray box (starting the mkdir).
Open a Terminal window. You can do this by hitting Command-space bar, typing Terminal, and hitting enter. Alternatively, the Terminal application is located in the /System/Applications/Utilities/ folder, which you can navigate to using Finder.
Paste the copied text into the Terminal window and hit enter.

1.4.2.3 Linux

Go to the Miniconda page and go to the “Quick command line install” section.
Click on the “Linux” tab.
Copy all of the contents in the gray box (starting the mkdir).
Open a terminal window. I assume you know how to do this if you are using Linux.
Paste the copied text into the terminal window and hit enter.

1.4.3 Setting up a conda environment

I have created a conda environment for use in this workshop. You can install this environment by executing the following on the command line.

conda env create -f https://raw.githubusercontent.com/huji-stats/huji-stats.github.io/refs/heads/main/huji_stats.yml

This will build the environment for you (it may take several minutes). To then activate the environment, enter

conda activate huji_stats

on the command line. You will need to activate the environment every time you open a new terminal (or PowerShell) window.

1.5 Launching JupyterLab

Once you have invoked a Pixi shell or activated your conda environment, you can launch JupyterLab via your operating system’s terminal program (Terminal on macOS and PowerShell on Windows). To do so, enter the following on the command line.

jupyter lab

You will have an instance of JupyterLab running in your default browser. If you want to specify the browser, you can, for example, type

jupyter lab --browser=firefox

on the command line.

Alternatively, if you are using VSCode, you can use its menu system to open .ipynb files.

1.6 Checking your distribution

Let’s now run a quick test to make sure things are working properly. We will make a quick plot that requires some of the scientific libraries we will use.

Launch a Jupyter notebook in JupyterLab. In the first cell (the box next to the [ ]: prompt), paste the code below. To run the code, press Shift+Enter while the cursor is active inside the cell. You should see a plot that looks like the one below. If you do, you have a functioning Python environment for scientific computing!

import numpy as np
import bokeh.plotting
import bokeh.io

bokeh.io.output_notebook()

# Generate plotting values
t = np.linspace(0, 2*np.pi, 200)
x = 16 * np.sin(t)**3
y = 13 * np.cos(t) - 5 * np.cos(2*t) - 2 * np.cos(3*t) - np.cos(4*t)

p = bokeh.plotting.figure(height=250, width=275)
p.line(x, y, color='red', line_width=3)
text = bokeh.models.Label(x=0, y=0, text='HUJI-Stats', text_align='center')
p.add_layout(text)

bokeh.io.show(p)

Loading BokehJS ...

Computing environment

%load_ext watermark
%watermark -v -p numpy,bokeh,jupyterlab

Python implementation: CPython
Python version       : 3.13.5
IPython version      : 9.4.0

numpy     : 2.2.6
bokeh     : 3.7.3
jupyterlab: 4.4.5