Update: Setting up Python, numpy, and PyTorch natively on Apple M1

Hendrik Erz

Abstract: A few months ago I received my first MacBook with an M1 chip – a.k.a.: It uses the ARM-architecture instead of the default Intel-architecture that most modern computers use. Back then I told you that installing PyTorch is a pain; and back then it was because the ecosystem wasn't fully adapted to the fact that suddenly MacBooks could also have ARM instruction sets. However, in the past six months, lots has changed. As I'm getting more and more emails by people asking whether or not my old instructions still hold true, I've decided it's time for an update.

Published on Friday, July 9th, 2021 by Hendrik | 9 min reading time

A few months ago I wrote an article outlining how you could set up a PyTorch data science environment using one of the new MacBooks. Back then, it involved a little bit of fiddling around and also downloading (or compiling yourself) a version of PyTorch. Since then, luckily, many things have changed.

As you can see here and here, we’re slowly getting there. It turns out that actually only few compilers support new platform/architecture combinations easily. But slowly, everything is falling into place. This doesn’t mean that you can simply run pip install torch on M1 Macs, but almost. In general, I’ve learned a lot over the past 6 months – especially with regard to how you organise your Python environments and all those pesky package dependencies. So, in this blogpost I’ll outline how you can now install PyTorch (and other libraries like SciPy, numpy, pandas, etc.) on your M1 Mac as of mid-2021. I expect there will little change, and as soon as this article is outdated, there’ll be plenty of “native” documentation (i.e. on the conda-homepage directly) on how to do that – if you even need specific instructions to begin with.

Please note that I did not completely test this myself, since Homebrew is a global install and as such I couldn’t just try it out with a new user account. And no, due to me preparing for my dissertation proposal examination, I don’t have the time deleting everything and setting it up anew. But what I describe here is fairly stable and will work. In general, remember: If some of the commands tell you in a message to do something additional, it is very likely that you should do that. That being said, if you figure that at some point I should add an additional paragraph with info for other users if you stumble upon something, please drop me a mail so I can amend this article!

Preconsiderations: Pip, Conda, and the Environmental Disaster of Python

In my original article, I mentioned something of “a fork of a fork of a fork” and many people asked me “… but, that doesn’t sound like I should do it, does it?” I apologise for making you insecure back then, this was not my intention. It is true, though, that Miniforge, which I recommend you use, is in fact the fork of a fork of a fork. But that shouldn’t make you feel uncomfortable, since it’s pretty stable and simple to use!

Every programming language has its quirks, and while with C++ it’s the complete lack of a package management system, for Python it’s kind of the opposite. In general you have three kinds of package managements for Python:

Pip: Pip is the most common package manager, and you’ll often see installation instructions that go like python -m pip install <some-package>.¹ You will sometimes need this even if you use Conda, but in general, pip works best for so-called noarch-packages, i.e. packages that are written purely in Python and are thus completely independent of any particular operating system or CPU architecture.
Venv: venv, or “virtual environment”, is the Pip-native way to create distinct environments. The main difference to pip is that here what you’ll have is a completely sealed environment that is independent of other environments. Why that is necessary? Well, for starters many packages in the Python world depend on other packages. So much nothing new. But whereas environments like JavaScript, with its NPM manager, is pretty clever in maintaining several different versions of the very same package to satisfy any needs by some of the packages you use, Python doesn’t, and it will fail if you have conflicting dependencies among your packages. Virtual environments are the Pythonian way to solve that problem by simply creating one environment for each configuration of packages you need. That being said, one crucial problem nowadays is that virtual environments do not take into account different Python versions (yes, that’s a thing). And that’s what conda is good at.
Conda: The probably most advanced iteration of managing your environments comes in the form of an abbreviated snake. Conda in principle works like virtual environments, but with one crucial difference: Instead of only managing different configuration of packages and their dependencies, it additionally allows you to specify a Python version. This means: For each conda environment you create you can specify a certain Python version if need be. So if you need to make use of a package that hasn’t been updated in a long time and still relies on Python 2.7, you can direct conda to download that Python version alongside and you’ll be fine.

Just by writing this enumeration, I realised I should probably at one point write a dedicated article just concerning this … well, yeah: environmental disaster that is the Python ecosystem. But for this article, just remember: In general use conda, except it tells you it cannot find a package, in which case you’re probably good to go with pip.²

Setting Up Homebrew

One thing that hasn’t changed is to set up homebrew first. The easiest way is to simply copy the oneliner from the homepage and execute it in your terminal. By the way, I recommend you use iTerm, since it’s much nicer than the built-in terminal. But the built-in terminal will do just fine.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

If you don’t like to just run remote Shell scripts on your computer, you can have a look at what will be downloaded here.

It will create the directory /opt/homebrew and will install everything in there. And that’s it! Now on to the next step.

Setting Up Miniforge

Now, remember I said in my original article that Miniforge is a “fork of a fork of a fork”? That’s still technically true, but it’s not as bad as it sounds. Here’s what Miniforge is:

First, there was Anaconda, which is a big (very big!) package that will not just install the conda package manager, but already include many libraries, GUI programs and other stuff you possibly don’t want. If you just want it up and running, you can certainly simply install Anaconda – but did I mention it’s obscenely big?
Then there is Miniconda, which is only the package manager-part of Anaconda. It comes with all the capabilities of installing packages, but without the bloat.
Finally, there is Miniforge, which still is technically a version of Miniconda, but with two decisive improvements: First, it is a community-managed Open Source fork of Miniconda, and it also has better support for various architectures – and this is what we want: Support for the ARM architecture!

So if you’re installing Miniforge, you’re technically installing a fork of a fork of a fork, but it’s a fork that has the benefits you need: It comes without five Gigabyte of additional pleasures and with improved support for ARM64-architecture Macs!

To set up Miniforge, you can just follow the instructions on the Homebrew homepage:

brew install --cask miniforge

Afterwards, the homepage tells you to run an additional command:

conda init "$(basename "${SHELL}")"

What this will do is initialise conda for your shell. For example, the environment variable SHELL will evaluate to the path to the ZSH shell, which should be the default on your Mac as well. The basename command will just take the name of the Shell itself (so in my case zsh). So it will run conda init zsh, and, if you’re using Bash instead it should run conda init bash.

What this final command will do is add something to your shell configuration. This shell configuration lives in your home folder and should have the name .zshrc (or .bashrc). It’s a small file that will be executed whenever you start your terminal. After running this code, Conda will add something to your configuration that makes sure that you are able to switch to different environments by typing conda activate <basename>. Whenever you start your terminal, you will start in the base environment, which is kind of the default. That being said, it’s a good idea to re-start your terminal afterwards so that this code is being loaded in any case.

Setting Up Your Environments

Now that you’re done setting up the preconditions you can now begin installing packages. PyTorch and SciPy are only two of the packages available. To do so, you need to first set up a new environment. Since PyTorch 1.8 works with Python 3.8 (and not Python 3.9!) you should create an environment with Conda that contains a Python 3.8 binary (replace myenv with the name you want the environment to have):

conda create --name myenv python=3.8

Afterwards, activate that environment³:

conda activate myenv

Now you can install all the packages you need:

conda install <package-name>

Remember to always switch to the correct environment before installing packages, since otherwise conda will just throw them into whatever environment you’re in – normally the base-environment!

Conclusion

Setting up PyTorch, Numpy, SciPy, or any other Python package that requires native code is still not the most straight-forward method, that is true. However, at this point, if you have an M1 Mac, your experience doesn’t really differ from the one of people running Intel-Macs. Yes, instead of simply using pip you’re confined to conda, but as I have figured in the past several months,conda is in fact much more advanced than pip, so even if you’re on an Intel-Mac, I can still recommend you roll with conda instead of pip.

pip will come in handy in all those instances, where a package is not available in the conda repositories. So if conda tells you it cannot find a package, chances are high that it’s noarch (i.e. not bound to any specific architecture) and you can simply run python -m pip install <package-name>.

Note that sometimes people will tell you just to write pip install, but I've heard many times that it's not recommended. The recommended syntax is to prepend it with python -m, so I suggest you do the same – even inside conda-installs! ↩
Note: Some people will tell you to use virtual environments, and maybe sometimes it's better …? But I figured that you can achieve basically the same using Conda. But I cannot really dive into that problem here. ↩
In case you're interested: Whenever you run conda activate what Conda will do is basically replace where the commands python and pip point to. So try it out for yourself: When you start your terminal, type which python and it should point to the base-environment. Then conda activate one of your other environments, and afterwards which python should point to the Python binary in that environment. ↩