Nix With Zephyr RTOS
Creating a Nix Flake for a Zephyr Development Environment
In my free time, I have done some research on Nix. Nix is a package management and system configuration tool. It has it's own language (also called Nix) and a Linux flavor called NixOS that utilizes nix file for configuration and the Nix package repository for package management. The core principal behind Nix and it's tool are reproducability. For more info, you can check out the Nix website.
With the focus on reproducability, Nix can be used to create consistent development environments through the nix develop command. This command uses a Nix Flake (a sort of package definition) to create a shell with all the dependencies a package or project might need for developing.
One task I've repeated several times at my job is creating a setup guide for developing Zephyr applications. This usually involved creating a README.md with the steps required to install the toolchain and other build dependencies. I thought that using Nix might be a good way to reduce the number of manual steps required for this scenario.
If you want the TLDR, here's the gists I saved with my work:
Initial Observations
After doing some research, this seemed like a reasonable usage of nix. I read up on how Nix Flakes are supposed to work, what features they do and do not provide. One tricky part of Nix development shells is that only the packages/components that are specified are available to the shell. This means the environment is isolated from the rest of your system. With this in mind, I decided the tasks for me to figure out were:
- Write a custom derivation for the
zephyr-sdk - Incorporate installing the required Python packages
- Incorporate the additional build dependencies (CMake, git, etc)
Note: I would really recommend reading this series of blog posts for more information regarding the basics of Flakes.
Writing the Zephyr SDK Derivation
There were several tricky things I had to figure out for a derivation to install the Zephyr SDK:
- Handle Linux vs macOS and 32-bit vs 64-bit support
- Use the minimal installer to download the
arm-zephyr-eabicomponents - Specify what packages are build inputs for the installer
Tackling supporting multiple OSes required constructing the installer URL from the provided system. Another tricky part was that the URLs use a convention that is opposite of Nix for naming the OS (i.e. linux-x86_64 vs x86_64-linux). This can be solved with a simple dictionary to convert between the names.
Using the minimal installer proved to be a giant pain in the butt. This is because by design Nix doesn't want derivations to access the network during the setup phase. The reasoning is because this can lead to non-deterministic behavior, which I definitely understand. To work around this two things are required. First the derivation much declare that it requires removing using chroot with the derivation. By setting __noChroot = true, the minimal installer can do it's thing without issue. The second requirement is to alter your Nix install's configuration with by relaxing the sandox requirements. A simple addition of sandbox = relaxed to the appropriate nix.conf works. There are additional methods described in my comment as well.
Finally the last step I needed the dependencies setup.sh uses as part of the minimal install. I thought these were simply:
cmakewget
I was wrong. It took quite a while for me to learn a basic principal of Nix: nothing is provided unless you ask for it. This meant I also had to include:
whichcacert
It was particularly infuriating to figure out setup.sh was failing to download components because it lacked the standard set of CA certs for setting up TLS. From my memory, the sequence of issues started with setup.sh failing -> realizing wget was returning an error -> determining it might be TLS related -> experimenting with curl and realizing it was certificate related -> finding cacert in the Nix package list. This was not fun to debug at all.
Once I had all the depedencies I was good to go! Except for some crazy reason, Nix's cmake package makes an assumption that if you use it as a nativeBuildInput that you must want CMake to run during the configuration phase. This is leads to my second principal of Nix: sometimes too much is assumed. To remedy this, I needed to include this line: dontUseCmakeConfigure = true;.
I think that was all of the issues I hit writing this file...
Getting Zephyr's Python Dependencies
The next issue I had was collecting Zephyr's Python depenencies into the buildInputs. Zephyr has a large list of these so finding each one in the package repository was not feasible. Additional reasons that's a bad move are I would be repeating a list that already exists in Zephyr's requirements files and the versions up on the Nix package list are not equivalent. Thankfully I discovered mach-nix . This is a great project which can take a list of Python requirements and create a single output derivation to provide to buildInputs. The only bump I hit with mach-nix was that it does not handle cascading requirement files, which Zephyr uses. To workaround this, I read each file into a string and concatenated them into a single output to pass to mach-nix.
Finishing the Flake
The remaining steps left were to combine everything together. This simply meant adding everything into the buildInputs attribute. The main tripping point after this was needing to specify that I needed setuptools and pip separate from Python (despite installing PythonFull??). I also added a few environment variables to the development shell so that Zephyr's build system would find the toolchain easily.
Using It!
To use this:
- Place
flake.nixandsdk.nixin yourzephyrrepo. - Create a directory to save the profile in, I used
scratch/zephyr-dev. - Run
nix develop --profile ~/scratch/zephyr-devfromzephyr/
That's it! Using the --profile flag creates a new gcroot which Nix uses to keep track of things that should be saved when garbage collecting with nix-collect-garbage.
Improvements
There are definitely a few improvements that I could make in both of these scripts:
- This only works with a T1 west topology, will need significant changes for other topologies
- There seem to be strange warnings issue with the Python
cryptographypackage that I did not investigate - Since I use Cortex-M4 based devices, I hardcoded the toolchain to only install
arm-zephyr-eabi. The scripts could be altered to relatively easy provide options for other architectures. - Modify
flake.nixto useflake-utilswhich seem to make using flakes easier?
Closing Thoughts
In the end I did enjoy this experiment. I learned a lot about Nix, I learned more about how the Zephyr toolchain is installed, and I got to a functional programming language to do it. I'm still not sold on the functional paradigm. It's such a difficult thing to switch to and in the end I don't think the Nix code is any more readable than other tools.
Despite this, Nix is still not a tool that I feel comfortable bringing into regular use. The number of undocumented things I had to search the depths of the internet for was maddening. I should not have to scour Github hoping to find someone else's snippet to learn how to do something! It should be in the docs! The docs should also be docs! There is nice documentation for Nix the language, Nix the package management system, and Nix the CLI. There's nice documentation for the nixpkgs collection. There's the Nix Pills series, but these are already outdated and in my opinion not a complete resource. But none of these cover how to write a flake, or what attributes are present in a flake attribute set! I found getting started to be incredibly frustrating due to lack of material to help me along.
The real dealbreaker to me is the package list. It's yet another place for developers to distribute and it requires learning a new language/configuration to do it. I'm not saying it's impossible but this is a large hurdle. All of these necessary packages need to be updated on Nix's registry for any of this to be worthwhile. Uses have to learn new ways of configuring these when they install them. Unfortunately it seems to be adding new complexity rather than removing it.
Ugh, I need a paragraph to rant about the cmake issue I hit. I could not believe that there's just an assumption that cmake should run during the configure stage if it is used as a nativeBuildInput! How is this consistent with the rest of the Nix package experience? It's not! I didn't ask for it, all I said was make cmake available to use to build my package.
In the end this pains me as I see the utility of this system! It's super cool that I can have an isolate environment with all of the dependencies that something like Zephyr requires. Nix was able to do this across typical system packages like cmake and with adding additional Python dependencies. However until Nix improves documentation, unifies the standard way of creating packages/derivations (Is it flakes? Is it the older ways? Is it something else?), gives users obvious ways to configure packages (make the cmake behavior obvious), and ensures that packages in the registry get used and updated, it's not something I'll be using in my toolkit seriously.