April 3, 2021

A linker script for keeping configuration files synced

This post is mainly to test hugo’s code formatting capacities. I have two machines – a desktop PC and a laptop – that I like to keep in sync. I do this by keeping a git repo of my various configuration files called dotfiles which is backed up to a web server. Then, I just make symbolic links from all the places my programs are looking for their configs to the config files in the repo. So I push to that when I make changes, and fetch changes from the server when I need to propagate them to the other machine.

The system works pretty well if you already have the links set up. I could probably automate the pushing and pulling from the remote, but I actually quite like having the control over when something might change. Having the dotfiles in a git repo is great, because it means I can go and just test out things and if it breaks I can roll back the changes pretty easily (so long as I remember to checkout a development branch of the repo first).

Together with emacs use-package and its :ensure functionality, I can have my emacs setup the same across my machines, which is nice. Maybe I’ll write more about my various dotfiles in future posts.

The main issue I had is that if I added a new config file to the repo, I would then have to manually add the symlink on the other machine. It’s not a big inconvenience, but especially when the config file isn’t just a dotfile that sits in my home directory, it’s annoying to have to figure out where I’m putting it. So I decided to see what I could do to automate the process.

The trick is that I replicate the directory structure in $HOME inside the dotfiles repo. For example, my config file for ssh lives in ~/.ssh/config but that’s really just a symlink to ~/.dotfiles/.ssh/config. Then, I use fd (aka fdfind in the Debian repos) to find all the files in dotfiles (excluding some things) and then use a python script to make the necessary dirs and set up the symlinks.

Anyway, here’s the script:

#!/usr/bin/env python3

from pathlib import Path
from subprocess import run

I think subprocess is built in, but I don’t think pathlib is. Both really just wrap some os module stuff. So I could probably make this more portable by learning how os works. But I don’t want to, because these convenient and simple interfaces exist and they make my life easier.

Do I need to specify python3 explicitly here? Probably not but I don’t care. Oh wait, maybe I do because I think pathlib is only for python3. So if I wanted a python2 version I’d probably have to faff about with os, and nobody wants to see that.

DOTFILES = Path.home().joinpath(".dotfiles/")

shell_cmd = r"fdfind --hidden --type f --exclude '\.git' --exclude '\#*' ."

excludes = [".gitignore", "halp", "linker.py", "linker.sh"]

DOTFILES is the path to the git repo where I store the config files. Because I’m using pathlib, I can actually make this somewhat environment neutral, because Path.home() will work regardless of what system I’m on. This advantage is totally undermined by the fact that I’m relying on fdfind, and a unix-y sort of system for where to place config files (as hidden files in your home directory). But still, it felt like good practice to do this rather than just use Path("~/.dotfiles"). There are two sorts of excluding going on here. Inside the shell_cmd call I’m excluding, for example, the .git directory and emacs temp files. Then there’s the excludes list, which is just a list of files that live in the .dotfiles directory that I don’t want to link anywhere. These include, for example, the linker script itself and the halp file which is just a list of useful commands and tips that I use only intermittently. Having this file saves me having to look them up once every few months. But I don’t need a symlink for that file. I have an alias set up for less .dotfiles/halp so I can access my notes anywhere. I’ve also recently discovered cht.sh which performs something of the same function.

prcs = run(shell_cmd, shell=True, capture_output=True, cwd=DOTFILES)
print("Found the following dotfiles:\n", prcs.stdout.decode("utf-8"))

path_list = [Path(x) for x in prcs.stdout.decode("utf-8").split("\n")][:-1]

path_dirs = [x.parent for x in path_list]

Now we actually call the shell command, and process its output into path_list which is a list of paths to files in .dotfiles and path_dirs which is the directories those files live in. The [:-1] at the end of the definition of path_list just chops off an empty line which gets interpreted as Path(".") – i.e. the path to the current directory – which causes problems later. I can actually more robustly deal with that problem by just explicitly putting Path(".") in excludes. Blogging as code review. Huh.

It’s sometimes non-obvious what the CWD is going to be when you run a python script, so it’s safer to specify what directory to run a command in, hence cwd=DOTFILES. I guess I could have folded this into the shell_cmd string, but this way is perhaps a little more transparent, and it guarantees that you search through the same directory you use to create links to.

def main():
    for dir in path_dirs:
        if not Path.home().joinpath(dir).is_dir():
            print(f"Creating directory {dir}")

    for pp in path_list:
        home_path = Path.home().joinpath(pp)
        df_path = DOTFILES.joinpath(pp)
        if (not home_path.is_symlink()) and pp.name not in excludes:
            print(f"Linking file {df_path} to symlink {home_path}")
            except FileExistsError as err:
                print(f"File {pp} already exists at {home_path}")

if __name__ == "__main__":

Why is only some of the code inside main()? Because I wanted to test the script without actually making any changes to the file system, so with this setup in emacs, I can run the buffer in the builtin python interpreter with C-c C-c, and then inspect path_dirs etc to see if they are what they ought to be without having to tidy up a bunch of broken symlinks if it doesn’t work. (This is something that came to me after I had already had to tidy up a bunch of broken symlinks once or twice…)

Inside main, all we do is make the directories we need to, and then link the files we need to. This is a pretty low impact script, in that it only makes links that don’t already exist (so you can override them locally if you want) and it fails gracefully with a warning if there’s a file rather than a symlink in a location it’s trying to place a link.

I have only just really figured out which way round the arguments go when using ln and I still found the way symlink_to works to be counterintuitive. It’s a method of the path to the link, which takes as its argument the path to the file.

UPDATE: I tidied up the script and posted it as a gist.

© Seamus Bradley 2021

Powered by Hugo & Kiss.