Since we released the alpha version of Pijul 1.0 two months ago, a lot of things have happened. In this post, I want to share some of them, and give a roadmap for the next few weeks or months.
I’m happy to announce that we are now very close to a beta version of Pijul. A number of things needed to be fixed, and they have indeed been fixed. In particular:
We had a number of issues with SSH keys and unintuitive error messages related to network errors. For example, a temporary connection drop used to be fatal for HTTP connections, whereas Pijul can easily recover from that now. These issues looked really bad, but were actually fairly easy to fix.
The core algorithms were “almost there” when I announced the initial alpha, but, as I explain below, weren’t totally ready. In particular, the patch format has needed to change once. I haven’t maintained a “Changelog” file since the beginning, but that is also because most entries in that log would have been of the form “correctness of apply” or “correctness of unrecord”.
Sanakirja (our database backend), now performs more checks to detect disk errors, either accidental (the user overwrites the file) or physical (disk failures). This obviously comes with a small performance hit, but there is probably still room for optimisation there.
On the “software engineering” side of things, I’ve recently simplified the design of the library. We used to have two giant traits to describe all the operations that could be done on a repository: for example record
and output
needs to look at the pristine and at the current working copy, but with different mutability, while apply only needs one channel of the pristine. There are many functions like that in Pijul, requiring overlapping subsets of the trait, which makes the boundaries somewhat unclear.
One of my recent patches simplifies that design by splitting these traits into smaller pieces, in order to make it easier to write alternative backends. The consequence is that many functions now require complicated trait bounds, such as ChannelTxnT + TreeTxnT + ChannelIter
. Also, there is still a large number of macros, but this is also because Sanakirja still has an unsafe interface (in part due to the lack of generic associated types in Rust) and needs wrappers to be used safely.
We have a basic CI system on the Nest, but it is currently only enabled for a few projects, including Pijul itself. It is based on Nix, which makes it really fast for building the same project over and over. The main reason for the restriction is that it only started to work reliably in the last few days, another reason is our limited computational resources. We plan to generalise it soon to more backends than Nix, and to open it to all public projects on the Nest. The ability to efficiently go back to arbitrary versions (work in progress) could make this quite useful.
Among the improvements to the Nest, the “Explore” page is a first step towards making this more social. I hope to be able to expand the social features of the Nest very soon.
One crucial thing for the future of Pijul is its integration into existing workflows and tools. The top priorities in that direction are to get text editors to support Pijul. I did start a draft for a VSCode plugin in November, but I’m happy to see that GarettWithOneR is moving much faster, and making great progress in that direction.
We need to be more generic in our diff algorithm: at the moment, the diff is line-based, but all the algorithms in Pijul can already handle binary or word-based diff algorithms. This is probably a rather easy project, but might require some global changes in order to deal with conflicts. Once this is done, a Unity plugin could become possible and useful.
Another project I’ve started is a way to handle gigantic repositories, especially going back to tags arbitrarily far in the past. At the moment, going back in time means unapplying all the changes since the time one wants to go back to, which isn’t really acceptable for very large repositories. Once this is implemented, I’ll run a series of benchmarks on large projects and files, and report the results here.
We’re quite close to finally moving to the beta phase. The algorithms are starting to be well tested, and have solid mathematical proofs. The initial quirks are almost all gone. Before that, I want to solve (or at least close) all the currently open discussions in our main repository.
I want to finish the rollback
command, which makes the “inverse” patch of another patch. This is currently implemented in libpijul, but still has one bug where inverting a conflict resolution doesn’t really work. Related to this, a more long-term goal is to handle code block movements: at the moment, Pijul’s behaviour is similar to other distributed version control systems, but there could be ways to do better.
This is the most crucial metric for this project: a history reset is needed when we need to change the on-disk representation for one reason or another. It happened a few times in the past, and did happen again after the current alpha was published. This is, however, very unlikely to ever happen again.
What happened was, after only a few days of using Pijul for itself, I started noticing an issue with pijul unrecord
, where patches were somehow “lossy”, in the sense that they didn’t contain enough information to unapply them (I now have a clear proof that the current patch format doesn’t lose anything).
Here is the specific issue: Pijul represents blocks of bytes in a graph, where edges are labelled with their status (deleted, alive, etc.). The recent improvements in the algorithm introduced the possibility to split vertices, which has made it necessary to add new statuses to detect when that happened.
Then, patches can add new vertices, or map the statuses of existing edges. I initially thought that the new statuses could be computed at apply time, but I was wrong, because I don’t know how to compute them when unrecording. Indeed, in order for the map of edge statuses to be invertible, it must be one-to-one, which wasn’t the case.
This has led me to reset the repository after just one week, changing the patch formats in the process. This was two months ago, and after that happened, I’ve started to work on a proof that the algorithms are correct, which I hope to publish soon.
When I started implementing the new algorithms a few months ago, the community was rather small. However, the fragile, clunky, alpha version grew significantly beyond my expectations. In particular, a number of people have made great contributions to the code, ranging from fixing a minor compilation error, to new features, design discussions, etc.
It is very hard to make an exhaustive list of all the people who have made this project what it is today. Florent Becker provided the initial impulse, as well as many insights, code contributions, and friendly support for years. Also, the current state of things wouldn’t have been possible either without the enthusiasm of lthms, tae, Florian Gilcher, among others.
I want to thank Pierre-Yves David, Laurent Bulteau and Florian Horn, with whom I’ve started a collaboration on research topics related to version control. Pierre-Yves is one of the main contributors to Mercurial, and the founder of Octobus, which looks like a really cool company if you’re interested in Rust, version control systems, or (and especially) both.
I would also like to thank all the new contributors of the last two months, listed here. In particular, cole-h and
loewenheim have contributed to many discussions and proposed many improvements, ranging from compilation errors to colours in the change visualisation, to the ergonomics of a number of commands (pijul record --amend
or pijul unrecord
are just examples). And danieleades taught me about modern error management in Rust (I had not looked into that topic since the days of error-chain).
Pijul is based on a number of layers, and there have also been great contributions on them: the manual has seen many contributions. Jason-ni patiently tested asynchronous issues in Thrussh.
I also want to give special thanks to tankf33der, who has patiently discovered a truly impressive number of bugs. Some of these bugs were easy to fix (such as making HTTPS more secure on this site and nest.pijul.com), others required deep redesigns (such as introducing CRC checks in Sanakirja to detect disk errors). Many of them seem to have been inspired by a “what if?” testing methodology rather than actual usage, which led to small, reproducible test cases. The title of this post (“how to survive”) was inspired by a message from him after one of my patches once again broke his repository (sorry about that!).
Finally, I wanted to thank Paul Hammant for giving me really useful insights about his professional experience as a trunk-based development consultant, and the possible future of Pijul as a trunk-based development tool. If you’re interested in development methodology, you might enjoy reading about his VCS Nirvana, as well as other posts on his blog.
A number of adventurous people have used the Advent of Code puzzles to learn Pijul, which I find really cool.
I believe Emily is the only one who completed all 25 puzzles.
CT075 came close with 22 24 puzzles solved.
The others I know of are (in alphabetical order) henil, idmyn, jraregris, krixano.
We’ve had an IRC channel on Freenode for a long time, but neither Florent nor myself have been very active on it. I’ve never been really good at IRC: I find simultaneous conversations hard to follow and history impossible to search. Many things require bots I don’t have the time to write, including mentions, direct messages, etc.
After reading how Mozilla replaced their IRC server, taking opinions of the community, I decided to try out Zulip.
The address is https://pijul.zulipchat.com/, and invitations aren’t required. Feel free to come, say hello, and tell us what you think!