Since our release of Pijul 0.2, I’ve been working mostly on the biggest feature of Pijul 0.3: branches. In this post, I try to explain the current plan.
First, branches in Pijul are not like branches in Git. Actually, Git branches are like Pijul’s patches, so in order to produce the simplest possible system, I was once tempted to find simpler ways to map Git workflows to patches. Sometimes, however, we do need some form of branches. There are two use cases I can think of:
I want to isolate “private” patches I’m working on currently, and prevent Pijul from pushing them to a remote repository.
I want to push patches to a remote “unreviewed” branch, so that they get “merged” to the remote repository only after reviewing. This will be the Pijul way of doing pull requests, only much much lighter than 3-way merge/pull requests.
Branches are of course mostly an engineer’s thing, but they’re an absolutely essential feature to have, and a clean definition would make Pijul really easy to learn.
I’ve been thinking of two different definitions/semantics:
The first definition has the merit of simplicity, both in implementation and in learning curve, whereas the other would for instance allow an experimental branch to follow a stable branch, even if patches do not pass through the experimental branch first.
These ideas definitely call for a purely functional implementation: with the same implementation as in Pijul 0.2, we could get branches almost for free. Two branches would share everything but their differences. Of course, branches that forked long ago and are “mostly merged” would not share much, but this is probably unavoidable, or at least would require whole new ideas.
I’ve contacted the authors of LMDB (the fantastic backend library we’re currentyl using), in order to know whether it would be possible to add some limited form of referential transparency to their B+ trees. They kindly replied that they would probably not do it, which I definitely understand: why would someone want to add such an overhead in space and maintenance for a feature probably used by a single user?
So, I’ve decide to start a new LMDB. It will certainly not be as fast as the original one, but it will have this missing feature. This would give the following time and space complexities:
creating a branch: time O(1), space O(1).
pushing a patch to a branch: same time/space complexity as in the one-branch case, i.e. O(p log h) when there is no missing context, plus a factor of (size of the missing context) else.
merging branch A (seen as a set of patches) to branch B (seen as a set of patches): this would amount to applying set of patches A-B to B, so the usual complexity of patch application.
After a few days of work, I have a full implementation of transactions on arbitrarily large files. My implementation is thread-safe, process-safe, and crash-resistant. It allows any number of concurrent readers and one writer at a time. Moreover, the writer does not exclude readers (readers still read the version before the writer started).
The only operation excluding readers is when a writer commits its transaction, which is done by a call to msync on a single page. I’m now starting to fill these pages with a useful data structure, like B+ trees.
I’ve tried to pay close attention to compatibility with 32-bit machines, even for large files. The only non-portable things right now are (1) calls to mmap and msync, which need to be translated to mmap64, and to windows when this part of the library becomes stable and (2) machines with page size other than 4kB are not yet supported. The only such architecture I’m aware of is SPARC (which also happens to be big-endian, but I already handle this).
Since I don’t want the nodes of my B-trees to be more than an x86-page, this probably means that nodes of more than 4kB (i.e. very large keys and/or values) will involve some amount of copying on SPARC.
Some translation of the base layer (pages allocator) to windows, 32 bits (i.e. using mmap64 instead of mmap when needed), and SPARC.
Cross-platform fallocate (I’m using ftruncate right now), and more calls to fallocate when the file needs to grow. This is probably the easiest part (only one part causes the file to grow).
Organizing the inside of B-tree nodes. One pleasant surprise here is, in Pijul, the most performance critical database is the graph base, but that one uses keys and values of fixed size. A less pleasant surprise is that there are also bases with keys and values of non-fixed size. This probably means that I need to implement both possibilities, since fixed-size keys can be implemented with binary trees inside the pages, whereas insertion and deletions on other nodes would need a sophisticated allocator inside a single page (or probably just a memmove for now, at least before profiling).