These deep infrastructure changes were certainly triggered by unfortunate events, but probably happened at the right time, just before finalising the entire format. While the servers were down, a number of things happened:
Thanks to @tankf33der’s restless testing, we’ve caught three bugs in Sanakirja, which could cause data corruption on very large instances. If you are using Sanakirja, you should make sure that you’re using version 1.2.7
(or later) of crate sanakirja-core
.
@rohan’s changes were finally applied, and now allow different file encondings to work with our diff algorithms.
Speaking of diff, we now have an efficient implementation of binary diffs, based on a rolling checksum (like rsync) and my diffs
crate on the result.
We’ve also fixed a number of crashes in Pijul, mostly linked to careless error handling.
The performance of a number of commands (including pijul record
and pijul unrecord
) has improved a lot, thanks to benchmarking and various algorithmic tricks.
Commands to output a repository (including pijul reset
and pijul channel switch
) are now able to output files in parallel. pijul record
also has support for running in parallel, but this is turned off at the moment, until we find a solution to make the resulting patches fully deterministic.
Tags finally got their proper implementation: they can be used with the pijul tag
subcommand, and provide a way to navigate through very large histories, which until now was only possible via very costly chains of unrecords. The implementation is based on a trick in the design of Sanakirja 1.0, allowing one to read compressed databases very efficiently. I’ll probably blog about that bit of our technology at some point.