A lot of things are currently happening in Pijul, but this short post is about a satellite project, Thrussh, which we use in Pijul and in the Nest.
Along with other authors of SSH implementations, I was warned a few weeks ago by Fabian Bäumer, Marcus Brinkmann and Jörg Schwenk, three security researchers at Ruhr University Bochum, that the SSH protocol had a flaw by which an attacker could, in some cases and with some specific algorithms, manipulate the identifier of packets in an SSH channel and break channel integrity. This is called the Terrapin Attack.
I just released a patch for Thrussh as well as a new version on Crates.io. This should be backwards-compatible and not break existing clients and servers. However, note that even though the fix is a tiny one, it has a strong impact on crucial parts of the protocol (key renegotiation).
The current flaw is an interesting one, as it touches some of the core concepts of the SSH protocol and allows to explain its main ideas quite naturally.
The most important thing to understand is that a nice thing we generally want to build over a network is the illusion that there is a direct “pipe” between two computers, even though the network is a complex interconnected mesh. The first way to build a pipe is TCP, the transmission control protocol. TCP splits a stream of bytes into packets and numbers them. The computer on the receiving end confirms the received packet to the sender, allowing the sender to retransmit any missing packet.
Like everything else, the correctness depends on our model of the network and its participants: is anyone interested in listening to the messages? in modifying them? do they have quantum computers? or maybe 5$ wrenches?
With that in mind, SSH is actually not much more than a stronger pipe than TCP: it works on top of TCP and extends it with protections against some threats like eavesdroppers and people trying to manipulate the message.
The way the SSH protocol itself works is actually surprisingly simple: after an initial plain-text handshake made of a single line, the stream is split into packets (which do not necessarily match TCP packets). Packets are just a series of bytes, plus a packet type, number and a length, just like in TCP, except there are more possible packet types. Some of these types are related to cryptography, while others indicate specific things related to the most common use of the protocol: remote-controlling computers. For example, “start a shell command” is one type.
One common misunderstanding about SSH is that your SSH key is used to encrypt packets. Even though we now have cryptographic algorithms like Ed25519 which can encrypt large-sized packets, the algorithms traditionally used in SSH (RSA, which is still secure and widely used) are very limited in size. And even then, Ed25519 is terribly slow compared to symmetric encryption (AES, Chacha20…).
Therefore, an essential step in the SSH protocol is to create a shared secret. Among the packet “types” described above, a particularly important type is the one starting the creation of a shared secret via the Diffie-Hellman algorithm, one of the most important algorithms in cryptography, allowing two parties to create a shared secret in the presence of eavesdropper. The SSH secret and public keys are used to sign the messages used during that phase, in order to make sure that the two sides (Alice and Bob) are creating a shared secret with each other, and not each with the same man-in-the-middle (Eve), since Eve could then pretend to be Alice when talking to Bob, and pretend to be Bob when talking to Alice. That way, she would be able to decrypt all the messages.
The encoding of the packets is always the same, with a type, number and length. Initially, the packets are sent in plaintext. After negotiating the shared secret, they are still encoded in the same way, but are transmitted encrypted with a symmetric cipher (using the shared secret as the key), and authenticated using a MAC algorithm.
One important point to note is that the SSH packet number is a 32-bit unsigned integer that gets incremented with each packet, and wraps around when it reaches 232. This is so that even if you send the same plaintext with the same symmetric key (for example if you’re sending the same command twice), the encrypted messages will be completely different and an eavesdropper will gain no information from seeing these two messages.
However, at some point we still need to create a new shared secret, since using the same symmetric key for a large number of messages may ultimately leak some information. Also, a key re-exchange is always slower than the normal course of the protocol, since re-creating a secret involves both public key algorithms and multiple network round-trips. But as long as we don’t do it too often, this is unnoticeable. In Thrussh, the limits (in time and number of bytes written) after which we initiate a key re-exchange are defined in the Limits
type, and key re-exchanged is decided when flushing packets, in functions called flush
, see here for the server and here for the client.
If you’re unfamiliar with cryptographic protocols, you may have glossed over the words “authenticated using a MAC algorithm”, thinking it was a small detail. It turns out this on the contrary, message authentication is absolutely essential to our pipe metaphore. Indeed, encryption is only useful to make our messages unreadable, but we also want them to be unwritable. The most basic thing that can be done is modifying arbitrary bytes, but this is likely to yield a scrambled plaintext output. A more sophisticated attack, if we have more hypotheses about the packet contents (a simple thermometer next to a server is sometimes enough!), we could try to stitch parts of some packets together.
One basic way to authenticate messages is to hash them using a cryptographic hash function, append the hash, and encrypt the result: this defeats attacks where the attacker can’t read the cleartext contents of the packets. Note that even if they could read the plaintext of all the packets, they would still need to generate an encrypted version of the hash of the packet they’re crafting. Without knowing the encryption key, this requires either seeing a packet with the exact same hash, or to find a collision in the cryptographic hash function used in the authentication code. Since the hash includes the packet number, both options are quite unlikely.
The designers of the SSH protocol did an excellent job in multiple ways: first, the RFCs describing the protocol are extremely clear. Then, they knew that cryptographic algorithms get broken after time and need to be replaced. Also, our uses of a protocol may change over time. For all these reasons, some packet types are reserved for future uses, with names like UNIMPLEMENTED
. Others are meant to defeat traffic analysis (there’s an IGNORE
message type, for example).
However, this forward compatibility comes with an increased complexity: this is because the usual way to model a protocol, especially when you try to implement it, is by having a “state machine” in mind: at any given time, the protocol is in some state, where it expects some messages to transition to another state. The number of lines of code you have to write is then roughly proportional to the number of all possible transitions from all states. If you add just one message type, you’re not just adding one case, but as many cases as there are states.
Fortunately, SSH has few states: the initial line and the “normal operation” are two of them, but then the Diffie-Hellman protocol has two or three more states, depending on who initiates the exchange. And what Bäumer, Brinkmann and Schwenk have found is that sending many of those extra packets while in those critical intermediate states can be used to break message authentication, at least in some message authentication algorithms. The trick is to send a number of these “packets from the future” during the initial key exchange, which allows the same number of actual packets to be deleted after the key exchange is over.
As a fix, the authors of OpenSSH have suggested countermeasures that are easy to implement in a backwards-compatible way: negotiate the algorithms, and if the client and server both support this fix, drop the connection if unexpected packets are received during the key exchange, since these definitely come from attackers. Also, reset the packet number counter after each re-exchange. I have implemented these in Thrussh, but only tested them using the official patch against OpenSSH, so more tests may be useful, please reach out if you want to write your first SSH client or server (less than 30 lines of Rust code required!).
The impact is actually rather limited and only affects some algorithms. However, I find it really cool that we keep finding flaws after 30 years, especially for a protocol of that simplicity. This also serves as a reminder that SSH was designed at the beginning of the Internet, when users were more demanding and engineering companies were more ambitious: the creators of SSH certainly didn’t envision a future where such protocols would be tied to one central server for the entire internet, deciding who runs which version of the protocol, what device is too old to be useful, or what algorithm your secret key should use. Let’s not talk about those authorities forcing you to re-exchange keys.