September 8, 2020

Cryptography Introduction: Block Ciphers

by Jscrambler

Cryptography Introduction: Block Ciphers

Before we dive deep into what symmetric ciphers are and how important they are in our daily lives, there are two theoretical concepts that need to be reviewed to better understand them: (Pseudo)random Number Generators and Perfect Secrecy.

Random Numbers

Most of the cryptographic algorithms have their roots in mathematical operations performed over the data we are trying to hide from unauthorized eyes. As such, it is only logical to add random noise to help and increase the confusion created by the cryptographic algorithms.

But not any kind of noise is acceptable to be used in cryptography. It has to be secure and truly random because otherwise, it would be easy for an attacker to replicate the process and more easily decrypt your protected data. The main problem is that computers are deterministic in nature and incapable of producing truly random values on their own.

A place where the generation of random values can be observed is in a casino.

Flipping coins, rolling dice, or playing in the roulette are good examples of true randomness (if none of the items have been tampered with), as it is theoretically impossible to predict the results of those actions with 100% accuracy. However, this solution is too slow for modern cryptography, where sometimes it is necessary to generate thousands of random numbers quickly.

Another way of generating random values for physical properties is through the use of special hardware that measures thermal or electric noise. These types of hardware gather information over a small period of time and, once enough entropy has been generated, they are able to produce random values that are theoretically unpredictable by an attacker. The downside of this method is that it can also take a while to gather enough entropy and, as such, delay the cryptographic protocol from executing.

The other option to produce secure random numbers is through the use of Pseudo-Random Number Generators. These are computational and deterministic algorithms that can produce long sequences of apparently random numbers from a smaller value (often called seed or key). To be considered a pseudorandom generator, the algorithm should produce outputs indistinguishable from a uniform distribution when observed by statistical tests.

Perfect Secrecy

Claude Elwood Shannon was an American mathematician, electrical engineer, and cryptographer known as "the father of information theory". In 1945, he wrote a paper on Communication Theory of Secrecy Systems (that was only declassified in 1949), which is often used as the basis for theoretical cryptography. In this paper, he defines the concept of theoretically unbreakable ciphers as well as the concepts of confusion and diffusion.

In cryptography, confusion refers to making the relationship between ciphertext and the symmetric key as complex as possible. Diffusion is used to dissipate the statistical structure of plaintext over the bulk of ciphertext. What this means is that a single bit change on the plaintext should at least change half of the resulting ciphertext. Both these concepts are usually implemented with the use of substitutions and permutations, where substitutions increase the confusion of the cipher, deterministically changing some components, whilst permutations manipulate the order bits are represented in the final result. Both these steps are usually applied over several rounds - like it will be seen later in this post.

The term “perfect secrecy” is applied to encryption algorithms where there is no information that can correlate any particular ciphertext to the key used or the original plaintext. What this means is that the ciphertext can not be broken by any type of cryptanalysis as there is an equal probability of a given key being used to encrypt a plaintext message to a given ciphertext. However, in the literature, this is only possible in situations where the key is at least the same size as the plaintext message.

A technique that is used as a demonstration of this concept is the One-time pad. By generating a truly random key, at least as long as the message, never reusing it and always keeping it a secret between the two people communicating, it can then be used either as a modular addition or a bitwise XOR operation with the plaintext to create a perfectly secure ciphertext.

Since the early 1900s and until around the ‘80s, when computers became easier to manufacture and use, this method was often used across nations for secure communications, including the hotline between Moscow and Washington D.C. during the Cold War. These systems used two equal and long key tapes, one at each end of the communication, that were synchronized and used as messages were either sent or received.

However, this system is impractical for day-to-day usage. As such, simpler algorithms that make use of pseudorandom generators are needed - such is the case of Block Ciphers.

Block Ciphers

Unlike stream ciphers that are applied continuously bit by bit, block ciphers, like the name implies, are applied on a fixed-length bit block - which, for the example of AES, is 128 bits long.

Block ciphers are deterministic algorithms - so, for a specific key and input data block, the resulting block will always be the same. They are a secure way to protect a single block of data but, since we usually want to protect more information than just 128 bits, it is recommended to use cryptographic protocols that are built on top of this construction to increase the security of the information.

Block Cipher Modes of Operation

When discussing block ciphers, it is important to understand the different block cipher modes of operation. Block cipher modes are the overlaying algorithm that reuses the block ciphers constructions to encrypt multiple blocks of data with the same key, without compromising its security.

The first block cipher mode we are going to discuss is Electronic Codebook (ECB) mode. It is often mentioned as an example of what not to do when using block ciphers. The ECB mode applies the underlying block cipher algorithm over each block of data without any alteration to it or to the used key. As a result, if two blocks of the original data are the same, the same two blocks will be equal in terms of the encrypted version. Even if attackers can't decrypt the contents of the message, the knowledge that some blocks are equal to others might already give them insights on what the original message could be based on its context.

A better option is the Cipher Block Chaining mode (CBC) mode, which adds an extra operation before applying the block cipher algorithm. So, before applying the block cipher algorithm to a block of plaintext, a bitwise XOR operation is performed with the current block’s plaintext and last block's resulting ciphertext. In the case of the first block, a random block of data is used, which is often called an Initialization Vector (IV). There are two consequences of this extra operation - first, reusing the same key to encrypt two equal messages will produce two entirely different ciphertexts when using different IVs; second, even if two blocks on the same message are equal, their ciphered version won’t be since they will be mixed with different values before being encrypted. Also, a single bit difference on a block is propagated to all the following blocks, creating even more confusion when trying to compare two different messages. Because each block needs the result of the previous block, the encryption process is not parallelizable. However, the reverse process of decryption does not have such requirements, so multiple blocks can be decrypted at the same time. It is also possible to simply decrypt some blocks and not all of them for quick access to some information.

The CBC mode can also be reused to create a message authentication code (MAC). Instead of using a randomly generated IV, a 0 IV is used and the last block in the encryption process is used as the MAC. Any bit change on the message will result in a completely different MAC since every change is propagated until the last block. Even though it is tempting, it is not advisable to try and just execute a CBC algorithm once and keep the last block as MAC. It is always advisable to keep both operations separated and to use different cryptographic keys for each process.

A similar solution to CBC is the Cipher Feedback mode (CFB), which also uses an IV and the XOR operation over the plaintext, but this time after the block cipher algorithm. In this mode, the IV (for the first block) or the previous block (remaining blocks) is encrypted with the block cipher algorithm and its result is then mixed with the plaintext block. The restrictions and advantages over the encryption and decryption processes are the same as in CBC mode.

An alternative mode to CFB is the Output feedback mode (OFB) which, instead of using the ciphertext of the previous block in the current block’s encryption, uses the output of the block cipher applied to the IV of the previous block. The downside of this change is that the decryption process can no longer be parallelized.

The last mode we are going to discuss is the Counter mode (CTR). It is also one of the most used modes for block ciphers for both its security and performance properties. It is a mixture of ECB and OFB modes, working with the advantages of both. Like ECB, each block is independent so blocks can be encrypted, decrypted, and changed separately from each other, which facilitates its parallelizability. However, each block is not encrypted directly with the block cipher algorithm. Like in OFB, the plaintext block is XORed with the result of applying the block cipher to an IV block. The difference is that, in CTR mode, this block is not the result of the previous iteration. Instead, it is a block where the 64 first bits are a random nonce, shared across all blocks in the same message, and 64 bits of "counter". This counter portion can literally be a counter where, for each block, its value is increased by one; or it can be the result of a function that produces a sequence of results that do not repeat over a long time.

AES

Several block cipher algorithms have been created and used over the last few decades. Currently, the Advanced Encryption Standard (AES) is viewed as the standard in secure symmetric cryptography.

In the late ‘90s, the National Institute of Standards and Technology of the United States (NIST) started the process of creating a new block cipher to replace its predecessor, Data Encryption Standard (DES), since its 56-bit keysize was becoming more vulnerable to brute-force attacks as computers became faster. During this process, two Belgian cryptographers, Vincent Rijmen and Joan Daemen, came up with the Rijndael family of block ciphers with three algorithms chosen for the standard. All of them work on 128-bit blocks but have different key sizes: 128, 196, and 256 bits.

To better understand how the base algorithm works, it’s best to imagine a block of 128 bits (16 bytes) as a two dimensional array displayed as follows:

d0 d4 d8 d12
d1 d5 d9 d13
d2 d6 d10 d14
d3 d7 d11 d15

The algorithm performs operations on this matrix in 10,12 or 14 rounds, depending on the key size (128,196, and 256 - respectively). Using a specific Key expansion algorithm (which we will not approach in this post), multiple 128-bit blocks are generated from the original key, one for each round plus one to be applied at the beginning of the block transformation.

If we have the key block similarly organized as the following data block:

k0 k4 k8 k12
k1 k5 k9 k13
k2 k6 k10 k14
k3 k7 k11 k15

Then, the first operation on the data block is a simple bitwise xor operation between the data and the corresponding key, overriding the existing byte contents. This is often called the AddRoundKey step.

After this operation, we have 9, 11, or 13 rounds of applying the following operations:

1. SubBytes

For every byte on the matrix, replace it with a value on the Substitution Box, which behaves like a lookup table. This box is constructed in such a way that provides non-linearity properties to the cipher. The method to retrieve a value from this table is to split the byte we are trying to replace in half, where the first half defines the row and the second half defines the column. This step is important to increase the confusion of the encrypted text.

2. ShiftRows

This step moves bytes around by making no shifts on the first row, making a one-byte shift on the second row, two bytes on the third row, and three bytes on the fourth row. In other words, this matrix:

d0 d4 d8 d12
d1 d5 d9 d13
d2 d6 d10 d14
d3 d7 d11 d15

becomes this one:

d0 d4 d8 d12
d5 d9 d13 d1
d10 d14 d2 d6
d15 d3 d7 d11

3. MixColumns

In this step, each column is extracted from the matrix and multiplied by a static matrix with the following contents:

2 3 1 1
1 2 3 1
1 1 2 3
3 1 1 2

For those that still remember Algebra, multiplying a matrix with an array will result in an array of the same size. This array is replaced on the same place where the column was retrieved from.

4. AddRoundKey

The first and fourth steps are used to increase the confusion in the resulting cipher, whilst the second and third steps are used to increase its diffusion, propagating small bit changes to as many bytes as possible.

The final round simply executes these steps:

  1. SubBytes
  2. ShiftRows
  3. AddRoundKey

There is no cryptographical necessity for the MixColumns step, as it is easily reverted on this last round.

Usage

On the browser, there are two ways of adding cryptographic functionalities to your application: external cryptographic libraries or the browser's WebCrypto API.

A simple example of encrypting a message from the user using AES-CTR mode, if using the WebCrypto API, would look similar to this:

async function encryptMessage(message, password){
    let counter = new Uint8Array(16) // the counter to be used should have the size of a block. this creates a 16 byte buffer which is 128 bits
    window.crypto.getRandomValues(counter); // generates data to be used as a nonce for this encryption;
    let encryptionParameters = {
        name:"AES-CTR",
        counter,
        length:64 // define how many bits to be used as the counter portion. the default is set at 64 bits, but the number can be increased if the message is longer than 2^64
    };
    let encodedMessage = new TextEncoder().encode(message); // Web Crypto only works with TypedArrays (Uint8Array)
    
    let masterKey = await window.crypto.subtle.importKey(
    "raw",
    new TextEncoder().encode(password),
    "PBKDF2",
    false,
    ["deriveBits", "deriveKey"]
  ); //getting a CryptoKey object from a string like password using the PBKDF2 algorithm
        let salt = window.crypto.getRandomValues(new Uint8Array(16));
    const encodedKey = await window.crypto.subtle.deriveKey(
    {
      "name": "PBKDF2",
      salt: salt,
      "iterations": 100000,
      "hash": "SHA-256"
    },
    masterKey,
    { "name": "AES-CTR", "length": 256},
    true,
    [ "encrypt", "decrypt" ]
  ); // Derive the encryption key from the master key to increase its entropy
    
    const encryptedContent = await window.crypto.subtle.encrypt(
    encryptionParameters,
    encodedKey,
    encodedMessage
  );
    
    return encryptedContent;
}

encryptMessage("big message contents to protect", "safest password").then(encryptedContents => console.log(encryptedContents));

While this solution can be more secure, since the cryptographic algorithms are protected and isolated in the browser's native implementation, it is an asynchronous solution. This means that you have to use Promises if you intend to use this Web API. Another requirement is that it is only supported in HTTPS connections.

If a synchronous solution is required, or if you need support for legacy browser versions, a great library to use is the Stanford JavaScript Crypto Library (SJCL). The previous example can be written similarly with the SJCL:

function encryptMessage(message,password){
    sjcl.random = new sjcl.prng(8);
    let buffer = new Uint32Array(32);
    sjcl.random.addEntropy(buffer, 1024, "crypto.getRandomValues"); //initialize random generator
    let salt = sjcl.random.randomWords(8) // a word consists of 32 bits (4 bytes) 
    let iv = sjcl.random.randomWords(8)
    let encodedKey = sjcl.misc.pbkdf2(password,salt,1000);
    let aesAlgorithm = new sjcl.cipher.aes(encodedKey);
    let encondedMessage = sjcl.codec.utf8String.toBits(message);
    let encryptedContent = sjcl.mode['gcm'].encrypt(aesAlgorithm,encondedMessage,iv);
    return sjcl.codec.base64.fromBits(encryptedContent)
}

console.log(encryptMessage("big message contents to protect", "safest password"));

In this case, we are using the Galois Counter Mode (GCM), which is an adapted version of the Counter mode for improved performance since the normal counter mode is not available in the default SJCL script.

At the server-side, the Node.js Crypto API can be used as well to take care of the encryption algorithms, or simply reuse the SJCL:

const crypto = require('crypto');

function encryptMessage(message,password){
    const algorithm = 'aes-192-ctr';
    const key = crypto.scryptSync(password, 'salt', 24); // generate a secure crypto key from a string based password
    const iv = crypto.randomBytes(16);
    const aesCipher = crypto.createCipheriv(algorithm, key, iv);
    let encryptedContent = aesCipher.update(message, 'utf8','hex');
    encryptedContent += aesCipher.final('hex');
    return encryptedContent;
}

console.log(encryptMessage("big message contents to protect", "safest password"));

Final Thoughts

Encryption is very important in order to protect both our own and our user’s data in today’s world. It is easy to assume that either the browser (or the operating system it is being executed on) could become compromised, and the assumptions given by a HTTPS connection can occur only after an attacker has gained access to information.

For scenarios where it is important to guarantee sensitive information to send to the server-side as soon as possible, a solution can be to reuse the user’s password. That can be accomplished by deriving a secret key from it to be used in a symmetric cryptographic algorithm, like the AES described in this post. Since only the user and the server know the original password, and given that the operation is deterministic, it may be a useful and safe solution to implement.

The key aspect to retain from this post is to use the correct operation mode for your use case and to never try to invent a new mode or algorithm. Misuse of cryptographic structures can introduce vulnerabilities to the system that were not there before.

Finally, in case you’re developing web or mobile applications using JavaScript, make sure you are protecting their source code from code theft, tampering, and reverse engineering. You can try this in your own code with a free Jscrambler trial.