DEV Community

Moritz Höppner
Moritz Höppner

Posted on

How to truncate CBC ciphertext

Suppose you have a large CBC ciphertext, say AES-256-CBC encrypted, and you only need the first n bytes of the plaintext. You might know that it is a text document, and you want to display a preview. Or you might want to determine its format by some magic bytes at the beginning of the file.

You could of course decrypt the whole thing and only use the first n bytes, but obviously that'd be not very good performance-wise.

I'll show two ways of truncating CBC ciphertext so that you don't need to decrypt the whole file to get the first bytes. The idea is very simple: Remove everything from the ciphertext but the first blocks. But there is one gotcha, and that's padding. So I'll write a little about that. To reproduce the example, you need a *nix shell, the OpenSSL command line utility, hexdump, basenc (from GNU coreutils) and dd.

Let's prepare an example plaintext and ciphertext:

$ echo -n "Lorem ipsum dolor sit amet, consetetur" > lorem.txt
$ openssl enc -aes-256-cbc \
    -in lorem.txt \
    -out lorem.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c
Enter fullscreen mode Exit fullscreen mode

Decryption is done block by block, instead of decrypting the first n bytes, we decrypt the first m blocks, where m is the smallest multiple of the block size (16 bytes in our case), so that m * { block size } <= n. In this tutorial, I'll decrypt the first block.

The naive approach

If we only want to decrypt the first block, we should be able to remove everything else from the ciphertext:

$ dd if=lorem.enc of=lorem-truncated.enc bs=16 count=1
Enter fullscreen mode Exit fullscreen mode

And decrypt:

$ openssl enc -aes-256-cbc -d \
    -in lorem-truncated.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c
bad decrypt
408F800302000000:error:1C800064:Provider routines:ossl_cipher_unpadblock:bad decrypt:providers/implementations/ciphers/ciphercommon_block.c:107:
Enter fullscreen mode Exit fullscreen mode

Okay, that didn't work. The last command should result in a "bad decrypt" error message, raised in OpenSSL's providers/implementations/ciphers/ciphercommon_block.c:107. We can see in the source code that OpenSSL looks at the last byte of the given block buf and raises an error if this byte is 0 or greater than the block size, i.e. greater than 16 or 0x10.

Let's play a little with OpenSSL to find out why this happens.

Playing with OpenSSL

To simplify the following shell commands, we can make tiny shell scripts for encryption and decryption:

# enc.sh

openssl enc -aes-256-cbc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c
Enter fullscreen mode Exit fullscreen mode
# dec.sh

openssl enc -aes-256-cbc -d \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c
Enter fullscreen mode Exit fullscreen mode

Okay, so the first question is: Is buf a block of the plaintext or the ciphertext? Shouldn't it be part of the ciphertext because the error occured during decryption? Let's find out.

The truncated ciphertext looks like this:

$ hexdump -C lorem-truncated.enc                                                 
00000000  81 90 91 f9 c1 33 1b fb  6d 49 3f d5 c5 04 43 89  |.....3..mI?...C.|
00000010
Enter fullscreen mode Exit fullscreen mode

We can change the last byte to something between 0x00 and 0x10, say 0x0a, to see if the error message changes:

$ echo 819091f9c1331bfb6d493fd5c504430a | basenc --base16 -d | ./dec.sh
Enter fullscreen mode Exit fullscreen mode

Nope, still the same.

Okay, now the same with the plaintext. Currently, we have:

$ hexdump -C lorem.txt
00000000  4c 6f 72 65 6d 20 69 70  73 75 6d 20 64 6f 6c 6f  |Lorem ipsum dolo|
00000010  72 20 73 69 74 20 61 6d  65 74 2c 20 63 6f 6e 73  |r sit amet, cons|
00000020  65 74 65 74 75 72                                 |etetur|
00000030
Enter fullscreen mode Exit fullscreen mode

Since we truncate the second and third block of the ciphertext anyway, we ignore the corresponding plaintext blocks for the moment. We can reproduce the problem with only one input block:

$ echo 4c6f72656d20697073756d20646f6c6f \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
bad decrypt
408F800302000000:error:1C800064:Provider routines:ossl_cipher_unpadblock:bad decrypt:providers/implementations/ciphers/ciphercommon_block.c:107:
Enter fullscreen mode Exit fullscreen mode

Now our little experiment. We change the last byte to 0x0a:

$ echo 4c6f72656d20697073756d20646f6c0a \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
bad decrypt
408F800302000000:error:1C800064:Provider routines:ossl_cipher_unpadblock:bad decrypt:providers/implementations/ciphers/ciphercommon_block.c:112:
Enter fullscreen mode Exit fullscreen mode

Okay, we still get a "bad decrypt" error. But: It is raised now in line 112 instead of line 107! It seems that we passed the check in ciphercommon_block.c:106 and now run into a different problem. Let's play a little with this to make sure we got it right:

$ echo 4c6f72656d20697073756d20646f6c00 \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
# Last byte is 0x00 -> error in line 107

$ echo 4c6f72656d20697073756d20646f6c10 \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
# Last byte is 0x10 (block size) -> error in line 112

$ echo 4c6f72656d20697073756d20646f6c11 \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
# Last byte is 0x11 (> block size) -> error in line 107
Enter fullscreen mode Exit fullscreen mode

Great. So the decryption fails after decrypting a block, the problem is the format of the plaintext. First of all, the last byte of the plaintext has to be greater than 0 and smaller than or equal to the block size.

Now, if the last byte is correct, why is an error raised in ciphercommon_block.c:112? The for loop looks at the last pad bytes of the block, where pad is the last byte itself. If one of these bytes is not equal to pad, an error is raised. So if pad is 0x01, the for loop only looks at the last byte itself. Let's try it:

$ echo 4c6f72656d20697073756d20646f6c01 \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
Lorem ipsum dol
Enter fullscreen mode Exit fullscreen mode

That works! But, of course, we lost the last byte to satisfy OpenSSL. If pad is 0x02, the for loop looks at the last two bytes and requires both of them to be 0x02:

$ echo 4c6f72656d20697073756d20646f0202 \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
Lorem ipsum do
Enter fullscreen mode Exit fullscreen mode

Above, we tried 0x10 as last byte. Now we know that if the last byte is 0x10 (the block size), then every byte of the block has to be 0x10:

$ echo 10101010101010101010101010101010 \
    | basenc --base16 -d \
    | ./enc.sh \
    | dd bs=16 count=1 status=none \
    | ./dec.sh
Enter fullscreen mode Exit fullscreen mode

As expected, no error occurs, but also no plaintext is left.

Padding

We found out that the last bytes of our plaintext block must be one of these, or else OpenSSL complains:

01
0202
030303
...
10101010101010101010101010101010
Enter fullscreen mode Exit fullscreen mode

These are padding bytes. When encrypting data, OpenSSL adds them to the plaintext so that the length of the data is a multiple of 16 bytes. After all, block ciphers need blocks of a fixed length as input, 16 bytes in the case of AES. If the plaintext length is already a multiple of 16 bytes, a whole block of 0x10 padding bytes is added. So the last byte is always a padding byte. This last byte is B the last B bytes are removed from the plaintext after decryption.

This is one of many possible padding schemes, called PKCS #7. You might get a little confused when you try to find its specification. At least, I did. PKCS #7 was the name of a standard for a format to store encrypted data in. It describes something called "envelope encryption", which means data is encrypted, and then the encryption key is itself encrypted and sent together with the encrypted data. RFC 2315 describes this process and mentions in passing the above padding scheme. And even symmetric encryption doesn't necessarily have anything to do with envelope encryption, the padding scheme is now called PKCS #7. Basically the same scheme (only for smaller block sizes) is described in the PKCS #5 standard (RFC 1423), again in a very specific context, here of DES-CBC encryption. To add some more confusion, PKCS #7 is today obsoleted by something called CMS, described in RFC 5652. And this specification describes the same padding scheme.

We can see the padding bytes added by OpenSSL during encryption with the -nopad flag:

$ openssl enc -aes-256-cbc -d \
    -in lorem.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c \
    -nopad \
    | hexdump -C
00000000  4c 6f 72 65 6d 20 69 70  73 75 6d 20 64 6f 6c 6f  |Lorem ipsum dolo|
00000010  72 20 73 69 74 20 61 6d  65 74 2c 20 63 6f 6e 73  |r sit amet, cons|
00000020  65 74 65 74 75 72 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a  |etetur..........|
00000030
Enter fullscreen mode Exit fullscreen mode

Back to the truncating problem

I think the best option for truncating the ciphertext is to use the -nopad flag. Then OpenSSL treats padding bytes as plaintext, meaning it doesn't expect any padding bytes:

$ openssl enc -aes-256-cbc -d \
    -in lorem-truncated.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c \
    -nopad
Lorem ipsum dolo
Enter fullscreen mode Exit fullscreen mode

Great, we are done. The naive approach was almost right, we simply had to add -nopad.

But just for fun, I want to show you another possibility I thought of before I knew the -nopad option existed.

We know that the last block of the ciphertext contains the encrypted padding bytes. So we can simply keep this last block. However, the last block can only be decrypted properly if the second-to-last ciphertext block is also present, as in CBC mode, any decrypted block is XORed with the ciphertext block before. So we keep the last two blocks. This means that our truncated ciphertext has at least three blocks. Our original plaintext had only three blocks to begin with, so to demonstrate the procedure, we need a slightly longer plaintext.

$ echo "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed" > lorem.txt
$ openssl enc -aes-256-cbc \
    -in lorem.txt \
    -out lorem.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c
$ hexdump -C lorem.enc
00000000  81 90 91 f9 c1 33 1b fb  6d 49 3f d5 c5 04 43 89  |.....3..mI?...C.|
00000010  a2 47 23 26 af 09 04 6e  f3 f6 1a b7 16 b9 95 18  |.G#&...n........|
00000020  93 9c e7 f4 23 a2 2e c6  c5 49 e1 73 65 52 cc 92  |....#....I.seR..|
00000030  48 b3 8f c6 61 50 50 52  1d bf bf 43 b4 c8 d8 8a  |H...aPPR...C....|
00000040
Enter fullscreen mode Exit fullscreen mode

Our new lorem-truncated.enc contains the first and last two blocks of lorem.enc:

$ dd if=lorem.enc of=lorem-truncated.enc bs=16 count=1
$ dd if=lorem.enc bs=16 skip=2 >> lorem-truncated.enc
$ hexdump -C lorem-truncated.enc
00000000  81 90 91 f9 c1 33 1b fb  6d 49 3f d5 c5 04 43 89  |.....3..mI?...C.|
00000010  93 9c e7 f4 23 a2 2e c6  c5 49 e1 73 65 52 cc 92  |....#....I.seR..|
00000020  48 b3 8f c6 61 50 50 52  1d bf bf 43 b4 c8 d8 8a  |H...aPPR...C....|
00000030
Enter fullscreen mode Exit fullscreen mode

OpenSSL can decrypt this file without errors:

$ openssl enc -aes-256-cbc -d \
    -in lorem-truncated.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c \
    | hexdump -C
00000000  4c 6f 72 65 6d 20 69 70  73 75 6d 20 64 6f 6c 6f  |Lorem ipsum dolo|
00000010  46 a3 d7 ab 1b 48 3f e6  ff db 4c 12 a0 de bf ff  |F....H?...L.....|
00000020  67 20 65 6c 69 74 72 2c  20 73 65 64 0a           |g elitr, sed.|
0000002d
Enter fullscreen mode Exit fullscreen mode

The second block is garbage because OpenSSL XORed the decrypted block with the first block of lorem-truncated.enc. But it should be XORed with the second block of lorem.enc, which isn't there anymore.

But we can of course simple remove the garbage block and the padding block from the plaintext:

$ openssl enc -aes-256-cbc -d \
    -in lorem-truncated.enc \
    -iv be154e2343408caa1f11ab3445bdd34c \
    -K be154e2343408caa1f11ab3445bdd34cbe154e2343408caa1f11ab3445bdd34c \
    | dd bs=16 count=1 status=none
Lorem ipsum dolo
Enter fullscreen mode Exit fullscreen mode

Top comments (0)