DEV Community

Cover image for Encryption: ciphers, digests, salt, IV
Golf-Lang
Golf-Lang

Posted on • Edited on • Originally published at golf-lang.blogspot.com

Encryption: ciphers, digests, salt, IV

What is encryption

Encryption is a method of turning data into an unusable form that can be made useful only by means of decryption. The purpose is to make data available solely to those who can decrypt it (i.e. make it usable). Typically, data needs to be encrypted to make sure it cannot be obtained in case of unauthorized access. It is the last line of defense after an attacker has managed to break through authorization systems and access control.

This doesn't mean all data needs to be encrypted, because often times authorization and access systems may be enough, and in addition, there is a performance penalty for encrypting and decrypting data. If and when the data gets encrypted is a matter of application planning and risk assessment, and sometimes it is also a regulatory requirement, such as with HIPAA or GDPR.

Data can be encrypted at-rest, such as on disk, or in transit, such as between two parties communicating over the Internet.

Here you will learn how to encrypt and decrypt data using a password, also known as symmetrical encryption. This password must be known to both parties exchanging information.

Cipher, digest, salt, iterations, IV

To properly and securely use encryption, there are a few notions that need to be explained.

A cipher is the algorithm used for encryption. For example, AES256 is a cipher. The idea of a cipher is what most people will think of when it comes to encryption.

A digest is basically a hash function that is used to scramble and lengthen the password (i.e. the encryption key) before it's used by the cipher. Why is this done? For one, it creates a well randomized, uniform-length hash of a key that works better for encryption. It's also very suitable for "salting", which is the next one to talk about.

The "salt" is a method of defeating so-called "rainbow" tables. An attacker knows that two hashed values will still look exactly the same if the originals were. However, if you add the salt value to hashing, then they won't. It's called "salt" because it's sort of mixed with the key to produce something different. Now, a rainbow table will attempt to match known hashed values with precomputed data in an effort to guess a password. Usually, salt is randomly generated for each key and stored with it. In order to match known hashes, the attacker would have to precompute rainbow tables for great many random values, which is generally not feasible.

You will often hear about "iterations" in encryption. An iteration is a single cycle in which a key and salt are mixed in such a way to make guessing the key harder. This is done many times so to make it computationally difficult for an attacker to reverse-guess the key, hence "iterations" (plural). Typically, a minimum required number of iterations is 1000, but it can be different than that. If you start with a really strong password, generally you need less.

IV (or "Initialization Vector") is typically a random value that's used for encryption of each message. Now, salt is used for producing a key based on a password. And IV is used when you already have a key and now are encrypting messages. The purpose of IV is to make the same messages appear differently when encrypted. Sometimes, IV also has a sequential component, so it's made of a random string plus a sequence that constantly increases. This makes "replay" attacks difficult, which is where attacker doesn't need to decrypt a message; but rather an encrypted message was "sniffed" (i.e. intercepted between the sender and receiver) and then replayed, hoping to repeat the action already performed. Though in reality, most high-level protocols already have a sequence in place, where each message has, as a part of it, an increasing packet number, so in most cases IV doesn't need it.

Prerequisites

This example uses Golf framework. Install it first.

Encryption example

To run the examples here, create an application "enc" in a directory of its own (see mgrg for more on Golf's program manager):

mkdir enc_example
cd enc_example
gg -k enc
Enter fullscreen mode Exit fullscreen mode

To encrypt data use encrypt-data statement. The simplest form is to encrypt a null-terminated string. Create a file "encrypt.golf" and copy this:

begin-handler /encrypt public
    set-string str = "This contains a secret code, which is Open Sesame!"
    // Encrypt
    encrypt-data str to enc_str password "my_password"
    p-out enc_str
    @
    // Decrypt
    decrypt-data enc_str password "my_password" to dec_str
    p-out dec_str
    @
end-handler
Enter fullscreen mode Exit fullscreen mode

You can see the basic usage of encrypt-data and decrypt-data. You supply data (original or encrypted), the password, and off you go. The data is encrypted and then decrypted, yielding the original.

In the source code, a string variable "enc_str" (which is created as a "char *") will contain the encrypted version of "This contains a secret code, which is Open Sesame!" and "dec_str" will be the decrypted data which must be exactly the same.

To run this code from command line, make the application first:

gg -q
Enter fullscreen mode Exit fullscreen mode

Then have Golf produce the bash code to run it - the request path is "/encrypt", which in our case is handled by function "void encrypt()" defined in source file "encrypt.golf". In Golf, these names always match, making it easy to write, read and execute code. Use "-r" option in gg to specify the request path and get the code you need to run the program:

gg -r --req="/encrypt" --silent-header --exec
Enter fullscreen mode Exit fullscreen mode

You will get a response like this:

72ddd44c10e9693be6ac77caabc64e05f809290a109df7cfc57400948cb888cd23c7e98e15bcf21b25ab1337ddc6d02094232111aa20a2d548c08f230b6d56e9
This contains a secret code, which is Open Sesame!
Enter fullscreen mode Exit fullscreen mode

What you have here is the encrypted data, and then this encrypted data is decrypted using the same password. Unsurprisingly, the result matches the string you encrypted in the first place.

Note that by default encrypt-data will produce encrypted value in a human-readable hexadecimal form, which means consisting of hexadecimal characters "0" to "9" and "a" to "f". This way you can store the encrypted data into a regular string. For instance it may go to a JSON document or into a VARCHAR column in a database, or pretty much anywhere else. However you can also produce a binary encrypted data. More on that in a bit.

Encrypt data into a binary result

In the previous example, the resulting encrypted data is in a human-readable hexadecimal form. You can also create binary encrypted data, which is not a human-readable string and is also shorter. To do that, use "binary" clause. Replace the code in "encrypt.golf" with:

begin-handler /encrypt public
    set-string str = "This contains a secret code, which is Open Sesame!"
    // Encrypt
    encrypt-data str to enc_str password "my_password" binary
    // Save the encrypted data to a file
    write-file "encrypted_data" from enc_str
    get-app directory to app_dir
    @Encrypted data written to file <<p-out app_dir>>/encrypted_data
    // Decrypt data
    decrypt-data enc_str password "my_password" binary to dec_str
    p-out dec_str
    @
end-handler
Enter fullscreen mode Exit fullscreen mode

When you want to get binary encrypted data, you should get its length in bytes too, or otherwise you won't know where it ends, since it may contain null bytes in it. Use "output-length" clause for that purpose. In this code, the encrypted data in variable "enc_str" is written to file "encrypted_data", and the length written is "outlen" bytes. When a file is written without a path, it's always written in the application home directory (see directories), so you'd use get-app to get that directory.

When decrypting data, notice the use of "input-length" clause. It says how many bytes the encrypted data has. Obviously you can get that from "outlen" variable, where encrypt-data stored the length of encrypted data. When encryption and decryption are decoupled, i.e. running in separate programs, you'd make sure this length is made available.

Notice also that when data is encrypted as "binary" (meaning producing a binary output), the decryption must use the same.

Make the application:

gg -q
Enter fullscreen mode Exit fullscreen mode

Run it the same as before:

gg -r --req="/encrypt" --silent-header --exec
Enter fullscreen mode Exit fullscreen mode

The result is:

Encrypted data written to file /var/lib/gg/enc/app/encrypted_data
This contains a secret code, which is Open Sesame!
Enter fullscreen mode Exit fullscreen mode

The decrypted data is exactly the same as the original.

You can see the actual encrypted data written to the file by using "octal dump" ("od") Linux utility:

od -c /var/lib/gg/enc/app/encrypted_data
Enter fullscreen mode Exit fullscreen mode

with the result like:

$ od -c /var/lib/gg/enc/app/encrypted_data
0000000   r 335 324   L 020 351   i   ; 346 254   w 312 253 306   N 005
0000020 370  \t   )  \n 020 235 367 317 305   t  \0 224 214 270 210 315
0000040# 307 351 216 025 274 362 033   % 253 023   7 335 306 320
0000060 224#   ! 021 252     242 325   H 300 217#  \v   m   V 351
0000100
Enter fullscreen mode Exit fullscreen mode

There you have it. You will notice the data is binary and it actually contains the null byte(s).

Encrypt binary data

The data to encrypt in these examples is a string, i.e. null-delimited. You can encrypt binary data just as easily by specifying it whole (since Golf keeps track of how many bytes are there!), or specifying its length in "input-length" clause, for example copy this to "encrypt.golf":

begin-handler /encrypt public
    set-string str = "This c\000ontains a secret code, which is Open Sesame!"
    // Encrypt
    encrypt-data str to enc_str password "my_password" input-length 12
    p-out enc_str
    @
    // Decrypt
    decrypt-data enc_str password "my_password" to dec_str 
    // Output binary data; present null byte as octal \000

    string-length dec_str to res_len
    start-loop repeat res_len use i start-with 0
        if-true dec_str[i] equal 0
            p-out "\\000"
        else-if
            pf-out "%c", dec_str[i]
        end-if
    end-loop
    @
end-handler
Enter fullscreen mode Exit fullscreen mode

This will encrypt 12 bytes at memory location "enc_str" regardless of any null bytes. In this case that's "This c" followed by a null byte followed by "ontain" string, but it can be any kind of binary data, for example the contents of a JPG file.

On the decrypt side, you'd obtain the number of bytes decrypted in "output-length" clause. Finally, the decrypted data is shown to be exactly the original and the null byte is presented in a typical octal representation.

Make the application:

gg -q
Enter fullscreen mode Exit fullscreen mode

Run it the same as before:

gg -r --req="/encrypt" --silent-header --exec
Enter fullscreen mode Exit fullscreen mode

The result is:

6bea45c2f901c0913c87fccb9b347d0a
This c\000ontai
Enter fullscreen mode Exit fullscreen mode

The encrypted value is shorter because the data is shorter in this case too, and the result matches exactly the original.

Use any cipher or digest

The encryption used by default is AES256 and SHA256 hashing from the standard OpenSSL library, both of which are widely used in cryptography. You can however use any available cipher and digest (i.e. hash) that is supported by OpenSSL (even the custom ones you provide).

To see which algorithms are available, do this in command line:

#get list of cipher providers
openssl list -cipher-algorithms

#get list of digest providers
openssl list -digest-algorithms
Enter fullscreen mode Exit fullscreen mode

These two will provide a list of cipher and digest (hash) algorithms. Some of them may be weaker than the default ones chosen by Golf, and others may be there just for backward compatibility with older systems. Yet others may be quite new and did not have enough time to be validated to the extent you may want them to be. So be careful when choosing these algorithms and be sure to know why you're changing the default ones. That said, here's an example of using Camellia-256 (i.e. "CAMELLIA-256-CFB1") encryption with "SHA3-512" digest. Replace the code in "encrypt.golf" with:

begin-handler /encrypt public
    set-string str = "This contains a secret code, which is Open Sesame!"
    // Encrypt data
    encrypt-data str to enc_str password "my_password" \
        cipher "CAMELLIA-256-CFB1" digest "SHA3-512"
    p-out enc_str
    @
    // Decrypt data
    decrypt-data enc_str password "my_password"  to dec_str \
        cipher "CAMELLIA-256-CFB1" digest "SHA3-512"
    p-out dec_str
    @
end-handler
Enter fullscreen mode Exit fullscreen mode

Make the application:

gg -q
Enter fullscreen mode Exit fullscreen mode

Run it:

gg -r --req="/encrypt" --silent-header --exec
Enter fullscreen mode Exit fullscreen mode

In this case the result is:

f4d64d920756f7220516567727cef2c47443973de03449915d50a1d2e5e8558e7e06914532a0b0bf13842f67f0a268c98da6
This contains a secret code, which is Open Sesame!
Enter fullscreen mode Exit fullscreen mode

Again, you get the original data. Note you have to use the same cipher and digest in both encrypt-data and decrypt-data!

You can of course produce the binary encrypted value just like before by using "binary" and "output-length" clauses.

If you've got external systems that encrypt data, and you know which cipher and digest they use, you can match those and make your code interoperable. Golf uses standard OpenSSL library so chances are that other software may too.

Using salt

To add a salt to encryption, use "salt" clause. You can generate random salt by using random-string statement (or random-crypto if there is a need). Here is the code for "encrypt.golf":

begin-handler /encrypt public
    set-string str = "This contains a secret code, which is Open Sesame!"
    // Get salt
    random-string to rs length 16
    // Encrypt data
    encrypt-data str to enc_str password "my_password" salt rs
    @Salt used is <<p-out rs>>, and the encrypted string is <<p-out enc_str>>
    // Decrypt data
    decrypt-data enc_str password "my_password" salt rs to dec_str
    p-out dec_str
    @
end-handler
Enter fullscreen mode Exit fullscreen mode

Make the application:

gg -q
Enter fullscreen mode Exit fullscreen mode

Run it a few times:

gg -r --req="/encrypt" --silent-header --exec
gg -r --req="/encrypt" --silent-header --exec
gg -r --req="/encrypt" --silent-header --exec
Enter fullscreen mode Exit fullscreen mode

The result:

Salt used is VA9agPKxL9hf3bMd, and the encrypted string is 3272aa49c9b10cb2edf5d8a5e23803a5aa153c1b124296d318e3b3ad22bc911d1c0889d195d800c2bd92153ef7688e8d1cd368dbca3c5250d456f05c81ce0fdd
This contains a secret code, which is Open Sesame!
Salt used is FeWcGkBO5hQ1uo1A, and the encrypted string is 48b97314c1bc88952c798dfde7a416180dda6b00361217ea25278791c43b34f9c2e31cab6d9f4f28eea59baa70aadb4e8f1ed0709db81dff19f24cb7677c7371
This contains a secret code, which is Open Sesame!
Salt used is nCQClR0NMjdetTEf, and the encrypted string is f19cdd9c1ddec487157ac727b2c8d0cdeb728a4ecaf838ca8585e279447bcdce83f7f95fa53b054775be1bb2de3b95f2e66a8b26b216ea18aa8b47f3d177e917
This contains a secret code, which is Open Sesame!
Enter fullscreen mode Exit fullscreen mode

As you can see, a random salt value (16 bytes long in this case) is generated for each encryption, and the encrypted value is different each time, even though the data being encrypted was the same! This makes it difficult to crack encryption like this.

Of course, to decrypt, you must record the salt and use it exactly as you did when encrypting. In the code here, variable "rs" holds the salt. If you store the encrypted values in the database, you'd likely store the salt right next to it.

Initialization vector

In practice, you wouldn't use a different salt value for each message. It creates a new key every time, and that can reduce performance. And there's really no need for it: the use of salt is to make each key (even the same ones) much harder to guess. Once you've done that, you might not need to do it again, or often.

Instead, you'd use an IV (Initialization Vector) for each message. It's usually a random string that makes same messages appear different, and increases the computational cost of cracking the password. Here is the new code for "encrypt.golf":

begin-handler /encrypt public
    // Get salt
    random-string to rs length 16
    // Encrypt data
    start-loop repeat 10 use i start-with 0
        random-string to iv length 16
        encrypt-data "The same message" to enc_str password "my_password" salt rs iterations 2000 init-vector iv cache 
        @The encrypted string is <<p-out enc_str>>
        // Decrypt data
        decrypt-data enc_str password "my_password" salt rs iterations 2000 init-vector iv to dec_str cache
        p-out dec_str
        @
    end-loop
end-handler
Enter fullscreen mode Exit fullscreen mode

Make the application:

gg -q
Enter fullscreen mode Exit fullscreen mode

Run it a few times:

gg -r --req="/encrypt" --silent-header --exec
gg -r --req="/encrypt" --silent-header --exec
gg -r --req="/encrypt" --silent-header --exec
Enter fullscreen mode Exit fullscreen mode

The result may be:

The encrypted string is 787909d332fd84ba939c594e24c421b00ba46d9c9a776c47d3d0a9ca6fccb1a2
The same message
The encrypted string is 7fae887e3ae469b666cff79a68270ea3d11b771dc58a299971d5b49a1f7db1be
The same message
The encrypted string is 59f95c3e4457d89f611c4f8bd53dd5fa9f8c3bbe748ed7d5aeb939ad633199d7
The same message
The encrypted string is 00f218d0bbe7b618a0c2970da0b09e043a47798004502b76bc4a3f6afc626056
The same message
The encrypted string is 6819349496b9f573743f5ef65e27ac26f0d64574d39227cc4e85e517f108a5dd
The same message
The encrypted string is a2833338cf636602881377a024c974906caa16d1f7c47c78d9efdff128918d58
The same message
The encrypted string is 04c914cd9338fcba9acb550a79188bebbbb134c34441dfd540473dd8a1e6be40
The same message
The encrypted string is 05f0d51561d59edf05befd9fad243e0737e4a98af357a9764cba84bcc55cf4d5
The same message
The encrypted string is ae594c4d6e72c05c186383e63c89d93880c8a8a085bf9367bdfd772e3c163458
The same message
The encrypted string is 2b28cdf5a67a5a036139fd410112735aa96bc341a170dafb56818dc78efe2e00
The same message
Enter fullscreen mode Exit fullscreen mode

You can see that the same message appears different when encrypted, though when decrypted it's again the same. Of course, the password, salt, number of iterations, and init-vector must be the same for both encryption and decryption.

Note the use of "cache" clause in encrypt-data and decrypt-data. It effectively caches the computed key (given password, salt, cipher/digest algorithms and number of iterations), so it's not computed each time through the loop. With "cache" the key is computed once, and then a different IV (in "init-vector" clause) is used for each message.

If you want to occasionally rebuild the key, use "clear-cache" clause, which supplies a boolean. If true, the key is recomputed, otherwise it's left alone. See encrypt-data for more on this.

Conclusion

You have learned how to encrypt and decrypt data using different ciphers, digests, salt and IV values in Golf. You can also create a human-readable encrypted value and a binary output, as well as encrypt both strings and binary values (like documents).

Top comments (0)