Even if you are not into IT, you probably have heard about encryption. Even if you know it serves to protect your data, you might need to figure out how is this possible. In this article we take nothing for granted, and we explain what is encryption in simple terms. We will see how does it work, where you can use it, its advantages, and its flaws.
What is encryption?
We can start with the textbook definition of encryption. Encryption is the process of altering a piece of information so that only the intended receivers can understand it. While this may sound complex, it is an extremely simple concept. Imagine you want to communicate with another person, without that a third person in the room understands. If the three of you can speak English, but only you and your receiver can speak French, you could simply switch to French. You formulate your thought in English, but say it in French. This way, only people speaking French will understand it.
Encryption is the process of altering a piece of information so that only the intended receivers can understand it.
We can also give a deeper explanation to the “what is encryption?” question. With encryption, you have a message that anybody could understand (the thought). However, before making it publicly available (saying it out loud), you alter it (convert it from English to French). The receiver will convert it back in a protected environment, receiving your original thought.
Some properties of Encryption
Now that we have an idea of what is encryption, we can start to understand some of its properties. First, it must be reversible. In other words, if you have the intended receiver, you must have the power to convert the encrypted message back into its original state. Of course, encryption must be secure as well: only the intended receiver must understand the message, and no one else.
This, of course, makes our plan of converting our message in French worthless. In fact, millions of people can speak French. On top of that, new people learn it every day. Thus, converting the message to French is reversible but not secure. Surely, nothing we can use in computer science.
What is encryption used for?
Of course, we know that encryption protects our data. However, this is a way too simple a definition of what encryption can do for us. With encryption, we can achieve three main objectives.
- Confidentiality (or privacy) is the one we already know. It means that only the intended readers can understand the message, and no one else.
- Authenticity is an important guarantee for the receiver. It means it can validate the original sender, and be sure that he is who he claims to be.
- Integrity is another guarantee for the receiver. With it, the receiver is sure that he received is the same as the data sent, and no one altered it along the path.
These are three pillars of information security and cryptography. We will see how we can use encryption to achieve them.
How does encryption work?
Understanding what is encryption is a good start. However, this raises many more questions on its functionality. How is it possible that only the receiver can understand the message? What is actually happening behind the stage? Just read on…
The key and the algorithm
All modern encryption relies on keys. Just like a key can open and close a door, a key in encryption can make the message clear or unreadable. The other pillar of encryption is the algorithm. This is what does the actual work, converting the message back and forth from clear to unreadable and vice versa. To do that, the algorithm must know the message and the key.
Several encryption algorithms exist, and each has two functions. The encryption function will take the clear message and the key, and make it unreadable. Instead, the decryption function will take the unreadable message and the key, making the message clear.
For the whole process to work, both sender and receiver must agree on the encryption algorithm and have the same key. This is intuitive: the same key you use to close the message is the one you should use to open the message.
At this point, many people find it strange that encryption algorithms are public. If they are the ones making the message unreadable, how can they be public? This may seem to impact security, but it doesn’t. In fact, a good encryption algorithm relies exclusively on the confidentiality of the key, not of the algorithm. In other words, knowing how the algorithm work is useless if you don’t have the key.
What is a message, and what is a key?
Before we start diving into the operations of an encryption algorithm, we need to clarify the concept of message and key. At first, you may think about a message as a piece of text. Well, a message could be a piece of text, but it can also be many other things. In fact, computers have plenty of data formats: video, documents, binary files, and so on. The good news is, anything can be encrypted.
In the end, computers save files as a sequence of numbers (binary numbers). With encryption, you can encrypt such numbers, effectively encrypting any type of data. If you are curious about binary numbers, we got you covered.
A key, in the end, is another sequence of numbers. Most commonly, a key is a short piece of text (a few dozen characters) that the computer will understand as a sequence of numbers. Technically, you could use any piece of data as a key.
Inside the encryption algorithm
At this point, you can answer the question “What is encryption?”. However, we also want to understand how does it look on the inside. To find this out, we need to see how encryption algorithms work.
An encryption algorithms combines the key with the message to produce an encrypted message.
To do that, the algorithm uses ciphers. A cipher is a type of operation that alternates the message, based on the key. Modern encryption algorithms use several ciphers in sequence, even multiple times, to ensure that the encrypted message is far different from the original message. We can classify ciphers in two main categories.
- Transposition Ciphers permutes the original text
- Substitution Ciphers replace characters in the original text
By combining the two multiple times you will get a strong encryption process. Continue reading to find out how they work.
The Transposition Cipher
The transposition cipher is probably the easiest encryption method to understand. It simply shifts the character of a message, so if you have an E
in your message and your shift it by 1, you will get an F
in your encrypted message. This because F
one position to the right of E
. Remember that this is a visual representation because the PC will understand bits and bytes representing the E
. Specifically, the message is divided into bytes, and each is a number from 0 to 255. The PC will shift the value in the byte, turning 14 into 15 and 255 into 0.
Even if the computer reasons only with bytes, and represents even characters with them, we won’t. In this article, we are going to present you manipulation directly on characters, because it is easier to understand. As we already explained, you can apply the same concept to numbers.
We use the key to define the shift. To visualize, imagine we have a key of a single character: C
. Here, the shift will be of three letters, because C
is the third letter in the alphabet.
As we will see, while this message looks random, we can improve it with a better key.
A better transposition cipher
Now, we know the process of calculating a transposition cipher. This simple process of shifting characters is always valid, but we need to use a strong key. A strong key is simply a longer key. Using a key of a single character makes it extremely easy to brute force. The attacker can attempt to decode the message using any character, from A
to Z
(and so on), until it gets a message that makes sense. A script could to that in milliseconds. Plus, using a single character has another disadvantage: we preserve double letters. In fact, in the example above LL
turned in OO
.
By using a longer key, we repeat it for the entire length of the message. Actually, we did the same with the old key: we used C
, then C
, then C
again. If no we use “ABC”, we will use A
, then B
, then C
, and then A
again.
Now a double character in the encrypted message does not relate with a double character in the original message. Since the key now is 3 characters long, it is 3 orders of magnitude harder to break than a key with a single character.
Substitution cipher
A substitution cipher is somewhat a mechanic approach. It works by creating a map of characters so that a given character in the original message will correspond to another character in the ciphertext. To generate this map starting from a key, we write the key and then all the letters in the alphabet, without the ones we already used in the key. In order for this to work, the key must have no character repeated twice. However, an encryption algorithm may remove doubles from the key before using it. For example, the key BRAVO
will produce an alphabet of BRAVOCDEFGHIJKLMNPQSTUWXYZ
. This is the substitution alphabet.
We can relate the substitution alphabet to plain text alphabet as below.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
BRAVOCDEFGHIJKLMNPQSTUWXYZ
We can read this so that A
corresponds to B
, B
corresponds to R
and so on. Consequently, HELLO
will render as EOIIL
. This cipher is not very strong, despite the high number of substitution alphabets you can come up with. Despite that, its concepts, combined with other approaches, are still in use in modern algorithms like AES.
Building blocks of encryption
If you run the same cipher several times on the message, always using the same key, you are not adding security. In fact, your security is also relying on the confidentiality of the algorithm. If the algorithm was public, and everybody knew it runs, for example, 10 transpositions, transpositions from 2 to 10 would be worthless.
Modern algorithms don’t do that. Instead, they tend to divide the original message into blocks and encrypt them, in several rounds. Each round uses a portion of the key in some way, to generate a substitution alphabet or to define the shift. We won’t dive much further into that, as these are advanced concepts and are not required to answer our “what is encryption?” question.
Symmetric Keys and Asymmetric Keys
The Symmetric Key
Everything we saw so far is all about symmetric encryption. This, as you can guess, relies on symmetric keys. An encryption algorithm working with symmetric keys is straightforward. You use a key to encrypt the message, and the exact same key to decrypt it, that’s why symmetric.
Symmetric encryption is awesome. It is fast, easy to understand, but can only grant confidentiality. You have no way to validate the identity of the sender (authenticity), nor to verify the integrity of the message. In fact, if a man-in-the-middle attacker knew the key, he could alter the message and you would have no way to know.
But this is not the biggest problem of this approach. The biggest problem is sharing the key. In fact, if both parties must have the same key, how can you communicate the key to your partner? This is not a problem in case of static configuration, where you configure both sides manually. You could even share the key on the phone with your partner on the other side, and keep always the same key. Instead, this is a problem when communication is dynamic, without administrative intervention.
What is Asymmetric Encryption?
Asymmetric encryption is somewhat more complex than what we have seen. However, it grants the authenticity of the message and doesn’t need to share a key on an insecure media. As you might guess, asymmetric encryption means that the key we use to encrypt the message cannot be used to decrypt it.
Each party has two pair of keys: a public key and a private key. The message encrypted with the private key can only be decrypted by the public key. Instead, messages encrypted with the public key can only be decrypted with the private key. As the name says, the private key is extremely private: you should give it to no one. Instead, the public key is simply public. You must make it available to the public in order for this encryption to work.
Of course, the public key and the private key are generated together. The same private key will correspond to the same public key, and vice versa.
How does Asymmetric Encryption work?
From what we said above, we can understand two core concepts that will help you see how this approach works.
- If I encrypt a message with my private key, anyone will be able to decrypt it. Since people use my public key to do that and know that the key is mine, the can be sure I was the one encrypting the message. This grants authenticity.
- If I encrypt the message with the public key of my partner, I know he will be the only one able to decrypt it.
Thus, in order to send a message that only our partner is able to understand we simply use his public key. Anyone could send a message to him this way and know that he will be the only one that can decrypt it. What if we want to establish a two-direction communication? We encrypt the message with our private key, and then we encrypt the result with our partner’s public key. He will be the only one to decrypt the message, and then he will have to decrypt again with our public key. This way, he will know that the message came from us, and only us.
As you can see, to accomplish this we need to encrypt and decrypt the message twice. This means that asymmetric encryption is twice as heavy as symmetric encryption, nothing we want to use for high load of traffic. In fact, modern algorithms use asymmetric encryption just to exchange a key, to then use in symmetric encryption.
What about integrity?
When we started to explain what is encryption, we mentioned integrity. How can we grant it? With asymmetric encryption, to effectively alter the message you need to know the private key of both parties. Keeping these keys secret grants the integrity of the message. The attacker needs to know both, not just one. Since we are the only ones responsible for keeping secret our own key, we know if the integrity of our communication is in danger.
Encryption in use
We mainly use encryption in two ways. The first, and probably most critical, is encrypting data we want to send over an insecure media. With encryption, we can achieve privacy even over the Internet. The second main use, important as well, is encrypting data we want to store. For example, many companies implement software that encrypts the whole disk of laptop computers, and only the employee who uses it knows the key.
Encryption Algorithms
We covered what is encryption at a theoretical way, but encryption is very practical. For your knowledge, the following table shows the most popular encryption algorithms.
Algorithm | Symmetric/Asymmetric | Type | Key Length | Details |
---|---|---|---|---|
DES | Symmetric | Block cipher | 56 bits | Data encryption standard, an old algorithm that can be breaken with rainbow table attacks. Refrain from using it. |
3DES | Symmetric | Block cipher | 168/112/56 bits | Triple DES (des with a longer key). Still easily breakable. |
AES | Symmetric | Substitution-permutation network block cipher | 256/192/128 bits | Industry standard for symmetric encryption, (RFC 3565). |
RSA | Asymmetric | Public-key cryptosystem | Generally 1024 to 4096 bits | Industry standard for asymmetric encryption. RSA is the acronym of Rivest-Sharim-Adleman, the inventors of the algorithm. Read more on RFC 3447. |
Summing it up
In this article, we discovered what is encryption at a conceptual level, and saw how it works. We presented its challenges, and how it can secure information over an insecure media. If you are looking for something like TL;DR, here we have for you the key points.
- Encryption is the process of transforming a clear message in another message, that only the intended receiver can convert back to its original state. This means only the intended receiver will be able to read it.
- Encryption algorithms produce an encrypted message by taking the original message and a key. They decrypt the encrypted message using also the key. A good algorithm will rely on the confidentiality of the key, not of the algorithm.
- With asymmetric encryption, each party has two keys: one is public and the other is private. What you encrypt with the private key can only be decrypted with the public key and vice versa. You often use asymmetric encryption to exchange a key for symmetric encryption, as asymmetric encryption is twice CPU-intensive.
- The industry standard for symmetric encryption is AES, for asymmetric is RSA.
With this knowledge, you will be more aware of the privacy concerning your data. Hopefully, this will be useful in case you are a network engineer configuring VPNs, or in case you are a programmer implementing secure communication. As always, let me know what you think in the comments below.