Encryption refers to the coding of information in order to keep it secret. The string of characters comprising the information is called plaintext. It is transformed into another string by an algorithm that produces a coded form of the information called a cryptogram or ciphertext. The coded version may be safely stored or transmitted. At a later time it can be deciphered by reversing the encrypting process to recover the original information.
Data encryption has numerous uses in the real world.
I D E S O F M A R C H plaintext add 3 to each character position to get: L G H V R I P D U F K ciphertext
To decode the message, the receiver simply replaces each character in the cryptogram by the character which lies k positions earlier in the alphabet. This is obviously not a very secure scheme, since it is easy to break the code by simply trying the 25 possible values for the key value, k.
For example, if the positions of A, B, C, . . . ,Z are given by 0, 1, 2, . . ., 25 respectively, and the keyword is DAGGER, the message "IDESOFMARCH" is encrypted as follows:
I D E S O F M A R C H plaintext D A G G E R D A G G E keyword, repeated if necessary L D K Y S W P A X I L ciphertext 8 3 4 18 14 5 12 0 17 2 7 position of original characters
3 0 6 6 4 17 3 0 6 6 4 position of keyword character 11 3 10 24 18 22 15 0 23 8 11 (original+keyword) % 26
The receiver recovers the message by subtracting the characters in this keyword from those in the cryptogram.
For example:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
Q | W | E | R | T | Y | U | I | O | P | A | S | D | F | G | H | J | K | L | Z | X | C | V | B | N | M |
The string "IDESOFMARCH" would be encrypted using this substitution table as follows:
I D E S O F M A R C H plaintext O R T L G Y D Q K E I ciphertext
To decrypt, the receiver simply uses the substitution table in reverse.
Since there are 26! (26 factorial, or approximately 1028) possible substitution tables, this scheme is considerably more secure than the simple Caesar cipher scheme. Experienced cryptographers can easily break the code, however, by analyzing frequency counts of certain letters and combinations of letters.
For example, we might divide the message into blocks (substrings) of size 3 and permute the characters in each block as follows:
Original Position: 1 2 3
Permuted Position: 3 1 2
Thus, the message "IDESOFMARCH" would be encrypted (after the addition of a randomly selected character X so that the string length is a multiple of the block length) as:
I D E S O F M A R C H X plaintext D E I O F S A R M H X C ciphertext
To decrypt the cryptogram, the receiver must know the key permutation and its inverse.
Cryptogram's Position: 1 2 3
Permuted Position: 2 3 1 (this is the original string)
DES was adopted in 1977 by the National Institute of Standards and Technology as the standard encryption scheme for sensitive federal documents. It was revised in 1983, 1988 and 1993. In 2001, a new, even more complex standard known as the Advanced Encryption Standard (AES) was adopted.
Public key encryption schemes eliminate this problem by using two keys, one for encryption and one for decryption. These schemes are called public key because the encryption key is made public by the receiver to all those who transmit message to him or her; the decryption key, however, is known only to the receiver. The security of these schemes depends on it being nearly impossible to determine the decryption key if one knows only the encryption key.
M = M1, M2, . . . Mk becomes P1, P2, . . . Pk C = C1, C2, . . . Ck, where Ci = Pie % n
The cryptogram C is decrypted by raising each block Ci to the power d and reducing by using modulo n, where d is a secret decryption key.
To illustrate, suppose that the characters are converted to numeric values using the codes 0, 1, 2, . . . 25 for the letters A, B, C, . . ., Z, respectively, and the public-key pair (e, n) is (17, 2773). To encrypt the message M = "IDESOFMARCH" using the RSA algorithm, we divide M into two-character blocks M1, M2, . . ., M6. (We append a randomly selected character X to make the blocks come out evenly.) We represent each block Mi as an integer Pi. Pi is in the range 0 through 2772, and is found by concatenating the numeric codes of the characters that comprise the block:
where 0779 = 080317 % 2773, etc.ID ES OF MA RC HX Mi 0803 0418 1405 1200 1702 0723 Pi 0779 1983 2641 1444 0052 0802 Ci
For this encryption key, the corresponding decrypting key d = 157. Thus, we decode the cryptogram by calculating Pi = Ci157 % 2773.
Now it's time to discuss how we chose (17,2773) for our public-key pair (e,n). The number n is the product of two large "random" primes p and q:
In the preceding example, we used the small primes 47 and 59 to simplify the computations (2773 = 47 * 59). However, Rivest, Shamir, and Adelman suggest that p and q have several hundred digits. The decrypting key d is then selected to be some large integer that is relatively prime to p-1 and q-1; that is, one that has no factors in common with either number. In our example, d = 157 has this property. The number e is then selected to have the property thatn = p*q
To break this code, one must be able to determine the value of d from values of n and e. Because of the manner in which d and e are selected, this is possible if n can be factored into a product of primes. Thus, the security of RSA encryption is based on the difficulty of determining the prime factors of a large integer. Even with the best factorization algorithms known today, this is a prohibitively time-consuming task. Although research on factorization continues, no efficient algorithms have been found that allow even super computers to factor numbers several hundred digits long in less than years.e*d %((p-1)(q-1)) = 1