Intro-EC-in-Cryptography__2__Motivation_for_the_Use_of_Elliptic_Curves_in_Cryptography

140 days ago by maike.massierer

%hide %latex \section*{2. Motivation for the Use of Elliptic Curves in Cryptography} Before studying elliptic curves in Chapter 3 and ECC algorithms in Chapter 4, the {\it Diffie-Hellman key exchange} is introduced as a motivation for using elliptic curves in cryptography. We will see how a simple cryptographic protocol can be generalized to a setting where elliptic curves become useful, and how their use of elliptic curves may even improve the security of such a scheme. \subsection*{2.1 The Diffie-Hellman Key Exchange Protocol} Many cryptographic applications, such as encryption, depend on secret keys. These must be exchanged in advance and this must happen in a secret way, as the name suggests. Let's say Alice and Bob want to communicate via a voice-over-IP telephone. As data is transmitted across the unsafe Internet, a symmetric algorithm is used for encryption. That is one where both parties use the same secret key. Let's also suppose that this key is a (large) number. Evil Eve would also like to know the secret key so that she can decrypt Alice and Bob's communication data. How can Alice and Bob agree on a key without Eve knowing what it is? They could meet up in a room without Eve and agree on a number. But what if Alice was in Australia and Bob was in Brazil? In a scenario like this a method called {\it Diffie-Hellman key exchange} is often used. It allows the agreement on a shared secret key over an insecure network (that is one where we have eavesdroppers like Eve) such as the Internet. Before explaining how it works it is necessary to briefly introduce one mathematical term, the {\it primitive root.} Let $p$ be a prime. From now on we do all calculations modulo $p.$ This is called {\it modular arithmetic} and has similar properties to the ``normal'' arithmetic (addition and multiplication). Essentially, after every operation we divide the result by $p$ and continue working with the remainder. If you don't know how this works, check out [ModularArithmetic] or Section 4.4.1 of [CrypToolScript]. \medskip \noindent {\bf Definition.} An integer $g \in \{1,\ldots,p-1\}$ is called a {\bf primitive root} mod $p$ if for every $A \in \{1,2,\ldots,p-1\}$ there is an integer $a$ such that $A \equiv g^a \bmod{~p}.$ \medskip \noindent For example $2$ is a primitive root $\bmod{~5}$ because $$ 2^1 \equiv 2,~~~2^2 \equiv 4,~~~2^3 \equiv 3,~~~2^4 \equiv 1 \mod 5.$$ So the exponents $4, 1, 3, 2$ (in this order) give us the numbers $\{1,2,3,4\} = \{1,2,\ldots,p-1\}.$ However, $4$ is not a primitive root $\bmod{~5}$ because $$ 4^1 \equiv 4,~~~4^2 \equiv 1,~~~4^3 \equiv 4,~~~4^4 \equiv 1,~~~4^5 \equiv 4,~~\ldots \mod 5. $$ This sequence keeps repeating and only yields the values $1$ and $4.$ There are no exponents which result in $2$ or $3.$ (Also see Section 4.9 of [CrypToolScript].) Using this term, we now come back to the {\it Diffie-Hellman key exchange} protocol. \medskip \noindent {\bf Diffie-Hellman key exchange.} \begin{enumerate} \item Alice and Bob agree on a prime number $p$ and a natural number $g$ such that $g$ is a primitive root $\bmod{~p}.$ These numbers may be public. \item Alice chooses a secret natural number $a$, computes $A = g^a \bmod{p},$ and sends $A$ to Bob. \item Bob chooses a secret natural number $b$, computes $B = g^b \bmod{p},$ and sends $B$ to Alice. \item Alice computes the key $K_A = B^a \bmod{p}.$ \item Bob computes the key $K_B = A^b \bmod{p}.$ \end{enumerate} The two keys match because $$K_A \equiv B^a \equiv (g^b)^a \equiv g^{ba} \equiv g^{ab} \equiv (g^a)^b \equiv A^b \equiv K_B \mod p.$$ 
       
%hide html('<h3>Diffie-Hellman Key Exchange (1)</h3>') # step 0 p = [polygon([(-10,0),(10,0),(10,15),(-10,15)],rgbcolor=(0.5,0.8,1))] p[0] += polygon([(-2,0),(2,0),(2,15),(-2,15)],rgbcolor=(1,1,0.3)) p[0] += line([(-10,13),(10,13)],rgbcolor=[0,0,0]) p[0] += text('Alice',(-5,14),rgbcolor=[0,0,0],fontsize=14) p[0] += text('Bob',(5,14),rgbcolor=[0,0,0],fontsize=14) p[0] += text('Eve',(0,14),rgbcolor=[0,0,0],fontsize=14) # step 1 p += [text('agree on $p$ and $g$ with Bob',(-5,11),rgbcolor=[0,0,0],fontsize=9)] p[1] += text('agree on $p$ and $g$ with Alice',(5,11),rgbcolor=[0,0,0],fontsize=9) p[1] += arrow((-2,11),(2,11),rgbcolor=[0,0,0]) p[1] += arrow((2,11),(-2,11),rgbcolor=[0,0,0]) p[1] += text('$p, g$',(0,11.5),fontsize=12) # step 2 p += [text('choose random $a$',(-5,8),rgbcolor=[0,0,0])] p[2] += text('$A$ := $g^a$ mod $p$',(-5,6.7),rgbcolor=[0,0,0]) p[2] += arrow((-2,6.5),(2,3),rgbcolor=[0,0,0]) p[2] += text('$A$',(1.8,4),rgbcolor=[0,0,0],fontsize=12) # step 3 p += [text('choose random $b$',(5,8),rgbcolor=[0,0,0])] p[3] += text('$B$ := $g^b$ mod $p$',(5,6.7),rgbcolor=[0,0,0]) p[3] += arrow((2,6.5),(-2,3),rgbcolor=[0,0,0]) p[3] += text('$B$',(-1.8,4),rgbcolor=[0,0,0],fontsize=12) # step 4 p += [text('$K_A$ := $B^a$ mod $p$',(-5,2),rgbcolor=[0,0,0])] # step 5 p += [text('$K_B$ := $A^b$ mod $p$',(5,2),rgbcolor=[0,0,0])] t = ['', '1. Alice and Bob agree on a prime number $p$ and a primitive root $g.$', '2. Alice chooses a natural number $a$, computes $A = g^a \\bmod{p},$ and sends $A$ to Bob.', '3. Bob chooses a natural number $b$, computes $B = g^b \\bmod{p},$ and sends $B$ to Alice.', '4. Alice computes the key $K_A = B^a \\bmod{p}.$', '5. Bob computes the key $K_B = A^b \\bmod{p}.$'] @interact def _(step=selector([0,1,2,3,4,5],nrows=1)): html(t[step]) pl = p[0] for i in range(step+1): pl += p[i] show(pl,axes=false,xmin=-8,xmax=8,ymin=1,ymax=15) html('Figure 2.1 This graphic illustrates the Diffie-Hellman key exchange consisting of steps 0 to 5. Click on the numbers to see it <br>step by step. Note that everything in the blue boxes happens secretly (with Alice or Bob), and everything in the yellow box is <br>sent over the insecure network and may be seen by Eve.') 
       

Diffie-Hellman Key Exchange (1)

step 
Figure 2.1 This graphic illustrates the Diffie-Hellman key exchange consisting of steps 0 to 5. Click on the numbers to see it
step by step. Note that everything in the blue boxes happens secretly (with Alice or Bob), and everything in the yellow box is
sent over the insecure network and may be seen by Eve.

Click to the left again to hide and once more to show the dynamic interactive window

%hide %latex \noindent The number $g$ should be a primitive root $\bmod{~p}$ because this ensures that there are many different possibilities for the key. In the example above with $p=5,$ choosing $g=4$ (not a primitive root) gives only the values $1$ or $4$ for the key, whereas choosing $g=2$ (a primitive root) yields the values $K=1,2,3,4$ as possible keys. The more possible values there are, the harder it becomes for Eve to guess the correct key, which is what we want. In practice $g$ is not always required to be a primitive root, but it must be chosen in such a way that there are sufficiently many different values for the key. Such a sensible choice requires further investigation, which is why theorists often include the primitive root requirement. 
       
%hide html('<h3>Number of Possible Keys</h3>') @interact def _(p=input_box(default=5,width=6),g=input_box(default=4,width=6)): if not is_prime(p): html('<font size=6>Error: p must be a prime.</font>') elif Mod(g,p) == 0: html('<font size=6>Error: g must not be divisible by p.</font>') else: R = Integers(p) a = R(g) html('<font size=6>There are %s possible keys for the choice of p = %s and g = %s.</font>'%(multiplicative_order(a),p,g)) html('Figure 2.2 Choose values for p and g and let the program tell you how many possible values there are for the key. If you have made <br>a good choice and chosen g as a primitive root mod p, then there are p-1 possible keys.') 
       

Number of Possible Keys

Figure 2.2 Choose values for p and g and let the program tell you how many possible values there are for the key. If you have made
a good choice and chosen g as a primitive root mod p, then there are p-1 possible keys.

Click to the left again to hide and once more to show the dynamic interactive window

%hide %latex \noindent Unfortunately there is no simple general formula to compute primitive roots $\bmod{~p}.$ The program below will help you. 
       
%hide html('<h3>Calculation of Primitive Roots</h3>') @interact def _(p = input_box(23,width=6)): if not is_prime(p): html('<font size=6>Error: %s is not a prime.</font>'%p) else: html('<font size=6>g = %s is a primitive root mod %s</font>'%(primitive_root(p),p)) html('Figure 2.3 This program calculates a primitive root mod p for a given prime number p.') 
       

Calculation of Primitive Roots

Figure 2.3 This program calculates a primitive root mod p for a given prime number p.

Click to the left again to hide and once more to show the dynamic interactive window

%hide html('<h3>Diffie-Hellman Key Exchange (2)</h3>') @interact def _(p = selector([q for q in range(5,500) if is_prime(q)],default=23), g=input_box(5,width=6), s = checkbox(label='show secret values',default=false), c = checkbox(label='show calculations (with secret values)',default=false), lol=selector(['New example'],buttons=true,label='')): if c == true: s = true # check if g is a primitive root mod p R = Integers(p) if R(g).multiplicative_order() != p-1: html('<font size="6">Attention: $%s$ is not a primitive root mod $%s$.\n</font>'%(g,p)) a = floor(p*random()) b = floor(p*random()) a1 = '<FONT COLOR="#FF0000" SIZE="6">Alice chose $a = %s.$\n</FONT>'%a b1 = '<FONT COLOR="#0000FF" SIZE="6">Bob chose $b = %s.$\n</FONT>'%b A = Mod(g^a, p) B = Mod(g^b, p) A1 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $A = %s.$\n</FONT>'%A B1 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $B = %s.$\n\n</FONT>'%B A2 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $A \equiv g^a \equiv %s^{%s} \equiv %s \\bmod{%s}.$\n</FONT>'%(g,a,A,p) B2 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $B \equiv g^b \equiv %s^{%s} \equiv %s \\bmod{%s}.$\n\n</FONT>'%(g,b,B,p) KA = Mod(B^a, p) KB = Mod(A^b, p) KA1 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $K_A = %s.$\n</FONT>'%KA KB1 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $K_B = %s.$\n</FONT>'%KB KA2 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $K_A = B^a = %s^{%s} = %s \\bmod{%s}.$\n</FONT>'%(B,a,KA,p) KB2 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $K_B = A^b = %s^{%s} = %s \\bmod{%s}.$\n</FONT>'%(A,B,KB,p) if c: html(a1+b1+'\n'+A2+B2+KA2+KB2) else: if s: html(a1+b1) html(A1+B1+KA1+KB1) html('<font size="6", color="#000000">$\Rightarrow$ The secret key is $%s.$</font>'%KA) html('Figure 2.4 Try generating some secret keys. Remember that the size of the key depends on the size of the chosen prime p. Of <br>course the keys used in reality are much larger (typically 1024 or even 2048 bits), but this toy example, which allows $p$ to be <br>between 5 and 499, is a better illustration of the process as its output can be verified by hand.\n') 
       

Diffie-Hellman Key Exchange (2)

show secret values 
show calculations (with secret values) 
Figure 2.4 Try generating some secret keys. Remember that the size of the key depends on the size of the chosen prime p. Of
course the keys used in reality are much larger (typically 1024 or even 2048 bits), but this toy example, which allows p to be
between 5 and 499, is a better illustration of the process as its output can be verified by hand.

Click to the left again to hide and once more to show the dynamic interactive window

%hide html('<h3>Diffie-Hellman Key Exchange (3)</h3>') @interact def _(p = selector([q for q in range(5,500) if is_prime(q)],default=23), g=input_box(5,width=6), a=input_box(14,width=6), b=input_box(3,width=6), s = checkbox(label='show secret values',default=false), c = checkbox(label='show calculations (with secret values)',default=false)): if c == true: s = true # check if g is a primitive root mod p R = Integers(p) if R(g).multiplicative_order() != p-1: html('<font size="6">Attention: $%s$ is not a primitive root mod $%s$.\n</font>'%(g,p)) a1 = '<FONT COLOR="#FF0000" SIZE="6">Alice chose $a = %s.$\n</FONT>'%a b1 = '<FONT COLOR="#0000FF" SIZE="6">Bob chose $b = %s.$\n</FONT>'%b A = Mod(g^a, p) B = Mod(g^b, p) A1 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $A = %s.$\n</FONT>'%A B1 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $B = %s.$\n\n</FONT>'%B A2 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $A \equiv g^a \equiv %s^{%s} \equiv %s \\bmod{%s}.$\n</FONT>'%(g,a,A,p) B2 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $B \equiv g^b \equiv %s^{%s} \equiv %s \\bmod{%s}.$\n\n</FONT>'%(g,b,B,p) KA = Mod(B^a, p) KB = Mod(A^b, p) KA1 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $K_A = %s.$\n</FONT>'%KA KB1 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $K_B = %s.$\n</FONT>'%KB KA2 = '<FONT COLOR="#FF0000" SIZE="6">Alice calculated $K_A = B^a = %s^{%s} = %s \\bmod{%s}.$\n</FONT>'%(B,a,KA,p) KB2 = '<FONT COLOR="#0000FF" SIZE="6">Bob calculated $K_B = A^b = %s^{%s} = %s \\bmod{%s}.$\n</FONT>'%(A,B,KB,p) if c: html(a1+b1+'\n'+A2+B2+KA2+KB2) else: if s: html(a1+b1) html(A1+B1+KA1+KB1) html('<font size="6", color="#000000">$\Rightarrow$ The secret key is $%s.$</font>'%KA) html('Figure 2.5 This program is the same as the one above, except that you may now choose a and b yourself.\n') 
       

Diffie-Hellman Key Exchange (3)

show secret values 
show calculations (with secret values) 
Figure 2.5 This program is the same as the one above, except that you may now choose a and b yourself.

Click to the left again to hide and once more to show the dynamic interactive window

%hide %latex \subsection*{2.2 Security of the Diffie-Hellman Key Exchange} Now consider what information evil Eve can gain during the process of a Diffie-Hellman key exchange. Let's assume she has access to all the communication between the two parties. Hence she knows the values of $g, p, A$ and $B,$ but not those of $a$ or $b$ since $a$ is only known to Alice and $b$ is only known to Bob and they are never exchanged. Eve wants to know the key, i.e. the value $g^{ab} \bmod{p}.$ Eve can achieve this by finding $a$ or $b$ and then working out $g^{ab} \bmod{p}.$ The task of finding $a$ or $b$ is a well-known problem, called the {\it discrete logarithm problem.} \medskip \noindent {\bf Discrete Logarithm Problem (DLP).} Given a prime $p,$ a base $g$ and a number $A \equiv g^a \bmod{p},$ find the value of $a.$ The number $a$ is then called the {\bf discrete logarithm} (to base $g$) of $A \bmod{p}.$ \medskip \noindent So essentially, the difficulty is to find a suitable exponent $a$. Calculating logarithms is easy over the real numbers, but a hard problem when using modular arithmetic. Note that the DLP always has a solution if $g$ is a primitive root $\bmod{~p},$ meaning that the powers of $g$ take on all values from $1$ to $p-1 \bmod{p}.$ In other words for every number $A$ there exists an $a$ such that $g^a \equiv A \bmod{p}.$ Still it may be very difficult to find this solution as we will explain soon. If Eve can solve the DLP, she can obtain the secret key and thus break the Diffie-Hellman key exchange protocol. Therefore it is often said that the security of a Diffie-Hellman key exchange {\it depends} on the hardness of the DLP. So the most important question to answer at this point is: How difficult is solving the discrete logarithm problem? $**$You may have noticed that solving the DLP may not be the only way of breaking the Diffie-Hellman key exchange protocol. It is not explicitly required that Eve find out $a$ or $b,$ she really just has to find the key $K \equiv g^{ab} \bmod{p}$ somehow. If she can do this without the knowledge of $a$ or $b,$ that's fine. This problem is commonly known as the {\bf Diffie-Hellman problem (DHP)} and may be easier to solve than the DLP. However, the best currently known way of solving the DHP is by solving the DLP. Scientists have been studying these two prominent problems for decades without any significant progress, which is why it is commonly assumed that they are equivalent, and research has been concentrating on the more general DLP.$**$ In those years of study, scientists have come to the conclusion that solving the DLP is very difficult. Try solving it by generating a key in the above example and finding the value of $a$ only using your knowledge of $p, g$ and $A$ (be sure you uncheck both boxes). You will notice that as the numbers get larger, the problem becomes almost impossible to solve by hand. The truth is that it is also very difficult to solve using a computer. We will get a grasp of exactly how difficult it is by examining some algorithms which can solve the discrete logarithm problem. A common method of determining the difficulty of a problem is by studying how efficient the algorithms are that can solve it. The efficiency (or {\it complexity}) of an algorithm is in turn measured by its running time (i.e. the number of operations it performs) relative to the length (in bits) of its input. There are several classes of algorithms, three of which we discuss here. The running time of a {\it polynomial-time} algorithm can be expressed as a polynomial in the length of its input (e.g. $n^2$ if $n$ is the length of the input). Polynomial-time algorithms are generally considered efficient or {\it feasible.} On the contrary, {\it exponential} algorithms have a running time that can only be expressed by a term exponential in the length of the input (e.g. $2^n$ for an input of length $n$). Exponential algorithms are considered inefficient or {\it infeasible.} {\it Sub-exponential} algorithms are neither polynomial-time nor exponential, but somewhere in between. $**$Problems that can only be solved by exponential algorithms are considered very hard. Unfortunately, there is usually no way of {\it proving} that there is no sub-exponential or polynomial-time algorithm that solves a certain problem. It is rather an assumption that scientists make when they have studied a problem for years without finding a better algorithm. However, we should always be prepared for the unlikely case that an ingenious person presents a better than the currently known best algorithm. Nevertheless, most people feel save using a cryptographic system if its security relies on a well-studied {\it hard} problem in this sense. And no matter how paranoid we are, it's the best we can do at the moment.$**$ 
       
%hide %latex \noindent Let's now discuss some methods of solving the DLP. \medskip \noindent {\bf A naive algorithm.} The first method is very simple and immediately comes to mind when considering that there are only finitely many possibilities for $a,$ which again results from the fact that $g$ is a primitive root. More precisely the first $p$ consecutive powers of $g$ yield all numbers $1,\ldots,p-1,$ giving the equality of the sets $$\{ g^1 = g, g^2 \bmod{p}, g^3 \bmod{p}, \ldots, g^{p-1} \bmod{p} \} = \{ 1,2, \ldots, p-1 \}.$$ So the possibilities for $a$ are $1,2,\ldots,p-1$ and by trying them all we can solve the DLP. This method is called a {\it brute-force search} or {\it exhaustive search.} It is trivial for small values of $p$ but becomes impossible when $p$ is very large. Even a fast computer cannot do enough operations to try all values if $p$ has more than $80$ bits (is greater than $2^{80}$). In practice primes with $1024$ or even $2048$ bits (corresponding to approximately $300$ to $600$ decimal digits) are often chosen, making this attack infeasible. Nevertheless, observe how it can crack small problems. In the program below try the values $p=50021, g=3$ and notice that this relatively small example ($50021$ has $16$ bits) already takes a few seconds (turn off the output because that takes a long time by itself and will otherwise distort the values). This might give you an idea of the time this algorithm takes to find a discrete logarithm modulo a 1024-bit prime: For every extra bit, the computing time approximately doubles! More precisely, the algorithm is exponential in the length of $p.$ As we have learned, the term ``exponential'' is a synonym for ``inefficient'' in complexity theory. 
       
%hide html('<h3>DLP Brute-Force Search</h3>') @interact def _(p = input_box(default=23,width=10),g=input_box(default=5,width=10),A=input_box(default=6,width=10), output=checkbox(label='show output')): R = Integers(p) # check if p < 100000 if p >= 100000: html('<font size="6">Error: $p$ must be $< 100000.$\n</font>') # check if p is prime elif not is_prime(p): html('<font size="6">Error: $%s$ is not a prime.\n</font>'%p) # check if g is a primitive root mod p elif R(g).multiplicative_order() != p-1: html('<font size="6">Error: $%s$ is not a primitive root mod $%s$.\n</font>'%(g,p)) else: html('<font size="6">Task: Find $a$ such that $%s^a \equiv %s \\bmod %s.$\n</font>'%(g,A,p)) s = [] a = 0 tmp = 1 while tmp != A: s += ['$g^{%s} \equiv %s \\bmod{p}$\n'%(a,tmp)] a += 1 tmp = Mod(g^a,p) s += ['<font color="#FF0000">$g^{%s} \equiv %s \\bmod{p}$\n</font>'%(a,tmp)] if output == true and p > 200: html('<font size="6">Can only show output for p < 200. Sorry.</font>') elif output==true and p < 200: while len(s)%4 != 0: s += [''] n = len(s) t = '<table border=1>' for i in range(n/4): t += '<tr><td align=center width=250>%s</td><td align=center width=250>%s</td>\ <td align=center width=250>%s</td><td align=center width=250>%s</td></tr>\n'%(\ s[i],s[i+n/4],s[i+2*n/4],s[i+3*n/4]) t += '</table>' html(t) html('<font size="6">$\Rightarrow a = %s$ is the discrete logarithm (to base $%s$) of $%s \\bmod{%s}$</font>'%(a,g,A,p)) html('Figure 2.6 This application conducts a brute-force search. Enter a prime number p (up to 100 000), a generator g and a value A and observe <br>how the discrete logarithm is calculated by trying out possible values for a until the discrete logarithm of A to base g is found. Notice that it <br>takes longer for larger values of p. For very large values of p, turn off the output. If you do not know a generator g for your selected prime p, <br>you can calculate it with Figure 2.3.') 
       

DLP Brute-Force Search

show output 
Figure 2.6 This application conducts a brute-force search. Enter a prime number p (up to 100 000), a generator g and a value A and observe
how the discrete logarithm is calculated by trying out possible values for a until the discrete logarithm of A to base g is found. Notice that it
takes longer for larger values of p. For very large values of p, turn off the output. If you do not know a generator g for your selected prime p,
you can calculate it with Figure 2.3.

Click to the left again to hide and once more to show the dynamic interactive window

%hide %latex \noindent When the possible set of solutions to a problem is finite, a brute-force search is always possible. In almost all cases, this can solve the problem if enough time and computing power is available. However, this approach can always be thwarted by making the set of possible values so large that even a very fast computer cannot try out all solutions. For example, assuming that we're working with an 80-bit prime $p$ and that our computer can try out 100,000 values per second (i.e. do 100,000 modular exponentiations per second, a realistic assumption for a standard PC in 2008), it would take approximately 400 billion years to try all possible values. That's about 30 times the age of the universe! %Even though on average one only has to try out approximately half of the possible values, it is still beyond what any computer will be able to %achieve anytime soon. Exhaustive search is just not good enough, better methods are needed. \medskip \noindent {\bf More sophisticated attacks.} More sophisticated methods for solving the DLP exist. Some well known ones are the {\it baby-step giant-step algorithm,} the {\it Pohlig-Hellman algorithm} and {\it Pollard's rho/lambda method.} These algorithms run faster than the naive brute-force method, but they are still exponential in a parameter related $p.$ So again, $p$ can always be chosen in such a way that these attacks become irrelevant for practical purposes. They simply take too long. Nonetheless, they play an important role when choosing the right parameters for a cryptosystem. \medskip \noindent {\bf One sub-exponential algorithm.} There is only one algorithm which is actually faster than exponential time. It is called {\it index calculus algorithm} and is currently the most powerful method known for solving the DLP as we have defined it. Its running time is sub-exponential. The fastest variant of the index calculus algorithm for the DLP $\bmod{~p}$ is called the {\it number field sieve.} Its running time grows like the function $\exp(1.923 (\log p)^{\frac{1}{3}} (\log \log p)^{\frac{2}{3}}).$ Here, $\log p$ is the length of $p$ in bits. For a detailed explanation of the algorithm, see [Hankerson2004]. \medskip \noindent The running time of the number field sieve is plotted below. Compare it to the exponential running time of a brute-force search. 
       
%hide set_verbose(-1) t1 = text('exponential',(14,6000000),rgbcolor="blue") t2 = text('sub-exponential',(30,3000000),rgbcolor="red") x = var('x') min=0 max=20 # L_n[1/3,1.923] # x = log_2(n) p1 = plot(exp(1.923*x^(1/3)*log(x)^(2/3)),(min,max+15),rgbcolor=[1,0,0]) #show(p1) # c^p c=2.22 p2 = plot(c^x,(min,max),rgbcolor=[0,0,1]) (p1+p2+t1+t2).save('j1.png',axes_labels=['length of $p$','time'],figsize=5.5) ########################################################################### t1 = text('exponential',(9,25000),rgbcolor="blue") t2 = text('sub-exponential',(20,10000),rgbcolor="red") t3 = text('polynomial',(19,4000),rgbcolor=[0,1,0]) min=0 max=13 # L_n[1/3,1.923] # x = log_2(n) p1 = plot(exp(1.923*x^(1/3)*log(x)^(2/3)),(min,max+5.4),rgbcolor=[1,0,0]) #show(p1) # c^p c=2.22 p2 = plot(c^x,(min,max),rgbcolor=[0,0,1]) # p^c c = 2.5 p3 = plot(x^c,(min,max+7),rgbcolor="green") (p1+p2+p3+t1+t2+t3).save('j2.png',axes_labels=['length of $p$','time'],figsize=5.5) ############################################################################ html('<table bgcolor=lightgrey cellpadding=2>') html('<tr><td align="center"><img src="cell://j1.png"></td>') html('<td align="center"><img src="cell://j2.png"></td></tr>') html('</table>') html('Figure 2.7 Running time comparison of exponential, sub-exponential and polynomial-time algorithms.') 
       
Figure 2.7 Running time comparison of exponential, sub-exponential and polynomial-time algorithms.
Figure 2.7 Running time comparison of exponential, sub-exponential and polynomial-time algorithms.
%hide %latex \noindent The running time of a sub-exponential algorithm which solves the DLP still grows substantially with the length of $p.$ Hence running this algorithm can still be made infeasible by just choosing a large enough $p.$ However, the complexity of the calculations Alice and Bob have to do also depends on the length of $p.$ So if we choose $p$ very large, Alice and Bob eventually run into problems. Obviously, the Diffie-Hellman key exchange only works if the calculations Alice and Bob have to do are feasible, i.e. polynomial-time, and $p$ is not too large. An algorithm that's impossible to break is no good if it's impossible to execute right from the start. This is a general requirement for any cryptographic algorithm: that its execution be efficient for anyone who knows the appropriate key. %In addition, values like $p$ (on which the complexity of calculations depend) should be relatively short. This saves storage space and makes all %computations with $p$ efficient. We often compare the complexity of the algorithm itself to the complexity of breaking this algorithm in order to %determine how ``good'' a cryptographic algorithm is. The better this ratio, the smaller $p$ can be chosen. The ideal situation is that the algorithm is of a moderate polynomial-time complexity and any attack is exponential. Unfortunately, this is not the case with the DLP -- the index calculus algorithm has sub-exponential running time in our setup, which makes solving the DLP easier than we want it to be. For this reason we have to work with relatively large values for $p$ (1024 or 2048 bits). This poses an important question: \begin{center} {\bf Can we find mathematical structures where Diffie-Hellman still works\\ but the DLP is harder to solve, even for smaller values of $p$?} \end{center} The answer is yes. Elliptic curves are one such structure. They provide the same functionality as modulo arithmetic, e.g. they can be used to do the Diffie-Hellman key exchange, but index calculus cannot solve the elliptic curve discrete logarithm problem. All other algorithms known to solve this problem have exponential running time. The arithmetic involved with elliptic curves is still efficient though. This has the effect that much smaller values can be used for $p$ (approximately 160 bits instead of 1024) when replacing modulo arithmetic with elliptic curve arithmetic. 
       
%hide %latex \subsection*{2.3 Generalizing the Diffie-Hellman Key Exchange} We want to generalize the Diffie-Hellman key exchange protocol so that other mathematical structures (e.g. elliptic curve arithmetic instead of modulo arithmetic or hyperelliptic curves of small genus) can be used. In order to see how this can be done, we first learn about a few general concepts in algebra and number theory. \medskip \noindent {\bf Definition.} A {\bf (commutative) group} $(G,*)$ is a set $G$ equipped with an operation `$*$' that combines any two elements $a$ and $b \in G$ to another element $a*b \in G$ and has the following properties: \begin{itemize} \item {\it Associativity.} For all $a,b,c \in G:~~(a*b)*c = a*(b*c).$ \item {\it Identity element.} There exists a unique element $e \in G,$ such that for all $a \in G:~~a*e = e*a = a.$ \item {\it Inverse element.} For each $a \in G$ there exists an element $b \in G$ such that $a*b = b*a = e$ where $e$ is the identity element. \item {\it Commutativity.} For all $a,b \in G:~~a*b=b*a.$ \end{itemize} For further explanation, see [Groups]. It is a convention that the operation is not stated explicitly when it is clear which operation (e.g. addition or multiplication) is used. In this case the group is just referred to by $G.$ For example, $(\mathbb{Z},+)$ is a group. The identity element is $0$ and for $n \in \mathbb{Z}$ the inverse element is $-n$ (because $n + (-n) = 0$). Associativity and commutativity in $\mathbb{Z}$ are well-known. However, $(\mathbb{Z},\cdot)$ is not a group. The identity element is $1$ this time, but not all integers have inverses. This is because fractions are not in $\mathbb{Z}.$ For example, the multiplicative inverse of $5$ is $\frac{1}{5}$ (as $5 \cdot \frac{1}{5} = 1$), but $\frac{1}{5} \notin \mathbb{Z}.$ 0 also does not have a multiplicative inverse (there is no number which gives 1 when multiplied with 0). The integers $\{0,1,\ldots,p-1\}$ together with the operation ``addition $\bmod{~p}$'' also form a group. It is denoted by $(\mathbb{F}_p,+),$ where the `$+$'-symbol means ``addition $\bmod{~p}.$'' As above, the identity is $0$ and the inverse of $n \in \mathbb{F}_p$ is $p-n$ (because $n + (p-n) \equiv p \equiv 0 \bmod{p}$). Again, $(\mathbb{F}_p,\cdot)$ is not a group (if we take `$\cdot$' to mean ``multiplication $\bmod{~p}$'') because $0$ does not have an inverse element. However, if $p$ is prime (!) and if we take $0$ out of the set, we are left with $\{1,2,\ldots,p-1\},$ denoted by $\mathbb{F}_p^{\times},$ and $(\mathbb{F}_p^{\times},\cdot)$ is a group. The identity element is $1.$ Multiplicative inverses are hard to calculate by hand, but for a computer this is easy to do, e.g. using the {\it extended Euclidean algorithm} (or see Section 4.8.4 of [CrypToolScript]). A group $(G,+)$ with the operation ``addition'' is called an {\it additive} group. Here the {\it scalar} multiplication is defined as usual: for $g \in G$ we write $3g$ to mean $g+g+g.$ Note that there is still only {\it one} operation defined in the group (namely addition); group multiplication is {\it not} defined, we cannot multiply group elements. We can only multiply one group element by a {\it scalar} (i.e. an integer), hence the name {\it scalar multiplication}. Another convention for additive groups is to denote the inverse of $g \in G$ by $-g.$ In {\it multiplicative} groups $(G,\cdot)$ (i.e. ones where the operation is multiplication) similar conventions apply. We often write $g^3$ to mean $g \cdot g \cdot g$ and $\frac{1}{g}$ to mean the inverse of $g$ ($g^{-1}$ is also common). 
       
%hide %latex \medskip \noindent Now let $(G,\cdot)$ be a multiplicative group. \medskip \noindent {\bf Definition.} The {\bf order of an element} $g \in G$ is the smallest positive integer $r$ such that $g^r = 1,$ where 1 is the identity element of $G.$ (Similarly in an additive group, the order of $g$ is the smallest positive integer $r$ such that $rg = 0,$ where 0 is the identity element of the group.) \medskip \noindent The order of a group {\it element} is not to be confused with the order of the group itself. Groups can be infinite or finite. When dealing with finite groups, we are often interested in exactly how many elements the group has. \medskip \noindent {\bf Definition.} The number of elements in the set $G$ is called the {\bf order of the group} and is denoted $\# G$ or \nolinebreak $| G |.$ \medskip \noindent {\bf Definition.} An element $g \in G$ of order $\# G$ is called a {\bf generator} of $G.$ We often say that $g$ {\it generates} $G$ and write $\langle g \rangle = G.$ \medskip \noindent If $g$ has order $\#G$ then $\#G$ is the smallest number $r$ such that $g^r = 1$ and hence the consecutive powers of $g$ range through all $\#G$ group elements until $g^{\#G} = 1.$ In other words, $g$ is a generator of $G$ if and only if every element of $G$ can be written as a power of $g.$ (Similarly in an additive group, $g$ is a generator if and only if every element of the group can be written as a multiple of $g.$) Notice that a primitive root, as defined earlier, is nothing but a generator of $(\mathbb{F}_p^{\times},\cdot).$ Recall that finding such a generator of $(\mathbb{F}_p^{\times},\cdot)$ is not trivial. In $(\mathbb{F}_p,+)$ however, finding a generator is easy: 1 is a generator because every $n \in \mathbb{F}_p$ can be written as a multiple of $1$ (explicitly $n = n \cdot 1$). The groups that we have studied always have generators, however, this is not true in general. There are groups which are not generated by one single element. \medskip \noindent {\bf Definition.} Groups that have a generator are called {\bf cyclic.} \medskip \noindent Using this terminology we can now generalize the Diffie-Hellman key exchange protocol to arbitrary groups. \begin{enumerate} \item Alice and Bob agree on a group $(G,\cdot)$ and a generator $g$ of $G.$ These values may be public. \item Alice chooses a secret natural number $a$, computes $A = g^a,$ and sends $A$ to Bob. \item Bob chooses a secret natural number $b$, computes $B = g^b,$ and sends $B$ to Alice. \item Alice computes the key $K_A = B^a.$ \item Bob computes the key $K_B = A^b.$ \end{enumerate} The version of the Diffie-Hellman key exchange described in the beginning is identical to this one if we substitute $\mathbb{F}_p^{\times}$ for $G.$ In a similar way, many cryptographic protocols that use natural numbers and arithmetic $\bmod{~p}$ can be generalized to arbitrary groups. When the security of such an algorithm depends on the hardness of the DLP, then it is important that the discrete logarithm problem is difficult in that particular group. More precisely, we wish to find groups in which the group operation can be computed efficiently, but where the DLP can only be solved in exponential time (and not in sub-exponential time, as in $\mathbb{F}_p^{\times}$). %One good candidate for such groups are elliptic curves. As an exercise, let's now rewrite the Diffie-Hellman key exchange for {\it additive} groups. Remember that this is exactly the same as above, only {\it written} differently. Exponentiation is replaced by scalar multiplication (and we now denote the scalars by $\alpha$ and $\beta$ in order to distinguish them from group elements). \begin{enumerate} \item Alice and Bob agree on a group $(G,+)$ and a generator $g$ of $G.$ These values may be public. \item Alice chooses a secret natural number $\alpha$, computes $A = \alpha g,$ and sends $A$ to Bob. \item Bob chooses a secret natural number $\beta$, computes $B = \beta g,$ and sends $B$ to Alice. \item Alice computes the key $K_A = \alpha B.$ \item Bob computes the key $K_B = \beta A.$ \end{enumerate} 
       
%hide %latex The discrete logarithm problem now reads as follows. ``Knowing $G,g$ and $A,$ find a natural number $\alpha$ such that $\alpha g = A.$'' The problem looks simple: find $\alpha$! It's hard to believe that this task is so difficult, but it is. Could we use the group $(\mathbb{F}_p,+)$ for this version of the protocol? Yes we could, but it wouldn't be safe! The DLP is trivial in $(\mathbb{F}_p,+)$ because the equation can simply be solved for $\alpha.$ Let's consider the simple example $p=5,g=2,A=3.$ $g=2$ is a generator of the additive group $(\mathbb{F}_5,+)$ because $$ 1 \cdot 2 \equiv 2,~~~2 \cdot 2 \equiv 4,~~~3 \cdot 2 \equiv 1,~~~4 \cdot 2 \equiv 3,~~~5 \cdot 2 \equiv 0~~~\bmod 5, $$ i.e. all elements of $\mathbb{F}_5$ can be written as multiples of 2. We want to solve the DLP $$ 2 \alpha \equiv 3 \bmod 5,$$ so we simply solve the equation by dividing by 2, i.e. by multiplying with the {\it multiplicative} inverse of 2, which is 3 (because $2 \cdot 3 \equiv 1 \bmod 5$). Hence $$ \alpha \equiv \frac{3}{2} \equiv 3 \cdot 3 \equiv 4 \bmod 5.$$ We have solved the DLP, as $$ 2 \alpha \equiv 2 \cdot 4 \equiv 3 \bmod 5. $$ Now the general case works just the same, the equation can always be solved for $\alpha:$ $$ \alpha g = A \text{~~in~~} \mathbb{F}_p~~\Leftrightarrow~~\alpha g \equiv A \bmod {p}~~\Leftrightarrow~~\alpha \equiv \frac{A}{g} \bmod{p}. $$ The DLP can be solved by dividing by $g \bmod{p},$ i.e. by finding the {\it multiplicative} inverse of $g \bmod{p}.$ This can be done easily because $(\mathbb{F}_p^{\times},\cdot)$ is also a group and scalar multiplication in $(\mathbb{F}_p,+)$ is equivalent to the group operation (multiplication) in $(\mathbb{F}_p^{\times},\cdot).$ So in a way, the nice properties of $\mathbb{F}_p$ cause it to be unsafe in terms of the DLP. In spite of this negative example, there are additive groups that have a hard DLP. As mentioned before, elliptic curves are such groups. They are written additively, i.e. their group operation is defined as addition, but there does not seem to be a sensible way of defining a multiplication of group elements that coincides with scalar multiplication and allows such an attack. This is the reason why elliptic curves are a good candidate for cryptographic protocols. Elliptic curves as they are used in cryptography are explained in Chapter 3.