In embedded applications the code size is often critical, so if an application requires both xGCD and modular inverse, usually xGCD is implemented alone, because it can provide the modular inverse, as well. We show here that from the modular inverse the two xGCD factors can be reconstructed, even faster than it would take to compute them directly. Therefore, it is always better to implement a modular inverse algorithm than xGCD. These apply to subroutine libraries, too, there is no need for a full xGCD implementation.
The modular inverse algorithms return a positive result, while the xGCD factors can be negative. c = x^{−1} and c = x^{−1}–^{ }y provide the two minimal values of one xGCD factor. The other factor is d = (1^{ }− c·x)^{ }_{/} y, so d = (1^{ }− x·x^{−1})^{ }_{/} y and d = x^{ } +^{ }(1^{ }− x·x^{−1})^{ }_{/} y are the two minimal values. One of the c values is positive, the other is negative, likewise d. We pair the positive c with the negative d and vice versa to get the two sets of minimal factors.
To get d, calculating only the MS half of x·x^{−1}, plus a couple of guard digits, is sufficient. Division with y provides an approximate quotient, which rounded to the nearest integer gives d. This way there is no need for longer than ybit arithmetic (except two extra digits for the proper rounding). The division is essentially of the same complexity as multiplication (for operand lengths in cryptography it takes 0.65…1.2 times as long, see e.g. [17]).
For the general case g^{ }>^{ }1 we need a trivial modification of the modular inverse algorithms: return the last candidate for the inverse before one of the parameters becomes 0 (as noted in [22] for polynomials). It gives x^{*} such that x·x^{*} ≡ g mod y. Again c = x^{*} or c = x^{*}–^{ }y and d = (g^{ }− x·x^{*})^{ }_{/} y or d = x^{ }+^{ }(g^{ }− x·x^{*})^{ }_{/} y.
The extended GCD algorithm needs storage room for the 2 factors in addition to its internal variables. They get constantly updated during the course of the algorithm. As described above, one can compute the factors from the modular inverse and save the memory for one (long integer) factor and all of the algorithmic steps updating it. The xGCD algorithms applied for operand lengths in cryptography perform a number of iterations proportional to the length of the input, and so the operations on the omitted factor would add up to at least as much work as a shiftadd multiplication algorithm would take. With a better multiplication (or division) algorithm not only memory, but also some computational work can be saved.
1.4Cryptographic Applications
The modular inverse of long integers is used extensively in cryptography, like for RSA and ElGamal public key cryptosystems, but most importantly in Elliptic Curve Cryptography.
1.4.1RSA
RSA encryption (decryption) of a message (ciphertext) g is done by modular exponentiation: g^{e} mod m, with different encryption (e) and decryption (d) exponent, such that (g^{e})^{d} mod m = g. The exponent e is the public key, together with the modulus m = p·q, the product of 2 large primes. d is the corresponding private key. The security lies in the difficulty of factoring m. (See [10].) Modular inverse is used in

Modulus selection: in primality tests (excluding small prime divisors). If a random number has no modular inverse with respect to the product of many small primes, it proves that the random number is not prime. (In this case a simplified modular inverse algorithm suffice, which only checks if the inverse exists.)

Private key generation: computing the inverse of the chosen public key (similar to the signing/verification keys: the computation of the private signing key from the chosen public signature verification key). d = e^{−1} mod (p^{ }–1)(q^{ }–1)

Preparation for CRT (Chinese Remainder Theorem based computational speedup): the precalculated halfsize constant C_{2} = p^{−1} mod q (where the public modulus m^{ }=^{ }p·q) helps accelerating the modular exponentiation about 4fold. [10]

Signed bit exponent recoding: expressing the exponent with positive and negative bits facilitates the reduction of the number of nonzero signed bits. This way many multiplications can be saved in the multiplysquare binary exponentiation algorithm. At negative exponent bits the inverse of the message g^{−1} mod m – which almost always exists and precomputed in less time than 2 modular multiplications – is multiplied to the partial result [15]. (In embedded systems, like smart cards or security tokens RAM is expensive, so other exponentiations methods, like windowing, are often inapplicable.)
The public key is (p, α, α^{a}), fixed before the encrypted communication, with randomly chosen α, a and prime p. Encryption of the message m is done by choosing a random k [1, p−2] and computing γ = α^{k} mod p and δ = m∙(α^{a})^{k} mod p.
Decryption is done with the private key a, by computing first the modular inverse of γ, then (γ^{−}^{1})^{a} = (α^{−a})^{k} mod p, and multiplying it to δ: δ∙(α^{−a})^{k} mod p = m. (See also in [10].)
Share with your friends: 