Fully-homomorphic encryption, zero-knowledge proofs, and multiparty computation

Introduction

Cloud computing and storage have changed the way businesses and people use, store and manage their data. Data is securely stored in an encrypted form, typically using a symmetric key encryption scheme, such as AES or ChaCha20. However, to perform data analytics, we have to either give the key to the server so that it can decrypt it or we have to download it, decrypt it, and run the calculations on our own, which can be costly, requiring lots of time or memory. Fully homomorphic encryption (FHE) allows us to delegate computations involving encrypted data to untrusted third parties, without any need to decrypt them first.

Even if this is a very powerful cryptographic primitive, we still face a big challenge: how do we know that the third party performed the calculation it was supposed to do? This is where zero-knowledge proofs (ZKP) come into play. ZKP allow us to prove the integrity of a given computation, without the need to re-execute it. zk-SNARKs (succinct non-interactive arguments of knowledge) yield short proofs which can be verified very fast and have applications in decentralized ledgers (solving both privacy and scalability issues) and decentralized private computations. They also face some challenges: generating proofs for arbitrary computations can be expensive and users with less powerful devices may not be able to generate them. Many zk-SNARKs require trusted setups, which should be generated by an honest party to ensure that nobody can cheat and generate fake proofs.

Both of them can be solved by multiparty computation (MPC). In this scheme, the generation of the proof or the establishment of the trusted setup is entrusted to various parties, which could have partial access to the data. In the case of setup ceremonies, as long as one of the parties involved is honest, the setup is secure. MPC can also be used, with decentralized ledgers, to ensure that anyone can participate in setup ceremonies and prevent malicious parties from blocking honest participants, using proofs with transparent setups (such as STARKs - scalable, transparent argument of knowledge).

Proof generation can be carried out by multiple servers, each of them having partial information on the secret inputs. Each party can submit proof attesting to the correctness of the proof generation protocol. Multiparty computation can also be used to perform calculations among different parties, each of them having different pieces of information relevant to the problem, such as financial information between banks or health-related information for health service providers. FHE helps parties share information and make calculations without revealing it or train machine learning models without compromising sensitive data.

It is clear from all the above that FHE, ZKP, and MPC have many points in common and each has something to offer to the other. ZKP can provide integrity of computations, FHE allows data sharing and calculation without compromising it and MPC gives us the power to delegate expensive computations to other parties. These open the doors for many new and exciting applications in finance, health, and medical sectors, with an emphasis on data privacy and decentralization.

We will now explain the basic idea behind each of these primitives.

Fully Homomorphic encryption

Fully homomorphic encryption is a form of encryption where we can perform operations with encrypted data and the result of those operations is the encrypted form of an operation involving the ciphertext. For example, in the RSA cryptosystem, we encrypt a message m using the public key by taking

E(m)=me( mod n)

Now suppose that we have m1,m2 numbers and we want to compute their product m1×m2. We can see that if we perform the product and then encrypt it, we get

E(m1×m2)=(m1×m2)e  (mod   n)

If we take the product of the encrypted forms of m1,m2, then

E(m1)×E(m2)=me1×me2=(m1×m2)e   (modn)

which is the same as calculating first the product and then encrypting. The operation in the encrypted space need not be the same as the one in the original. Given this property of the RSA cryptosystem, many researchers started wondering whether it could be possible to build a fully homomorphic encryption scheme. The first FHE scheme was presented in 2009 by Craig Gentry.

Math interlude: Homomorphisms

To be more precise, a homomorphism is a function between two algebraic structures (such as two groups, two rings, two vector spaces) and preserves their structure. If you had a course on linear algebra, linear transformations are examples of homomorphisms. In the context of groups, suppose we have two groups (𝔾1,⋅) and (𝔾2,⊕), each with their binary operation (it could be multiplication, addition, elliptic curve addition, function composition, etc). A function f:𝔾1→𝔾2 is an homomorphism if, given x,y in 𝔾1 we have

f(x⋅y) = f(x) ⊕ f(y)

Note that the operation between the images f(x),f(y) is the operation over 𝔾2. We also saw examples of homomorphisms between rings, when we defined modular arithmetic: we have a function preserving addition and multiplication from the set of integers with the usual operations, (ℤ,+,×) and the ring of integers modulo p, (ℤ/pℤ,⊕,⋅) (we use different symbols for addition and multiplication to remember that these are modulo p). For example, if we take p = 7, we have ℤ/pℤ={0,1,2,3,4,5,6}. We can see that: −5+3=−2. This is related to 2 ⊕ 3 ≡ 5 (mod7), where 2 is the element corresponding to −5 in ℤ/pℤ, 3 corresponds to itself and 5 is congruent to −2.−3×4=−12 which relates to 4⋅4≡2(mod7) in the same way as before.

It is important to see that homomorphisms are not necessarily one-to-one functions (the last ring homomorphism shows a clear example). If the homomorphism is a bijective function, it is called an isomorphism. The following is an example of an isomorphism from the real numbers ℝ with addition to the positive real numbers equipped with multiplication, ℝ +f : ℝ→ℝ+ , f (x) = exp (x) , with its inverse, f−1 : ℝ+→  ℝ,  f−1 (z) = ln(z). You can easily check that f  (x + y )=f (x) ⋅ f (y)

In the context of cryptography, we would like to have encryption or commitment schemes preserving some operations. For example, the Kate-Zaverucha-Goldberg (KZG) commitment scheme is additively homomorphic. The commitment takes polynomials (which we can think of as a group with ordinary polynomial addition, (ℙ,+)) and maps them into elliptic curve points (which also have a group structure, with elliptic curve addition, (𝔾,⊕)). We can verify that

cm(p1 (x)  +  p2(x))  =  cm(p1  (x))  ⊕  cm(p2(x))

This property is useful for proof aggregation and folding schemes. Elliptic curve pairings also provide some way to compute multiplications between polynomials in committed form (using KZG).

To be able to construct an FHE scheme we need not only preserve operations but also have a way to decipher the result.

FHE fundamentals

There are many libraries for FHE nowadays, such as OpenFHE, Microsoft SEAL, Λ∘λ and many more. With FHE you can make private queries to search engines or pages, such as Wikipedia; see here.

FHE schemes are based on lattice cryptography. A lattice is given by linear combinations with integer coefficients of some base vectors. To fix ideas, imagine we have two vectors ex = (1,0) and ey = (0,1) and we consider all possible combinations p = xex + yey with x,y in ℤ, yielding points in space (0,0),(1,0),(1,1),(−1,−2),.... A lattice looks like this. Ideal lattices correspond to ideals in polynomial rings, inheriting the natural addition and multiplication operations of the ring (Ideals generalize the idea behind certain subsets of the integers, such as even numbers. The addition of any two even numbers is always even and, whenever we multiply any integer by an even, the result is also even -an absorption property-).

To build an FHE scheme we could picture having a ciphertext with some small attached to it, such that the decryption works as long as the noise is below a certain threshold. If we have ways to homomorphically multiply and add ciphertexts, but at the expense of increasing the noise parameters accordingly, that is, E(a + b) =E (a)  E (b) and E(a×b)=E(a)⋅E(b). We call this a somewhat homomorphic encryption scheme (SHE). If we could add a "recrypt" algorithm, which can take a given ciphertext E(m) and reduce its noise, obtaining a new ciphertext E′(m) that also encrypts m, then we can obtain an FHE scheme.

The SHE scheme can handle circuits of a certain depth (imagine this as the number of times you can multiply or add before the noise becomes too large). The SHE can be modified to have its decryption circuit have a lower multiplicative depth, making it "bootstrappable" and thus transforming it into an FHE scheme.

Some common schemes for FHE are:

  • Brakerski-Fan-Vercauteren (BFV) and Brakerski-Gentry-Vaikuntanathan (BGV) for integer arithmetic.
  • Cheon-Kim-Kim-Song (CKKS) for real number arithmetic.
  • Ducas-Micciancio (DM) and Chillotti-Gama-Georgieva-Izabachene (CGGI) for boolean circuits and arbitrary functions using lookup tables.

Many cryptographic primitives, such as public key cryptography, are based on the hardness or intractability of mathematical problems, such as integer factorization (RSA) or the discrete logarithm problem (elliptic curve cryptography). These problems cannot be solved efficiently with current computers (at least, provided that the integers involved are big enough or the groups have a large number of elements). However, quantum computers could easily handle these problems if certain conditions are met, via Shor's algorithm. FHE is based on the shortest vector problem or ring learning-with-error (RLWE) problem, which is an NP-hard problem that cannot be tackled with Shor's algorithm (FHE is considered to be quantum resistant).

Zero-knowledge proofs

Zero-knowledge proofs (ZKP) have been gaining a lot of attention during the last decade, especially after the first efficient SNARK constructions. ZKP play an important role in the solution of two of the main challenges in decentralized ledgers: scalability and privacy. To validate transactions, nodes have to re-execute them, leading to bottlenecks. Besides, all the information in the ledger is public, which can leak sensitive information about individuals and organizations.

zk-SNARKs allow one party to prove a statement, without revealing anything other than the validity of the statement. For example, we can prove that we have a given secret key, without revealing it. We can also prove that we have executed some transaction or computation, without exposing secret or sensitive information. An important property of SNARKs is their succinctness, which means that proofs:

  • Are short (occupy little memory, about 1 kB for some SNARKs).
  • Are fast to verify (typically, in the order of milliseconds).

Ethereum has been adding zero-knowledge proof technologies recently to solve scalability issues. Zcash implemented ZKP to provide private transactions, while Aleo uses them to enable running private computations in a decentralized way.

How do SNARKs work under the hood? Even if there are many different constructions (such as Marlin, Plonk, Halo, and STARKs), they have a common recipe. The building blocks of SNARKs are polynomial interactive oracle proofs (PIOP) and polynomial commitment schemes (PCS). Depending on the choices made, the resulting SNARK has different properties and requirements. For example, it may be transparent (does not need a trusted setup), post-quantum secure, need special (pairing-friendly) elliptic curves, take longer times to generate proofs, have shorter proofs (less than 1 kB), allow for easy recursion, etc. A comparison between different polynomial commitment schemes is shown here.

To be able to construct the proof, we first need to transform our computation into some SNARK-friendly format. We can prove the correctness of our execution by reducing it to some kind of NP-complete problem, such as graph coloring or circuit satisfiability. We will work with arithmetic circuits and the transformation of a program into a circuit is known as arithmetization. There are different forms or strategies for doing this transformation; an overview of the most commonly used is here.

Multiparty computation

In a secure multiparty computation, a group of participants, p1,p2,...,pm, each having some secret information s1,s2,...sm, want to compute a certain function that requires the knowledge of that secret information. For example, we could have m employees wanting to know their average salary without revealing their income. One easy way to do so would be that all of them trust another party and each sends their secret information and the "trusted" party computes the average. The drawback: the "trusted" party learns all the information and could leverage it. Or perhaps he is honest, but he gets hacked and an attacker obtains everything.

Luckily, there is a useful cryptographic primitive to deal with cases like these: additive secret sharing. Each of the participants can break their secret sk into m shares in such a way that no shareholder can, on his own, learn the secret. To be able to reconstruct the secret, all of the other parties have to collude and share their part. In the example above, each employee can break his salary, sk into m different, random shares. For example, if we have 4 employees and employee A earns 4500, he can have 4 shares, sAi: -1200, 1500, 3600, 600, such that ∑i sAi = 4500. He keeps one share and distributes the rest to B, C, and D. In turn, the rest break their secret and divide it. Afterward, each participant sums all the shares he has, obtaining a partial sum, sp,A = ∑k sk,A, and then shares these partial sums to compute the final average.

Secret sharing is secure whenever parties knowing at most m − 1 have no more information than anyone with no shares at all.

Now, how can we ensure that each party does what it is supposed to do? ZKP give us a way to guarantee that each participant does the computation as expected, by submitting proof that attests to the correct execution. If he cheats, the proof should fail and he could be penalized. Early MPC protocols had significant overhead; the last decade has seen many advances, making it efficient and leading to many applications.

Summary

Fully homomorphic encryption, zero-knowledge proofs, and multiparty computations are important cryptographic primitives that have been gaining more and more attention in recent years, with the introduction of decentralized ledgers and increasing concern over data privacy. Each has its unique features and applications and has something to offer to the other primitives. FHE allows us to make cloud computations on encrypted data without needing to hand our key to the server, which prevents third parties from gaining access to the specific contents of the data. ZKP allow us to prove the correctness of a given computation by submitting a short proof, which can be quickly verified. This is seen as one of the greatest tools to solve the privacy and scalability issues of decentralized ledgers. Multiparty computation helps us distribute a complex computation or calculate something when all the inputs are distributed among several parties in a secure way; it has applications for voting, private auctions, bidding, etc. FHE can help us improve existing ZKP, which in turn can make multiparty computation much simpler and more secure. In turn, MPC is needed for the setup ceremonies of zk-SNARKs and can also help provers reduce their proof generation time by delegating them to untrusted powerful servers. ZKP can also help us ensure that the computations involving encrypted data are carried out correctly. All these fields have seen great advances over the last decade and each will help the others advance, leading to new interesting applications, with a greater focus on decentralization and privacy. In upcoming posts, we will cover in more depth the mathematical foundations of FHE and further applications of MPC and ZKP.