Hybrid Public Key Encryption: My Involvement in Development and Analysis of a Cryptographic Standard
Contents
I am a PhD candidate in the Prosecco research team at Inria Paris. My research focusses on computeraided cryptography: I have been writing mechanized cryptographic proofs using the CryptoVerif proof assistant, verified implementations with the F* programming language, and have been working on linking CryptoVerif proofs to F* implementations.
I got involved into the analysis and development of a new Internet standard, HPKE (RFC 9180), as one case study on a cryptographic protocol. Eventually, I contributed many improvements and was lucky to become an official coauthor of the standard.
In the first section of this blog post, I describe some parts of the HPKE standard and its development journey from my point of view as a firsttime coauthor of an Internet standard. In the second section, I summarize a preliminary cryptographic analysis that I conducted. Finally, in the third section, I summarize an indepth analysis of one of HPKE’s modes that I did together with others and that is published at Eurocrypt 2021. Before we dive in, the next paragraph provides a short reminder about what hybrid publickey encryption is. Feel free to skip it if you are already familiar with the concept.
Hybrid publickey encryption as a generic term stands for cryptographic building blocks used to efficiently encrypt data to a receiver public key. They are called hybrid, because they use asymmetric and symmetric building blocks together for efficiency reasons: a public key encryption scheme or a key encapsulation mechanism is used to securely communicate a symmetric key to the receiver, and this symmetric key is used in a symmetric encryption scheme to encrypt the data. Thus, a message produced by a hybrid publickey encryption scheme consists of two parts: first, a ciphertext encapsulating the symmetric key, and second, a ciphertext encapsulating the actual data. Encrypting the data directly with a publickey encryption scheme would be much more expensive in computation time and network bandwidth.
The HPKE RFC
Hybrid Public Key Encryption, while it stands for the general concept, is also the name of a new Internet standard. More precisely, it is an IRTF Informational RFC, by the Crypto Forum Research Group (CFRG). It has been developed to provide hybrid publickey encryption as a standard building block with a specification based on modern cryptographic building blocks, with a security proof, and with test vectors to help interoperability. For a concise introduction to the HPKE standard, I encourage you to head over and read the companion “too long; didn’t read” blog post by Franziskus Kiefer. It provides a highlevel overview of the functions exposed by the HPKE standard, and mentions where HPKE is already used right now. The companion blog post by HPKE coauthor Christopher Wood provides a historical introduction into hybrid publickey introduction, and an overview over the construction and design goals of the new standard.
HighLevel Functionality
The HPKE standard defines how a sender can efficiently encrypt data to a recipient. The recipient is identified by a public key. This is the functionality that has been introduced above. Moving a bit further than that, HPKE also defines how to encrypt multiple messages to a same recipient by reusing ephemeral key material and by defining how nonces are computed for subsequent messages. Additionally, HPKE defines how a sender and a receiver can agree on almostarbitrarylength shared secrets. This functionality is called Secret Export. Secret Export is useful because key encapsulation mechanisms usually output only a fixedsize shared secret. This functionality provides a secure way to expand this to an almost arbitrary length – almost, because in practice, the specific key derivation functions have limits.
HPKE Defines 4 Modes
These two functionalities, encrypting multiple messages and secret export, are available in four modes. The modes vary in the involved longterm key material. The mode names somewhat reflect the authentication guarantees provided by the mode. In all modes, a receiver key pair is necessary. After all, this is publickey encryption, so the receiver needs to possess a key pair and the sender needs to know the public part of it. In Base mode, there is no other key material involved. The usual hybrid publickey encryption corresponds to Base mode.
The PSK mode allows mixing in a preshared key that sender and receiver already share. Coming from Base mode, this allows to strengthen the confidentiality security guarantee. Also, it allows to use a PSK in a more secure way, by mixing in fresh ephemeral keys each time. Finally, it permits the receiver to authenticate the sender.
The Auth mode adds a sender key pair through which the sender can be authenticated. From a cryptographic point of view, this is the most interesting part of the new HPKE standard. It is related to signcryption. With my coauthors, I presented an analysis of it in an Eurocrypt 2021 paper. I will describe this work in a later section of this blog post.
Finally, the AuthPSK mode combines PSK and Auth mode by using both a preshared key and a sender key pair.
The following table summarizes HPKE’s four modes, showing which kinds of longterm key material are used:
receiver key pair  sender key pair  preshared key  

Base  y  n  n 
Auth  y  y  n 
PSK  y  n  y 
AuthPSK  y  y  y 
Who Has Been Involved?
My PhD advisor Karthikeyan Bhargavan started the draft together with Richard Barnes back in January 2019. Christopher Wood joined shortly after as of version 3. I myself became a coauthor of the draft with version 5. InternetDrafts are a community effort: there have been many discussion and reviews shared on the CFRG mailing list, issues and pull requests in the Github repository, and offlist discussions. The acknowledgements section in the RFC reflects that many people participated in improving this RFC. Finally, after going through 12 draft versions, HPKE has been published as RFC 9180.
Cryptographic Standard, Not a Wire Format
HPKE defines a lowlevel cryptographic standard, not a complete protocol with wire format. You might know from IETF Internet standards like TLS that they define very precisely what messages are sent over the network and in exactly which encoding. HPKE is a standard by the Internet Research Task Force (IRTF), and provides a standard for only the cryptographic protocol. It is up to a higherlevel application to define how it uses HPKE’s Application Programming Interface (API), and in what format it sends messages. The HPKE standard defines which API implementations need to provide such that an application wanting to use HPKE can use them as a library.
There are also a couple of nongoals, or, certain properties that HPKE is not providing, that a highlevel application needs to implement itself if needed. Notably, the concepts that HPKE is not considering in its design are: message order and message loss, downgrade prevention, replay protection, forward secrecy, bad ephemeral randomness, and hiding plaintext length. These are all important properties, however it is not the task of HPKE as an individual Internet standard to provide a monolithic solution to all needs. The standard does contain a discussion of how to achieve bidirectional encryption and metadata protection, though. Error handling is important, too.
The API Defined by HPKE
The HPKE standard defines several interfaces for applications to use. However, to keep it simple, we’ll focus on the “multishot interface” first. It provides access to the complete functionality of HPKE – the other interfaces just provide simpler access to a subset.
To begin, an HPKE user needs to setup a socalled context, which we could as well have called state. The setup function takes as input all longterm key material that is involved in the mode, plus an optional applicationspecific bitstring. This way, the context is bound to the sender and receiver key material that it is meant for, and to their use case if a pair of sender and receiver use HPKE for multiple usages at once.
The context contains four values, that are initialized by the setup function: the symmetric key for data encryption, a base nonce for data encryption, a secret used to derive export secrets, and a sequence number for nonce derivation. The first three are bitstrings of a certain fixed length, and cryptographically speaking indistinguishable from uniformly random values. The sequence number is always initialized to zero.
Using this context, the user can do the following, in any order:
 Encrypt or decrypt a message. This will compute the nonce and increase the sequence number.
 Export an almostarbitrarylength shared secret.
The two other interfaces that are provided for easier access are:
 The singleshot encryption interface. It allows encrypting one message to a receiver public key. If used in Base mode, this is the usual use case for hybrid publickey encryption.
 The singleshot secret export interface. It allows generating one secret of (almost) arbitrary length, that will be known only by the sender and the receiver.
Available Implementations
The Github repository of the RFC lists available implementations in different programming languages and indicates which draft version and modes they support. Hopefully, these implementations will update to the final version now that the InternetDraft is officially published. You can create your own implementation and test it against the test vectors that are included in the RFC!
Security Guarantees
Naturally for an encryption system, one expects the message’s contents to stay secret, i.e., that they are only known to sender and receiver. HPKE guarantees that as long as the receiver private key is not compromised, or, if a preshared key is used, as long as the receiver private key and the preshared key are not both compromised. For the Base mode, the appropriate security notion is usually called INDCCA2 security. For the Auth mode, we call it OutsiderCCA and InsiderCCA security following signcryption literature (more on that further down in this post). Likewise, one expects that the export keys are secret, i.e., only known to sender and receiver. Just as for the messages, HPKE guarantees that as long as the receiver private key and an optional preshared key are not compromised. Formally, the security notion is that export keys are indistinguishable from uniformly random bitstrings of the same length.
In the PSK, Auth, and AuthPSK modes, the sender can be authenticated. HPKE guarantees this if longterm key material is not compromised at the time of message reception: sender private key, possibly a preshared key, and the receiver private key. The receiver private key must not be compromised because HPKE is fundamentally vulnerable to keycompromise impersonation. This is covered in detail in Section 9.1 “Security Properties” of the HPKE standard. For Auth mode, we call the respective security notion OutsiderAuth.
The HPKE Construction
In the beginning of the post, I mentioned publickey encryption (PKE) and key encapsulation mechanisms (KEMs). Traditionally in hybrid publickey encryption, we would generate a symmetric key and encrypt it under the receiver public key using a PKE scheme. The new HPKE RFC goes a different road and uses a KEM to encapsulate a symmetric key. KEMs are more specialized building blocks in that they are really only constructed to transport a uniformly random payload, unlike a generic PKE scheme that can encrypt arbitrary plaintexts. This specialization makes KEMs more efficient for this particular use case of transporting symmetric keys, and simplifies the security definitions. While for PKE, there are several subtlely different INDCCA security notions, it has been shown that they are all equivalent for KEMs and thus collapse into one. KEMs are particularly interesting for HPKE because it prepares the RFC for the use of specialized postquantumsecure building blocks: The NIST PostQuantum Cryptography competition called for the development of postquantumsecure KEMs, and chose 4 candidates to go forward.
Coming back to the initial question of how the RFC constructs HPKE. There are three parts: First, a KEM is used to generate or encapsulate a shared secret. Second, a key schedule function takes this shared secret as parameter and computes a context, as introduced above. The context contains a key and a base nonce for symmetric encryption, and an exporter secret from which the secret export interface will derive further shared secrets. Third, an Authenticated Encryption with Associated Data (AEAD) scheme for encryption and a Key Derivation Function (KDF) for deriving shared export secrets. Let’s visit these three in a bit more detail.
The KEM in HPKE
An HPKE instantiation needs two kinds of KEMs if it wants to implement all four modes.
A standard INDCCAsecure KEM is sufficient to provide the Base and PSK modes:
The KEM needs to provide an Encap()
function that takes a receiver public key as parameter, and returns
a KEM shared secret and its encapsulation.
The Decap()
function takes an encapsulation and a receiver private key as parameter,
and returns the KEM shared secret.
The preshared key only gets involved at the stage of the key schedule function.
In the Auth and AuthPSK modes, a sender key pair is involved, and it does so already at the KEM stage.
An “Authenticated KEM” needs to provide an AuthEncap()
function that takes a receiver public key and a sender private key as parameter,
and an AuthDecap()
function that takes an encapsulation, a receiver private key, and a sender public key as parameter.
The HPKE RFC defines a KEM that supports all modes, based on ellipticcurve DiffieHellman. It is called DHKEM and its idea is straightforward: The encapsulation algorithm generates a fresh ephemeral DiffieHellman key pair. Its public key is the KEM ciphertext, also called encapsulation. A DiffieHellman shared secret is computed between the ephemeral and the receiver key pair. This DH shared secret is fed through a KDF to compute the KEM shared secret. For the authenticated variant, an additional DiffieHellman shared secret is computed between the sender and receiver key pairs, and concatenated to the ephemeralstatic DH shared secret before feeding it through the KDF.
A Protocol Change Coming From Provable Security Analysis
Up to including draft 2 of the RFC, DHKEM did not use a KDF to process the DH shared secret(s), but returned them directly as KEM shared secret. In early 2020, I had finished a preliminary analysis of HPKE. I then went on to analyse DHKEM separately. The goal was to produce a security proof for HPKE assuming a generic INDCCAsecure KEM, and a separate proof that DHKEM is such an INDCCAsecure KEM. Such a modular analysis is beneficial when the need to change the KEM arises. (This was before the work on Outsider and Insider security that was published in the Eurocrypt 2021 paper and so it focussed mainly on Base mode.)
In this separate analysis of DHKEM, it turned out that it was not provably INDCCA secure using the Gap DiffieHellman assumption. We use this assumption because an HPKE recipient can be used as an oracle for the Decisional DiffieHellman problem. Trying to prove that the DiffieHellman shared secret is indistinguishable from a random group element fails, because the adversary can submit the ephemeral public key, the receiver public key, and the DiffieHellman shared secret (or the random group element) to the DDH oracle. Thus, the DiffieHellman shared secret cannot be used directly as KEM shared secret. The usual mitigation is to use a key derivation function to compute the KEM shared secret, like the final version of the RFC does now.
Another reason to use a key derivation function to derive the final KEM shared secret is that practical pseudorandom functions require a key that is a uniformly random bitstring, and not only a uniformly random group element. In general, the bitstring representation of DiffieHellman public keys is not a surjective function, and so, uniformly random group elements could be distinguished from uniformly random bitstrings.
I proposed a change that added the KDF step in a pull request that was accepted. It became part of HPKE as of draft version 3. This was not an exploitable security issue in HPKE: The key schedule function of HPKE applies a KDF to the DH shared secret anyway, so the nonuniformity was not exploitable anywhere. It was an inconsistency that hindered modularity: it would require a separate proof for DHKEMbased HPKE and for HPKE with other INDCCAsecure KEMs. It could also have possibly confused users of the standard: Asking future KEMs to be INDCCA secure and proposing a DHKEM that is not INDCCA secure itself is inconsistent – people might use the draftversion2 DHKEM believing it is INDCCA secure.
Avoiding Identity MisBinding Issues
Another change I contributed alongside this was to move the binding to the sender and receiver identities from the key schedule function into DHKEM. Since then, this binding happens during the key derivation step: the bitstring representations of ephemeral, receiver, and sender (for the authenticated KEM) public keys are used as info parameter of the HKDFExpand step. This serves to avoid identity misbinding issues, also known as unknown keyshare attacks. X25519 and X448 are particularly prone to this because they have “equivalent” public keys: These are public keys that are different, but will lead to the same DiffieHellman shared secret when used with the same private key. Putting the bitstring representation of the public keys explicitly into the context of the key derivation will make sure that the KEM shared secret ends up different, even for equivalent public keys.
The more identity information in the context of key derivation the better – it increases security and simplifies analysis. This already came up during my previous work on analysing the WireGuard VPN protocol. If you are interested, take a look at the discussion in the long version of our paper (Page 39 in the PDF) or the slides of the conference talk (Slide 12 on Page 87 in the PDF).
Implicit Authentication
The authenticated variant of DHKEM only provides implicit authentication:
AuthDecap()
cannot abort if the encapsulation was produced by a party who does not know the sender private key –
it will simply compute a KEM shared secret that can only later produce an error on a layer that can explicitly check authentication.
This is the case at the AEAD stage: it can explicitly check authentication because of its authentication tag.
The KeySchedule Function
The previous paragraphs were all about the KEM in HPKE – time to move to the next stage: the KeySchedule function. It mixes together the KEM shared secret, the optional preshared key, and an applicationspecific info bitstring that can be chosen freely by the application to separate use cases.
We took great care to ensure proper oracle separation throughout the entire construction of HPKE. DHKEM, KeySchedule, and the secret export all use a key derivation function that is possibly based on the same underlying hash function. The analysis is made easier if we can separate these hash calls somewhat. Thus, we do not use any hash function directly in HPKE, but always via an Extract/Expand construction known from HKDF. Each use of Extract and Expand has a unique label, to avoid collisions and ease the security analysis. A detailed description can be found in Section 9.6 “Domain Separation” of the RFC.
Data Encapsulation/AEAD
HPKE is currently only specified for AEAD schemes that use a nonce. The base nonce computed by the KeySchedule is xored to a sequence number (that starts at zero) to compute the nonce for the current encryption. Thus, the first nonce, which is the only one used in the singleshot encryption interface, is directly given by the base nonce.
Secret Export
The secret export function expects an “exporter context” bitstring that is mixed into the key derivation, and the desired secret length. For the HKDFbased KDFs specified for HPKE, this length is limited to 255 times the output length of the underlying hash function.
Input Length Restrictions
Discussions with my former PhD colleague Benjamin Beurdouche were the inspiration to indicate maximum lengths for all inputs to HPKE that will be fed through a key derivation function. And so, the HPKE RFC contains a section on input length restrictions for the preshared key and its identifier, the info bitstring, the exporter context bitstring, and the input key material used to derive DiffieHellman key pairs. The maximum input lengths are admittedly very large and only marginally smaller than the maximum input length of the underlying hash function. We decided to include them anyway for completeness and to raise awareness that these functions do have input limits. If you are interested in these kinds of problems, a relevant related work is another CFRG draft, Usage Limits on AEAD Algorithms. HPKE coauthor Christopher Wood is involved there, as well.
While working on the cryptographic analysis of HPKE, I have also been updating an F* implementation of HPKE to the newest drafts. Before myself, Aymeric Fromherz worked on it as part of the HACLxN paper. F* is a “generalpurpose functional programming language aimed at program verification”, with a rich type system including dependent and refinement types. HACL* is a formally verified crypto library implemented in F*. It is used in production in Mozilla's NSS library, the Windows kernel, the Linux kernel, Microsoft's implementation of the QUIC protocol, the Tezos blockchain, and WireGuard.
In Lines 348 to 393 of Spec.Agile.HPKE.fsti you can see how the maximum input lengths are computed depending on the maximum input lengths of the various hash functions. Finally, the types of the inputs are refined accordingly, for example in the case of the preshared key:


Successful type checking of the specification means that this computed maximum input length is indeed a valid input length. This is possible because the HACL* library defines maximum input lengths for all hash functions that it implements.
PreShared Keys and Partitioning Oracle Attacks
It is important to note that HPKE is not suitable for use with lowentropy preshared keys (PSKs), for example a password that is directly used as PSK. This is because HPKE uses HKDF as key derivation function, which is not designed to resist dictionary attacks. Additionally, Julia Len et al. discovered Partitioning Oracle Attacks that can be used to recover lowentropy PSKs in HPKE. See the Section “PreShared Key Recommendations” in the RFC for more details.
Analyzing HPKE with CryptoVerif
I started working on provable security of HPKE in approximately November 2019. After reading the draft, I went on to prototype a CryptoVerif model for the analysis of HPKE, to quickly and broadly assess its cryptographic security. I could have done this by hand, but as I am used to writing proofs with CryptoVerif, I did that. Also, as CryptoVerif forces the user to write down the protocol clearly with variable types, function input and output types, etc., this helps to catch ambiguities in or misunderstandings of the specification early. In February 2020, I uploaded this preliminary analysis as a preprint on the IACR eprint server. In this section of the blog post I summarize this work.
Short Interlude on CryptoVerif
CryptoVerif is a proof assistant for cryptographic gamehopping proofs.
It is developed by my PhD advisor Bruno Blanchet and has been used to
write proofs for realworld protocols like SSH, Signal, TLS and WireGuard.
When using CryptoVerif,
the user has to specify the initial game in a domainspecific language, an applied picalculus.
The user also specifies security queries to define the proof goals:
CryptoVerif can prove secrecy of values, authenticationlike properties via correspondence between events,
and indistinguishability between two games.
For some simple protocols, CryptoVerif can find
proofs automatically.
For more complex protocols, and that is the case for most interesting realworld
protocols, we have to use CryptoVerif’s
interactive mode to write a proof, game hop by game hop.
This works by using short commands to indicate what kind of game transformation
CryptoVerif shall do to rewrite the game.
Indeed, the user only needs to write down the initial game, and CryptoVerif
will write the intermediate games just based on the indication of a transformation.
In contrast, when using the EasyCrypt proof assistant, the user has to write all intermediary
games and prove that they are indistinguishable up to negligible probability by hand,
EasyCrypt only checks the proof.
EasyCrypt’s advantage is that it is more expressive.
In CryptoVerif, for example, the command crypto ind_cca(enc)
indicates that one wants
to use the INDCCA assumption on an encryption scheme enc
to transform the game.
If a proof concludes in CryptoVerif, we have asymptotic security.
CryptoVerif also computes an exact security bound.
However, usually, if the user does not craft the proof carefully, it is not
an optimal bound.
Scope of the Analysis
The models I wrote for this preliminary analyis cover all four modes of HPKE.
Concerning the interfaces, they cover only a single encrypted message and two
exported secrets.
They consider all compromise scenarios of longterm key material
that are reasonable, i.e., that do not lead to a trivial loss of security.
The model considers a twouser security notion, so there are two honest parties
S
(for sender), and R
(for receiver).
The adversary can instruct S
to use HPKE with some other party, which
can be R
or a party controlled by the adversary.
For session in which the honest parties S
and R
communicate, I prove secrecy of the message,
sender authentication if applicable, and secrecy of the exported secrets.
Cryptographic Assumptions
The proofs use the following cryptographic assumptions:
 Collision resistance for the hash function used in the KeySchedule. In version 2 of the draft, there were still direct calls to a hash function. This has later been replaced by calls to HKDFExtract, as mentioned above.
 HKDFExtract used with SHA256, SHA384, or SHA512 is a random oracle.
 HKDFExpand used with SHA256, SHA384, or SHA512 is a pseudorandom function.
 ChaCha20Poly1305, AES128GCM, and AES256GCM are INDCCA and INTCTXTsecure authenticated encryption schemes with additional data.
 The Gap DiffieHellman assumption holds in the primeorder groups P256, P384, and P521. The analysis is not valid for X25519 and X448, we included them only for our Eurocrypt 2021 work.
Section 4 “Cryptographic Assumptions” in the preprint references theorems that support these assumptions.
Limitations and Caveats
In this analysis, I only cared about asymptotic security and not about exact security. In practice, this just means that I did not look at the security bound that CryptoVerif computed. As mentioned before, this analysis is only valid for the three NIST curves for which HPKE is specified. Also, analysis of multiple messages and an arbitrary number of exported secrets is missing. The security notion is not as strong as it could be concerning compromise of keys: the adversary is not allowed to choose the sender private key but can only compromise it. We improve this in the Eurocrypt 2021 work. Finally, the model does not cover communication of a sender with themselves, which is interesting for local file encryption, for example.
Results and Outcomes
The analysis produced asymptotic security proofs for all covered modes, interfaces, and compromise cases. This is summarized in Table 2 in the preprint (Page 17 in the PDF). Therefore, the analysis confirms HPKE’s security goals, when looking at the whole protocol, and not at DHKEM separately. While building upon this work to analyse DHKEM separately, it became clear that it is not INDCCAsecure on its own, as described above in this post. This led to a clear improvement of the standard, by adapting DHKEM to include a KDF step. Finally, discussing this preliminary analysis helped me and my coauthors of the Eurocrypt 2021 paper to come together. There, we did a careful and stronger design of security notions for HPKE’s Auth mode, and provide cryptographic proofs with exact security bounds. This work is summarized in the next section.
If you want to take a look at CryptoVerif models of HPKE, you find the one’s of this preliminary analysis in this GitHub repository. A good starting point could be the model for Base mode. However, I encourage you to rather look at the models of the Eurocrypt paper. These are more polished, have stronger security notions and proofs with better bounds, and are uptodate with the standard! Links are in the next section.
Looking Deeper into HPKE’s Auth Mode
This is joint work with my coauthors Joël Alwen, Bruno Blanchet, Eduard Hauck, Eike Kiltz, and Doreen Riepel.
From a cryptographic point of view, Auth mode is the most novel and interesting part of HPKE. It turned out that designing and proving appropriate security notions for a generic Authenticated KEM (AKEM) and generic Authenticated Public Key Encryption (APKE) is a challenging and timeconsuming task, and so we had to limit our work to a certain subset of HPKE.
Scope and Limitations
We analyse HPKE’s Auth mode, however we only cover a single encrypted message
and do not consider secret export.
In contrast to the preliminary analysis presented above, we develop multiuser
security notions with n
honest parties, including the possibility to
encrypt a message to the sender.
Furthermore, we cover all elliptic curves for which HPKE is specified.
In the terms of the standard, the object of analysis is the
singleshot encryption interface of HPKE’s Auth mode.
Our analysis is modular: We define and prove security notions for DHKEM, we prove that the KeySchedule function is a PRF, and we prove composition theorems about security notions of HPKE assuming a secure Authenticated KEM and a PRF (the KeySchedule function).
Cryptographic Assumptions
An important contribution of independent interest is the new framework of (rerandomisable) nominal groups. We developed it to capture both primeorder (the three NIST curves) and nonprimeorder groups (Curve25519 and Curve448) in one model. This allows us to write only one single proof per security notion and then specialize the exact security bound for each particular curve. The very short summary of the framework is that we do not assume a group structure, but only an exponentiation function with certain properties.
In the paper, we define the GapDiffieHellman assumption for nominal groups and use it in our CryptoVerif proofs of DHKEM’s security. The other assumption we use for DHKEM is that HKDF is a random oracle.
The assumptions for the PRF proof of KeySchedule are that HKDFExtract is a collisionresistant hash function on one specific input domain and a PRF on a separate input domain, and that HKDFExpand is a PRF.
For the composition theorems of HPKE we assume a secure Authenticated KEM, a PRF (the KeySchedule function), and an AEAD encryption scheme that is INDCPA and INTCTXTsecure.
As our security notions for AKEM and APKE are multiuser notions, we equally use multiuser/multikey notions for PRF, INDCPA, and INTCTXT.
Security Notions for AKEM and APKE
Authenticated KEMs and Authenticated Public Key Encryption use a sender and a receiver key pair, and aim to provide confidentiality and authentication guarantees. Meaningful security notions have to cover these two security goals and different compromise cases of the keys pairs. In our paper, we design three security notions: OutsiderCCA, InsiderCCA, and OutsiderAuth, both for AKEM and APKE. For an intuition of these notions, I’ll cite from the summary in the HPKE standard:
Intuitively, OutsiderCCA security formalizes confidentiality, and OutsiderAuth security formalizes authentication of the KEM shared secret in case none of the sender or recipient private keys are compromised. InsiderCCA security formalizes confidentiality of the KEM shared secret in case the sender private key is known or chosen by the adversary. (If the recipient private key is known or chosen by the adversary, confidentiality is trivially broken, because then the adversary knows all secrets on the recipient’s side).
For APKE, these notions are defined respectively, where the confidentiality and authentication guarantees are about the message plaintext. The difference to the preliminary analysis summarized in the previous section is that here, the adversary is allowed to choose the private key in the InsiderCCA security notion instead of only being able to compromise it. This makes InsiderCCA a strictly stronger security notion.
It is also possible to define an InsiderAuth notion in the same spirit. However, we already know that DHKEM is vulnerable to keycompromise impersonation. Also, our work found that the generic HPKE construction is fundamentally vulnerable to keycompromise impersonation. I’ll cite how this is covered in the standard:
As shown by [ABHKLR20], keycompromise impersonation attacks are generally possible on HPKE because KEM ciphertexts are not bound to HPKE messages. An adversary who knows a recipient’s private key can decapsulate an observed KEM ciphertext, compute the key schedule, and encrypt an arbitrary message that the recipient will accept as coming from the original sender. Importantly, this is possible even with a KEM that is resistant to keycompromise impersonation attacks. As a result, mitigating this issue requires fundamental changes that are outofscope of this specification.
Results
Considering the HPKE standard, the result of the paper is that HPKE’s Auth mode satisfies its desired security properties with a maximum security level of 128 bit. This limitation is in place even if a curve or a hash function with a higher security level is used. The reason is the AEAD – its authentication tag has a length of 128 bit regardless of the key length. For more details, I encourage you to read Section 6.3 in the preprint, where we discuss the security level comprehensively. Let me just cite the last two sentences of the paper:
a greater security level could be obtained by using AEADs with longer tags. More research on the multikey security of AEAD schemes is still needed to confirm this analysis.
In terms of theorems the paper contains the following results:
 CryptoVerif proofs for OutsiderCCA, InsiderCCA, and OutsiderAuth of the standard’s DiffieHellmanbased instantiation of AKEM
 a CryptoVerif proof of PRFsecurity of HPKE’s KeySchedule
 CryptoVerif proofs of composition theorems for OutsiderCCA, InsiderCCA, and OutsiderAuth of the AKEM/KeySchedule/AEAD construction
 Handwritten nontight proofs establishing that the respective singleuser/twouser security notions of the AKEM imply the multiuser security notions. These theorems are helpful to close the gap to proofs of postquantumsecure KEMs which are usually only analysed in a twouser setting. By providing these proofs, we make our results accessible to a greater community.
Influence on the Standard
This work on analysing HPKE’s Auth mode influenced the standard in several ways. First, we proposed a small change in the KeySchedule function to make the modular cryptographic analysis nicer and easier. We suggested changing the following two lines in KeySchedule


to


The reasoning behind the change is documented in
this pull request.
In short, swapping shared_secret
and psk
allows us to use a PRF assumption
on HKDFExtract in a more straightforward way.
Second, the results of the analysis have been summarized in Section 9.1 “Security Properties” of the standard, notably to add text on the generic KCI attack that was surfaced by our work.
Third, the composition theorems proved in our work have informed Section 9.2 “Security Requirements on a KEM used within HPKE” of the standard, which is important when HPKE is used with other (Authenticated) KEMs.
More Resources on This Work
I gave a 5min short live conference talk during Eurocrypt 2021 in Zagreb, Croatia. You can take a look at the 6 slides. There is also a 26min long prerecorded version of the talk, with slides. If you want to look at the paper, I encourage you to use the IACR preprint, as we updated the nominal groups framework and the theorem’s security bounds since the conference version. The CryptoVerif models are available on GitHub. The README helps to navigate through the different files. Appendix B of the IACR preprint (as of Page 44 in the PDF) helps to understand how the security notions in mathematical notation used in the preprint relate to the CryptoVerif code.
Closing Words
As you could guess after reading the “Limitations” paragraphs in the two previous sections, there are several things left to analyse. One thing to add is the use of multiple modes with a same key pair, as noted in the Section “Kem Key Reuse”. Time will tell when and if we write proofs for this!
The contributions of my Eurocrypt coauthors and me have clearly shown that early involvement of formal cryptographic analysis during Internet standard development is welcome and can have a positive impact on their development. I am grateful for the collaboration with the HPKE coauthors and the coauthors of the Eurocrypt paper. It has been a great experience during which I learned a lot. It is motivating and feels rewarding to see that my work is relevant for realworld use cases.
Are you a cryptographer and interested in realworld protocols? Then I encourage you to take a look at the drafts that are currently workinprogress in the Crypto Forum Research Group (CFRG) and see if you are interested in contributing, for example by reviewing or formally analysing them!
I thank Franziskus Kiefer and Karthikeyan Bhargavan for providing feedback on drafts of this blog post.
Further Reading
Too long; didn’t read on HPKE, a short introduction to HPKE by Franziskus Kiefer.
HPKE: Standardizing publickey encryption (finally!) by HPKE coauthor Christopher Wood.
An Executable HPKE Specification, a blog post on a hacspec implementation of the HPKE RFC by Franziskus Kiefer.
Cloudflare and the IETF, coauthored by HPKE coauthor Christopher Wood. The post mentions HPKE, among others Internet standards.