Smooth Min-entropy Lower Bounds for Approximation Chains

Marwah, Ashutosh; Dupuis, Frédéric

doi:10.1007/s00220-024-05074-8

Smooth Min-entropy Lower Bounds for Approximation Chains

Published: 23 August 2024

Volume 405, article number 211, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Communications in Mathematical Physics Aims and scope Submit manuscript

Smooth Min-entropy Lower Bounds for Approximation Chains

Download PDF

44 Accesses
Explore all metrics

Abstract

For a state $\rho _{A_1^n B}$, we call a sequence of states $(\sigma _{A_1^k B}^{(k)})_{k=1}^n$ an approximation chain if for every $1 \le k \le n$, $\rho _{A_1^k B} \approx _\epsilon \sigma _{A_1^k B}^{(k)}$. In general, it is not possible to lower bound the smooth min-entropy of such a $\rho _{A_1^n B}$, in terms of the entropies of $\sigma _{A_1^k B}^{(k)}$ without incurring very large penalty factors. In this paper, we study such approximation chains under additional assumptions. We begin by proving a simple entropic triangle inequality, which allows us to bound the smooth min-entropy of a state in terms of the Rényi entropy of an arbitrary auxiliary state while taking into account the smooth max-relative entropy between the two. Using this triangle inequality, we create lower bounds for the smooth min-entropy of a state in terms of the entropies of its approximation chain in various scenarios. In particular, utilising this approach, we prove approximate versions of the asymptotic equipartition property and entropy accumulation. In the companion paper (Marwah and Dupuis in Proving Security of BB84 Under Source Correlations, 2024. arXiv:2402.12346 [quant-ph]), we show that the techniques developed in this paper can be used to prove the security of quantum key distribution in the presence of source correlations.

How to Smooth Entropy?

Bitwise Quantum Min-Entropy Sampling and New Lower Bounds for Random Access Codes

Necessary Criterion for Approximate Recoverability

Article Open access 04 August 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

One-shot information theory investigates the behaviour of tasks in communication and cryptography under general unstructured processes, as opposed to independent and identically distributed (i.i.d) processes, where the states or the tasks themselves have a certain tensor product structure. This is crucial for information theoretically secure cryptography, where one cannot place any kind of assumption on the actions of the adversary (see, for example, [2, 3]). To prove security for such protocols, a common strategy is to show that some smooth min-entropy is sufficiently large. For this reason, the smooth min-entropy [4, 5] is one of the most important quantities in one-shot information theory.

The smooth min-entropy $H^{\epsilon }_{\min }(K|E)_{\rho }$ for the classical-quantum state ${\rho = \sum _{k} p(k)|{k}\rangle \langle {k}|\otimes \rho _{E|k}}$ characterises the amount of randomness one can extract from the classical register K independent of the adversary’s register E [6]. It behaves very differently from the von Neumann conditional entropy, which characterises tasks in the i.i.d setting, and the difference between the two can be very large. Roughly speaking, the smooth min-entropy places a much higher weight on the worst possible scenario of the conditioning register, whereas the von Neumann entropy places an equal weight on all possible scenarios.

An important and interesting argument, which works with the von Neumann conditional entropy but fails with the smooth min-entropy, is that of proving lower bounds on the entropy using an approximation chain. We call a sequence of states,^{Footnote 1}$(\sigma ^{(k)}_{A_1^k B})_{k=1}^n$ an $\epsilon $-approximation chain for the state $\rho _{A_1^n B}$ if for every k, we can approximate the partial state $\rho _{A_1^k B}$ as $\Vert {\rho _{A_1^k B} - \sigma ^{(k)}_{A_1^k B}}\Vert _1 \le {\epsilon }$. If one can further prove that these states satisfy $H(A_k | A_{1}^{k-1} B)_{\sigma ^{(k)}} \ge c$ for some $c>0$ sufficiently large, then the following simple argument shows that $H(A_1^n | B)_{\rho }$ is large:

$$\begin{aligned} H(A_1^n | B)_{\rho }&= \sum _{k=1}^n H(A_k | A_{1}^{k-1} B)_{\rho } \\&\ge \sum _{k=1}^n \left( H(A_k | A_{1}^{k-1} B)_{\sigma ^{(k)}} - g(\epsilon ) \right) \\&\ge n(c- g(\epsilon )) \end{aligned}$$

where we used continuity of the von Neumann conditional entropy in the second line ($g(\epsilon ) = O(\epsilon \log \frac{|A|}{\epsilon })$ is a “small” function of $\epsilon $). It is well known that a similar argument is not possible with the smooth min-entropy. Consequently, identities for the smooth min-entropy, like the chain rules [7], are much more restrictive. Tools like entropy accumulation [8, 9] also seem quite rigid, in the sense that they cannot be applied unless certain (Markov chain or non-signalling) conditions apply. It is also not clear how one could relax the conditions for such tools. In this paper, we consider scenarios consisting of approximation chains, similar to the above, along with additional conditions and prove lower bounds on the appropriate smooth min-entropies.

We begin by considering the scenario of approximately independent registers, that is, a state $\rho _{A_1^n B}$, which for every $1 \le k \le n$ satisfies

$$\begin{aligned} \frac{1}{2}\left\| \rho _{A_1^k B} - \rho _{A_k} \otimes \rho _{A_1^{k-1} B} \right\| _1 \le \epsilon . \end{aligned}$$

(1)

for some small $\epsilon >0$ and arbitrarily large n (in particular $n \gg \frac{1}{\epsilon }$). That is, for every k, the system $A_k$ is almost independent of the system B and everything else, which came before it. For simplicity, let us further assume that for all k the state $\rho _{A_k} = \rho _{A_1}$. Intuitively, one expects that the smooth min-entropy (with the smoothing parameter depending on $\epsilon $ and not on n)^{Footnote 2} for such a state will be large and close to $\approx n (H(A_1)-g'(\epsilon ))$ (for some small function $g'(\epsilon )$). However, it is not possible to prove this result using techniques, which rely only on the triangle inequality and smoothing. The triangle inequality, in general, can only be used to bound the trace distance between $\rho _{A_1^n B}$ and $\otimes _{k=1}^n \rho _{A_k} \otimes \rho _B$ by $n \epsilon $, which will result in a trivial bound when $n \gg \frac{1}{\epsilon }$.^{Footnote 3} Instead, we show that a bound on the entropic distance given by the smooth max-relative entropy between these two states can be used to prove a lower bound for the smooth min-entropy in this scenario.

While an upper bound of $n\epsilon $ is trivial and meaningless for the trace distance for large n, it is still a meaningful bound for the relative entropy between two states, which is unbounded in general. We can show that the above approximation conditions (Eq. 1) also imply that relative entropy distance between $\rho _{A_1^n B}$ and $\otimes _{k=1}^n \rho _{A_k} \otimes \rho _B$ is $n f(\epsilon )$ for some small function $f(\epsilon )$. The substate theorem [10] allows us to transform this relative entropy bound into a smooth max-relative entropy bound. For two general states $\rho _{AB}$ and $\eta _{AB}$, such that $d:= D^{\delta }_{\max }(\rho _{AB} || \eta _{AB})$, we can easily bound the smooth min-entropy of $\rho $ in terms of the min-entropy of $\eta $ by observing that

$$\begin{aligned} \rho _{AB} \approx _{\delta } \tilde{\rho }_{AB} \le 2^d \eta _{AB} \le 2^{-(H_{\min }(A|B)_{\eta } - d)} {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B \end{aligned}$$

(2)

for some state $\sigma _B$, which satisfies $D_{\max }(\eta _{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B) = -H_{\min }(A|B)_{\eta }$. This implies that

$$\begin{aligned} H_{\min }^{\delta }(A|B)_{\rho } \ge H_{\min }(A|B)_{\eta } - D^{\delta }_{\max }(\rho _{AB} || \eta _{AB}) \end{aligned}$$

(3)

We call this an entropic triangle inequality, since it is based on the triangle inequality property of $D_{\max }$. We can further improve this smooth min-entropy triangle inequality to (Lemma 3.5)

$$\begin{aligned} H_{\min }^{\epsilon +\delta }(A|B)_{\rho }&\ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D^\epsilon _{\max } (\rho _{AB}|| \eta _{AB}) - \frac{g_1(\delta , \epsilon )}{\alpha -1} \end{aligned}$$

(4)

for some function $g_1$, $\epsilon + \delta <1$ and $1 < \alpha \le 2$. Our general strategy for the scenarios considered in this paper is to first bound the “one-shot information theoretic” distance (the smooth max-relative entropy distance) between the real state $\rho $ ($\rho _{A_1^n B}$ in the above scenario) and a virtual, but nicer state, $\eta $ ($\otimes _{k=1}^n \rho _{A_k} \otimes \rho _B$ above) by $nf(\epsilon )$ for some small $f(\epsilon )$. Then, we use Eq. 4 above to reduce the problem of bounding the smooth min-entropy on state $\rho $ to that of bounding a $\alpha $-Rényi entropy on the state $\eta $. Using this strategy, in Corollary 4.4, we prove that for states satisfying the approximately independent registers assumptions, we have for $\delta = O\left( \epsilon \log \frac{|A|}{\epsilon } \right) $ that

$$\begin{aligned} H_{\min }^{\delta ^{\frac{1}{4}}}(A_1^n|B)_{\rho }&\ge n \left( H (A_1)_{\rho } - O(\delta ^{\frac{1}{4}}) \right) - O\left( \frac{1}{\delta ^{3/4}} \right) . \end{aligned}$$

(5)

Another scenario we consider here is that of approximate entropy accumulation. In the setting for entropy accumulation, a sequence of channels ${{\,\mathrm{\mathcal {M}}\,}}_k: R_{k-1} \rightarrow A_k B_k R_k$ for $1 \le k \le n$ sequentially act on a state $\rho ^{(0)}_{R_0 E}$ to produce the state $\rho _{A_1^n B_1^n E} = {{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E})$. It is assumed that the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ are such that the Markov chain $A_1^{k-1} \leftrightarrow B_1^{k-1} E \leftrightarrow B_k$ is satisfied for every k. This ensures that the register $B_k$ does not reveal any additional information about $A_1^{k-1}$ than what was previously revealed by $B_1^{k-1}E$. The entropy accumulation theorem [8], then provides a tight lower bound for the smooth min-entropy $H^{\delta }_{\min }(A_1^n | B_1^n E)$. We consider an approximate version of the above setting where the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ themselves do not necessarily satisfy the Markov chain condition, but they can be $\epsilon $-approximated by a sequence of channels ${{\,\mathrm{\mathcal {M}}\,}}'_k$, which satisfies certain Markov chain conditions. Such relaxations are important to understand the behaviour of cryptographic protocols, like device-independent quantum key distribution [11, 12], which are implemented with imperfect devices [13, 14]. Once again we can model this scenario as an approximation chain: for every $1 \le k \le n$, the state produced in the kth step satisfies

$$\begin{aligned} \rho _{A_1^k B_1^k E}&= {{\,\textrm{tr}\,}}_{R_k} \circ {{\,\mathrm{\mathcal {M}}\,}}_k \left( {{\,\mathrm{\mathcal {M}}\,}}_{k-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) \right) \\&\approx _{\epsilon } {{\,\textrm{tr}\,}}_{R_k} \circ {{\,\mathrm{\mathcal {M}}\,}}'_k \left( {{\,\mathrm{\mathcal {M}}\,}}_{k-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) \right) := \sigma ^{(k)}_{A_1^k B_1^k E}. \end{aligned}$$

Moreover, the assumptions on the channel ${{\,\mathrm{\mathcal {M}}\,}}'_k$ guarantee that the state $\sigma ^{(k)}_{A_1^k B_1^k E}$ satisfies the Markov chain condition $A_1^{k-1} \leftrightarrow B_1^{k-1} E \leftrightarrow B_k$, and so the chain rules and bounds used for entropy accumulation apply for it too. Roughly speaking, we use the chain rules for divergences [15] to show that the divergence distance between the states $\rho _{A_1^n B_1^n E} = {{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E})$ and the virtual state $\sigma _{A_1^n B_1^n E} = {{\,\mathrm{\mathcal {M}}\,}}'_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(\rho ^{(0)}_{R_0 E})$ is relatively small, and then reduce the problem of lower bounding the smooth min-entropy of $\rho _{A_1^n B_1^n E}$ to that of lower bounding an $\alpha $-Rényi entropy of $\sigma _{A_1^n B_1^n E}$, which can be done by using the chain rules developed for entropy accumulation.^{Footnote 4} In Theorem 5.1, we show the following smooth min-entropy lower bound for the state $\rho _{A_1^n B_1^n E}$ for sufficiently small $\epsilon $ and an arbitrary $\delta >0$

$$\begin{aligned} H_{\min }^{\delta }(A_1^n|B_1^n E)_{\rho } \ge \sum _{k=1}^n \inf _{\omega }&H(A_k | B_k \tilde{R}_{k-1})_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega )} - nO(\epsilon ^{\frac{1}{24}}) - O\left( \frac{1}{\epsilon ^{\frac{1}{24}}} \right) \end{aligned}$$

(6)

where the infimum is over all possible input states $\omega _{R_{k-1} \tilde{R}_{k-1}}$ for reference register $\tilde{R}_{k-1}$ isomorphic to $R_{k-1}$, and the dimensions |A| and |B| are assumed constant while using the asymptotic notation.

In the companion paper [1], we use the techniques developed in this paper to provide a solution for the source correlation problem in quantum key distribution (QKD) [16]. Briefly speaking, the security proofs of QKD require that one of the honest parties produce randomly and independently sampled quantum states in each round of the protocol. However, the states produced by a realistic quantum source will be somewhat correlated across different rounds due to imperfections. These correlations are called source correlations. Proving security for QKD under such a correlated source is challenging and no general satisfying solution was known. In [1], we use the entropic triangle inequality to reduce the security of a QKD protocol with a correlated source to that of the QKD protocol with a depolarised variant of the perfect source, for which security can be proven using existing techniques.

2 Background and Notation

For n quantum registers $(X_1, X_2, \ldots , X_n)$, the notation $X_i^j$ refers to the set of registers $(X_i, X_{i+1}, \ldots , X_{j})$. We use the notation [n] to denote the set $\{1,2, \ldots , n\}$. For a register A, |A| represents the dimension of the underlying Hilbert space. If X and Y are Hermitian operators, then the operator inequality $X \ge Y$ denotes the fact that $X-Y$ is a positive semidefinite operator and $X>Y$ denotes that $X-Y$ is a strictly positive operator. A quantum state (or briefly just state) refers to a positive semidefinite operator with unit trace. At times, we will also need to consider positive semidefinite operators with trace less than equal to 1. We call these operators subnormalised states. We will denote the set of registers a quantum state describes (equivalently, its Hilbert space) using a subscript. For example, a quantum state on registers A and B, will be written as $\rho _{AB}$ and its partial states on registers A and B, will be denoted as $\rho _{A}$ and $\rho _{B}$. The identity operator on register A is denoted using ${{\,\mathrm{\mathbb {1}}\,}}_A$. A classical-quantum state on registers X and B is given by $\rho _{XB} = \sum _{x} p(x) |{x}\rangle \langle {x}| \otimes \rho _{B|x}$, where $\rho _{B|x}$ are normalised quantum states on register B.

The term “channel” is used for completely positive trace preserving (CPTP) linear maps between two spaces of Hermitian operators. A channel ${{\,\mathrm{\mathcal {N}}\,}}$ mapping registers A to B will be denoted by ${{\,\mathrm{\mathcal {N}}\,}}_{A \rightarrow B}$. We write $\text {supp}(X)$ to denote the support of the Hermitian operator X and use $X \ll Y$ to denote that $\text {supp}(X) \subseteq \text {supp}(Y)$.

The trace norm is defined as $\left\| X \right\| _1:= {{\,\textrm{tr}\,}}\big (\left( X^\dag X \right) ^{\frac{1}{2}}\big )$. The fidelity between two positive operators P and Q is defined as $F(P,Q)= \left\| \sqrt{P}\sqrt{Q} \right\| _1^2$. The generalised fidelity between two subnormalised states $\rho $ and $\sigma $ is defined as

$$\begin{aligned} F_*(\rho , \sigma ) := \left( \left\| \sqrt{\rho }\sqrt{\sigma } \right\| _1 + \sqrt{(1- {{\,\textrm{tr}\,}}\rho )(1- {{\,\textrm{tr}\,}}\sigma )} \right) ^2. \end{aligned}$$

(7)

The purified distance between two subnormalised states $\rho $ and $\sigma $ is defined as

$$\begin{aligned} P(\rho , \sigma ) = \sqrt{1- F_{*}(\rho , \sigma )}. \end{aligned}$$

(8)

We will also use the diamond norm distance as a measure of the distance between two channels. For a linear transform ${{\,\mathrm{\mathcal {N}}\,}}_{A \rightarrow B}$ from operators on register A to operators on register B, the diamond norm distance is defined as

$$\begin{aligned} \left\| {{\,\mathrm{\mathcal {N}}\,}}_{A \rightarrow B} \right\| _{\diamond } := \max _{X_{AR}: \left\| X_{AR} \right\| _1 \le 1} \left\| {{\,\mathrm{\mathcal {N}}\,}}_{A \rightarrow B}(X_{AR}) \right\| _1 \end{aligned}$$

(9)

where the supremum is over all Hilbert spaces R (fixing $|R|= |A|$ is sufficient) and operators $X_{AR}$ such that $\left\| X_{AR} \right\| _1 \le 1$.

Throughout this paper, we use base 2 for both the functions $\log $ and $\exp $. We follow the notation in Tomamichel’s book [17] for Rényi entropies. For $\alpha \in (0,1) \cup (1,2)$, the Petz $\alpha $-Rényi relative entropy between the positive operators P and Q is defined as

$$\begin{aligned} \bar{D}_{\alpha }(P || Q) = {\left\{ \begin{array}{ll} \frac{1}{\alpha -1} \log {{\,\textrm{tr}\,}}\frac{\left( P^\alpha Q^{1-\alpha } \right) }{{{\,\textrm{tr}\,}}(P)} &{} \text { if } (\alpha <1 \text { and } P \not \perp Q) \text { or } (P \ll Q)\\ \infty &{} \text { else}. \end{array}\right. } \end{aligned}$$

(10)

The sandwiched $\alpha $-Rényi relative entropy for $\alpha \in [\frac{1}{2},1) \cup (1,\infty ]$ between the positive operator P and Q is defined as

$$\begin{aligned} \tilde{D}_{\alpha }(P || Q) = {\left\{ \begin{array}{ll} \frac{1}{\alpha -1} \log \frac{{{\,\textrm{tr}\,}}(Q^{-\frac{\alpha '}{2}} P Q^{-\frac{\alpha '}{2}})^{\alpha }}{{{\,\textrm{tr}\,}}(P)} &{} \text { if } (\alpha <1 \text { and } P \not \perp Q) \text { or } (P \ll Q)\\ \infty &{} \text { else}. \end{array}\right. } \end{aligned}$$

(11)

where $\alpha ' = \frac{\alpha -1}{\alpha }$. In the limit $\alpha \rightarrow \infty $, the sandwiched divergence becomes equal to the max-relative entropy, $D_{\max }$, which is defined as

$$\begin{aligned} D_{\max } (P|| Q) := \inf \left\{ \lambda \in \mathbb {R}: P \le 2^{\lambda }Q \right\} . \end{aligned}$$

(12)

In the limit of $\alpha \rightarrow 1$, both the Petz and the sandwiched relative entropies equal the quantum relative entropy, D(P||Q), which is defined as

$$\begin{aligned} D(P|| Q) := {\left\{ \begin{array}{ll} \frac{{{\,\textrm{tr}\,}}\left( P \log P - P \log Q \right) }{{{\,\textrm{tr}\,}}(P)} &{} \text { if } (P \ll Q)\\ \infty &{} \text { else}. \end{array}\right. } \end{aligned}$$

(13)

Given any divergence $\mathbb {D}$, we can define the (stabilised) channel divergence based on $\mathbb {D}$ between two channels ${{\,\mathrm{\mathcal {N}}\,}}_{A \rightarrow B}$ and ${{\,\mathrm{\mathcal {M}}\,}}_{A \rightarrow B}$ as [18, 19]

$$\begin{aligned} \mathbb {D}({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}) := \sup _{\rho _{AR}} \mathbb {D}({{\,\mathrm{\mathcal {N}}\,}}_{A \rightarrow B}(\rho _{AR}) || {{\,\mathrm{\mathcal {M}}\,}}_{A \rightarrow B}(\rho _{AR})) \end{aligned}$$

(14)

where R is reference register of arbitrary size ($|R| = |A|$ can be chosen when $\mathbb {D}$ satisfies the data processing inequality).

We can use the divergences defined above to define the following conditional entropies for the subnormalised state $\rho _{AB}$:

$$\begin{aligned} \bar{H}_{\alpha }^{\uparrow } (A|B)_{\rho }&:= \sup _{\sigma _B} - \bar{D}_{\alpha }(\rho _{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B) \end{aligned}$$

(15)

$$\begin{aligned} \tilde{H}_{\alpha }^{\uparrow } (A|B)_{\rho }&:= \sup _{\sigma _B} - \tilde{D}_{\alpha }(\rho _{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B) \end{aligned}$$

(16)

$$\begin{aligned} \bar{H}_{\alpha }^{\downarrow } (A|B)_{\rho }&:= - \bar{D}_{\alpha }(\rho _{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \rho _B) \end{aligned}$$

(17)

$$\begin{aligned} \tilde{H}_{\alpha }^{\downarrow } (A|B)_{\rho }&:= - \tilde{D}_{\alpha }(\rho _{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \rho _B) \end{aligned}$$

(18)

for appropriate $\alpha $ in the domain of the divergences. The supremum in the definition for $\bar{H}_{\alpha }^{\uparrow }$ and $ \tilde{H}_{\alpha }^{\uparrow }$ is over all quantum states $\sigma _B$ on register B.

For $\alpha \rightarrow 1$, all these conditional entropies are equal to the von Neumann conditional entropy H(A|B). $\tilde{H}_{\infty }^{\uparrow } (A|B)_{\rho }$ is usually called the min-entropy. The min-entropy is usually denoted as $H_{\min }(A|B)_{\rho }$ and for a subnormalised state can also be defined as

$$\begin{aligned} H_{\min }(A|B)_{\rho }&:= \sup \left\{ \lambda \in \mathbb {R}: \text { there exists state }\sigma _B \text { such that } \rho _{AB} \le 2^{-\lambda } {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B \right\} . \end{aligned}$$

(19)

For the purpose of smoothing, define the $\epsilon $-ball around the subnormalised state $\rho $ as the set

$$\begin{aligned} B_{\epsilon }(\rho ) := \{ \tilde{\rho } \ge 0 : P(\rho , \tilde{\rho }) \le \epsilon \text { and } {{\,\textrm{tr}\,}}\tilde{\rho } \le 1\}. \end{aligned}$$

(20)

We define the smooth max-relative entropy as

$$\begin{aligned} D_{\max }^{\epsilon }(\rho || \sigma ) := \min _{\tilde{\rho } \in B_{\epsilon }(\rho )} D_{\max }(\tilde{\rho } || \sigma ) \end{aligned}$$

(21)

The smooth min-entropy of $\rho _{AB}$ is defined as

$$\begin{aligned} H_{\min }^{\epsilon }(A|B)_{\rho } := \max _{\tilde{\rho } \in B_{\epsilon }(\rho )} H_{\min }(A|B)_{\tilde{\rho }}. \end{aligned}$$

(22)

3 Entropic Triangle Inequality for the Smooth Min-entropy

In this section, we derive a simple entropic triangle inequality (Lemma 3.5) for the smooth min-entropy of the form in Eq. 4. This Lemma is a direct consequence of the following triangle inequality for $\tilde{D}_{\alpha }$ (see [20, Theorem 3.1] for a triangle inequality, which changes the second argument of $\tilde{D}_{\alpha }$).

Lemma 3.1

Let $\rho $ and $\eta $ be subnormalised states and Q be a positive operator, then for $\alpha >1$, we have

$$\begin{aligned} \tilde{D}_{\alpha }(\rho || Q) \le \tilde{D}_{\alpha }(\eta || Q) + \frac{\alpha }{\alpha -1} D_{\max }(\rho || \eta ) + \frac{1}{\alpha -1} \log \frac{{{\,\textrm{tr}\,}}(\eta )}{{{\,\textrm{tr}\,}}(\rho )} \end{aligned}$$

and for $\alpha <1$ if one of $\tilde{D}_{\alpha }(\eta || Q)$ and $D_{\max }(\rho || \eta )$ is finite (otherwise we cannot define their difference), we have

$$\begin{aligned} \tilde{D}_{\alpha }(\rho || Q) \ge \tilde{D}_{\alpha }(\eta || Q) - \frac{\alpha }{1- \alpha } D_{\max }(\rho || \eta )- \frac{1}{1-\alpha } \log \frac{{{\,\textrm{tr}\,}}(\eta )}{{{\,\textrm{tr}\,}}(\rho )}. \end{aligned}$$

Proof

If $D_{\max }(\rho || \eta )= \infty $, then both statements are true trivially. Otherwise, we have that $\rho \le 2^{D_{\max }(\rho || \eta )} \eta $ and also $\rho \ll \eta $. Now, if $\rho \not \ll Q$ then $\eta \not \ll Q$. Hence, for $\alpha >1$ if $\tilde{D}_{\alpha }(\rho || Q) = \infty $, then $\tilde{D}_{\alpha }(\eta || Q) = \infty $, which means the Lemma is also satisfied in this condition. For $\alpha <1$, if $\tilde{D}_{\alpha }(\rho || Q) = \infty $, then the Lemma is also trivially satisfied. For the remaining cases we have,

$$\begin{aligned} 2^{(\alpha -1) \tilde{D}_{\alpha }(\rho || Q)}&= \frac{{{\,\textrm{tr}\,}}\left( Q^{-\frac{\alpha -1}{2\alpha }} \rho Q^{-\frac{\alpha -1}{2\alpha }} \right) ^{\alpha }}{{{\,\textrm{tr}\,}}(\rho )} \\&\le \frac{{{\,\textrm{tr}\,}}\left( Q^{-\frac{\alpha -1}{2\alpha }} 2^{D_{\max }(\rho || \eta )} \eta Q^{-\frac{\alpha -1}{2\alpha }} \right) ^{\alpha }}{{{\,\textrm{tr}\,}}(\rho )}\\&= \frac{{{\,\textrm{tr}\,}}(\eta )}{{{\,\textrm{tr}\,}}(\rho )} 2^{\alpha D_{\max }(\rho || \eta )} 2^{(\alpha -1) \tilde{D}_{\alpha }(\eta || Q)} \end{aligned}$$

where we used the fact that ${{\,\textrm{tr}\,}}(f(X))$ is monotone increasing if the function f is monotone increasing. Dividing by $(\alpha -1)$ now gives the result. $\square $

We define smooth $\alpha $-Rényi conditional entropy as follows to help us amplify the above inequality.

Definition 3.2

($\epsilon $-smooth $\alpha $-Rényi conditional entropy). For $\alpha \in (1, \infty ]$ and $\epsilon \in [0,1]$, we define the $\epsilon $-smooth $\alpha $-Rényi conditional entropy as

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha , \epsilon }(A|B)_{\rho }:= \max _{\tilde{\rho }_{AB} \in B_{\epsilon }(\rho _{AB})} \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\tilde{\rho }}. \end{aligned}$$

(23)

Lemma 3.3

For $\alpha \in (1, \infty ]$ and $\epsilon \in [0,1)$, and states $\rho _{AB}$ and $\eta _{AB}$ we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha , \epsilon }(A|B)_{\rho } \ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D^\epsilon _{\max } (\rho _{AB}|| \eta _{AB}) - \frac{1}{\alpha -1} \log \frac{1}{1-\epsilon ^2}. \end{aligned}$$

Proof

Let $\tilde{\rho }_{AB} \in B_{\epsilon }(\rho _{AB})$ be a subnormalised state such that $D_{\max }(\tilde{\rho }_{AB}|| \eta _{AB})= D^\epsilon _{\max } (\rho _{AB}|| \eta _{AB})$. Using Lemma 3.1 for $\alpha >1$, we have that for every state $\sigma _B$, we have

$$\begin{aligned} \tilde{D}_{\alpha } (\tilde{\rho }_{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B)&\le \tilde{D}_{\alpha } (\eta _{AB} || {{\,\mathrm{\mathbb {1}}\,}}_A \otimes \sigma _B) + \frac{\alpha }{\alpha -1} D^{\epsilon }_{\max }({\rho }_{AB} || \eta _{AB}) + \frac{1}{\alpha -1}\log \frac{1}{1- \epsilon ^2} \end{aligned}$$

(24)

where we used the fact that $\tilde{\rho }_{AB} \in B_{\epsilon }(\rho _{AB})$ which implies that ${{\,\textrm{tr}\,}}(\tilde{\rho }_{AB}) \ge 1- \epsilon ^2$. Since, the above bound is true for arbitrary states $\sigma _B$, we can multiply it by $-1$ and take the supremum to derive

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\tilde{\rho }} \ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D^\epsilon _{\max }(\rho _{AB}|| \eta _{AB}) - \frac{1}{\alpha -1} \log \frac{1}{1-\epsilon ^2}. \end{aligned}$$

The desired bound follows by using the fact that $\tilde{H}^{\uparrow }_{\alpha , \epsilon }(A|B)_{{\rho }} \ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\tilde{\rho }}$. $\square $

Lemma 3.4

For a state $\rho _{AB}$, $\epsilon \in [0,1)$, and $\delta \in (0,1)$ such that $\epsilon + \delta < 1$ and $\alpha \in (1,2]$, we have

$$\begin{aligned} H_{\min }^{\epsilon +\delta }(A|B)_{\rho } \ge \tilde{H}^{\uparrow }_{\alpha , \epsilon }(A|B)_{\rho } - \frac{g_0(\delta )}{\alpha -1} \end{aligned}$$

where $g_0(x):= - \log (1- \sqrt{1-x^2})$.

Proof

First, note that

$$\begin{aligned} H_{\min }^{\epsilon +\delta }(A|B)_{\rho }&\ge \sup _{\tilde{\rho } \in B_{\epsilon }(\rho _{AB})} H_{\min }^{\delta }(A|B)_{\tilde{\rho }}. \end{aligned}$$

(25)

To prove this, consider a $\tilde{\rho }_{AB} \in B_{\epsilon }(\rho _{AB})$ and $\rho '_{AB} \in B_{\delta }(\tilde{\rho }_{AB})$ such that $H_{\min }(A|B)_{{\rho '}} = H^{\delta }_{\min }(A|B)_{\tilde{\rho }}$. Then, using the triangle inequality for the purified distance, we have

$$\begin{aligned} P(\rho _{AB}, \rho '_{AB})&\le P(\rho _{AB}, \tilde{\rho }_{AB}) + P(\tilde{\rho }_{AB}, \rho '_{AB}) \\&\le \epsilon + \delta \end{aligned}$$

which implies that $H_{\min }^{\epsilon +\delta }(A|B)_{\rho } \ge H_{\min }(A|B)_{{\rho '}} = H^{\delta }_{\min }(A|B)_{\tilde{\rho }}$. Since, this is true for all $\tilde{\rho } \in B_{\epsilon }(\rho _{AB})$ the bound in Eq. 25 is true.

Using this, we have

$$\begin{aligned} H_{\min }^{\epsilon +\delta }(A|B)_{\rho }&\ge \sup _{\tilde{\rho } \in B_{\epsilon }(\rho _{AB})} H_{\min }^{\delta }(A|B)_{\tilde{\rho }}\\&\ge \sup _{\tilde{\rho } \in B_{\epsilon }(\rho _{AB})} \left\{ \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\tilde{\rho }} - \frac{g_0(\delta )}{\alpha -1} \right\} \\&= \tilde{H}^{\uparrow }_{\alpha , \epsilon }(A|B)_{\rho } - \frac{g_0(\delta )}{\alpha -1} \end{aligned}$$

where we have used [8, Lemma B.10]^{Footnote 5} (originally proven in [21]) in the second step. $\square $

We can combine these two lemmas to derive the following result.

Lemma 3.5

For $\alpha \in (1,2]$, $\epsilon \in [0,1)$, and $\delta \in (0,1)$ such that $\epsilon + \delta < 1$ and two states $\rho _{AB}$ and $\eta _{AB}$, we have

$$\begin{aligned} H_{\min }^{\epsilon +\delta }(A|B)_{\rho }&\ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D^\epsilon _{\max } (\rho _{AB}|| \eta _{AB}) - \frac{g_1(\delta , \epsilon )}{\alpha -1} \end{aligned}$$

(26)

where $g_1(x, y):= - \log (1- \sqrt{1-x^2}) - \log (1-y^2)$.

Proof

We can combine Lemmas 3.3 and 3.4 as follows to derive the bound in the Lemma:

$$\begin{aligned} H_{\min }^{\epsilon +\delta }(A|B)_{\rho }&\ge \tilde{H}^{\uparrow }_{\alpha , \epsilon }(A|B)_{\rho } - \frac{g_0(\delta )}{\alpha -1}\\&\ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D^\epsilon _{\max } (\rho _{AB}|| \eta _{AB}) - \frac{1}{\alpha -1}\left( g_0(\delta ) + \log \frac{1}{1-\epsilon ^2} \right) . \end{aligned}$$

$\square $

We can use the asymptotic equipartition theorem for smooth min-entropy and max-relative entropy [21,22,23] to derive the following novel triangle inequality for the von Neumann conditional entropy. Although we do not use this inequality in this paper, we believe it is interesting and may prove useful in the future.

Corollary 3.6

For $\alpha \in (1,2]$ and states $\rho _{AB}$ and $\eta _{AB}$, we have that

$$\begin{aligned} H(A|B)_{\rho }&\ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D(\rho _{AB}|| \eta _{AB}). \end{aligned}$$

(27)

Proof

Using Lemma 3.5 with $\alpha \in (1,2]$, the states $\rho ^{\otimes n}_{AB}$, and $\eta ^{\otimes n}_{AB}$ and any $\epsilon > 0$ and $\delta > 0$ satisfying the conditions for the Lemma, we get

$$\begin{aligned}&H_{\min }^{\epsilon +\delta }(A_1^n|B_1^n)_{\rho ^{\otimes n}} \ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n)_{\eta ^{\otimes n}} - \frac{\alpha }{\alpha -1} D^\epsilon _{\max } (\rho _{AB}^{\otimes n}|| \eta ^{\otimes n}_{AB}) - \frac{g_1(\delta , \epsilon )}{\alpha -1} \\&\quad \Rightarrow \frac{1}{n} H_{\min }^{\epsilon +\delta }(A_1^n|B_1^n)_{\rho ^{\otimes n}} \ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} \frac{1}{n} D^\epsilon _{\max } (\rho _{AB}^{\otimes n}|| \eta ^{\otimes n}_{AB}) - \frac{1}{n} \frac{g_1(\delta , \epsilon )}{\alpha -1}. \end{aligned}$$

Taking the limit of the above for $n \rightarrow \infty $, we get

$$\begin{aligned}&\lim _{n \rightarrow \infty } \frac{1}{n} H_{\min }^{\epsilon +\delta }(A_1^n|B_1^n)_{\rho ^{\otimes n}} \ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \lim _{n \rightarrow \infty }\frac{\alpha }{\alpha -1} \frac{1}{n} D^\epsilon _{\max } (\rho _{AB}^{\otimes n}|| \eta ^{\otimes n}_{AB})\\&\quad - \lim _{n \rightarrow \infty } \frac{1}{n} \frac{g_1(\delta , \epsilon )}{\alpha -1} \\&\quad \Rightarrow H(A|B)_{\rho } \ge \tilde{H}^{\uparrow }_{\alpha }(A|B)_{\eta } - \frac{\alpha }{\alpha -1} D(\rho _{AB}|| \eta _{AB}) \end{aligned}$$

which proves the claim. $\square $

4 Approximately Independent Registers

In this section, we introduce our technique for using the smooth min-entropy triangle inequality for considering approximations by studying a state $\rho _{A_1^n B}$ such that for every $k \in [n]$

$$\begin{aligned} \left\| \rho _{A_1^{k}B} - \rho _{A_k} \otimes \rho _{A_1^{k-1}B} \right\| _1 \le \epsilon . \end{aligned}$$

(28)

We assume that the registers $A_k$ all have the same dimension equal to |A|. One should think of the registers $A_k$ as the secret information produced during some protocol, which also provides the register B to an adversary. We would like to prove that $H^{f(\epsilon )}_{\min }(A_1^n | B)$ is large (lower bounded by $\Omega (n)$) under the above approximate independence conditions for some reasonably small function f of $\epsilon $ and close to $nH(A_1)$, if we assume the states $\rho _{A_k}$ are identical.

Let us first examine the case where the state $\rho $ above is classical. We use the standard notation for probability distributions to address elements of $\rho $, so that $\rho (a_1^n, b):= \langle {a_1^n, b}| \rho _{A_1^n B} |{a_1^n, b}\rangle $, where $|{a_1^n, b}\rangle $ is standard basis vector. To show that in this case the smooth min-entropy is high, we will show that the set where the conditional probability $\rho (a_1^n |b):= \frac{\rho (a_1^n b)}{\rho (b)}$ can be large, has a small probability using the Markov inequality. We will use the following lemma for this purpose.

Lemma 4.1

Suppose p, q are probability distributions on $\mathcal {X}$ such that $\frac{1}{2}\left\| p-q \right\| _1 \le \epsilon $, then $S \subseteq \mathcal {X}$ defined as $S:= \{x \in \mathcal {X}: p(x) \le (1+\epsilon ^{1/2}) q(x) \}$ is such that $q(S) \ge 1- \epsilon ^{1/2}$ and $p(S)\ge 1- \epsilon ^{1/2}- \epsilon $.

Proof

Let $S^c:= \mathcal {X} {\setminus } S$, where S is the set defined above. If $q(S^c) = 0$, the statement in the lemma is satisfied. If $q(S^c) > 0$, we have that

$$\begin{aligned} \epsilon \ge \frac{1}{2}\left\| p-q \right\| _1&= \max _{H \subseteq \mathcal {X}} \vert p(H)- q(H) \vert \\&\ge q(S^c) \left| \frac{p(S^c)}{q(S^c)} - 1 \right| \\&\ge q(S^c)\left( \frac{p(S^c)}{q(S^c)} - 1 \right) \\&= q(S^c)\left( \frac{\sum _{x \in S^c} p(x)}{\sum _{x \in S^c} q(x)} - 1 \right) \\&\ge q(S^c)\left( \frac{\sum _{x \in S^c} (1+\epsilon ^{\frac{1}{2}})q(x)}{\sum _{x \in S^c} q(x)} - 1 \right) \\&= q(S^c) \epsilon ^{\frac{1}{2}} \end{aligned}$$

which implies that $q(S^c) \le \epsilon ^{\frac{1}{2}}$. Now, the statement of the Lemma follows. $\square $

We will also assume for the sake of simplicity that $\rho _{A_k}$ are identical for all $k \in [n]$. Using the Lemma above, for every $k \in [n]$, we know that the set

$$\begin{aligned} B_k :&= \left\{ (a_1^n, b): \rho (a_1^k, b)> (1+\sqrt{\epsilon })\rho (a_1^{k-1}, b)\rho (a_k) \right\} \\&= \left\{ (a_1^n, b): \rho (a_k | a_1^{k-1}, b) > (1+\sqrt{\epsilon })\rho (a_k) \right\} \end{aligned}$$

satisfies $\Pr _\rho (B_k) \le 2\sqrt{\epsilon }$. We can now define $L = \sum _{k=1}^n \chi _{B_k}$, which is a random variable that simply counts the number of bad sets $B_k$ an element $(a_1^n, b)$ belongs to. Using the Markov inequality, we have

$$\begin{aligned} \Pr _{\rho }\left[ L > n\epsilon ^{\frac{1}{4}}\right] \le \frac{{{\,\mathrm{\mathbb {E}}\,}}_\rho [L]}{n\epsilon ^{\frac{1}{4}}} \le 2\epsilon ^{\frac{1}{4}}. \end{aligned}$$

We can define the bad set $\mathcal {B}:= \left\{ (a_1^n, b): L(a_1^n, b)>n\epsilon ^{\frac{1}{4}} \right\} $, then we can define the subnormalised distribution $\tilde{\rho }_{A_1^n B}$ as

$$\begin{aligned} \tilde{\rho }_{A_1^n B} (a_1^n, b) = {\left\{ \begin{array}{ll} \rho _{A_1^n B} (a_1^n, b) &{} (a_1^n, b) \not \in \mathcal {B} \\ 0 &{} \text {else} \end{array}\right. }. \end{aligned}$$

We have $P(\tilde{\rho }_{A_1^n B}, \rho _{A_1^n B}) \le \sqrt{2}\epsilon ^{1/8}$. Further, note that for every $(a_1^n, b) \not \in \mathcal {B}$, we have

$$\begin{aligned} \rho (a_1^n | b)&= \prod _{k=1}^n \rho (a_k | a_1^{k-1}, b) \\&= \prod _{k: (a_1^n, b) \not \in B_k} \rho (a_k | a_1^{k-1}, b) \prod _{k: (a_1^n, b) \in B_k} \rho (a_k | a_1^{k-1}, b) \\&\le (1+\sqrt{\epsilon })^n \prod _{k: (a_1^n, b) \not \in B_k} \rho _{A_k}(a_k)\\&\le (1+\sqrt{\epsilon })^n 2^{-n(1-\epsilon ^{\frac{1}{4}})H_{\min }(A_1)} \end{aligned}$$

where in the third line we have used the fact that if $(a_1^n, b) \not \in B_k$, then $\rho (a_k | a_1^{k-1}b) \le (1+\sqrt{\epsilon })\rho _{A_k}(a_k)$ and in the last line we have used the fact that for $(a_1^k, b) \not \in \mathcal {B}$, we have $|\{k \in [n]: (a_1^{n},b) \not \in B_k\}| = n- L(a_1^n, b) \ge n(1-\epsilon ^{\frac{1}{4}})$, that all the states $\rho _{A_k}$ are identical and $2^{-H_{\min }(A_k)} = \max _{a_k} \rho _{A_k}(a_k)$. Note that we have essentially proven and used a $D_{\max }$ bound above. This proves the following lower bound for the smooth min-entropy of $\rho $

$$\begin{aligned} H^{\sqrt{2}\epsilon ^{\frac{1}{8}}}_{\min } (A_1^n | B) \ge n(1-\epsilon ^{\frac{1}{4}})H_{\min }(A_1) - n \log (1+\sqrt{\epsilon }). \end{aligned}$$

(29)

The right-hand side above can be improved to get the Shannon entropy H instead of the min-entropy $H_{\min }$. However, we will not pursue this here, since this bound is sufficient for the purpose of our discussion.

Although, we are unable to generalise the classical argument above to the quantum case, it provides a great amount of insight into the approximately independent registers problem. Two important examples of distributions, which satisfy the approximate independence conditions above were mentioned in Footnotes 2 and 3 earlier. To create the first distribution, we flip a biased coin B, which is 0 with probability $\epsilon $ and 1 otherwise. If $B=0$, then $A_1^n$ is set to the constant all zero string otherwise it is sampled randomly and independently. For this distribution, once the bad event ($B=0$) is removed, the new distribution has a high min-entropy. On the other hand, for the second distribution, $Q_{A_1^{2n}B_1^{2n}}$, we have that the random bits $B_i$ are chosen independently, with each being equal to 0 with probability $\epsilon $ and 1 otherwise. If the bit $B_i$ is 0, then $A_i$ is set equal to $A_{i-1}$ otherwise it is sampled independently. In this case, there is no small probability (small as a function of $\epsilon $) event, that one can simply remove, so that the distribution becomes i.i.d. However, we expect that with high probability the number of $B_i = 0$ is close to $2n \epsilon $. Given that the distribution samples all the other $A_i$ independently, the smooth min-entropy for the distribution should be close to $2n(1-\epsilon )H(A_1)$. The above argument shows that any distribution satisfying the approximate independence conditions in Eq. 28 can be handled by combining the methods used for these two example distributions, that is, deleting the bad part of the distribution and recognising that the probability for every element in the rest of the space behaves independently on average.

The above classical argument is difficult to generalise to quantum states primarily because the quantum equivalents of Lemma 4.1 are not as nice and simple. Furthermore, quantum conditional probabilities themselves are also difficult to use. Fortunately, the substate theorem serves as the perfect tool for developing a smooth max-relative entropy bound, which we can then use with the min-entropy triangle inequality. The quantum substate theorem [10, 24] provides an upper bound on the smooth max relative entropy $D^{\epsilon }_{\max }(\rho || \sigma )$ between two states in terms of their relative entropy $D(\rho || \sigma )$.

Theorem 4.2

(Quantum substate theorem [24]). Let $\rho $ and $\sigma $ be two states on the same Hilbert space. Then for any $\epsilon \in (0,1)$, we have

$$\begin{aligned} D^{\sqrt{\epsilon }}_{\max }(\rho || \sigma ) \le \frac{D(\rho || \sigma )+1}{\epsilon } + \log \frac{1}{1-\epsilon }. \end{aligned}$$

(30)

In this section, we will also frequently use the multipartite mutual information [25,26,27]. For a state $\rho _{X_1^n}$, the multipartite mutual information between the registers $(X_1, X_2, \ldots , X_n)$ is defined as

$$\begin{aligned} I(X_1: X_2: \cdots : X_n)_\rho := D(\rho _{X_1^n}|| \rho _{X_1} {{\,\mathrm{\otimes }\,}}\rho _{X_2} {{\,\mathrm{\otimes }\,}}\cdots {{\,\mathrm{\otimes }\,}}\rho _{X_n}). \end{aligned}$$

(31)

In other words, it is the relative entropy between $\rho _{X_1^n}$ and $\rho _{X_1} {{\,\mathrm{\otimes }\,}}\rho _{X_2} {{\,\mathrm{\otimes }\,}}\cdots {{\,\mathrm{\otimes }\,}}\rho _{X_n}$. It can easily be shown that the multipartite mutual information satisfies the following identities:

$$\begin{aligned} I(X_1: X_2: \cdots : X_n)_\rho&= \sum _{k=1}^n H(X_k)_\rho - H(X_1 \cdots \ X_n)_\rho \end{aligned}$$

(32)

$$\begin{aligned}&= \sum _{k=2}^n I(X_k : X_1^{k-1}). \end{aligned}$$

(33)

Going back to proving a bound for the quantum approximately independent registers problem, note that using the Alicki–Fannes–Winter (AFW) bound [28, 29] for mutual information [30, Theorem 11.10.4], Eq. 28 implies that for every $k \in [n]$

$$\begin{aligned} I(A_k : A_1^{k-1}B)_\rho \le \epsilon \log |A| + g_2\left( \frac{\epsilon }{2} \right) \end{aligned}$$

(34)

where $g_2(x):= (x+1) \log (x+1) - x \log (x)$. With this in mind, we can now focus our efforts on proving the following theorem.

Theorem 4.3

Let registers $A_k$ have dimension |A| for all $k \in [n]$. Suppose a quantum state $\rho _{A_1^n B}$ is such that for every $k \in [n],$ we have

$$\begin{aligned} I(A_k : A_1^{k-1}B)_{\rho } \le \epsilon \end{aligned}$$

(35)

for some $0< \epsilon < 1$. Then, we have that

$$\begin{aligned} H_{\min }^{\epsilon ^{\frac{1}{4}}+\epsilon }(A_1^n|B)_{\rho }&\ge \sum _{k=1}^n H (A_k)_{\rho } - 3 n \epsilon ^{\frac{1}{4}} \log (1+2|A|) \nonumber \\&\quad - \frac{2 \log (1+2|A|)}{\epsilon ^{3/4}} - \frac{2 \log (1+2|A|)}{\epsilon ^{1/4}} \left( \log (1- \sqrt{\epsilon }) + g_1(\epsilon , \epsilon ^{\frac{1}{4}}) \right) \end{aligned}$$

(36)

where $g_1(x, y):= - \log (1- \sqrt{1-x^2}) - \log (1-y^2)$. In particular, when all the states $\rho _{A_k}$ are identical, we have

$$\begin{aligned} H_{\min }^{\epsilon ^{\frac{1}{4}}+\epsilon }(A_1^n|B)_{\rho }&\ge n \left( H (A_1)_{\rho } - 3 \epsilon ^{\frac{1}{4}} \log (1+2|A|) \right) \nonumber \\&\quad - \frac{2 \log (1+2|A|)}{\epsilon ^{3/4}} - \frac{2 \log (1+2|A|)}{\epsilon ^{1/4}} \left( \log (1- \sqrt{\epsilon }) + g_1(\epsilon , \epsilon ^{\frac{1}{4}}) \right) . \end{aligned}$$

(37)

Proof

First note that we have,

$$\begin{aligned} I(A_1: A_2: \cdots : A_n: B)&= D(\rho _{A_1^n B} || \bigotimes _{k=1}^n \rho _{A_k} \otimes \rho _B) \\&= \sum _{k=1}^n I(A_k : A_1^{k-1}B) \\&\le n \epsilon . \end{aligned}$$

Using the substate theorem, we now have

$$\begin{aligned} D_{\max }^{\epsilon ^{\frac{1}{4}}}\left( \rho _{A_1^n B} \Bigg \Vert \bigotimes _{k=1}^n \rho _{A_k} \otimes \rho _B \right)&\le \frac{D(\rho _{A_1^n B} || \bigotimes _{k=1}^n \rho _{A_k} \otimes \rho _B) + 1}{\sqrt{\epsilon }} - \log (1- \sqrt{\epsilon }) \nonumber \\&\le n \sqrt{\epsilon } + \frac{1}{\sqrt{\epsilon }} - \log (1- \sqrt{\epsilon }). \end{aligned}$$

(38)

We now define the auxiliary state $\eta _{A_1^n B}:= \bigotimes _{k=1}^n \rho _{A_k} \otimes \rho _B$. Using Lemma 3.5, for $\alpha \in (1,2)$, we can transform the smooth min-entropy into an $\alpha $-Rényi entropy on the auxiliary product state $\eta _{A_1^n B}$ as follows:

$$\begin{aligned}&H_{\min }^{\epsilon ^{\frac{1}{4}}+\epsilon }(A_1^n|B)_{\rho } \\&\quad \ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B)_{\eta } - \frac{\alpha }{\alpha -1} D^{\epsilon ^{\frac{1}{4}}}_{\max } (\rho _{A_1^n B}|| \eta _{A_1^n B}) - \frac{g_1(\epsilon , \epsilon ^{\frac{1}{4}})}{\alpha -1} \\&\quad = \sum _{k=1}^n \tilde{H}^{\uparrow }_{\alpha }(A_k)_{\rho } - \frac{\alpha }{\alpha -1} D^{\epsilon ^{\frac{1}{4}}}_{\max } (\rho _{A_1^n B}|| \eta _{A_1^n B}) - \frac{g_1(\epsilon , \epsilon ^{\frac{1}{4}})}{\alpha -1} \\&\quad \ge \sum _{k=1}^n H (A_k)_{\rho } - n(\alpha -1) \log ^2(1+2|A|)- \frac{\alpha }{\alpha -1} D^{\epsilon ^{\frac{1}{4}}}_{\max } (\rho _{A_1^n B}|| \eta _{A_1^n B}) - \frac{g_1(\epsilon , \epsilon ^{\frac{1}{4}})}{\alpha -1} \\&\quad \ge \sum _{k=1}^n H (A_k)_{\rho } - n(\alpha -1) \log ^2(1+2|A|)- \frac{\alpha }{\alpha -1} n \sqrt{\epsilon }\\&\quad - \frac{\alpha }{\alpha -1} \frac{1}{\sqrt{\epsilon }} - \frac{\alpha }{\alpha -1} \log (1- \sqrt{\epsilon })- \frac{g_1(\epsilon , \epsilon ^{\frac{1}{4}})}{\alpha -1}. \end{aligned}$$

In the third line above, we have used [8, Lemma B.9] (which is an improvement of [21, Lemma 8]), which is valid as long as $\alpha <1+ \frac{1}{\log (1+ 2|A|)}$. We will select $\alpha = 1 + \frac{\epsilon ^{1/4}}{\log (1+ 2|A|)}$ for which the above $\alpha $ bound is satisfied, this gives us

$$\begin{aligned} H_{\min }^{\epsilon ^{\frac{1}{4}}+\epsilon }(A_1^n|B)_{\rho }&\ge \sum _{k=1}^n H (A_k)_{\rho } - 3 n \epsilon ^{\frac{1}{4}} \log (1+2|A|) - \frac{2 \log (1+2|A|)}{\epsilon ^{3/4}} \\&\quad - \frac{2 \log (1+2|A|)}{\epsilon ^{1/4}} \left( \log (1- \sqrt{\epsilon }) + g_1(\epsilon , \epsilon ^{\frac{1}{4}}) \right) . \end{aligned}$$

$\square $

We can now plug the bound in Eq. 34 to derive the following Corollary.

Corollary 4.4

Let registers $A_k$ have dimension |A| for all $k \in [n]$. Suppose a quantum state $\rho _{A_1^n B}$ is such that for every $k \in [n],$ we have

$$\begin{aligned} \left\| \rho _{A_1^k B} - \rho _{A_k} \otimes \rho _{A_1^{k-1} B} \right\| _1 \le \epsilon . \end{aligned}$$

(39)

Then, we have that for $\delta = \epsilon \log |A| + g_2\left( \frac{\epsilon }{2} \right) $ such that $0< \delta <1$,

$$\begin{aligned} H_{\min }^{\delta ^{\frac{1}{4}}+\delta }(A_1^n|B)_{\rho }&\ge \sum _{k=1}^n H (A_k)_{\rho } - 3 n \delta ^{\frac{1}{4}} \log (1+2|A|) \nonumber \\&\quad - \frac{2 \log (1+2|A|)}{\delta ^{3/4}} - \frac{2 \log (1+2|A|)}{\delta ^{1/4}} \left( \log (1- \sqrt{\delta }) + g_1(\delta , \delta ^{\frac{1}{4}}) \right) \end{aligned}$$

(40)

where $g_1(x, y) = - \log (1- \sqrt{1-x^2}) - \log (1-y^2)$ and $g_2(x) = (x+1)\log (x+1) - x \log (x)$. In particular, when all the states $\rho _{A_k}$ are identical, we have

$$\begin{aligned} H_{\min }^{\delta ^{\frac{1}{4}}+\delta }(A_1^n|B)_{\rho }&\ge n \left( H (A_1)_{\rho } - 3 \delta ^{\frac{1}{4}} \log (1+2|A|) \right) \nonumber \\&\quad - \frac{2 \log (1+2|A|)}{\delta ^{3/4}} - \frac{2 \log (1+2|A|)}{\delta ^{1/4}} \left( \log (1- \sqrt{\delta }) + g_1(\delta , \delta ^{\frac{1}{4}}) \right) . \end{aligned}$$

(41)

4.1 Weak approximate asymptotic equipartition

We can modify the proof of Theorem 4.3 to prove a weak approximate asymptotic equipartition property (AEP).

Theorem 4.5

Let registers $A_k$ have dimension |A| for all $k \in [n]$ and the registers $B_k$ have dimension |B| for all $k \in [n]$. Suppose a quantum state $\rho _{A_1^n B_1^n E}$ is such that for every $k \in [n],$ we have

$$\begin{aligned} \left\| \rho _{A_1^k B_1^k E} - \rho _{A_k B_k} \otimes \rho _{A_1^{k-1} B_1^{k-1} E} \right\| _1 \le \epsilon . \end{aligned}$$

(42)

Then, we have that for $\delta = \epsilon \log \left( |A||B| \right) + g_2\left( \frac{\epsilon }{2} \right) $ such that $0< \delta <1$,

$$\begin{aligned} H_{\min }^{\delta ^{\frac{1}{4}}+\delta }(A_1^n|B_1^n E)_{\rho }&\ge \sum _{k=1}^n H (A_k|B_k)_{\rho } - 3 n \delta ^{\frac{1}{4}} \log (1+2|A|) \nonumber \\&\quad - \frac{2 \log (1+2|A|)}{\delta ^{3/4}} - \frac{2 \log (1+2|A|)}{\delta ^{1/4}} \left( \log (1- \sqrt{\delta }) + g_1(\delta , \delta ^{\frac{1}{4}}) \right) \end{aligned}$$

(43)

where $g_1(x, y) = - \log (1- \sqrt{1-x^2}) - \log (1-y^2)$ and $g_2(x) = (x+1)\log (x+1) - x \log (x)$. In particular, when all the states $\rho _{A_k B_k}$ are identical, we have

$$\begin{aligned} H_{\min }^{\delta ^{\frac{1}{4}}+\delta }(A_1^n|B_1^n E)_{\rho }&\ge n \left( H (A_1|B_1)_{\rho } - 3 \delta ^{\frac{1}{4}} \log (1+2|A|) \right) \nonumber \\&\quad - \frac{2 \log (1+2|A|)}{\delta ^{3/4}} - \frac{2 \log (1+2|A|)}{\delta ^{1/4}} \left( \log (1- \sqrt{\delta }) + g_1(\delta , \delta ^{\frac{1}{4}}) \right) . \end{aligned}$$

(44)

Proof

To prove this, we use the auxiliary state $\eta _{A_1^n B_1^n E}:= \bigotimes \rho _{A_k B_k} \otimes \rho _E$. Then, we have

$$\begin{aligned} D(\rho _{A_1^n B_1^n E} || \eta _{A_1^n B_1^n E})&= I(A_1 B_1: A_2 B_2: \cdots : A_n B_n: E)_{\rho } \\&= \sum _{k=1}^n I(A_k B_k: A_1^{k-1} B_1^{k-1} E)_{\rho } \\&\le n\left( \epsilon \log \left( |A||B| \right) + g\left( \frac{\epsilon }{2} \right) \right) = n \delta \end{aligned}$$

where we used the AFW bound for mutual information in the last line [30, Theorem 11.10.4]. The rest of the proof follows the proof of Theorem 4.3, only difference being that now we have $\tilde{H}^{\uparrow }_{\alpha }(A_1^n | B_1^n E)_{\eta } = \sum _{k=1}^n \tilde{H}^{\uparrow }_{\alpha }(A_k | B_k)_{\rho }$. $\square $

We call this generalisation weak because the smoothing term ($\delta $) depends on size of the side information |B|. In Appendix E, we show that under the assumptions of the theorem, some sort of bound on the dimension of the registers B is necessary otherwise one cannot have a non-trivial bound on the smooth min-entropy.

4.2 Simple security proof for sequential device independent quantum key distribution

The approximately independent register scenario and the associated min-entropy lower bound can be used to provide simple “proof of concept” security proofs for cryptographic protocols. In this section, we will briefly sketch a proof for sequential device independent quantum key distribution (DIQKD) to demonstrate this idea. The protocol for sequential DIQKD used in [31] is presented as Protocol 1.

Sequential DIQKD protocol
Parameters:
$\bullet $ $\omega _{\text {exp}}$ is the expected winning probability for the honest implementation of the device
$\bullet $ $n \ge 1$ is the number of rounds in the protocol
$\bullet $ $\gamma \in (0,1]$ is the fraction of test rounds
Protocol:
1. For every $0 \le i \le n$ perform the following steps:
(a) Alice chooses a random $T_i \in \{ 0,1\}$ with $\Pr [T_i =1] =\gamma $.
(b) Alice sends $T_i$ to Bob.
(c) If $T_i = 0$, Alice and Bob set the questions $(X_i,Y_i) = (0,2)$, otherwise they sample $(X_i,Y_i)$ uniformly at random from $\{ 0,1\}$.
(d) Alice and Bob use their device with the questions $(X_i,Y_i)$ and obtain the outputs $A_i, B_i$.
2. Alice announces her questions $X_1^n$ to Bob.
3. Error correction: Alice and Bob use an error correction procedure, which lets Bob obtain the raw key $\tilde{A}_1^n$ (if the error correction protocol succeeds, then $A_1^n = \tilde{A}_1^n$). In case the error correction protocol aborts, they abort the QKD protocol too.
4. Parameter Estimation: Bob uses $B_1^n$ and $\tilde{A}_1^n$ to compute the average winning probability $\omega _{\text {avg}}$ on the test rounds. He aborts if $\omega _{\text {avg}} < \omega _{\text {exp}}$
5. Privacy Amplification: Alice and Bob use a privacy amplification protocol to create a secret key K from $A_1^n$ (using $\tilde{A}_1^n$ for Bob).
Protocol 1

We consider a simple model for DIQKD, where Eve (the adversary) distributes a state $\rho ^{(0)}_{E_A E_B E}$ between Alice and Bob at the beginning of the protocol. Alice and Bob then use their states sequentially as given in Protocol 1. The kth round of the protocol produces the questions $X_k, Y_k$ and $T_k$, the answers $A_k$ and $ B_k$ and transforms the shared state from $\rho ^{(k-1)}_{E_A E_B E}$ to $\rho ^{(k)}_{E_A E_B E}$.

Given the questions and answers of the previous rounds, the state shared between Alice and Bob and their devices in each round can be viewed as a device for playing the CHSH game. Suppose in the kth round, the random variables produced in the previous $k-1$ rounds are $r_{k-1}:= x_1^{k-1}, y_1^{k-1}, t_1^{k-1}, a_1^{k-1}, b_1^{k-1}$ and that the state shared between Alice and Bob is $\rho ^{(k-1)}_{E_A E_B E| r_{k-1}}$. We can then define $\Pr [W_k |r_{k-1}]$ to be the winning probability of the CHSH game played by Alice and Bob using the state and their devices in the kth round. Note that Alice’s device cannot distinguish whether the CHSH game is played in a round or is used for key generation. We can further take an average over all the previous round’s random variables to derive the probability of winning the kth game

$$\begin{aligned} \Pr [W_k] = {{\,\mathrm{\mathbb {E}}\,}}_{r_{k-1}}\left[ \Pr [W_k |r_{k-1}]\right] . \end{aligned}$$

(45)

Alice and Bob randomly sample a subset of the rounds (using the random variable $T_k$) and play the CHSH game on this subset. If the average winning probability of CHSH game on this subset is small, they abort the protocol. For simplicity and brevity, we will assume here that the state $\rho ^{(0)}_{E_A E_B E}$ distributed between Alice and Bob at the start of the protocol by Eve has an average winning probability at least $\omega _{{\text {exp}}}$, that is,

$$\begin{aligned} \frac{1}{n}\sum _{k=1}^n \Pr [W_k] \ge \omega _{{\text {exp}}} - \delta \end{aligned}$$

(46)

for some small $\delta >0$. Using standard sampling arguments it can be argued that either this is true or the protocol aborts with high probability.

For any shared state $\sigma _{E_A E_B E}$ (where $E_A$ is held by Alice, $E_B$ is held by Bob and E is held by the adversary) and local measurement devices, if Alice and Bob win the CHSH game with a probability $\omega \in (\frac{3}{4}, \frac{2+\sqrt{2}}{4}]$, then Alice’s answer A to the game is random given the questions X, Y and the register E held by adversary. This is quantified by the following entropic bound [32] (see [31, Lemma 5.3] for the following form)

$$\begin{aligned} H(A| XY E) \ge f(\omega ) = {\left\{ \begin{array}{ll} 1 - h\left( \frac{1}{2} + \frac{1}{2}\sqrt{16 \omega (\omega -1) +3} \right) &{} \text {if } \omega \in [\frac{3}{4}, \frac{2+\sqrt{2}}{4}] \\ 0 &{} \text {if } \omega \in [0, \frac{3}{4}) \end{array}\right. } \end{aligned}$$

(47)

where $h(\cdot )$ is the binary entropy. The function f is convex over the interval $\left[ 0, \frac{2+\sqrt{2}}{4}\right] $. We plot it in the interval $[\frac{3}{4}, \frac{2+\sqrt{2}}{4}]$ in Fig. 1.

For $\epsilon >0$, we choose the parameter $\omega _{\text {exp}} \in {[}\frac{3}{4} + \delta , \frac{2+\sqrt{2}}{4}]$ to be large enough so that

$$\begin{aligned} 1- f(\omega _{\text {exp}} - \delta ) = h\left( \frac{1}{2} + \frac{1}{2}\sqrt{16 (\omega _{\text {exp}}-\delta ) (\omega _{\text {exp}} - \delta -1) +3} \right) \le \epsilon ^4. \end{aligned}$$

(48)

We will now use Eq. 47 to bound the von Neumann entropy of the answers given Eve’s information for the sequential DIQKD protocol. We have

$$\begin{aligned} H(A_1^n | X_1^n Y_1^n T_1^n E)&= \sum _{k=1}^n H(A_k | A_1^{k-1} X_1^n Y_1^n T_1^n E) \\&\overset{(1)}{=}\ \sum _{k=1}^n H(A_k | A_1^{k-1} X_1^k Y_1^k T_1^k E) \\&\overset{(2)}{=}\ \sum _{k=1}^n H(A_k | X_k Y_k R_{k-1} E) \\&= \sum _{k=1}^n {{\,\mathrm{\mathbb {E}}\,}}_{r_{k-1} \sim R_{k-1}} \left[ H(A_k | X_k Y_k E)_{\rho ^{(k)}_{|r_{k-1}}}\right] \\&\overset{(3)}{\ge }\ \sum _{k=1}^n {{\,\mathrm{\mathbb {E}}\,}}_{r_{k-1} \sim R_{k-1}} \left[ f\left( \Pr [W_k | r_{k-1}] \right) \right] \\&\ge \sum _{k=1}^n f\left( \Pr [W_k] \right) \\&\ge n f\left( \frac{1}{n} \sum _{k=1}^n \Pr [W_k] \right) \\&\ge nf(\omega _{\text {exp}}-\delta ) \ge n(1- \epsilon ^4) \end{aligned}$$

where in (1) we have used the fact that the questions sampled in the rounds after the kth round are independent of the random variables in the previous rounds, in (2) we use the fact that Alice’s answers are independent of the random variable $T_k$ given the question $X_k$ and we also grouped the random variables generated in the previous round into the random variable $R_{k-1}:= A_1^{k-1} B_1^{k-1} X_1^{k-1} Y_1^{k-1} T_1^{k-1}$, in (3) we use the bound in Eq. 47, and in the next two steps we use convexity of f. If instead of the von Neumann entropy on the left-hand side above we had the smooth min-entropy, then the bound above would be sufficient to prove the security of DIQKD. However, this argument cannot be easily generalised to the smooth min-entropy because a chain rule like the one used in the first step does not exist for the smooth min-entropy (entropy accumulation [8, 9] generalises exactly such an argument). We can use the argument used for the approximately independent register case to transform this von Neumann entropy bound to a smooth min-entropy bound.

This bound results in the following bound on the multipartite mutual information

$$\begin{aligned} I(A_1: \cdots : A_n: X_1^n Y_1^n T_1^n E)&= \sum _{k=1}^n H(A_k) + H(X_1^n Y_1^n T_1^n E) - H(A_1^n X_1^n Y_1^n T_1^n E) \\&= \sum _{k=1}^n H(A_k) - H(A_1^n | X_1^n Y_1^n T_1^n E) \\&\le n - n(1-\epsilon ^4) = n \epsilon ^4 \end{aligned}$$

where we have used the dimension bound $H(A_k) \le 1$ for every $k \in [n]$. This is the same as the multipartite mutual information bound we derived while analysing approximately independent registers in Theorem 4.3. We can simply use the smooth min-entropy bound derived there here as well. This gives us the bound

$$\begin{aligned} H^{2\epsilon }_{\min }(A_1^n | X_1^n Y_1^n T_1^n E)&\ge \sum _{k=1}^n H(A_k) - 3n \epsilon \log 5 - O\left( \frac{1}{\epsilon ^3} \right) \nonumber \\&=n(1- 3\epsilon \log 5) - O\left( \frac{1}{\epsilon ^3} \right) \end{aligned}$$

(49)

where we have used the fact that the answers $A_k$ can always be assumed to be uniformly distributed [31, 32]. For every $\epsilon >0$, we can choose a sufficiently large n so that this bound is large and positive.

We note that this method is only able to provide “proof of concept” or existence type security proofs. This proof method couples the value of the security parameter for privacy amplification $\epsilon $ with the average winning probability, which is not desirable. The parameter $\epsilon $ is chosen according to the security requirements of the protocol and is typically very small. For such values of $\epsilon $, the average winning probability of the protocol will have to be extremely close to the maximum and we cannot realistically expect practical implementations to achieve such high winning probabilities. However, we do expect that this method will make it easier to create “proof of concept” type proofs for new cryptographic protocols in the future.

5 Approximate Entropy Accumulation

In general, it is very difficult to estimate the smooth min-entropy of states produced during cryptographic protocols. The entropy accumulation theorem (EAT) [8] provides a tight and simple lower bound for the smooth min-entropy $H^{\epsilon }_{\min }(A_1^n | B_1^n E)_{\rho }$ of sequential processes, under certain Markov chain conditions. The state $\rho _{A_1^n B_1^n E}$ in the setting for EAT is produced by a sequential process of the form shown in Fig. 2. The process begins with the registers $R_0$ and E. In the context of a cryptographic protocol, the register $R_0$ is usually held by the honest parties, whereas the register E is held by the adversary. Then, in each round $k \in [n]$ of the process, a channel ${{\,\mathrm{\mathcal {M}}\,}}_{k}: R_{k-1} \rightarrow A_k B_k R_k$ is applied on the register $R_{k-1}$ to produce the registers $A_k, B_k$ and $R_k$. The registers $A_1^n$ usually contain a partially secret raw key and the registers $B_1^n$ contain the side information about $A_1^n$ revealed to the adversary during the protocol. EAT requires that for every $k \in [n]$, the side information $B_k$ satisfies the Markov chain $A_1^{k-1} \leftrightarrow B_1^{k-1}E \leftrightarrow B_k$, that is, the side information revealed in the kth round does not reveal anything more about the secret registers of the previous rounds than was already known to the adversary through $B_1^{k-1}E$. Under this assumption, EAT provides the following lower bound for the smooth min-entropy

$$\begin{aligned} H^{\epsilon }_{\min }(A_1^n | B_1^n E)_{\rho } \ge \sum _{k=1}^n \inf _{\omega _{R_{k-1} \tilde{R}}} H(A_k | B_k \tilde{R})_{{{\,\mathrm{\mathcal {M}}\,}}_k (\omega )} - c\sqrt{n} \end{aligned}$$

(50)

where the infimum is taken over all input states to the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ and $c>0$ is a constant depending only on |A| (size of registers $A_k$) and $\epsilon $. We will state and prove an approximate version of EAT. Consider the sequential process in Fig. 2 again. Now, suppose that the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ do not necessarily satisfy the Markov chain conditions mentioned above, but each of the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ can be $\epsilon $-approximated by ${{\,\mathrm{\mathcal {M}}\,}}'_k$ which satisfy the Markov chain $A_1^{k-1} \leftrightarrow B_1^{k-1}E \leftrightarrow B_k$ for a certain collection of inputs. The approximate entropy accumulation theorem below provides a lower bound on the smooth min-entropy in such a setting. The proof of this theorem again uses the technique based on the smooth min-entropy triangle inequality developed in the previous section. In this setting too, we have a chain of approximations. For each $k \in [n]$, we have

$$\begin{aligned} \rho _{A_1^k B_1^k E}&= {{\,\textrm{tr}\,}}_{R_k} \circ {{\,\mathrm{\mathcal {M}}\,}}_k \left( {{\,\mathrm{\mathcal {M}}\,}}_{k-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) \right) \\&\approx _{\epsilon } {{\,\textrm{tr}\,}}_{R_k} \circ {{\,\mathrm{\mathcal {M}}\,}}'_k \left( {{\,\mathrm{\mathcal {M}}\,}}_{k-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) \right) := \sigma ^{(k)}_{A_1^k B_1^k E}. \end{aligned}$$

According to the Markov chain assumption for the channels ${{\,\mathrm{\mathcal {M}}\,}}'_k$, the state $\sigma ^{(k)}_{A_1^k B_1^k E}$, satisfies the Markov chain $A_1^{k-1} \leftrightarrow B_1^{k-1}E \leftrightarrow B_k$. Therefore, we expect that the register $A_k$ adds some entropy to the smooth min-entropy $H^{\epsilon '}_{\min }(A_1^n | B_1^n E)_{\rho }$ and that the information leaked through $B_1^n$ is not too large. We show that this is indeed the case in the approximate entropy accumulation theorem.

The approximate entropy accumulation theorem can be used to analyse and prove the security of cryptographic protocols under certain imperfections. The entropy accumulation theorem itself is used to prove the security of sequential device independent quantum key distribution (DIQKD) protocols [12]. In these protocols, the side information $B_k$ produced during each of the rounds are the questions used during the round to play a non-local game, like the CHSH game. In the ideal case, these questions are sampled independently of everything which came before. As an example of an imperfection, we can imagine that some physical effect between the memory storing the secret bits $A_1^{k-1}$ and the device producing the questions may lead to a small correlation between the side information produced during the kth round and the secret bits $A_1^{k-1}$ (also see [13, 14]). The approximate entropy accumulation theorem below can be used to prove security of DIQKD under such imperfections. We do not, however, pursue this example here and leave applications of this theorem for future work. In Sect. 5.4, we modify this Theorem to incorporate testing for EAT.

Theorem 5.1

For $k \in [n]$, let the registers $A_k$ and $B_k$ be such that $|A_k| = |A|$ and $|B_k| = |B|$. For $k \in [n]$, let ${{\,\mathrm{\mathcal {M}}\,}}_k$ be channels from $R_{k-1} \rightarrow R_{k} A_k B_k$ and

$$\begin{aligned} \rho _{A_1^n B_1^n E} = {{\,\textrm{tr}\,}}_{R_n} \circ {{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1 (\rho ^{(0)}_{R_0 E}) \end{aligned}$$

(51)

be the state produced by applying these maps sequentially. Suppose the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ are such that for every $k \in [n]$, there exists a channel ${{\,\mathrm{\mathcal {M}}\,}}'_{k}$ from $R_{k-1} \rightarrow R_{k} A_k B_k$ such that

1.
${{\,\mathrm{\mathcal {M}}\,}}'_{k}$ $\epsilon $-approximates ${{\,\mathrm{\mathcal {M}}\,}}_{k}$ in the diamond norm:
$$\begin{aligned} \frac{1}{2}\left\| {{\,\mathrm{\mathcal {M}}\,}}_k- {{\,\mathrm{\mathcal {M}}\,}}'_k \right\| _{\diamond } \le \epsilon \end{aligned}$$
(52)
2.
For every choice of a sequence of channels ${{\,\mathrm{\mathcal {N}}\,}}_i \in \{ {{\,\mathrm{\mathcal {M}}\,}}_i, {{\,\mathrm{\mathcal {M}}\,}}'_i \}$ for $i \in [k-1]$, the state ${{\,\mathrm{\mathcal {M}}\,}}'_k \circ {{\,\mathrm{\mathcal {N}}\,}}_{k-1} \circ \cdots \circ {{\,\mathrm{\mathcal {N}}\,}}_1 (\rho ^{(0)}_{R_0 E})$ satisfies the Markov chain
$$\begin{aligned} A_1^{k-1} \leftrightarrow B_1^{k-1}E \leftrightarrow B_k. \end{aligned}$$
(53)

Then, for $0<\delta , \epsilon _1, \epsilon _2<1$ such that $\epsilon _1 + \epsilon _2 <1$, $\alpha \in \left( 1, 1+ \frac{1}{\log (1+2|A|)} \right) $ and $\beta >1$, we have

$$\begin{aligned}&H_{\min }^{\epsilon _1+\epsilon _2}(A_1^n|B_1^n E)_{\rho } \ge \sum _{k=1}^n \inf _{\omega _{R_{k-1} \tilde{R}}} H(A_k | B_k \tilde{R})_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega )} - n(\alpha -1) \log ^2(1+ 2 |A|) \nonumber \\&\quad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 4^{\frac{\alpha -1}{\alpha }\log (|A||B|)}-1 \right) \right) \nonumber \\&\quad - \frac{\alpha }{\alpha -1} n z_{\beta }(\epsilon , \delta ) - \frac{1}{\alpha -1}\left( g_1(\epsilon _2, \epsilon _1)+ \frac{\alpha g_0(\epsilon _1)}{\beta -1} \right) . \end{aligned}$$

(54)

where

$$\begin{aligned} z_\beta (\epsilon , \delta ) := \frac{\beta +1}{\beta -1}\log \left( \left( 1+ \sqrt{(1-\delta )\epsilon } \right) ^{\frac{\beta }{\beta +1}} + \left( \frac{\sqrt{(1-\delta )\epsilon }}{\delta ^\beta } \right) ^{{\frac{1}{\beta +1}}} \right) \end{aligned}$$

(55)

and $g_1(x,y) = - \log (1- \sqrt{1-x^2}) - \log (1-y^2)$ and the infimum in Eq. 54 is taken over all input states $\omega _{R_{k-1} \tilde{R}}$ to the channels ${{\,\mathrm{\mathcal {M}}\,}}'_k$ where $\tilde{R}$ is a reference register ($\tilde{R}$ can be chosen isomorphic to $R_{k-1}$).

For the choice of $\beta = 2$, $\delta = \epsilon ^{\frac{1}{8}}$, we have

$$\begin{aligned} z_2(\epsilon , \delta ) \le 3 \log \left( \left( 1+\epsilon ^{\frac{1}{2}} \right) ^{\frac{2}{3}} + \epsilon ^{\frac{1}{12}} \right) . \end{aligned}$$

We also have that

$$\begin{aligned} \log \left( 1 + \delta 2^{\frac{\alpha -1}{\alpha } 2 \log (|A||B|)} \right) \le (|A||B|)^2\epsilon ^{\frac{1}{8}}. \end{aligned}$$

Finally, if we define $\epsilon _r:= (|A||B|)^2 \epsilon ^{\frac{1}{8}} + 3 \log \left( \left( 1+\epsilon ^{\frac{1}{2}} \right) ^{\frac{2}{3}} + \epsilon ^{\frac{1}{12}} \right) $, and choose $\alpha = \sqrt{\epsilon _r}$, we get the bound

$$\begin{aligned}&H_{\min }^{\epsilon _1+\epsilon _2}(A_1^n|B_1^n E)_{\rho } \ge \sum _{k=1}^n \inf _{\omega _{R_k \tilde{R}_k}} H(A_k | B_k \tilde{R}_k)_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega _{R_k \tilde{R}_k})} \nonumber \\&\quad - n\sqrt{\epsilon _r} (\log ^2(1+ 2 |A|) + 2) - \frac{1}{\sqrt{\epsilon _r}}\left( g_1(\epsilon _2, \epsilon _1)+ 2 g_0(\epsilon _1) \right) \end{aligned}$$

(56)

The entropy loss per round in the above bound behaves as $\sim \epsilon ^{\frac{1}{24}}$. This dependence on $\epsilon $ is indeed very poor. In comparison, we can carry out a similar proof argument for classical probability distributions to get a dependence of $O(\sqrt{\epsilon })$ (Theorem F.1). The exponent of $\epsilon $ in our bound seems to be almost a factor of 12 off from the best possible bound. Roughly speaking, while carrying out the proof classically, we can bound the relevant channel divergences in the proof by $O\left( \epsilon \right) $, whereas in Eq. 56, we were only able to bound the channel divergence by $\sim \epsilon ^{1/12}$. This leads to the deterioration of performance we see here as compared to the classical case. We will discuss this further in Sect. 5.5.

In order to prove this theorem, we will use a channel divergence based chain rule. Recently proven chain rules for $\alpha $-Rényi relative entropy [15, Corollary 5.2] state that for $\alpha >1$ and states $\rho _A$ and $\sigma _A$, and channels $\mathcal {E}_{A \rightarrow B}$ and $\mathcal {F}_{A \rightarrow B}$, we have

$$\begin{aligned} \tilde{D}_{\alpha }(\mathcal {E}_{A \rightarrow B}(\rho _{A})|| \mathcal {F}_{A \rightarrow B}(\sigma _{A})) \le \tilde{D}_{\alpha }(\rho _{A}|| \sigma _{A}) + \tilde{D}_{\alpha }^{\text {reg}}(\mathcal {E}_{A \rightarrow B} || \mathcal {F}_{A \rightarrow B}) \end{aligned}$$

(57)

where $\tilde{D}_{\alpha }^{\text {reg}}(\mathcal {E}_{A \rightarrow B} || \mathcal {F}_{A \rightarrow B}):= \lim _{n \rightarrow \infty } \frac{1}{n} \tilde{D}_{\alpha }(\mathcal {E}_{A \rightarrow B}^{\otimes n} || \mathcal {F}_{A \rightarrow B}^{\otimes n})$ and $\tilde{D}_{\alpha }(\cdot || \cdot )$ is the channel divergence (see Eq. 14).

Now observe that if we were guaranteed that for the maps in Theorem 5.1 above, $\tilde{D}_{\alpha }^{\text {reg}}({{\,\mathrm{\mathcal {M}}\,}}_k || {{\,\mathrm{\mathcal {M}}\,}}'_k) \le \epsilon $ for every k for some $\alpha >1$. Then, we could use the chain rule in Eq. 57 as follows

$$\begin{aligned}&\tilde{D}_{\alpha }({{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) || {{\,\mathrm{\mathcal {M}}\,}}'_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(\rho ^{(0)}_{R_0 E})) \\&\quad \le \tilde{D}_{\alpha }({{\,\mathrm{\mathcal {M}}\,}}_{n-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) || {{\,\mathrm{\mathcal {M}}\,}}'_{n-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(\rho ^{(0)}_{R_0 E})) + \tilde{D}_{\alpha }^{\text {reg}}({{\,\mathrm{\mathcal {M}}\,}}_n || {{\,\mathrm{\mathcal {M}}\,}}'_n) \\&\quad \le \cdots \\&\quad \le \tilde{D}_{\alpha }(\rho ^{(0)}_{R_0 E}||\rho ^{(0)}_{R_0 E}) + \sum _{k=1}^n \tilde{D}_{\alpha }^{\text {reg}}({{\,\mathrm{\mathcal {M}}\,}}_k || {{\,\mathrm{\mathcal {M}}\,}}'_k)\\&\quad \le n \epsilon . \end{aligned}$$

Once we have the above result we can simply use the well known relation between smooth max-relative entropy and $\alpha $-Rényi relative entropy [17, Proposition 6.5] to get the bound

$$\begin{aligned}&{D}_{\max }^{\epsilon '}({{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) || {{\,\mathrm{\mathcal {M}}\,}}'_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(\rho ^{(0)}_{R_0 E})) \\&\quad \le \tilde{D}_{\alpha }({{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) || {{\,\mathrm{\mathcal {M}}\,}}'_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(\rho ^{(0)}_{R_0 E})) + \frac{g_0(\epsilon ')}{\alpha -1}\\&\quad \le n\epsilon + O(1). \end{aligned}$$

This bound can subsequently be used in Lemma 3.5 to relate the smooth min-entropy of the real state ${{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E})$ with the $\alpha -$Rényi conditional entropy of the auxiliary state ${{\,\mathrm{\mathcal {M}}\,}}'_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(\rho ^{(0)}_{R_0 E})$, for which we can use the original entropy accumulation theorem.

In order to prove Theorem 5.1, we broadly follow this idea. However, the condition $\left\| {{\,\mathrm{\mathcal {M}}\,}}_k- {{\,\mathrm{\mathcal {M}}\,}}'_k \right\| _{\diamond } \le \epsilon $ does not lead to any kind of bound on $\tilde{D}_{\alpha }^{\text {reg}}$ or any other channel divergence. We will get around this issue by instead using mixed channels ${{\,\mathrm{\mathcal {M}}\,}}^{\delta }_{k}:= (1-\delta ){{\,\mathrm{\mathcal {M}}\,}}'_{k} + \delta {{\,\mathrm{\mathcal {M}}\,}}_{k}$. Also, instead of trying to bound channel divergence in terms of $\tilde{D}_{\alpha }^{\text {reg}}$, we will bound the $D^\#_{\alpha }$ (defined in the next section) channel divergence and use its chain rule. We develop the relevant $\alpha $-Rényi divergence bounds for this divergence in the next two subsections and then prove the theorem above in Sec 5.3.

5.1 Divergence bound for approximately equal states

We will use the sharp Rényi divergence $D^{\#}_{\alpha }$ defined in Ref. [15] (see [33] for the following equivalent definition) in this section. For $\alpha > 1$ and two positive operators P and Q, it is defined

$$\begin{aligned} D^{\#}_{\alpha }(P || Q) := \min _{A \ge P} \hat{D}_{\alpha } (A || Q) \end{aligned}$$

(58)

where $\hat{D}_{\alpha } (A || Q)$ is the $\alpha $-Rényi geometric divergence [34]. For $\alpha >1$, it is defined as

$$\begin{aligned} \hat{D}_{\alpha } (A || Q) = {\left\{ \begin{array}{ll} \frac{1}{\alpha -1}\log {{\,\textrm{tr}\,}}\left( Q \left( Q^{-\frac{1}{2}} A Q^{-\frac{1}{2}} \right) ^\alpha \right) &{} \text {if } A \ll Q\\ \infty &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

(59)

A in the optimisation above is any operator $A \ge P$. In general, such an operator A is unnormalised. We will prove a bound on $D^{\#}_{\alpha }$ between two states in terms of the distance between them and their max-relative entropy. In order to prove this bound, we require the following simple generalisation of the pinching inequality (see for example [17, Sect. 2.6.3]).

Lemma 5.2

(Asymmetric pinching). For $t>0$, a positive semidefinite operator $X \ge 0 $ and orthogonal projections $\Pi $ and $\Pi _\perp = {{\,\mathrm{\mathbb {1}}\,}}- \Pi $, we have that

$$\begin{aligned} X \le (1+t) \Pi X \Pi + \left( 1+ \frac{1}{t} \right) \Pi _\perp X \Pi _\perp . \end{aligned}$$

(60)

Proof

We will write the positive matrix X as the block matrix

$$\begin{aligned} X= \begin{pmatrix} X_1 \ X_2 \\ X_2^*\ X_3 \end{pmatrix} \end{aligned}$$

where the blocks are partitioned according to the direct sum ${{\,\mathrm{\text {im}}\,}}(\Pi ) \oplus {{\,\mathrm{\text {im}}\,}}(\Pi _\perp )$. Then, the statement in the Lemma is equivalent to proving that

$$\begin{aligned} \begin{pmatrix} X_1 \ {} &{} X_2 \\ X_2^*\ {} &{} X_3 \end{pmatrix} \le \begin{pmatrix} (1+t) X_1 \ {} &{} 0 \\ 0 \ {} &{} 0 \end{pmatrix} + \begin{pmatrix} 0 \ {} &{} 0 \\ 0 \ {} &{} \left( 1+ \frac{1}{t} \right) X_3 \end{pmatrix} \end{aligned}$$

which is equivalent to proving that

$$\begin{aligned} 0 \le \begin{pmatrix} t X_1 \ {} &{} - X_2 \\ - X_2^*\ {} &{} \frac{1}{t} X_3 \end{pmatrix}. \end{aligned}$$

This is true because

$$\begin{aligned} \begin{pmatrix} t X_1 \ {} &{} - X_2 \\ - X_2^*\ {} &{} \frac{1}{t} X_3 \end{pmatrix} = \begin{pmatrix} - t^{1/2} \ {} &{} 0 \\ 0 \ {} &{} t^{-1/2} \end{pmatrix} \begin{pmatrix} X_1 \ {} &{} X_2 \\ X_2^*\ {} &{} X_3 \end{pmatrix} \begin{pmatrix} - t^{1/2} \ {} &{} 0 \\ 0 \ {} &{} t^{-1/2} \end{pmatrix} \ge 0 \end{aligned}$$

since $X \ge 0$. $\square $

Lemma 5.3

Let $\epsilon >0$ and $\alpha \in (1, \infty )$, $\rho $ and $\sigma $ be two normalised quantum states on the Hilbert space $\mathbb {C}^n$ such that $\frac{1}{2}\left\| \rho - \sigma \right\| _1 \le \epsilon $ and also $D_{\max }(\rho ||\sigma ) \le d < \infty $, then we have the bound

$$\begin{aligned} D^{\#}_{\alpha }(\rho || \sigma ) \le \frac{\alpha +1}{\alpha -1}\log \left( (1+ \sqrt{\epsilon })^{\frac{\alpha }{\alpha +1}} + \left( 2^{\alpha d} \sqrt{\epsilon } \right) ^{\frac{1}{\alpha +1}} \right) . \end{aligned}$$

(61)

Note: For a fixed $\alpha \in (1, \infty )$, this upper bound tends to zero as $\epsilon \rightarrow 0$. On the other hand, for a fixed $\epsilon \in (0,1)$, the upper bound tends to infinity as $\alpha \rightarrow 1$ (that is, the bound becomes trivial). In Appendix B, we show that a bound of this form for $D^{\#}_{\alpha }$ necessarily diverges for $\epsilon >0$ as $\alpha \rightarrow 1$.

Proof

Since, $D_{\max }(\rho ||\sigma ) < \infty $, we have that $\rho \ll \sigma $. We can assume that $\sigma $ is invertible. If it was not, then we could always restrict our vector space to the subspace $\text {supp}(\sigma )$.

Let $\rho - \sigma = P- Q$, where $P \ge 0$ is the positive part of the matrix $\rho - \sigma $ and $Q \ge 0$ is its negative part. We then have that ${{\,\textrm{tr}\,}}(P)= {{\,\textrm{tr}\,}}(Q) \le \epsilon $.

Further, let

$$\begin{aligned} \sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}} = \sum _{i=1}^n \lambda _i |{x_i}\rangle \langle {x_i}| \end{aligned}$$

(62)

be the eigenvalue decomposition of $\sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}}$. Define the real vector $q \in \mathbb {R}^n$ as

$$\begin{aligned} q(i) := \langle {x_i}| \sigma |{x_i}\rangle . \end{aligned}$$

Note that q is a probability distribution. Observe that

$$\begin{aligned} \mathbb {E}_{I \sim q} \left[ \lambda _I\right]&= \sum _{i=1}^n \lambda _i \langle {x_i}| \sigma |{x_i}\rangle \\&= {{\,\textrm{tr}\,}}\left( \sigma \sum _{i=1}^n \lambda _i |{x_i}\rangle \langle {x_i}| \right) \\&= {{\,\textrm{tr}\,}}\left( \sigma \sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}} \right) \\&= {{\,\textrm{tr}\,}}(P)\\&\le \epsilon . \end{aligned}$$

Also, observe that $\lambda _i \ge 0$ for all $i \in [n]$ because $\sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}} \ge 0$. Let’s define

$$\begin{aligned} S := \{i \in [n]: \lambda _i \le \sqrt{\epsilon }\}. \end{aligned}$$

(63)

Since, $\lambda _i \ge 0$ for all $i \in [n]$, we can use the Markov inequality to show:

$$\begin{aligned} \Pr _q (I \in S^c)&= \Pr _q (\lambda _I > \sqrt{\epsilon }) \\&\le \frac{{{\,\mathrm{\mathbb {E}}\,}}_{I \sim q}\left[ \lambda _I\right] }{\sqrt{\epsilon }} \\&\le \sqrt{\epsilon }. \end{aligned}$$

Thus, if we define the projectors $\Pi := \sum _{i \in S} |{x_i}\rangle \langle {x_i}|$ and $\Pi _{\perp }:= \sum _{i \in S^c} |{x_i}\rangle \langle {x_i}| = {{\,\mathrm{\mathbb {1}}\,}}- \Pi $, we have

$$\begin{aligned} {{\,\textrm{tr}\,}}(\sigma \Pi _\perp )&= \sum _{i \in S^c} \langle {x_i}| \sigma |{x_i}\rangle \nonumber \\&= \Pr _q (I \in S^c) \nonumber \\&\le \sqrt{\epsilon }. \end{aligned}$$

(64)

Moreover, by the definition of set S (Eq. 63) we have

$$\begin{aligned} \Pi \sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}} \Pi&= \sum _{i \in S} \lambda _i |{x_i}\rangle \langle {x_i}| \le \sqrt{\epsilon } \Pi \end{aligned}$$

(65)

and using $D_{\max }(\rho || \sigma ) \le d$, we have that

$$\begin{aligned} \sigma ^{-\frac{1}{2}} \rho \sigma ^{-\frac{1}{2}} \le 2^d {{\,\mathrm{\mathbb {1}}\,}}. \end{aligned}$$

(66)

Now, observe that since $\sigma ^{-\frac{1}{2}} \rho \sigma ^{-\frac{1}{2}} \ge 0$, for an arbitrary $t >0$, using Lemma 5.2 we have

$$\begin{aligned} \sigma ^{-\frac{1}{2}} \rho \sigma ^{-\frac{1}{2}}&\le (1+t) \Pi \sigma ^{-\frac{1}{2}} \rho \sigma ^{-\frac{1}{2}} \Pi + \left( 1+ \frac{1}{t} \right) \Pi _\perp \sigma ^{-\frac{1}{2}} \rho \sigma ^{-\frac{1}{2}} \Pi _\perp \\&\le (1+t) \Pi \left( {{\,\mathrm{\mathbb {1}}\,}}+ \sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}} \right) \Pi + \left( 1+ \frac{1}{t} \right) 2^d \Pi _\perp \\&\le (1+t)(1+ \sqrt{\epsilon }) \Pi + \left( 1+ \frac{1}{t} \right) 2^d \Pi _\perp \end{aligned}$$

where we have used $\rho \le \sigma + P$ to bound the first term and Eq. 66 to bound the second term in the second line, and Eq. 65 to bound $\Pi \sigma ^{-\frac{1}{2}} P \sigma ^{-\frac{1}{2}} \Pi $ in the last step.

We will define $A_t:= (1+t)(1+ \sqrt{\epsilon }) \sigma ^{\frac{1}{2}}\Pi \sigma ^{\frac{1}{2}} + \left( 1+ \frac{1}{t} \right) 2^d \sigma ^{\frac{1}{2}}\Pi _\perp \sigma ^{\frac{1}{2}}$. Above, we have shown that $A_t \ge \rho $ for every $t>0$. Therefore, for each $t>0$, $D^\#_\alpha (\rho || \sigma ) \le \hat{D}_\alpha (A_t || \sigma )$. We will now bound $\hat{D}_\alpha (A_t || \sigma )$ for $\alpha \in (1, \infty )$ as:

$$\begin{aligned} \hat{D}_\alpha (A_t || \sigma )&= \frac{1}{\alpha -1} \log {{\,\textrm{tr}\,}}\left( \sigma \left( \sigma ^{-\frac{1}{2}} A_t \sigma ^{-\frac{1}{2}} \right) ^\alpha \right) \\&= \frac{1}{\alpha -1} \log {{\,\textrm{tr}\,}}\left( \sigma \left( (1+t)(1+ \sqrt{\epsilon }) \Pi + \left( 1+ \frac{1}{t} \right) 2^d \Pi _\perp \right) ^\alpha \right) \\&= \frac{1}{\alpha -1} \log {{\,\textrm{tr}\,}}\left( \sigma \left( (1+t)^\alpha (1+ \sqrt{\epsilon })^\alpha \Pi + \left( 1+ \frac{1}{t} \right) ^\alpha 2^{d\alpha } \Pi _\perp \right) \right) \\&= \frac{1}{\alpha -1} \log \left( (1+t)^\alpha (1+ \sqrt{\epsilon })^\alpha {{\,\textrm{tr}\,}}\left( \sigma \Pi \right) + \left( 1+ \frac{1}{t} \right) ^\alpha 2^{d\alpha } {{\,\textrm{tr}\,}}\left( \sigma \Pi _\perp \right) \right) \\&\le \frac{1}{\alpha -1} \log \left( (1+t)^\alpha (1+ \sqrt{\epsilon })^\alpha + \left( 1+ \frac{1}{t} \right) ^\alpha 2^{d\alpha } \sqrt{\epsilon } \right) \end{aligned}$$

where in the last line we use ${{\,\textrm{tr}\,}}(\sigma \Pi )\le 1$ and ${{\,\textrm{tr}\,}}(\sigma \Pi _\perp )\le \sqrt{\epsilon }$ (Eq. 64). Finally, since $t>0$ was arbitrary, we can choose the $t>0$ which minimizes the right-hand side. For this choice of $t_{\min }= \left( \frac{2^{\alpha d} \sqrt{\epsilon }}{\left( 1+ \sqrt{\epsilon } \right) ^{\alpha }} \right) ^{\frac{1}{\alpha +1}}$, we get

$$\begin{aligned} \hat{D}_\alpha (A_{t_{\min }} || \sigma )&\le \frac{\alpha +1}{\alpha -1}\log \left( (1+ \sqrt{\epsilon })^{\frac{\alpha }{\alpha +1}} + 2^{\frac{\alpha }{\alpha +1} d} \epsilon ^{\frac{1}{2(\alpha +1)}} \right) \end{aligned}$$

which proves the required bound. $\square $

5.2 Bounding the channel divergence for two channels close to each other

Suppose there are two channels ${{\,\mathrm{\mathcal {N}}\,}}$ and ${{\,\mathrm{\mathcal {M}}\,}}$ mapping registers from the space A to B such that $\frac{1}{2} \left\| {{\,\mathrm{\mathcal {N}}\,}}- {{\,\mathrm{\mathcal {M}}\,}} \right\| _\diamond \le \epsilon $. In general, the channel divergence between two such channels can be infinite because there may be states $\rho $ such that ${{\,\mathrm{\mathcal {N}}\,}}(\rho ) \not \ll {{\,\mathrm{\mathcal {M}}\,}}(\rho )$. In order to get around this issue, we will use the $\delta -$mixed channel, ${{\,\mathrm{\mathcal {M}}\,}}_\delta $. For $\delta \in (0,1)$, we define ${{\,\mathrm{\mathcal {M}}\,}}_\delta $ as

$$\begin{aligned} \mathcal {M}_\delta := (1- \delta ) \mathcal {M} + \delta \mathcal {N}. \end{aligned}$$

This guarantees that $D_{\max }({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}_\delta ) \le \log \frac{1}{\delta }$, which is enough to ensure that the divergences we are interested in are finite. Moreover, by mixing ${{\,\mathrm{\mathcal {M}}\,}}$ with ${{\,\mathrm{\mathcal {N}}\,}}$, we only decrease the distance:

$$\begin{aligned} \frac{1}{2}\left\| \mathcal {M}_\delta - {{\,\mathrm{\mathcal {N}}\,}} \right\| _{\diamond }&= \frac{1}{2}\left\| (1-\delta ){{\,\mathrm{\mathcal {M}}\,}}+ \delta {{\,\mathrm{\mathcal {N}}\,}}- {{\,\mathrm{\mathcal {N}}\,}} \right\| _{\diamond } \nonumber \\&= (1-\delta )\frac{1}{2}\left\| {{\,\mathrm{\mathcal {M}}\,}}- {{\,\mathrm{\mathcal {N}}\,}} \right\| _{\diamond }\nonumber \\&\le (1-\delta ) \epsilon . \end{aligned}$$

(67)

We will now show that $D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}_{\delta })$ is small for an appropriately chosen $\delta $. By the definition of channel divergence, we have that

$$\begin{aligned} D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}_{\delta }) = \sup _{\rho _{AR}} D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR}) || {{\,\mathrm{\mathcal {M}}\,}}_{\delta }(\rho _{AR})) \end{aligned}$$

where R is an arbitrary reference system (${{\,\mathrm{\mathcal {N}}\,}}, {{\,\mathrm{\mathcal {M}}\,}}_{\delta }$ map register A to register B). We will show that for every $\rho _{AR}$, $D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR}) || {{\,\mathrm{\mathcal {M}}\,}}_{\delta }(\rho _{AR}))$ is small. Note that

$$\begin{aligned} {{\,\mathrm{\mathcal {M}}\,}}_{\delta }(\rho _{AR})&= (1-\delta ){{\,\mathrm{\mathcal {M}}\,}}(\rho _{AR}) + \delta {{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR})\\&\ge \delta {{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR}) \end{aligned}$$

which implies that $D_{\max }({{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR}) || {{\,\mathrm{\mathcal {M}}\,}}_{\delta }(\rho _{AR})) \le \log \frac{1}{\delta }$. Also, using Eq. 67 have that

$$\begin{aligned} \frac{1}{2}\left\| \mathcal {M}_\delta (\rho _{AR})- {{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR}) \right\| _1 \le (1-\delta )\epsilon . \end{aligned}$$

Using Lemma 5.3, we have for every $\alpha \in (1, \infty )$

$$\begin{aligned} D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}(\rho _{AR}) || {{\,\mathrm{\mathcal {M}}\,}}_{\delta }(\rho _{AR})) \le \frac{\alpha +1}{\alpha -1}\log \left( \left( 1+ \sqrt{(1-\delta )\epsilon } \right) ^{\frac{\alpha }{\alpha +1}} + \left( \frac{\sqrt{(1-\delta )\epsilon }}{\delta ^\alpha } \right) ^{{\frac{1}{\alpha +1}}} \right) . \end{aligned}$$

Since, this is true for all $\rho _{AR}$, for every $\alpha \in (1, \infty )$ we have

$$\begin{aligned} D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}_\delta ) \le \frac{\alpha +1}{\alpha -1}\log \left( \left( 1+ \sqrt{(1-\delta )\epsilon } \right) ^{\frac{\alpha }{\alpha +1}} + \left( \frac{\sqrt{(1-\delta )\epsilon }}{\delta ^\alpha } \right) ^{{\frac{1}{\alpha +1}}} \right) . \end{aligned}$$

Note that since $\delta $ was arbitrary, we can choose it appropriately to make sure that the above bound is small, for example by choosing $\delta = \epsilon ^\frac{1}{4 \alpha }$, we get the bound

$$\begin{aligned} D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}_\delta ) \le \frac{\alpha +1}{\alpha -1}\log \left( (1+ \sqrt{\epsilon })^{\frac{\alpha }{\alpha +1}} + \epsilon ^{\frac{1}{4(\alpha +1)}} \right) \end{aligned}$$

which is a small function of $\epsilon $ in the sense that it tends to 0 as $\epsilon \rightarrow 0$. We summarise the bound derived above in the following lemma.

Lemma 5.4

Let $\epsilon >0$. Suppose channels ${{\,\mathrm{\mathcal {N}}\,}}$ and ${{\,\mathrm{\mathcal {M}}\,}}$ from register A to B are such that $\frac{1}{2}\left\| {{\,\mathrm{\mathcal {N}}\,}}- {{\,\mathrm{\mathcal {M}}\,}} \right\| _{\diamond } \le \epsilon $. For $\delta \in (0,1)$, we can define the mixed channel ${{\,\mathrm{\mathcal {M}}\,}}_\delta := (1- \delta ) \mathcal {M} + \delta \mathcal {N}$. Then, for every $\alpha \in (1, \infty )$, we have the following bound on the channel divergence

$$\begin{aligned} D^\#_{\alpha }({{\,\mathrm{\mathcal {N}}\,}}|| {{\,\mathrm{\mathcal {M}}\,}}_\delta ) \le \frac{\alpha +1}{\alpha -1}\log \left( \left( 1+ \sqrt{(1-\delta )\epsilon } \right) ^{\frac{\alpha }{\alpha +1}} + \left( \frac{\sqrt{(1-\delta )\epsilon }}{\delta ^\alpha } \right) ^{{\frac{1}{\alpha +1}}} \right) . \end{aligned}$$

(68)

5.3 Proof of the approximate entropy accumulation theorem

We use the mixed channels defined in the previous section to define the auxiliary state ${{\,\mathrm{\mathcal {M}}\,}}^\delta _n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1(\rho ^{(0)}_{R_0 E})$ for our proof. It is easy to show using the divergence bounds in Sect. 5.2 and the chain rule for $D^\#_{\alpha }$ entropies that the relative entropy distance between the real state and this choice of the auxiliary state is small. However, the state ${{\,\mathrm{\mathcal {M}}\,}}^\delta _n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1(\rho ^{(0)}_{R_0 E})$ does not necessarily satisfy the Markov chain conditions required for entropy accumulation. Thus, we also need to reprove the entropy lower bound on this state by modifying the approach used in the proof of the original entropy accumulation theorem.

Proof of Theorem 5.1

Using Lemma 5.4, for every $\delta \in (0,1)$ and for each $k \in [n]$ we have that for every $\beta > 1$, the mixed maps ${{\,\mathrm{\mathcal {M}}\,}}^\delta _k:= (1-\delta ){{\,\mathrm{\mathcal {M}}\,}}'_k + \delta {{\,\mathrm{\mathcal {M}}\,}}_k$ satisfy

$$\begin{aligned} D^\#_\beta ({{\,\mathrm{\mathcal {M}}\,}}_k || {{\,\mathrm{\mathcal {M}}\,}}^\delta _k)&\le \frac{\beta +1}{\beta -1}\log \left( \left( 1+ \sqrt{(1-\delta )\epsilon } \right) ^{\frac{\beta }{\beta +1}} + \left( \frac{\sqrt{(1-\delta )\epsilon }}{\delta ^\beta } \right) ^{{\frac{1}{\beta +1}}} \right) \nonumber \\&:= z_\beta (\epsilon , \delta ) \end{aligned}$$

(69)

where we defined the right-hand side above as $z_\beta (\epsilon , \delta )$. This can be made “small” by choosing $\delta = \epsilon ^{\frac{1}{4\beta }}$ as was shown in the previous section. We use these maps to define the auxiliary state as

$$\begin{aligned} \sigma _{A_1^n B_1^n E} := {{\,\mathrm{\mathcal {M}}\,}}^\delta _n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1(\rho ^{(0)}_{R_0 E}). \end{aligned}$$

(70)

Now, we have that for $\beta >1$ and $\epsilon _1 >0$

$$\begin{aligned}&D^{\epsilon _1}_{\max }(\rho _{A_1^n B_1^n E}||\sigma _{A_1^n B_1^n E}) \nonumber \\&\quad \le \tilde{D}_\beta (\rho _{A_1^n B_1^n E}||\sigma _{A_1^n B_1^n E}) + \frac{g_0(\epsilon _1)}{\beta -1} \nonumber \\&\quad \le D^\#_\beta (\rho _{A_1^n B_1^n E}||\sigma _{A_1^n B_1^n E}) + \frac{g_0(\epsilon _1)}{\beta -1}\nonumber \\&\quad = D^\#_\beta ({{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) || {{\,\mathrm{\mathcal {M}}\,}}^\delta _n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1(\rho ^{(0)}_{R_0 E})) + \frac{g_0(\epsilon _1)}{\beta -1}\nonumber \\&\quad \le D^\#_\beta ({{\,\mathrm{\mathcal {M}}\,}}_{n-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(\rho ^{(0)}_{R_0 E}) || {{\,\mathrm{\mathcal {M}}\,}}^\delta _{n-1} \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1(\rho ^{(0)}_{R_0 E})) + D^\#_\beta ({{\,\mathrm{\mathcal {M}}\,}}_n || {{\,\mathrm{\mathcal {M}}\,}}^\delta _n) + \frac{g_0(\epsilon _1)}{\beta -1}\nonumber \\&\quad \le \cdots \nonumber \\&\quad \le \sum _{k=1}^n D^\#_\beta ({{\,\mathrm{\mathcal {M}}\,}}_k || {{\,\mathrm{\mathcal {M}}\,}}^\delta _k) + \frac{g_0(\epsilon _1)}{\beta -1}\nonumber \\&\quad \le n z_\beta (\epsilon , \delta ) + \frac{g_0(\epsilon _1)}{\beta -1} \end{aligned}$$

(71)

where the first line follows from [17, Proposition 6.5], the second line follows from [15, Proposition 3.4], fourth line follows from the chain rule for $D^\#_\beta $ [15, Proposition 4.5], and the last line follows from Eq. 69.

For $\epsilon _2 >0$ and $\alpha \in (1, 1+ \frac{1}{\log (1+2|A|)})$, we can plug the above in the bound provided by Lemma 3.5 to get

$$\begin{aligned} H_{\min }^{\epsilon _1+\epsilon _2}(A_1^n|B_1^n E)_{\rho }&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\sigma } - \frac{\alpha }{\alpha -1} n z_{\beta }(\epsilon , \delta ) \nonumber \\&\quad - \frac{1}{\alpha -1}\left( g_1(\epsilon _2, \epsilon _1)+ \frac{\alpha g_0(\epsilon _1)}{\beta -1} \right) . \end{aligned}$$

(72)

We have now reduced our problem to lower bounding $\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\sigma }$. Note that we cannot directly use the entropy accumulation here, since the mixed maps ${{\,\mathrm{\mathcal {M}}\,}}^\delta _k = (1-\delta ){{\,\mathrm{\mathcal {M}}\,}}'_k + \delta {{\,\mathrm{\mathcal {M}}\,}}_k$, which means that with $\delta $ probability the $B_k$ register may be correlated with $A_1^{k-1}$ even given $B_1^{k-1}E$, and it may not satisfy the Markov chain required for entropy accumulation.

The application of the maps ${{\,\mathrm{\mathcal {M}}\,}}^\delta _k$ can be viewed as applying the channel ${{\,\mathrm{\mathcal {M}}\,}}'_k$ with probability $1-\delta $ and the channel ${{\,\mathrm{\mathcal {M}}\,}}_k$ with probability $\delta $. We can define the channels ${{\,\mathrm{\mathcal {N}}\,}}_k$ which map the registers $R_{k-1}$ to $R_k A_k B_k C_k$, where $C_k$ is a binary register. The action of ${{\,\mathrm{\mathcal {N}}\,}}_k$ can be defined as:

1.
Sample the classical random variable $C_k \in \{0,1\}$ independently. $C_k= 1$ with probability $1-\delta $ and 0 otherwise.
2.
If $C_k =1$ apply the map ${{\,\mathrm{\mathcal {M}}\,}}'_k$ on $R_{k-1}$, else apply ${{\,\mathrm{\mathcal {M}}\,}}_k$ on $R_{k-1}$.

Let us call $\theta _{A_1^n B_1^n C_1^n E} ={{\,\mathrm{\mathcal {N}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {N}}\,}}_1 (\rho ^{(0)}_{R_0 E})$. Clearly ${{\,\textrm{tr}\,}}_{C_1^n}\left( \theta _{A_1^n B_1^n C_1^n E} \right) = \sigma _{A_1^n B_1^n E}$. Thus, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\sigma }&= \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta }\nonumber \\&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n C_1^n E)_{\theta }. \end{aligned}$$

(73)

We will now focus on lower bounding $\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n C_1^n E)_{\theta }$. Using [17, Proposition 5.1], we have that

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n C_1^n E)_{\theta } = \frac{\alpha }{1-\alpha } \log \sum _{c_1^n} \theta (c_1^n) \exp \left( \frac{1-\alpha }{\alpha }\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta _{|c_1^n}} \right) . \end{aligned}$$

We will show that for a given $c_1^n$, the conditional entropy $\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta _{|c_1^n}}$ accumulates whenever the “good” map ${{\,\mathrm{\mathcal {M}}\,}}'_k$ is used and loses some entropy for the rounds where the “bad” map ${{\,\mathrm{\mathcal {M}}\,}}_k$ is used. The fact that $c_1^n$ contains far more 1s than 0s with a large probability then allows us to prove a lower bound on $\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n C_1^n E)_{\theta }$.

Claim 5.5

Define $h_k:= \inf _{\omega } \tilde{H}^{\downarrow }_{\alpha }(A_k | B_k \tilde{R}_{k-1})_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega )}$ where the infimum is over all states $\omega _{R_{k-1} \tilde{R}_{k-1}}$ for a register $\tilde{R}_{k-1}$, which is isomorphic to $R_{k-1}$, and $s:= \log (|A||B|^2)$. Then, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta _{|c_1^n}} \ge \sum _{k=1}^n \left( \delta ({c_k,1}) h_k - \delta ({c_k,0})s \right) \end{aligned}$$

(74)

where $\delta ({x,y})$ is the Kronecker delta function ($\delta ({x,y})=1$ if $x=y$ and 0 otherwise).

Proof

We will prove the statement

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^k|B_1^k E)_{\theta _{|c_1^k}} \ge \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^{k-1} E)_{\theta _{|c_1^{k-1}}} + \left( \delta ({c_k,1}) h_k - \delta ({c_k,0})s \right) \end{aligned}$$

then the claim will follow inductively. We will consider two cases: when $c_k\!=\!0$ and when $c_k \!=\!1$. First suppose, $c_k=0$ then $\theta _{A_1^k B_1^k E | c_1^k} \!=\! {{\,\textrm{tr}\,}}_{R_k} \!\circ \!{{\,\mathrm{\mathcal {M}}\,}}_k^{R_{k-1} \rightarrow R_{k} A_k B_k} \left( \theta _{R_{k-1} A_1^{k-1} B_1^{k-1} E| c_1^{k}} \right) $. In this case, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^k|B_1^k E)_{\theta _{|c_1^k}}&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^k E)_{\theta _{|c_1^k}} - \log |A| \\&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^{k-1} E)_{\theta _{|c_1^k}} - \log \left( |A||B|^2 \right) \\&= \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^{k-1} E)_{\theta _{|c_1^{k-1}}} - s \end{aligned}$$

where in the first line we have used the dimension bound in Lemma D.1, in the second line we have used the dimension bound in Lemma D.3 and in the last line we have used $\theta _{A_1^{k-1}B_1^{k-1} E|c_1^k} = \theta _{A_1^{k-1}B_1^{k-1} E|c_1^{k-1}}$.

Now, suppose that $c_k =1$. In this case, we have that $\theta _{A_1^k B_1^k E | c_1^k} = {{\,\textrm{tr}\,}}_{R_k} \circ {{\,\mathrm{\mathcal {M}}\,}}_k' \left( \theta _{R_{k-1} A_1^{k-1} B_1^{k-1} E| c_1^{k}} \right) $ and since $\theta _{R_{k-1} A_1^{k-1} B_1^{k-1} E| c_1^{k}} = \Phi _{k-1} \circ \Phi _{k-2} \cdots \circ \Phi _{1}(\rho ^{(0)}_{R_0 E})$ where each of the $\Phi _i \in \{{{\,\mathrm{\mathcal {M}}\,}}_i, {{\,\mathrm{\mathcal {M}}\,}}'_i\}$, using the hypothesis of the theorem we have that the state $\theta _{A_1^k B_1^k E | c_1^k} = {{\,\mathrm{\mathcal {M}}\,}}_k' \left( \theta _{R_{k-1} A_1^{k-1} B_1^{k-1} E| c_1^{k}} \right) $ satisfies the Markov chain

$$\begin{aligned} A_1^{k-1} \leftrightarrow B_1^{k-1} E \leftrightarrow B_k. \end{aligned}$$

Now, using Corollary C.5 (the $ \tilde{H}^{\uparrow }_{\alpha }$ counterpart for [8, Corollary 3.5], which is the main chain rule used for proving entropy accumulation), we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^k|B_1^k E)_{\theta _{|c_1^k}}&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^{k-1} E)_{\theta _{|c_1^k}} + \inf _{\omega } \tilde{H}^{\downarrow }_{\alpha }(A_k | B_k \tilde{R}_{k-1})_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega )} \\&= \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^{k-1} E)_{\theta _{|c_1^{k-1}}} + h_k \end{aligned}$$

where in the last line we have again used $\theta _{A_1^{k-1}B_1^{k-1} E|c_1^k} = \theta _{A_1^{k-1}B_1^{k-1} E|c_1^{k-1}}$. Combining these two cases, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^k|B_1^k E)_{\theta _{|c_1^k}} \ge \tilde{H}^{\uparrow }_{\alpha }(A_1^{k-1}|B_1^{k-1} E)_{\theta _{|c_1^{k-1}}} + \left( \delta ({c_k,1}) h_k - \delta ({c_k,0})s \right) . \end{aligned}$$

(75)

Using this bound n times starting with $\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta _{|c_1^n}}$ gives us the bound required in the claim. $\square $

For the sake of clarity let $l_k(c_k):= \left( \delta ({c_k,1}) h_k - \delta ({c_k,0})s \right) $. We will now evaluate

$$\begin{aligned} \sum _{c_1^n} \theta (c_1^n) \exp \left( \frac{1-\alpha }{\alpha }\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta _{|c_1^n}} \right)&\le \sum _{c_1^n} \theta (c_1^n) \exp \left( \frac{1-\alpha }{\alpha }\sum _{k=1}^n l_k(c_k) \right) \nonumber \\&= \sum _{c_1^n} \prod _{k=1}^n \theta (c_k) 2^{\frac{1-\alpha }{\alpha }l_k(c_k)}\nonumber \\&= \prod _{k=1}^n \sum _{c_k} \theta (c_k) 2^{\frac{1-\alpha }{\alpha }l_k(c_k)}. \end{aligned}$$

(76)

Then, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n C_1^n E)_{\theta }&= \frac{\alpha }{1-\alpha } \log \sum _{c_1^n} \theta (c_1^n) \exp _2 \left( \frac{1-\alpha }{\alpha }\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\theta _{|c_1^n}} \right) .\nonumber \\&\ge \frac{\alpha }{1-\alpha } \sum _{k=1}^n \log \sum _{c_k} \theta (c_k) 2^{\frac{1-\alpha }{\alpha }l_k(c_k)} \nonumber \\&= \frac{\alpha }{1-\alpha } \sum _{k=1}^n \log \left( (1-\delta ) 2^{\frac{1-\alpha }{\alpha }h_k} + \delta 2^{-\frac{1-\alpha }{\alpha }s} \right) \nonumber \\&= \sum _{k=1}^n h_k - \frac{\alpha }{\alpha -1} \sum _{k=1}^n \log \left( 1-\delta + \delta 2^{\frac{\alpha -1}{\alpha }(s+ h_k)} \right) \nonumber \\&\ge \sum _{k=1}^n h_k - \frac{\alpha }{\alpha -1} n \log \left( 1+ \delta \left( 2^{\frac{\alpha -1}{\alpha }(s+ \log |A|)}-1 \right) \right) \end{aligned}$$

(77)

where in the second line we have used Eq. 76 and in the last line we have used the fact that $h_k \le \log |A|$ for all $k \in [n]$.

We restricted the choice of $\alpha $ to the region $\left( 1, 1+ \frac{1}{\log (1+2 |A|)} \right) $ in the theorem, so that we can now use [8, Lemma B.9] to transform the above to

$$\begin{aligned}&\tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n C_1^n E)_{\theta } \ge \sum _{k=1}^n \inf _{\omega _{R_{k-1} \tilde{R}_{k-1}}} H(A_k | B_k \tilde{R}_{k-1})_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega )} - n(\alpha -1) \log ^2(1+ 2 |A|) \nonumber \\&\quad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 2^{\frac{\alpha -1}{\alpha }2\log (|A||B|)}-1 \right) \right) . \end{aligned}$$

(78)

Putting Eqs. 72, 73, and 78 together, we have

$$\begin{aligned}&H_{\min }^{\epsilon _1+\epsilon _2}(A_1^n|B_1^n E)_{\rho } \ge \sum _{k=1}^n \inf _{\omega _{R_{k-1} \tilde{R}_{k-1}}} H(A_k | B_k \tilde{R}_{k-1})_{{{\,\mathrm{\mathcal {M}}\,}}'_k(\omega )} - n(\alpha -1) \log ^2(1+ 2 |A|) \\&\quad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 2^{\frac{\alpha -1}{\alpha }2\log (|A||B|)}-1 \right) \right) \\&\quad - \frac{\alpha }{\alpha -1} n z_{\beta }(\epsilon , \delta )\frac{1}{\alpha -1}\left( g_1(\epsilon _2, \epsilon _1)+ \frac{\alpha g_0(\epsilon _1)}{\beta -1} \right) . \end{aligned}$$

$\square $

5.4 Testing

We will now incorporate testing into the approximate entropy accumulation theorem proven in Theorem 5.1. We follow [9], which is itself based on [35] for this purpose. Testing enables one to prove a lower bound on the smooth min-entropy of a state produced by the process in Fig. 2 conditioned on the output of a classical event. This is particularly useful for proving tight and practical bounds in cryptographic protocols.

In this section, we will consider the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ and ${{\,\mathrm{\mathcal {M}}\,}}'_k$ which map registers $R_{k-1}$ to $A_k B_k X_k R_k$ such that $X_k$ is a classical value which is determined using the registers $A_k$ and $B_k$. Concretely, suppose that for every k, there exist a channel $\mathcal {T}_k: A_k B_k \rightarrow A_k B_k X_k$ of the form

$$\begin{aligned} \mathcal {T}_k (\omega _{A_k B_k}) = \sum _{y, z} \Pi _{A_k}^y \otimes \Pi _{B_k}^z \omega _{A_k B_k} \Pi _{A_k}^y \otimes \Pi _{B_k}^z \otimes |{f(y,z)}\rangle \langle {f(y,z)}|_{X_k} \end{aligned}$$

(79)

where $\{\Pi _{A_k}^y\}_y$ and $\{\Pi _{B_k}^z\}_z$ are orthogonal projectors and f is some deterministic function which uses the measurements y and z to create the output register $X_k$.

In order to define the min-tradeoff functions, we let $\mathbb {P}$ be the set of probability distributions over the alphabet of X registers. Let R be any register isomorphic to $R_{k-1}$. For a probability $q \in \mathbb {P}$ and a channel ${{\,\mathrm{\mathcal {N}}\,}}_{k}: R_{k-1} \rightarrow A_k B_k X_k R_k$, we also define the set

$$\begin{aligned} \Sigma _k (q | {{\,\mathrm{\mathcal {N}}\,}}_{k}) := \left\{ \nu _{A_k B_k X_k R_k R} = {{\,\mathrm{\mathcal {N}}\,}}_{k}(\omega _{R_{k-1}R}): \text { for a state } \omega _{R_{k-1}R} \text { such that }\nu _{X_k}= q \right\} . \end{aligned}$$

(80)

Definition 5.6

A function $f: \mathbb {P} \rightarrow \mathbb {R}$ is called a min-tradeoff function for the channels $\{{{\,\mathrm{\mathcal {N}}\,}}_k \}_{k=1}^n$ if for every $k \in [n]$, it satisfies

$$\begin{aligned} f(q) \le \inf _{\nu \in \Sigma _k (q| {{\,\mathrm{\mathcal {N}}\,}}_{k})} H(A_k | B_k R)_{\nu }. \end{aligned}$$

(81)

We will also need the definitions of the following simple properties of the min-tradeoff functions for our entropy accumulation theorem:

$$\begin{aligned}&\text {Max}(f) := \max _{q \in \mathbb {P}} f(q) \end{aligned}$$

(82)

$$\begin{aligned}&\text {Min}(f) := \min _{q \in \mathbb {P}} f(q)\end{aligned}$$

(83)

$$\begin{aligned}&\text {Min}_{\Sigma }(f) := \min _{q: \Sigma (q) \ne \varnothing } f(q)\end{aligned}$$

(84)

$$\begin{aligned}&\text {Var}(f) := \max _{q: \Sigma (q) \ne \varnothing } \sum _{x} q(x)f(\delta _x)^2 - \left( \sum _{x} q(x)f(\delta _x) \right) ^2 \end{aligned}$$

(85)

where $\Sigma (q):= \bigcup _k \Sigma _k(q)$ and $\delta _x$ is the distribution with unit weight on the alphabet x.

Theorem 5.7

For $k \in [n]$, let the registers $A_k$ and $B_k$ be such that $|A_k| = |A|$ and $|B_k| = |B|$. For $k \in [n]$, let ${{\,\mathrm{\mathcal {M}}\,}}_k$ be channels from $R_{k-1} \rightarrow R_{k} A_k B_k X_k$ and

$$\begin{aligned} \rho _{A_1^n B_1^n X_1^n E} = {{\,\textrm{tr}\,}}_{R_n} \circ {{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1 (\rho ^{(0)}_{R_0 E}) \end{aligned}$$

(86)

be the state produced by applying these maps sequentially. Further, let ${{\,\mathrm{\mathcal {M}}\,}}_k$ be such that ${{\,\mathrm{\mathcal {M}}\,}}_k = \mathcal {T}_k \circ {{\,\mathrm{\mathcal {M}}\,}}^{(0)}_k$ for $\mathcal {T}_k$ defined in Eq. 79 and some channel ${{\,\mathrm{\mathcal {M}}\,}}^{(0)}_k: R_{k-1} \rightarrow R_{k} A_k B_k$. Suppose the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ are such that for every $k \in [n]$, there exists a channel ${{\,\mathrm{\mathcal {M}}\,}}'_{k}$ from $R_{k-1} \rightarrow R_{k} A_k B_k X_k$ such that

1.
${{\,\mathrm{\mathcal {M}}\,}}'_k = \mathcal {T}_k \circ {{\,\mathrm{\mathcal {M}}\,}}^{\prime (0)}_k$ for some channel ${{\,\mathrm{\mathcal {M}}\,}}^{\prime (0)}_k: R_{k-1} \rightarrow R_{k} A_k B_k$.
2.
${{\,\mathrm{\mathcal {M}}\,}}'_{k}$ $\epsilon $-approximates ${{\,\mathrm{\mathcal {M}}\,}}_{k}$ in the diamond norm:
$$\begin{aligned} \frac{1}{2}\left\| {{\,\mathrm{\mathcal {M}}\,}}_k- {{\,\mathrm{\mathcal {M}}\,}}'_k \right\| _{\diamond } \le \epsilon \end{aligned}$$
(87)
3.
For every choice of a sequence of channels ${{\,\mathrm{\mathcal {N}}\,}}_i \in \{ {{\,\mathrm{\mathcal {M}}\,}}_i, {{\,\mathrm{\mathcal {M}}\,}}'_i \}$ for $i \in [k-1]$, the state ${{\,\mathrm{\mathcal {M}}\,}}'_k \circ {{\,\mathrm{\mathcal {N}}\,}}_{k-1} \circ \cdots \circ {{\,\mathrm{\mathcal {N}}\,}}_1 (\rho ^{(0)}_{R_0 E})$ satisfies the Markov chain
$$\begin{aligned} A_1^{k-1} \leftrightarrow B_1^{k-1}E \leftrightarrow B_k. \end{aligned}$$
(88)

Then, for an event $\Omega $ defined using $X_1^n$, an affine min-tradeoff function f for $\{{{\,\mathrm{\mathcal {M}}\,}}'_k\}_{k=1}^n$ such that for every $x_1^n \in \Omega $, $f(\text {freq}(x_1^n)) \ge h$, for parameters $0<\delta , \epsilon _1, \epsilon _3<1$ and $\epsilon _2:= 2\sqrt{\frac{\epsilon _1}{P_{\rho }(\Omega )}}$ such that $\epsilon _2 + \epsilon _3 <1$, $\alpha \in \left( 1,2 \right) $, and $\beta >1$, we have

$$\begin{aligned}&H_{\min }^{\epsilon _2+\epsilon _3}(A_1^n|B_1^n E)_{\rho _{|\Omega }}\nonumber \\&\quad \ge nh - \frac{(\alpha -1)\ln (2)}{2}\left( \log (2 |A|^2 + 1) + \sqrt{2+ \text {Var}(f)} \right) ^2 - n(\alpha -1)^2 K_\alpha \nonumber \\&\qquad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 4^{\frac{\alpha -1}{\alpha }(\log (|A||B|) + \text {Max}(f) - \text {Min}(f)+1)}-1 \right) \right) \nonumber \\&\qquad - \frac{\alpha }{\alpha -1} n z_{\beta }(\epsilon , \delta ) - \frac{1}{\alpha -1}\left( \alpha \log \frac{1}{P_{{\rho }}(\Omega ) - \epsilon _1}+ g_1(\epsilon _3, \epsilon _2)+ \frac{\alpha g_0(\epsilon _1)}{\beta -1} \right) . \end{aligned}$$

(89)

where

$$\begin{aligned}&z_\beta (\epsilon , \delta ) := \frac{\beta +1}{\beta -1}\log \left( \left( 1+ \sqrt{(1-\delta )\epsilon } \right) ^{\frac{\beta }{\beta +1}} + \left( \frac{\sqrt{(1-\delta )\epsilon }}{\delta ^\beta } \right) ^{{\frac{1}{\beta +1}}} \right) \end{aligned}$$

(90)

$$\begin{aligned}&K_\alpha := \frac{1}{6(2-\alpha )^3 \ln (2)} 2^{(\alpha -1)\left( 2\log |A| + (\text {Max}(f) - \text {Min}_{\Sigma }(f)) \right) }\nonumber \\&\quad \ln ^3\left( 2^{\left( 2\log |A| + (\text {Max}(f) - \text {Min}_{\Sigma }(f)) \right) } + e^2 \right) \end{aligned}$$

(91)

and $g_1(x,y) = - \log (1- \sqrt{1-x^2}) - \log (1-y^2)$.

Proof

Just as in the proof of Theorem 5.1, we define

$$\begin{aligned} {{\,\mathrm{\mathcal {M}}\,}}^\delta _k := (1- \delta ) {{\,\mathrm{\mathcal {M}}\,}}'_k + \delta {{\,\mathrm{\mathcal {M}}\,}}_k \end{aligned}$$

(92)

for every k and the state

$$\begin{aligned} \sigma _{A_1^n B_1^n X_1^n E} := {{\,\mathrm{\mathcal {M}}\,}}^\delta _n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1(\rho ^{(0)}_{R_0 E}). \end{aligned}$$

(93)

so that for $\beta >1$ and $\epsilon _1 >0$, we have

$$\begin{aligned} D^{\epsilon _1}_{\max }(\rho _{A_1^n B_1^n X_1^n E}||\sigma _{A_1^n B_1^n X_1^n E})&\le n z_\beta (\epsilon , \delta ) + \frac{g_0(\epsilon _1)}{\beta -1}. \end{aligned}$$

(94)

Define $d_{\beta }:= n z_\beta (\epsilon , \delta ) + \frac{g_0(\epsilon _1)}{\beta -1}$. The bound above implies that there exists a state $\tilde{\rho }_{A_1^n B_1^n X_1^n E}$, which is also classical on $X_1^n$ such that

$$\begin{aligned} P\left( \rho _{A_1^n B_1^n X_1^n E}, \tilde{\rho }_{A_1^n B_1^n X_1^n E} \right) \le \epsilon _1 \end{aligned}$$

(95)

and

$$\begin{aligned} \tilde{\rho }_{A_1^n B_1^n X_1^n E} \le 2^{d_\beta } \sigma _{A_1^n B_1^n X_1^n E}. \end{aligned}$$

(96)

The registers $X_1^n$ for $\tilde{\rho }$ can be chosen to be classical, since the channel measuring $X_1^n$ only decreases the distance between $\tilde{\rho }$ and $\rho $, and the new state produced would also satisfy Eq. 96. As the registers $X_1^n$ are classical for both $\sigma $ and $\tilde{\rho }$, we can condition these states on the event $\Omega $. We will call the probability of the event $\Omega $ for the state $\sigma $ and $\tilde{\rho }$ $P_{\sigma }(\Omega )$ and $P_{\tilde{\rho }}(\Omega )$ respectively. Using Lemma G.1 and the Fuchs-van de Graaf inequality, we have

$$\begin{aligned} P\left( \rho _{A_1^n B_1^n X_1^n E|\Omega } , \tilde{\rho }_{A_1^n B_1^n X_1^n E|\Omega } \right) \le 2 \sqrt{\frac{\epsilon _1}{P_{{\rho }}(\Omega )}}. \end{aligned}$$

(97)

Conditioning Eq. 96 on $\Omega $, we get

$$\begin{aligned} P_{\tilde{\rho }}(\Omega ) \tilde{\rho }_{A_1^n B_1^n X_1^n E|\Omega } \le 2^{d_\beta } P_{\sigma }(\Omega ) \sigma _{A_1^n B_1^n X_1^n E|\Omega }. \end{aligned}$$

(98)

Together, the above two equations imply that

$$\begin{aligned} D^{\epsilon _2}_{\max }(\rho _{A_1^n B_1^n X_1^n E|\Omega } ||\sigma _{A_1^n B_1^n X_1^n E|\Omega }) \le n z_\beta (\epsilon , \delta ) + \frac{g_0(\epsilon _1)}{\beta -1} + \log \frac{P_{\sigma }(\Omega )}{P_{\tilde{\rho }}(\Omega )} \end{aligned}$$

(99)

for $\epsilon _2:= 2\sqrt{\frac{\epsilon _1}{P_{{\rho }}(\Omega )}}$.

For $\epsilon _3 >0$ and $\alpha \in (1, 2)$, we can plug the above in the bound provided by Lemma 3.5 to get

$$\begin{aligned} H_{\min }^{\epsilon _2+\epsilon _3}(A_1^n|B_1^n E)_{\rho _{|\Omega }}&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\sigma _{|\Omega }} - \frac{\alpha }{\alpha -1} n z_{\beta }(\epsilon , \delta ) \nonumber \\&\quad - \frac{1}{\alpha -1}\left( \alpha \log \frac{P_{\sigma }(\Omega )}{P_{\tilde{\rho }}(\Omega )}+ g_1(\epsilon _3, \epsilon _2)+ \frac{\alpha g_0(\epsilon _1)}{\beta -1} \right) . \end{aligned}$$

(100)

Now, note that using Eq. 79 and [8, Lemma B.7] we have

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n|B_1^n E)_{\sigma _{|\Omega }} = \tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n|B_1^n E)_{\sigma _{|\Omega }}. \end{aligned}$$

(101)

For every k, we introduce a register $D_k$ of dimension $|D_k| = \lceil 2^{\max (f)- \min (f)} \rceil $ and a channel $\mathcal {D}_k: X_k \rightarrow X_k D_k$ as

$$\begin{aligned} \mathcal {D}_k(\omega ) := \sum _x \langle {x|\omega |x}\rangle |{x}\rangle \langle {x}|\otimes \tau _x \end{aligned}$$

(102)

where for every x, the state $\tau _x$ is a mixture between a uniform distribution on $\{1, 2, \ldots \, \lfloor 2^{\max (f)- f(\delta _x)} \rfloor \}$ and a uniform distribution on $\{1, 2, \ldots \, \lceil 2^{\max (f)- f(\delta _x)} \rceil \}$, so that

$$\begin{aligned} H(D_k)_{\tau _x} = \max (f)- f(\delta _x) \end{aligned}$$

(103)

where $\delta _x$ is the distribution with unit weight at element x.

Define the channels $\bar{\mathcal {M}}_k:= \mathcal {D}_k \circ {{\,\mathrm{\mathcal {M}}\,}}_k$, $\bar{\mathcal {M}}'_k:= \mathcal {D}_k \circ {{\,\mathrm{\mathcal {M}}\,}}'_k$ and $\bar{\mathcal {M}}^\delta _k:= \mathcal {D}_k \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _k = (1-\delta )\bar{\mathcal {M}}'_k + \delta \bar{\mathcal {M}}_k$ and the state

$$\begin{aligned} \bar{\sigma }_{A_1^n B_1^n X_1^n D_1^n E} := \bar{\mathcal {M}}^\delta _n \circ \cdots \bar{\mathcal {M}}^\delta _1(\rho ^{(0)}_{R_0 E}) \end{aligned}$$

(104)

Note that $\bar{\sigma }_{A_1^n B_1^n X_1^n E} = {\sigma }_{A_1^n B_1^n X_1^n E}$. [9, Lemma 4.5] implies that this satisfies

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n|B_1^n E)_{{\sigma }_{|\Omega }}&= \tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n|B_1^n E)_{\bar{\sigma }_{|\Omega }} \nonumber \\&\ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n D_1^n |B_1^n E)_{\bar{\sigma }_{|\Omega }} - \max _{x_1^n \in \Omega } H_{\alpha }(D_1^n)_{\bar{\sigma }_{|x_1^n}} \end{aligned}$$

(105)

For $x_1^n \in \Omega $, we have

$$\begin{aligned} H_{\alpha }(D_1^n)_{\bar{\sigma }_{|x_1^n}}&\le H (D_1^n)_{\bar{\sigma }_{|x_1^n}} \nonumber \\&\le \sum _{k=1}^n H(D_k)_{\tau _{x_k}} \nonumber \\&= \sum _{k=1}^n \max (f)- f(\delta _{x_k}) \nonumber \\&= n \max (f) - n f(\text {freq}(x_1^n)) \nonumber \\&\le n \max (f) - n h. \end{aligned}$$

(106)

We can get rid of the conditioning on the right-hand side of Eq. 105 by using [8, Lemma B.5]

$$\begin{aligned} \tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n D_1^n |B_1^n E)_{\bar{\sigma }_{|\Omega }} \ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n D_1^n |B_1^n E)_{\bar{\sigma }} - \frac{\alpha }{\alpha -1}\log \frac{1}{P_{\sigma }(\Omega )}. \end{aligned}$$

(107)

We now show that the channels $\bar{\mathcal {M}}'_k$ satisfy the second condition in Theorem 5.1. For an arbitrary $k \in [n]$ and a sequence of channels ${{\,\mathrm{\mathcal {N}}\,}}_i \in \{\bar{\mathcal {M}}_i, \bar{\mathcal {M}}'_i\}$ for every $1\le i < k$, let

$$\begin{aligned} \eta _{A_1^k B_1^k X_1^k D_1^k E} = \bar{\mathcal {M}}'_k \circ {{\,\mathrm{\mathcal {N}}\,}}_{k-1} \cdots \circ {{\,\mathrm{\mathcal {N}}\,}}_1 (\rho ^{(0)}_{R_0 E}). \end{aligned}$$

For this state, we have

$$\begin{aligned} I(A_1^{k-1} D_1^{k-1}: B_k | B_1^{k-1} E)_{\eta }&= I(A_1^{k-1} : B_k | B_1^{k-1} E)_{\eta } + I(D_1^{k-1}: B_k | A_1^{k-1} B_1^{k-1} E)_{\eta } \\&= 0 \end{aligned}$$

where $I(A_1^{k-1}: B_k | B_1^{k-1} E)_{\eta } = 0$ because of the condition in Eq. 88, and $I(D_1^{k-1}: B_k | A_1^{k-1} B_1^{k-1} E)_{\eta } = 0$ since $X_1^{k-1}$ and hence $D_1^{k-1}$ are determined by $A_1^{k-1} B_1^{k-1}$. This implies that for this state $A_1^{k-1} D_1^{k-1} \leftrightarrow B_1^{k-1} E \leftrightarrow B_k$. Thus, the maps $\bar{\mathcal {M}}_k$ and $\bar{\mathcal {M}}'_k$ satisfy the conditions required for applying Theorem 5.1. Specifically, we can use the bounds in Eqs. 73 and 77 for bounding $\alpha $-conditional Rényi entropy in Eq. 107

$$\begin{aligned}&\tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n D_1^n |B_1^n E)_{\bar{\sigma }} \nonumber \\&\quad \ge \tilde{H}^{\uparrow }_{\alpha }(A_1^n D_1^n |B_1^n E)_{\bar{\sigma }} \nonumber \\&\quad \ge \sum _{k=1}^n \inf _{\omega _{R_{k-1} \tilde{R}_{k-1}}} \tilde{H}^{\downarrow }_{\alpha }(A_k D_k | B_k \tilde{R}_{k-1})_{\bar{\mathcal {M}}'_k(\omega )}\nonumber \\&\quad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 2^{\frac{\alpha -1}{\alpha }2\log (|A||D||B|)}-1 \right) \right) . \end{aligned}$$

(108)

The analysis in the proof of [35, Proposition V.3] shows that the first term above can be bounded as

$$\begin{aligned}&\inf _{\omega _{R_{k-1} \tilde{R}_{k-1}}} \tilde{H}^{\downarrow }_{\alpha }(A_k D_k | B_k \tilde{R}_{k-1})_{\bar{\mathcal {M}}'_k(\omega )} \nonumber \\&\quad \ge \text {Max}(f) - \frac{(\alpha -1)\ln (2)}{2}\left( \log (2 |A|^2 + 1) + \sqrt{2+ \text {Var}(f)} \right) ^2 - (\alpha -1)^2 K_\alpha \end{aligned}$$

(109)

Combining Eqs. 105, 106, 107, 108 and 109, we have

$$\begin{aligned}&\tilde{H}^{\uparrow }_{\alpha }(A_1^n X_1^n|B_1^n E)_{\bar{\sigma }_{|\Omega }} \nonumber \\&\quad \ge nh - \frac{(\alpha -1)\ln (2)}{2}\left( \log (2 |A|^2 + 1) + \sqrt{2+ \text {Var}(f)} \right) ^2 - n(\alpha -1)^2 K_\alpha \nonumber \\&\quad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 2^{\frac{\alpha -1}{\alpha }2\log (|A||D||B|)}-1 \right) \right) - \frac{\alpha }{\alpha -1}\log \frac{1}{P_{\sigma }(\Omega )}. \end{aligned}$$

(110)

Plugging this into Eq. 100, we get

$$\begin{aligned}&H_{\min }^{\epsilon _2+\epsilon _3}(A_1^n|B_1^n E)_{\rho _{|\Omega }}\nonumber \\&\quad \ge nh - \frac{(\alpha -1)\ln (2)}{2}\left( \log (2 |A|^2 + 1) + \sqrt{2+ \text {Var}(f)} \right) ^2 - n(\alpha -1)^2 K_\alpha \nonumber \\&\quad - \frac{\alpha }{\alpha -1} n \log \left( 1 + \delta \left( 4^{\frac{\alpha -1}{\alpha }(\log (|A||B|) + \text {max}(f) - \text {min}(f)+1)}-1 \right) \right) \nonumber \\&\quad - \frac{\alpha }{\alpha -1} n z_{\beta }(\epsilon , \delta ) - \frac{1}{\alpha -1}\left( \alpha \log \frac{1}{P_{{\rho }}(\Omega ) - \epsilon _1}+ g_1(\epsilon _3, \epsilon _2)+ \frac{\alpha g_0(\epsilon _1)}{\beta -1} \right) . \end{aligned}$$

(111)

where we have used $P_{\tilde{\rho }}(\Omega )\ge P_{{\rho }}(\Omega ) - \epsilon _1$ since $\frac{1}{2}\left\| \rho - \tilde{\rho } \right\| _1 \le P(\rho ,\tilde{\rho })\le \epsilon _1$. Note that the probability of $\Omega $ under the auxiliary state $\sigma $ cancels out. $\square $

5.5 Limitations and further improvements

As we pointed out previously, the dependence of the entropy loss per round on $\epsilon $ is very poor (behaves as $\sim \epsilon ^{1/24}$) in Theorem 5.1. The classical version of this theorem has a much better dependence of $O(\sqrt{\epsilon })$ on $\epsilon $ (see Theorem F.1). The reason for the poor performance of the quantum version is that our bound on the channel divergence (Lemma 5.4) is very weak compared to the bound we can use classically. It should be noted, however, that if Lemma 5.4 were to be improved in the future, one could simply plug the new bound into our proof and derive an improvement for Theorem 5.1.

A better bound on the channel divergence would have an additional benefit. It could simplify the proof and the Markov chain assumption in our theorem. In particular, it would be much easier to carry out the proof if the mixed channels ${{\,\mathrm{\mathcal {M}}\,}}^{\delta }_k$ were defined as $(1-\delta ){{\,\mathrm{\mathcal {M}}\,}}'_k + \delta \tau _{A_k B_k} \otimes {{\,\textrm{tr}\,}}_{A_k B_k}\circ {{\,\mathrm{\mathcal {M}}\,}}_k$ (which is what is done classically), where $\tau _{A_k B_k}$ is the completely mixed state on registers $A_k B_k$. Here, instead of mixing the channel ${{\,\mathrm{\mathcal {M}}\,}}'_k$ with ${{\,\mathrm{\mathcal {M}}\,}}_k$, we mix it with $\tau _{A_k B_k} \otimes {{\,\textrm{tr}\,}}_{A_k B_k}\circ {{\,\mathrm{\mathcal {M}}\,}}_k$, which also keeps $D_{\max }({{\,\mathrm{\mathcal {M}}\,}}_k || {{\,\mathrm{\mathcal {M}}\,}}^{\delta }_k)$ small enough. Moreover, this definition ensures that the registers $B_k$ produced by the map ${{\,\mathrm{\mathcal {M}}\,}}_k^\delta $ always satisfy the Markov chain conditions. If it were possible to show that the divergence between the real state ${{\,\mathrm{\mathcal {M}}\,}}_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1 (\rho ^{(0)}_{R_0 E})$ and the auxiliary state ${{\,\mathrm{\mathcal {M}}\,}}^\delta _n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}^\delta _1 (\rho ^{(0)}_{R_0 E})$ is small for this definition of ${{\,\mathrm{\mathcal {M}}\,}}_k^\delta $, then one could directly use the entropy accumulation theorem for lower bounding the entropy for the auxiliary state. We cannot do this in our proof as this definition of the mixed channel ${{\,\mathrm{\mathcal {M}}\,}}^\delta _k$ also increases the distance from the original channel ${{\,\mathrm{\mathcal {M}}\,}}_k$ to $\epsilon +2\delta $ and this makes the upper bound in Lemma 5.3 large (finite even in the limit $\epsilon \rightarrow 0$).

It seems that it should be possible to weaken the assumptions for approximate entropy accumulation. The classical equivalent of this theorem (Theorem F.1) for instance can be proven very easily and requires a much weaker approximation assumption. It would be interesting if one could remove the “memory” registers $R_k$ from the assumptions required for approximate entropy accumulation, since these are not typically accessible to the users in applications.

Another troubling feature of the approximate entropy accumulation theorem seems to be that it assumes that the size of the side information registers $B_k$ is constant. One might wonder if this is necessary, since continuity bounds like the Alicki-Fannes-Winter (AFW) inequality do not depend on the size of the side information. It turns out that a bound on the side information size is indeed necessary in this case. We show a simple classical example to demonstrate this in Appendix E. The necessity of such a bound also rules out a similar approximate extension of the generalised entropy accumulation theorem (GEAT).

Notes

For n quantum registers $(X_1, X_2, \ldots , X_n)$, the notation $X_i^j$ refers to the set of registers $(X_i, X_{i+1} \ldots , X_{j})$.
The smoothing parameter must depend on $\epsilon $ in such a scenario. This can be seen by considering the probability distribution $P_{A_1^n B}$ such that B is 0 with probability $\epsilon $ and 1 otherwise and $A_1^n $ is a random n-bit string if $B=1$ and constant if $B=0$.
Consider the distribution $Q_{A_1^{2n} B_1^{2n}}$, where for every $i \in [2n]$, the bit $B_i$ is chosen independently and is equal to 0 with probability $\epsilon $ and is 1 otherwise. The bit $A_i$ is chosen randomly if $B_i=1$, otherwise it is chosen to be equal to $A_{i-1}$. In this case, $Q_{A_k}$ is the uniformly random distribution for bits and Eq. 1 is satisfied. Let $I = |\{i \in [n]: A_{2i-1} = A_{2i}\}|$. Then, for $Q_{A_1^{2n} B_1^{2n}}$, this value concentrates around $\frac{n(1+\epsilon )}{2}$, whereas for $\prod _{i=1}^{2n} Q_{A_i} \cdot Q_{B_1^{2n}}$, it concentrates around $\frac{n}{2}$. This shows that $\left\| Q_{A_1^{2n} B_1^{2n}}-\prod _{i=1}^{2n} Q_{A_i} \cdot Q_{B_1^{2n}} \right\| _1 \rightarrow 2$.
The channel divergence bounds we are able to prove are too weak for this idea to work as stated here. The actual proof is more complicated. However, this idea works in the classical case.
This Lemma is also valid for subnormalised states as long as $\delta \in (0, \sqrt{2 {{\,\textrm{tr}\,}}(\tilde{\rho }) - {{\,\textrm{tr}\,}}(\tilde{\rho })^2})$ according to [8, Lemma B.4].
This Lemma was pointed out to us by Omar Fawzi.

References

Marwah, A., Dupuis, F.: Proving Security of bb84 Under Source Correlations (2024). arXiv:2402.12346 [quant-ph]
Tomamichel, M., Lim, C.C.W., Gisin, N., Renner, R.: Tight finite-key analysis for quantum cryptography. Nat. Commun. 3(1), 634 (2012). https://doi.org/10.1038/ncomms1631
Article ADS Google Scholar
Konig, R., Wehner, S., Wullschleger, J.: Unconditional security from noisy quantum storage. IEEE Trans. Inf. Theory 58(3), 1962–1984 (2012). https://doi.org/10.1109/TIT.2011.2177772
Article MathSciNet Google Scholar
Renner, R.: Security of Quantum Key Distribution. Ph.D. Thesis (2006)
Renner, R., König, R.: Universally composable privacy amplification against quantum adversaries. In: Kilian, J. (ed.) Theory of Cryptography, pp. 407–425. Springer, Berlin (2005)
Chapter Google Scholar
Tomamichel, M., Renner, R., Schaffner, C., Smith, A.: Leftover hashing against quantum side information. In: IEEE International Symposium on Information Theory, pp. 2703–2707 (2010). https://doi.org/10.1109/ISIT.2010.5513652
Vitanov, A., Dupuis, F., Tomamichel, M., Renner, R.: Chain rules for smooth min- and max-entropies. IEEE Trans. Inf. Theory 59(5), 2603–2612 (2013). https://doi.org/10.1109/TIT.2013.2238656
Article MathSciNet Google Scholar
Dupuis, F., Fawzi, O., Renner, R.: Entropy accumulation. Commun. Math. Phys. 379(3), 867–913 (2020). https://doi.org/10.1007/s00220-020-03839-5
Article ADS MathSciNet Google Scholar
Metger, T., Fawzi, O., Sutter, D., Renner, R.: Generalised entropy accumulation (2022). https://doi.org/10.48550/ARXIV.2203.04989. arXiv:2203.04989
Jain, R., Radhakrishnan, J., Sen, P.: Privacy and interaction in quantum communication complexity and a theorem about the relative entropy of quantum states. In: The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings, pp. 429–438 (2002). https://doi.org/10.1109/SFCS.2002.1181967
Ekert, A.K.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661–663 (1991). https://doi.org/10.1103/PhysRevLett.67.661
Article ADS MathSciNet Google Scholar
Arnon-Friedman, R., Dupuis, F., Fawzi, O., Renner, R., Vidick, T.: Practical device-independent quantum cryptography via entropy accumulation. Nat. Commun. 9(1), 459 (2018). https://doi.org/10.1038/s41467-017-02307-4
Article ADS Google Scholar
Jain, R., Kundu, S.: A Direct Product Theorem for Quantum Communication Complexity with Applications to Device-Independent Cryptography (2023)
Tan, E.Y.-Z.: Robustness of Implemented Device-Independent Protocols Against Constrained Leakage (2023)
Fawzi, H., Fawzi, O.: Defining quantum divergences via convex optimization. Quantum 5, 387 (2021). https://doi.org/10.22331/q-2021-01-26-387
Article Google Scholar
Pereira, M., Currás-Lorenzo, G., Navarrete, A., Mizutani, A., Kato, G., Curty, M., Tamaki, K.: Modified BB84 Quantum Key Distribution Protocol Robust to Source Imperfections (2022). https://doi.org/10.48550/ARXIV.2210.11754. arXiv:2210.11754
Tomamichel, M.: Quantum Information Processing with Finite Resources. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-21891-5
Book Google Scholar
Cooney, T., Mosonyi, M., Wilde, M.M.: Strong converse exponents for a quantum channel discrimination problem and quantum-feedback-assisted communication. Commun. Math. Phys. 344(3), 797–829 (2016). https://doi.org/10.1007/s00220-016-2645-4
Article ADS MathSciNet Google Scholar
Leditzky, F., Kaur, E., Datta, N., Wilde, M.M.: Approaches for approximate additivity of the Holevo information of quantum channels. Phys. Rev. A (2018). https://doi.org/10.1103/physreva.97.012332
Article Google Scholar
Christandl, M., Müller-Hermes, A.: Relative entropy bounds on quantum, private and repeater capacities. Commun. Math. Phys. 353(2), 821–852 (2017)
Article ADS MathSciNet Google Scholar
Tomamichel, M., Colbeck, R., Renner, R.: A fully quantum asymptotic equipartition property. IEEE Trans. Inf. Theory 55(12), 5840–5847 (2009). https://doi.org/10.1109/TIT.2009.2032797
Article MathSciNet Google Scholar
Tomamichel, M.: A Framework for Non-asymptotic Quantum Information Theory. Ph.D. Thesis (2012). https://doi.org/10.48550/ARXIV.1203.2142. https://arxiv.org/abs/1203.2142
Tomamichel, M., Hayashi, M.: A hierarchy of information quantities for finite block length analysis of quantum tasks. IEEE Trans. Inf. Theory 59(11), 7693–7710 (2013). https://doi.org/10.1109/TIT.2013.2276628
Article MathSciNet Google Scholar
Jain, R., Nayak, A.: Short Proofs of the Quantum Substate Theorem (2011). https://doi.org/10.48550/ARXIV.1103.6067. arXiv:1103.6067
Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 4(1), 66–82 (1960). https://doi.org/10.1147/rd.41.0066
Article MathSciNet Google Scholar
Horodecki, R.: Informationally coherent quantum systems. Phys. Lett. A 187(2), 145–150 (1994). https://doi.org/10.1016/0375-9601(94)90052-3
Article ADS MathSciNet Google Scholar
Cerf, N.J., Massar, S., Schneider, S.: Multipartite classical and quantum secrecy monotones. Phys. Rev. A (2002). https://doi.org/10.1103/physreva.66.042309
Article Google Scholar
Alicki, R., Fannes, M.: Continuity of quantum conditional information. J. Phys. A: Math. Gen. 37(5), 55–57 (2004). https://doi.org/10.1088/0305-4470/37/5/l01
Article ADS MathSciNet Google Scholar
Winter, A.: Tight uniform continuity bounds for quantum entropies: conditional entropy, relative entropy distance and energy constraints. Commun. Math. Phys. 347(1), 291–313 (2016). https://doi.org/10.1007/s00220-016-2609-8
Article ADS MathSciNet Google Scholar
Wilde, M.M.: Quantum Information Theory. Cambridge University Press, Cambridge (2013). https://doi.org/10.1017/CBO9781139525343
Book Google Scholar
Arnon-Friedman, R.: Device-Independent Quantum Information Processing. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60231-4
Book Google Scholar
Pironio, S., Acín, A., Brunner, N., Gisin, N., Massar, S., Scarani, V.: Device-independent quantum key distribution secure against collective attacks. New J. Phys. 11(4), 045021 (2009). https://doi.org/10.1088/1367-2630/11/4/045021
Article ADS Google Scholar
Bergh, B., Salzmann, R., Datta, N.: The $\alpha \rightarrow 1$ limit of the sharp quantum rényi divergence. J. Math. Phys. 62(9), 092205 (2021). https://doi.org/10.1063/5.0049791
Article ADS Google Scholar
Matsumoto, K.: A new quantum version of F-divergence. In: Ozawa, M., Butterfield, J., Halvorson, H., Rédei, M., Kitajima, Y., Buscemi, F. (eds.) Reality and measurement in algebraic quantum theory, pp. 229–273. Springer, Singapore (2018)
Chapter Google Scholar
Dupuis, F., Fawzi, O.: Entropy accumulation with improved second-order term. IEEE Trans. Inf. Theory 65(11), 7596–7612 (2019). https://doi.org/10.1109/TIT.2019.2929564
Article MathSciNet Google Scholar
Bluhm, A., Capel, Gondolf, P., Pérez-Hernández, A.: Continuity of Quantum Entropic Quantities via Almost Convexity (2022)
Sutter, D.: Approximate quantum Markov chains. In: Approximate Quantum Markov Chains, pp. 75–100. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78732-9_5
Bhatia, R.: Matrix Analysis, vol. 169. Springer, Cham (1997)
Google Scholar
Müller-Lennert, M., Dupuis, F., Szehr, O., Fehr, S., Tomamichel, M.: On quantum Rényi entropies: a new generalization and some properties. J. Math. Phys. 54(12), 122203 (2013). https://doi.org/10.1063/1.4838856
Article ADS MathSciNet Google Scholar
Leditzky, F.: Relative Entropies and Their Use in Quantum Information Theory. Ph.D. Thesis (2016)
Tomamichel, M., Leverrier, A.: A largely self-contained and complete security proof for quantum key distribution. Quantum 1, 14 (2017). https://doi.org/10.22331/q-2017-07-14-14
Article Google Scholar

Download references

Acknowledgement

We would like to thank Omar Fawzi for interesting discussions and for pointing out Lemma B.2. We also thank Ernest Tan and Amir Arqand, whose observations helped improve Theorem 5.1.

Funding

During the course of this work, AM was supported by the J.A. DeSève Foundation and by bourse d’excellence Google. This work was also supported by the Natural Sciences and Engineering Research Council of Canada.

Author information

Authors and Affiliations

Département d’informatique et de Recherche Opérationnelle, Université de Montréal, 2900 boul. Édouard-Montpetit, Montreal, QC, H3T 1J4, Canada
Ashutosh Marwah & Frédéric Dupuis

Authors

Ashutosh Marwah
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Dupuis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashutosh Marwah.

Additional information

Communicated by G. Chiribella.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Entropic Triangle Inequalities Cannot be Improved Much

In this section, we will construct a classical counterexample to show that it is not possible to improve Lemma 3.5 to get a result like

$$\begin{aligned} H^{\epsilon '}_{\min }(A|B)_\rho \ge H^{\epsilon }_{\min }(A|B)_\eta - O(D^{\epsilon ''}_{\max }(\rho || \eta )) \end{aligned}$$

(112)

where $\epsilon , \epsilon ' >0$ and the constant in front of $D^{\epsilon ''}_{\max }(\rho || \eta )$ is independent of the dimensions |A| and |B|.

Consider the probability distribution $p_{AB}$ where B is chosen to be equal to 1 with probability $1- \epsilon $ and 0 with probability $\epsilon $, and $A_1^n$ is chosen to be a random n-bit string if $B=1$ otherwise $A_1^n$ is chosen to be the all 0 string. Let E be the event that $B=0$. Then, we have

$$\begin{aligned} p_{AB|E} \le \frac{1}{p(E)} p_{AB} = \frac{1}{\epsilon } p_{AB} \end{aligned}$$

or equivalently $D_{\max }(p_{AB|E} || p_{AB}) \le \log \frac{1}{\epsilon }$. In this case, we have $H^{\epsilon }_{\min }(A|B)_{p} = n$ (where we are smoothing in the trace distance) and $H^{\epsilon '}_{\min }(A|B)_{p_{|E}} = \log \frac{1}{1-\epsilon '} = O(1)$ (independent of n). If Eq. 112, were true then we would have

$$\begin{aligned} n - O\left( \log \frac{1}{\epsilon } \right)&\le H^{\epsilon }_{\min }(A|B)_{P} - O(D^{\epsilon ''}_{\max }(p_{AB|E} || p_{AB})) \\&\le H^{\epsilon '}_{\min }(A|B)_{p_{|E}} =O(1) \end{aligned}$$

which would lead to a contradiction because n is a free parameter and we can let $n \rightarrow \infty $.

The same example can be used to show that it is not possible to improve Corollary 3.6 to an equation of the form

$$\begin{aligned} H(A|B)_{\rho } \ge H(A|B)_{\eta } - O(D(\rho || \eta )). \end{aligned}$$

For $\rho = P_{|E}$ and $\eta = P$, such a bound would imply that

$$\begin{aligned} 0 \ge (1-\epsilon )n - \log \frac{1}{\epsilon } \end{aligned}$$

which is not true for large n.

Appendix B: Bounds for $D^{\#}_{\alpha }$ of the Form in Lemma 5.3 Necessarily Diverge in the Limit $\alpha $ = 1

Classically, we have the following bound for Rényi entropies.

Lemma B.1

Suppose $\epsilon \in (0,1]$, $d \ge \epsilon ^{1/2}$, and p and q are two distributions over an alphabet $\mathcal {X}$ such that $\frac{1}{2} \left\| p-q \right\| _1 \le \epsilon $ and $D_{\max }(p||q) \le d < \infty $, for $\alpha >1$ we have

$$\begin{aligned} D_{\alpha }(p||q) \le \frac{1}{\alpha -1} \log \left( (1+\sqrt{\epsilon })^{\alpha -1}(1-2\sqrt{\epsilon }) + 2^{d(\alpha -1)+1}\sqrt{\epsilon } \right) . \end{aligned}$$

(113)

In the limit, $\alpha \rightarrow 1$, we get the bound

$$\begin{aligned} D(p||q) \le (1-2\sqrt{\epsilon }) \log (1+ \sqrt{\epsilon }) + 2\sqrt{\epsilon } d. \end{aligned}$$

(114)

Proof

Classically, we have that the set $S:= \{x \in \mathcal {X}: p(x) \le (1+\sqrt{\epsilon }) q(x) \}$ is such that $p(S) \ge 1- 2\sqrt{\epsilon }$ using Lemma 4.1. Thus, for $\alpha >1$ we have

$$\begin{aligned} \sum _{x \in \mathcal {X}} p(x) \left( \frac{p(x)}{q(x)} \right) ^{\alpha -1}&= \sum _{x \in S}p(x) \left( \frac{p(x)}{q(x)} \right) ^{\alpha -1} + \sum _{x \not \in S}p(x) \left( \frac{p(x)}{q(x)} \right) ^{\alpha -1} \\&\le \sum _{x \in S} (1+ \sqrt{\epsilon })^{\alpha -1} p(x) + \sum _{x \not \in S} 2^{d(\alpha -1)}p(x) \\&=(1+ \sqrt{\epsilon })^{\alpha -1} p(S) + 2^{d(\alpha -1)}p(S^c) \\&\le (1+ \sqrt{\epsilon })^{\alpha -1} (1- 2\sqrt{\epsilon }) + 2^{d(\alpha -1)+1}\sqrt{\epsilon } \end{aligned}$$

where in the second line we used the definition of set S and the fact that $D_{\max }(p||q) \le d$, in the last line we use the fact that since $d \ge \sqrt{\epsilon } \ge \log (1+ \sqrt{\epsilon })$, the convex sum is maximised for the largest possible value of $p(S^c)$, which is $2\sqrt{\epsilon }$. The bound now follows. $\square $

We observed in Sect. 5.1 that the bound in Lemma 5.3 for $D^{\#}_{\alpha }$ tends to $\infty $ as $\alpha \rightarrow 1$ for a fixed $\epsilon >0$. One may wonder if a bound like Eq. 114 exists for $\lim _{\alpha \rightarrow 1} D^\#_{\alpha }(\rho || \sigma ) = \hat{D}(\rho || \sigma )$ [33]. We show in the following that such a bound is not possible.

Suppose, that for all $\epsilon \in [0,a)$ (a small neighborhood of 0), $1 \le d < \infty $, states $\rho $ and $\sigma $, which satisfy $\frac{1}{2}\left\| \rho - \sigma \right\| _1 \le \epsilon $ and $\rho \le 2^d \sigma $, the following bound holds

$$\begin{aligned} \hat{D}(\rho || \sigma ) \le f(\epsilon , d) \end{aligned}$$

(115)

where $f(\epsilon , d)$ is such that $\lim _{\epsilon \rightarrow 0} f(\epsilon , d) = f(0,d)=0$ for every $1\le d <\infty $. Note that the upper bound in Eq. 114 is of this form. It is known that for pure states $\rho $, $\hat{D}(\rho || \sigma ) = D_{\max }(\rho || \sigma )$. We will use this to construct a contradiction.

Lemma B.2

^{Footnote 6} For a pure state $\rho = |{\rho }\rangle \langle {\rho }|$ and a state $\sigma $, we have

$$\begin{aligned} \hat{D}(\rho || \sigma ) = D_{\max }(\rho || \sigma ) = \langle {\rho }| \sigma ^{-1} |{\rho }\rangle . \end{aligned}$$

Proof

First, we can evaluate $\hat{D}$ as

$$\begin{aligned} \hat{D}(\rho || \sigma )&= {{\,\textrm{tr}\,}}\left( \rho \log \left( \rho ^{\frac{1}{2}} \sigma ^{-1} \rho ^{\frac{1}{2}} \right) \right) \\&= {{\,\textrm{tr}\,}}\left( |{\rho }\rangle \langle {\rho }| \log \left( |{\rho }\rangle \langle {\rho }| \sigma ^{-1} |{\rho }\rangle \langle {\rho }| \right) \right) \\&= {{\,\textrm{tr}\,}}\left( |{\rho }\rangle \langle {\rho }| \log (\langle {\rho }| \sigma ^{-1} |{\rho }\rangle ) |{\rho }\rangle \langle {\rho }| \right) \\&= \log \langle {\rho }| \sigma ^{-1} |{\rho }\rangle . \end{aligned}$$

Next, we have that

$$\begin{aligned} D_{\max }(\rho || \sigma )&= \log \left\| \sigma ^{-\frac{1}{2}} \rho \sigma ^{-\frac{1}{2}} \right\| _{\infty } \\&= \log \left\| \sigma ^{-\frac{1}{2}} |{\rho }\rangle \langle {\rho }| \sigma ^{-\frac{1}{2}} \right\| _{\infty } \\&= \log {{\,\textrm{tr}\,}}\left( \sigma ^{-\frac{1}{2}} |{\rho }\rangle \langle {\rho }| \sigma ^{-\frac{1}{2}} \right) \\&= \log \langle {\rho }| \sigma ^{-1} |{\rho }\rangle . \end{aligned}$$

$\square $

To obtain a contradiction, let $\epsilon \in [0, a^2)$. Define the states

$$\begin{aligned} \rho&:= |{0}\rangle \langle {0}| = \begin{pmatrix} 1 \, &{}0 \\ 0 \, &{}0 \end{pmatrix}\\ \sigma '_\epsilon&:= (\sqrt{1- \epsilon } |{0}\rangle + \sqrt{\epsilon }|{1}\rangle ) (\sqrt{1- \epsilon } |{0}\rangle + \sqrt{\epsilon }|{1}\rangle )^\dagger \\&= \begin{pmatrix} 1 - \epsilon \, &{} \sqrt{\epsilon (1- \epsilon )} \\ \sqrt{\epsilon (1- \epsilon )} \, &{} \epsilon \end{pmatrix}\\ \sigma _\epsilon&:= (1- \delta ) \sigma '_\epsilon + \delta \rho \\&= \begin{pmatrix} (1 - \epsilon )(1- \delta ) + \delta \, &{} (1- \delta )\sqrt{\epsilon (1- \epsilon )} \\ (1- \delta )\sqrt{\epsilon (1- \epsilon )} \, &{} (1- \delta )\epsilon \end{pmatrix} \end{aligned}$$

where $\{|{0}\rangle , |{1}\rangle \}$ is the standard basis and $\delta \in (0,1)$ is a parameter, which will be chosen later. Observe that $F(\rho , \sigma _\epsilon ) = {\langle e_0, \sigma _\epsilon e_0 \rangle }= {1 - \epsilon (1-\delta )}$, which implies that $\frac{1}{2}\left\| \rho - \sigma _\epsilon \right\| _1 \le \sqrt{\epsilon } \in [0,a)$. For these definitions, we have

$$\begin{aligned} \sigma _\epsilon ^{-1} = \frac{1}{(1-\delta )\delta \epsilon }\begin{pmatrix} (1- \delta )\epsilon \ {} &{} -(1- \delta )\sqrt{\epsilon (1- \epsilon )} \\ -(1- \delta )\sqrt{\epsilon (1- \epsilon )} \ {} &{} (1 - \epsilon )(1- \delta ) + \delta \end{pmatrix} \end{aligned}$$

which implies that $\hat{D}(\rho || \sigma _\epsilon ) = \log \frac{1}{\delta }$ using Lemma B.2. We can fix $\delta = \frac{1}{10}$. Note that $\hat{D}(\rho || \sigma _\epsilon ) >0$ is independent of $\epsilon $. Now observe that if the bound in Eq. 115 were true, then as $\epsilon \rightarrow 0$, $\hat{D}(\rho || \sigma _\epsilon ) = \log (10) \rightarrow 0$, which leads us to a contradiction. Thus, we cannot have bounds of the form in Eq. 115 (also see [36]). Consequently, any kind of bound on $\hat{D}_{\alpha }$ or $D^\#_\alpha $ which results in a bound of the form in Eq. 115 as $\alpha \rightarrow 1$, for example, the bound in Eq. 113, is also not possible at least close to $\alpha =1$.

It should be noted that the reason we can have bounds of the form in Lemma 5.3, despite the fact that no good bound on $\hat{D} = \lim _{\alpha \rightarrow 1} D^\#_\alpha $ can be produced is that $D^\#_\alpha $, unlike the conventional generalizations of the Rényi divergence, is not monotone in $\alpha $ [15, Remark 3.3](otherwise the above counterexample would also give a no-go argument for $D^\#_\alpha $).

Appendix C: Transforming Lemmas for EAT from ${\tilde{H}}^{\downarrow }_\alpha $ to ${\tilde{H}}^{\uparrow }_\alpha $

We have to redo the Lemmas used in [8] using $\tilde{H}^{\uparrow }_\alpha $ because we were only able to prove the dimension bound we need ($\tilde{H}^{\uparrow }_\alpha (A|BC)\ge \tilde{H}^{\uparrow }_\alpha (A|B) - 2\log |C|$) in terms of $\tilde{H}^{\uparrow }_\alpha $

Lemma C.1

[8, Lemma 3.1]. For $\rho _{A_1 A_2 B}$ and $\sigma _B$ be states and $\alpha \in (0, \infty )$, we have the chain rule

$$\begin{aligned} \tilde{D}_\alpha (\rho _{A_1 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _B) - \tilde{D}_\alpha (\rho _{A_1 A_2 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1 A_2} \otimes \sigma _B)= \tilde{H}_\alpha ^{\downarrow }(A_2 | A_1 B)_{\nu } \end{aligned}$$

(116)

where the state $\nu _{A_1 A_2 B}$ is defined as

$$\begin{aligned} \nu _{A_1 B}&:= \frac{\left( \rho _{A_1 B}^{\frac{1}{2}} \sigma _B^{-\alpha '} \rho _{A_1 B}^{\frac{1}{2}} \right) ^\alpha }{{{\,\textrm{tr}\,}}\left( \rho _{A_1 B}^{\frac{1}{2}} \sigma _B^{-\alpha '} \rho _{A_1 B}^{\frac{1}{2}} \right) ^\alpha }\\ \nu _{A_1 A_2 B}&:= \nu _{A_1 B}^{\frac{1}{2}} \rho _{A_2 | A_1 B} \nu _{A_1 B}^{\frac{1}{2}} \end{aligned}$$

and $\alpha ':= \frac{\alpha -1}{\alpha }$.

Corollary C.2

(Chain rule for $ \tilde{H}^{\downarrow }_\alpha $ [8, Theorem 3.2]) For $\alpha \in (0, \infty )$, a state $\rho _{A_1 A_2 B}$, we have the chain rule

$$\begin{aligned} \tilde{H}^{\downarrow }_\alpha (A_1 A_2 | B)_\rho = \tilde{H}^{\downarrow }_\alpha (A_1| B)_\rho + \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B)_\nu \end{aligned}$$

(117)

where the state $\nu _{A_1 A_2 B}$ is defined as

$$\begin{aligned} \nu _{A_1 B}&:= \frac{\left( \rho _{A_1 B}^{\frac{1}{2}} \rho _B^{-\alpha '} \rho _{A_1 B}^{\frac{1}{2}} \right) ^\alpha }{{{\,\textrm{tr}\,}}\left( \rho _{A_1 B}^{\frac{1}{2}} \rho _B^{-\alpha '} \rho _{A_1 B}^{\frac{1}{2}} \right) ^\alpha }\\ \nu _{A_1 A_2 B}&:= \nu _{A_1 B}^{\frac{1}{2}} \rho _{A_2 | A_1 B} \nu _{A_1 B}^{\frac{1}{2}} \end{aligned}$$

and $\alpha ':= \frac{\alpha -1}{\alpha }$.

We can modify [8, Theorem 3.2], which is in terms of $\tilde{H}^{\downarrow }_\alpha $, to the following, which is a chain rule in terms of $\tilde{H}^{\uparrow }_\alpha $. The chain rule in this Corollary was also observed in [8].

Corollary C.3

(Chain rule for $ \tilde{H}^{\uparrow }_\alpha $) For $\alpha \in (0, \infty )$, a state $\rho _{A_1 A_2 B}$ and for any state $\sigma _B$ such that $\tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho = -\tilde{D}_\alpha (\rho _{A_1 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _B)$, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho + \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B)_\nu \end{aligned}$$

(118)

where the state $\nu _{A_1 A_2 B}$ is defined as

$$\begin{aligned} \nu _{A_1 B}&:= \frac{\left( \rho _{A_1 B}^{\frac{1}{2}} \sigma _B^{-\alpha '} \rho _{A_1 B}^{\frac{1}{2}} \right) ^\alpha }{{{\,\textrm{tr}\,}}\left( \rho _{A_1 B}^{\frac{1}{2}} \sigma _B^{-\alpha '} \rho _{A_1 B}^{\frac{1}{2}} \right) ^\alpha }\\ \nu _{A_1 A_2 B}&:= \nu _{A_1 B}^{\frac{1}{2}} \rho _{A_2 | A_1 B} \nu _{A_1 B}^{\frac{1}{2}} \end{aligned}$$

and $\alpha ':= \frac{\alpha -1}{\alpha }$. For $\alpha \in (0, \infty )$, state $\rho _{A_1 A_2 B}$ and any state $\sigma _B$ such that $\tilde{H}^{\uparrow }_\alpha (A_1 A_2| B)_\rho = -\tilde{D}_\alpha (\rho _{A_1 A_2 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1 A_2} \otimes \sigma _B)$, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B)_\rho \le \tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho + \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B)_\nu \end{aligned}$$

(119)

where the state $\nu _{A_1 A_2 B}$ is defined the same as above.

Proof

Let $\sigma _B$ be a state such that $\tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho = -\tilde{D}_\alpha (\rho _{A_1 B} || {{\,\mathrm{\mathbb {1}}\,}}\otimes \sigma _B)$. Then, using Lemma C.1, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B)_\rho&\ge - \tilde{D}_\alpha (\rho _{A_1 A_2 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1 A_2} \otimes \sigma _B)\\&= - \tilde{D}_\alpha (\rho _{A_1 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _B) + \tilde{H}_\alpha ^{\downarrow }(A_2 | A_1 B)_{\nu } \\&= \tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho + \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B)_\nu \end{aligned}$$

for $\nu _{A_1 A_2 B}$ defined as in the Lemma. Similarly, if $\tilde{H}^{\uparrow }_\alpha (A_1 A_2| B)_\rho = -\tilde{D}_\alpha (\rho _{A_1 A_2 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1 A_2} \otimes \sigma _B)$, then

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B)_\rho&= - \tilde{D}_\alpha (\rho _{A_1 A_2 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1 A_2} \otimes \sigma _B)\\&= - \tilde{D}_\alpha (\rho _{A_1 B} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _B) + \tilde{H}_\alpha ^{\downarrow }(A_2 | A_1 B)_{\nu } \\&\le \tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho + \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B)_\nu \end{aligned}$$

for $\nu _{A_1 A_2 B}$ defined as in the Lemma. $\square $

We transform [8, Theorem 3.3] to a statement about $\tilde{H}^{\uparrow }_\alpha $ in the following.

Lemma C.4

Let $\alpha \in \left[ \frac{1}{2}, \infty \right) $ and $\rho _{A_1 A_2 B_1 B_2}$ be a state which satisfies the Markov chain $A_1 \leftrightarrow B_1 \leftrightarrow B_2$. Then, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B_1 B_2)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A_1| B_1)_\rho + \inf _{\nu } \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B_1 B_2)_\nu \end{aligned}$$

(120)

where the infimum is taken over all states $\nu _{A_1 A_2 B_1 B_2}$ such that $\nu _{A_2 B_2 | A_1 B_1} = \rho _{A_2 B_2 | A_1 B_1}$.

Proof

Since, $\rho $ satisfies the Markov chain $A_1 \leftrightarrow B_1 \leftrightarrow B_2$, there exists a decomposition of the system $B_1$ as [37, Theorem 5.4]

$$\begin{aligned} B_1 = \bigoplus _{j \in J} a_j \otimes c_j \end{aligned}$$

such that

$$\begin{aligned} \rho _{A_1 B_1 B_2} = \bigoplus _{j \in J} p(j) \rho _{A_1 a_j} \otimes \rho _{c_j B_2}. \end{aligned}$$

(121)

Let $J' \subseteq J$ be the set $\{j \in J: p(j)>0\}$. Note, that we can replace J by $J'$ in the above equation.

We can define the CPTP recovery map $\mathcal {R}_{B_1 \rightarrow B_1 B_2}$ for $\rho _{A_1 B_1 B_2}$ as

$$\begin{aligned} \mathcal {R}_{B_1 \rightarrow B_1 B_2} (X) := \bigoplus _{j \in J} {{\,\textrm{tr}\,}}_{c_j} \left( \Pi _{a_j} \otimes \Pi _{c_j} X \Pi _{a_j} \otimes \Pi _{c_j} \right) \otimes \rho _{c_j B_2} \end{aligned}$$

(122)

where $\Pi _{a_j} \otimes \Pi _{c_j}$ is the projector on the subspace $a_j \otimes c_j$. This recovery channel satisfies

$$\begin{aligned} \mathcal {R}_{B_1 \rightarrow B_1 B_2} (\rho _{A_1 B_1}) = \rho _{A_1 B_1 B_2}. \end{aligned}$$

(123)

We can now show that the optimisation for the conditional entropy $\tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho $ can be restricted to states of the form $\mathcal {R}_{B_1 \rightarrow B_1 B_2} \left( \sigma _{B_1} \right) $. This follows as

$$\begin{aligned}&\tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho = \sup _{\sigma _{B_1 B_2}} -\tilde{D}_\alpha (\rho _{A_1 B_1 B_2} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _{B_1 B_2}) \\&\quad \le \sup _{\sigma _{B_1 B_2}} -\tilde{D}_\alpha (\mathcal {R}_{B_1 \rightarrow B_1 B_2} \circ {{\,\textrm{tr}\,}}_{B_2} \left( \rho _{A_1 B_1 B_2} \right) || \mathcal {R}_{B_1 \rightarrow B_1 B_2} \circ {{\,\textrm{tr}\,}}_{B_2} \left( {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _{B_1 B_2} \right) )\\&\quad = \sup _{\sigma _{B_1}} -\tilde{D}_\alpha (\rho _{A_1 B_1 B_2} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \mathcal {R}_{B_1 \rightarrow B_1 B_2} \left( \sigma _{B_1} \right) )\\&\quad \le \sup _{\sigma _{B_1 B_2}} -\tilde{D}_\alpha (\rho _{A_1 B_1 B_2} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _{B_1 B_2})\\&\quad = \tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho \end{aligned}$$

where the second line follows from the data processing inequality for $\tilde{D}_\alpha $ for $\alpha \ge \frac{1}{2}$, the supremum in the fourth line is over all states on the registers $B_1 B_2$,and the last line simply follows from the definition of $\tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho $. As a result, it follows that

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho = \sup _{\sigma _{B_1}} -\tilde{D}_\alpha (\rho _{A_1 B_1 B_2} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \mathcal {R}_{B_1 \rightarrow B_1 B_2} \left( \sigma _{B_1} \right) ) \end{aligned}$$

(124)

Let $\sigma _{B_1 B_2} = \mathcal {R}_{B_1 \rightarrow B_1 B_2} \left( \eta _{B_1} \right) $ be such that $\tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho = -\tilde{D}_\alpha (\rho _{A_1 B_1 B_2} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _{B_1 B_2})$. Using Corollary C.3, for this choice of $\sigma _{B_1 B_2}$, we have that

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B_1 B_2)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho + \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B_1 B_2)_\nu \end{aligned}$$

(125)

where the state $\nu _{A_1 A_2 B_1 B_2}$ is defined as

$$\begin{aligned} \nu _{A_1 B_1 B_2}&:= \frac{\left( \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \sigma _{B_1 B_2}^{-\alpha '} \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \right) ^\alpha }{{{\,\textrm{tr}\,}}\left( \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \sigma _{B_1 B_2}^{-\alpha '} \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \right) ^\alpha }\\ \nu _{A_1 A_2 B_1 B_2}&:= \nu _{A_1 B_1 B_2}^{\frac{1}{2}} \rho _{A_2 | A_1 B_1 B_2} \nu _{A_1 B_1 B_2}^{\frac{1}{2}}. \end{aligned}$$

We will now show that $\nu _{A_2 B_2 | A_1 B_1} = \rho _{A_2 B_2 | A_1 B_1}$. For this it is sufficient to show that

$$\begin{aligned} \nu _{A_1 B_1}^{-\frac{1}{2}} \nu _{A_1 B_1 B_2 }^{\frac{1}{2}}= \rho _{A_1 B_1}^{-\frac{1}{2}} \rho _{A_1 B_1 B_2 }^{\frac{1}{2}}. \end{aligned}$$

We have that

$$\begin{aligned} \sigma _{B_1 B_2}&= \mathcal {R}_{B_1 \rightarrow B_1 B_2} \left( \eta _{B_1} \right) \\&= \bigoplus _{j \in J} {{\,\textrm{tr}\,}}_{c_j} \left( \Pi _{a_j} \otimes \Pi _{c_j} \eta _{B_1} \Pi _{a_j} \otimes \Pi _{c_j} \right) \otimes \rho _{c_j B_2}\\&= \bigoplus _{j \in J} q(j) \omega _{a_j} \otimes \rho _{c_j B_2} \end{aligned}$$

where we have defined the probability distribution $q(j):= {{\,\textrm{tr}\,}}(\Pi _{a_j} \otimes \Pi _{c_j} \eta _{B_1})$ and states $\omega _{a_j} = \frac{1}{q(j)} \Pi _{a_j} {{\,\textrm{tr}\,}}_{c_j} \left( \Pi _{c_j} \eta _{B_1} \Pi _{c_j} \right) \Pi _{a_j}$ for every $j \in J$.

Since $\tilde{D}_\alpha (\rho _{A_1 B_1 B_2} || {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _{B_1 B_2})=-\tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho \le \log |A_1| < \infty $, we have that

$$\begin{aligned}&\rho _{A_1 B_1 B_2} \ll {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \sigma _{B_1 B_2}\nonumber \\&\quad \Rightarrow \bigoplus _{j \in J'} p(j) \rho _{A_1 a_j} \otimes \rho _{c_j B_2} \ll {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \bigoplus _{j \in J} q(j) \omega _{a_j} \otimes \rho _{c_j B_2} \nonumber \\&\quad \Rightarrow \text {for every } j \in J': \rho _{A_1 a_j} \ll {{\,\mathrm{\mathbb {1}}\,}}_{A_1} \otimes \omega _{a_j} \text { and } q(j)>0. \end{aligned}$$

(126)

This decomposition can be used to evaluate $\nu _{A_1 B_1 B_2}$ as follows

$$\begin{aligned}&\nu _{A_1 B_1 B_2} = \frac{1}{N} \left( \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \sigma _{B_1 B_2}^{-\alpha '} \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \right) ^\alpha \\&\quad = \frac{1}{N} \left( \bigoplus _{j \in J'} p(j)^{\frac{1}{2}} \rho _{A_1 a_j}^{\frac{1}{2}} \!\otimes \! \rho _{c_j B_2}^{\frac{1}{2}} \bigoplus _{j \in J} q(j)^{-\alpha '} \omega ^{-\alpha '}_{a_j} \Vert !\otimes \! \rho ^{-\alpha '}_{c_j B_2} \bigoplus _{j \in J'} p(j)^{\frac{1}{2}} \rho _{A_1 a_j}^{\frac{1}{2}} \!\otimes \! \rho _{c_j B_2}^{\frac{1}{2}} \right) ^\alpha \\&\quad = \frac{1}{N} \left( \bigoplus _{j \in J'} p(j) q(j)^{-\alpha '} \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \otimes \rho ^{1-\alpha '}_{c_j B_2} \right) ^\alpha \\&\quad = \frac{1}{N} \bigoplus _{j \in J'} p(j)^\alpha q(j)^{1-\alpha } \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^\alpha \otimes \rho _{c_j B_2} \end{aligned}$$

for $N:= {{\,\textrm{tr}\,}}\left( \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \sigma _{B_1 B_2}^{-\alpha '} \rho _{A_1 B_1 B_2}^{\frac{1}{2}} \right) ^\alpha $. Further, we have

$$\begin{aligned}&\nu _{A_1 B_1}^{-\frac{1}{2}} \nu _{A_1 B_1 B_2 }^{\frac{1}{2}} \\&\quad = \frac{1}{N^{-\frac{1}{2}}} \bigoplus _{j \in J'} p(j)^{-\frac{\alpha }{2}}q(j)^{-\frac{1-\alpha }{2}} \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^{-\frac{\alpha }{2}}\otimes \rho _{c_j}^{-\frac{1}{2}} \\&\qquad \cdot \frac{1}{N^{\frac{1}{2}}} \bigoplus _{j \in J'} p(j)^{\frac{\alpha }{2}}q(j)^{\frac{1-\alpha }{2}} \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^{\frac{\alpha }{2}} \otimes \rho _{c_j B_2}^{\frac{1}{2}} \\&\quad = \bigoplus _{j \in J'} \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^{0} \otimes \rho _{c_j}^{-\frac{1}{2}} \rho _{c_j B_2}^{\frac{1}{2}} \\&\quad = \bigoplus _{j \in J'} \rho _{A_1 a_j}^{0} \otimes \rho _{c_j}^{-\frac{1}{2}} \rho _{c_j B_2}^{\frac{1}{2}} \end{aligned}$$

where in the last line we have used that the projector $\left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^{0}$ is equal to the projector $\rho _{A_1 a_j}^{0}$ for every $j \in J'$ (here $P^0$ is the projector onto the image of positive semidefinite operator P). This can be seen since for every $j \in J'$ we first have

$$\begin{aligned} {{\,\mathrm{\text {im}}\,}}\left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) \subseteq {{\,\mathrm{\text {im}}\,}}\left( \rho _{A_1 a_j} \right) . \end{aligned}$$

(127)

Second, we have that Eq. 126 above implies that $\omega _{a_j}^0 \rho _{A a_j}^0 = \rho _{A a_j}^0$ for every $j \in J'$. Now, for $j \in J'$ we have the following inequality

$$\begin{aligned} \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right)&\ge m \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{0}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) \\&= m \rho _{A_1 a_j} \end{aligned}$$

where $m>0$ is the minimum non-zero eigenvalue of $\omega ^{-\alpha '}_{a_j}$. Finally, raising the above to the power of 0 (this action is operator monotone)

$$\begin{aligned} \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^0&\ge \rho _{A_1 a_j}^0. \end{aligned}$$

(128)

Equations 127 and 128 together imply that for $j \in J'$

$$\begin{aligned} \left( \rho _{A_1 a_j}^{\frac{1}{2}} \omega ^{-\alpha '}_{a_j} \rho _{A_1 a_j}^{\frac{1}{2}} \right) ^0 = \rho _{A_1 a_j}^0. \end{aligned}$$

Finally, we have that

$$\begin{aligned} \rho _{A_1 B_1}^{-\frac{1}{2}} \rho _{A_1 B_1 B_2 }^{\frac{1}{2}}&= \bigoplus _{j \in J'} p(j)^{-\frac{1}{2}} \rho _{A_1 a_j}^{-\frac{1}{2}}\otimes \rho _{c_j}^{-\frac{1}{2}} \bigoplus _{j \in J'} p(j)^{\frac{1}{2}} \rho _{A_1 a_j}^{\frac{1}{2}}\otimes \rho _{c_j B_2}^{\frac{1}{2}} \\&= \bigoplus _{j \in J'} \rho _{A_1 a_j}^{0}\otimes \rho _{c_j}^{-\frac{1}{2}} \rho _{c_j B_2}^{\frac{1}{2}}. \end{aligned}$$

This proves that

$$\begin{aligned} \nu _{A_1 B_1}^{-\frac{1}{2}} \nu _{A_1 B_1 B_2 }^{\frac{1}{2}}= \rho _{A_1 B_1}^{-\frac{1}{2}} \rho _{A_1 B_1 B_2 }^{\frac{1}{2}} \end{aligned}$$

(129)

and hence

$$\begin{aligned} \nu _{A_2 B_2 | A_1 B_1}&= \nu _{A_1 B_1}^{-\frac{1}{2}} \nu _{A_1 B_1 B_2 }^{\frac{1}{2}} \nu _{A_2 | A_1 B_1 B_2} \nu _{A_1 B_1 B_2 }^{\frac{1}{2}} \nu _{A_1 B_1}^{-\frac{1}{2}} \\&= \rho _{A_1 B_1}^{-\frac{1}{2}} \rho _{A_1 B_1 B_2 }^{\frac{1}{2}} \rho _{A_2 | A_1 B_1 B_2} \rho _{A_1 B_1 B_2 }^{\frac{1}{2}} \rho _{A_1 B_1}^{-\frac{1}{2}} \\&= \rho _{A_2 B_2 | A_1 B_1} \end{aligned}$$

where we have used the fact that $\nu _{A_2 | A_1 B_1 B_2} = \rho _{A_2 | A_1 B_1 B_2}$ and Eq. 129. We can now modify Eq. 125 to get

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B_1 B_2)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho + \inf _{\nu } \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B_1 B_2)_\nu \end{aligned}$$

where the infimum is over states $\nu $ such that $\nu _{A_2 B_2 | A_1 B_1} = \rho _{A_2 B_2 | A_1 B_1}$. We can use the data processing inequality to get

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_\rho&= \tilde{H}^{\uparrow }_\alpha (A_1| B_1 B_2)_{\mathcal {R}_{B_1 \rightarrow B_1 B_2}(\rho _{A B_1})} \\&\ge \tilde{H}^{\uparrow }_\alpha (A_1| B_1)_{\rho }. \end{aligned}$$

Together with the above inequality this proves the Lemma. $\square $

We will use the following modification of [8, Corollary 3.5].

Corollary C.5

Let ${{\,\mathrm{\mathcal {M}}\,}}_{R \rightarrow A_2 B_2}$ be a channel and $\rho _{A_1 A_2 B_1 B_2}= {{\,\mathrm{\mathcal {M}}\,}}(\rho '_{A_1 B_1 R})$ such that the Markov chain $A_1 \leftrightarrow B_1 \leftrightarrow B_2$ holds. Then, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B_1 B_2)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A_1| B_1)_\rho + \inf _{\omega } \tilde{H}^{\downarrow }_\alpha (A_2 | A_1 B_1 B_2)_{{{\,\mathrm{\mathcal {M}}\,}}(\omega )} \end{aligned}$$

(130)

where the infimum is taken over all states $\omega _{A_1 B_1 R}$. Moreover, if $\rho '_{A_1 B_1 R}$ is pure then we can restrict the optimisation to pure states.

Proof

The proof is the same as [8, Corollary 3.5]. We include it here for the sake of completeness.

It is sufficient to show that for every state $\nu $ such that $\nu _{A_2 B_2 | A_1 B_1} = \rho _{A_2 B_2 | A_1 B_1}$, there exists an $\omega _{A_1 B_1 R}$ such that $\nu _{A_1 A_2 B_1 B_2} = {{\,\mathrm{\mathcal {M}}\,}}(\omega )$. For such a $\nu $, we can define

$$\begin{aligned} \omega _{R A_1 B_1} = \nu _{A_1 B_1}^{\frac{1}{2}} \rho _{A_1 B_1}^{-\frac{1}{2}} \rho '_{A_1 B_1 R} \rho _{A_1 B_1}^{-\frac{1}{2}} \nu _{A_1 B_1}^{\frac{1}{2}} \end{aligned}$$

which can be seen to be a valid state and also satisfy $\nu _{A_1 A_2 B_1 B_2} = {{\,\mathrm{\mathcal {M}}\,}}(\omega )$. $\square $

Appendix D: Dimension Bounds for Conditional Rényi Entropies

Lemma D.1

(Dimension bound). For $\alpha \in [\frac{1}{2}, \infty ]$, a state $\rho _{A_1 A_2 B}$, the following bounds hold for the sandwiched conditional entropies

$$\begin{aligned} \tilde{H}^{\downarrow }_\alpha (A_1| B)_\rho - \log |A_2|&\le \tilde{H}^{\downarrow }_\alpha (A_1 A_2 | B)_\rho \le \tilde{H}^{\downarrow }_\alpha (A_1| B)_\rho + \log |A_2| \\ \tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho - \log |A_2|&\le \tilde{H}^{\uparrow }_\alpha (A_1 A_2 | B)_\rho \le \tilde{H}^{\uparrow }_\alpha (A_1| B)_\rho + \log |A_2|. \end{aligned}$$

For $\alpha \in [0,2]$ and a state $\rho _{A_1 A_2 B}$, the following bounds hold for the Petz conditional entropies

$$\begin{aligned} \bar{H}^{\downarrow }_\alpha (A_1 A_2 | B)_\rho&\le \bar{H}^{\downarrow }_\alpha (A_1| B)_\rho + \log |A_2| \\ \bar{H}^{\uparrow }_\alpha (A_1 A_2 | B)_\rho&\le \bar{H}^{\uparrow }_\alpha (A_1| B)_\rho + \log |A_2|. \end{aligned}$$

Proof

For the sandwiched conditional entropies, we simply use the corresponding chain rules (Corollary C.2 or Corollary C.3) along with the fact that for all states $\nu $, $\tilde{H}^{\downarrow }_\alpha (A_2| A_1 B)_\nu \in [-\log |A_2|, \log |A_2|]$ [17, Lemma 5.2].

For the Petz conditional entropies, we will make use of the Jensen’s inequality for operators [38, Theorem V.2.3]. Suppose, $\{|{e_i}\rangle \}_{i=1}^{|X|}$ is an orthogonal basis for the space X. Then, we have for a positive operator $P_{XY}$ and $\alpha \in [0,1]$

$$\begin{aligned} {{\,\textrm{tr}\,}}_{X} P_{XY}^\alpha&= \sum _{i=1}^{|X|} {{\,\mathrm{\mathbb {1}}\,}}_Y \otimes \langle {e_{i}}|_X P_{XY}^\alpha {{\,\mathrm{\mathbb {1}}\,}}_Y \otimes |{e_{i}}\rangle _X \nonumber \\&\le |X| \left( \sum _{i=1}^{|X|} \frac{1}{|X|} {{\,\mathrm{\mathbb {1}}\,}}_Y \otimes \langle {e_{i}}|_X P_{XY} {{\,\mathrm{\mathbb {1}}\,}}_Y \otimes |{e_{i}}\rangle _X \right) ^\alpha \nonumber \\&= |X|^{1-\alpha } P_Y^\alpha \end{aligned}$$

(131)

where in the second step we have used the operator Jensen’s inequality with the operators $\left\{ \frac{1}{\sqrt{|X|}} {{\,\mathrm{\mathbb {1}}\,}}_Y \otimes |{e_{i}}\rangle _X \right\} _{i=1}^{|X|}$ along with the fact that the map $X \mapsto X^{\alpha }$ is operator concave. For $\alpha \in [1,2]$ and positive operator $P_{XY}$, we can use the same argument as above and the fact that $X \mapsto X^{\alpha }$ is operator convex in this regime and derive

$$\begin{aligned} {{\,\textrm{tr}\,}}_{X} P_{XY}^\alpha&\ge |X|^{1-\alpha } P_Y^\alpha . \end{aligned}$$

(132)

To prove the dimension bound, observe that for a positive state $\sigma _B$ and $\alpha \in [0,2]$, we have

$$\begin{aligned} -\bar{D}_{\alpha }(\rho _{A_1 A_2 B}|| {{\,\mathrm{\mathbb {1}}\,}}_{A_1 A_2}\otimes \sigma _B)&= \frac{1}{1-\alpha } \log {{\,\textrm{tr}\,}}\left( \rho _{A_1 A_2 B}^{\alpha } \sigma _B^{1-\alpha } \right) \\&= \frac{1}{1-\alpha } \log {{\,\textrm{tr}\,}}\left( {{\,\textrm{tr}\,}}_{A_2}\left( \rho _{A_1 A_2 B}^{\alpha } \right) \sigma _B^{1-\alpha } \right) \\&\le \frac{1}{1-\alpha } \log {{\,\textrm{tr}\,}}\left( |A_2|^{1-\alpha } \rho _{A_1 B}^{\alpha } \sigma _B^{1-\alpha } \right) \\&= -\bar{D}_{\alpha }(\rho _{A_1 B}|| {{\,\mathrm{\mathbb {1}}\,}}_{A_1}\otimes \sigma _B) + \log |A_2|. \end{aligned}$$

We can now take a supremum over $\sigma _B$ to prove the dimension bound for $\bar{H}^{\uparrow }_\alpha $ or choose $\sigma _B = \rho _B$ to prove the dimension bound for $\bar{H}^{\downarrow }_\alpha $. $\square $

The following Lemma was originally proven in [39, Proposition 8]. We reproduce the proof argument here.

Lemma D.2

For $\alpha \in [\frac{1}{2}, \infty ]$, a state $\rho _{ABC}$, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A| BC)_\rho \ge \tilde{H}^{\uparrow }_\alpha (AC|B)_\rho - \log |C| \end{aligned}$$

(133)

and for $\alpha \in [0, 2]$

$$\begin{aligned} \bar{H}^{\uparrow }_\alpha (A| BC)_\rho \ge \bar{H}^{\uparrow }_\alpha (AC|B)_\rho - \log |C| \end{aligned}$$

(134)

Proof

By the definition of the sandwiched conditional entropy, we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A| BC)&= \sup _{\eta _{BC} \in D(BC)} -\tilde{D}_\alpha (\rho _{ABC}|| {{\,\mathrm{\mathbb {1}}\,}}_{A} \otimes \eta _{BC})\\&\ge \sup _{\eta _{B} \in D(B)} -\tilde{D}_\alpha \left( \rho _{ABC}|| {{\,\mathrm{\mathbb {1}}\,}}_{A} \otimes \frac{{{\,\mathrm{\mathbb {1}}\,}}_C}{|C|} \otimes \eta _{B} \right) \\&= \sup _{\eta _{B} \in D(B)} -\tilde{D}_\alpha \left( \rho _{ABC}|| {{\,\mathrm{\mathbb {1}}\,}}_{AC} \otimes \eta _{B} \right) - \log |C| \\&= \tilde{H}^{\uparrow }_\alpha (AC| B) - \log |C| \end{aligned}$$

where we simply restrict the supremum in the second line to states of the form $\eta _{BC} = \eta _{B} \otimes \frac{{{\,\mathrm{\mathbb {1}}\,}}_C}{|C|}$ to derive the inequality. The same proof also works with $\bar{H}^{\uparrow }_\alpha $ entropy. $\square $

The following lemma was originally proven in [40, Proposition 3.3.5].

Lemma D.3

(Dimension bound for conditioning register). For $\alpha \in [\frac{1}{2}, \infty ]$ and a state $\rho _{ABC}$ we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A| B C)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A| B)_\rho - 2\log |C|. \end{aligned}$$

(135)

Further, if the register C is classical, then we have

$$\begin{aligned} \tilde{H}^{\uparrow }_\alpha (A| B C)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A| B)_\rho - \log |C|. \end{aligned}$$

(136)

Proof

This bound can be proven by combining Lemmas D.1 and D.2. In the case that C is classical, we have the inequality $\tilde{H}^{\uparrow }_\alpha (A C| B)_\rho \ge \tilde{H}^{\uparrow }_\alpha (A| B)_\rho $ [17, Lemma 5.3]. $\square $

Appendix E: Necessity for Constraints on Side Information Size for Approximate AEP and EAT and its Implication for Approximate GEAT

It turns out that it is necessary to place some sort of bound on the size of the side information for an approximate entropy accumulation theorem of the form in Theorem 5.1. The following classical example demonstrates this. This example also demonstrates the necessity for a bound on the size of the side information in an approximate asymptotic equipartition of the form in Theorem 4.5.

Let there be n rounds. For $k \in [n]$, the map ${{\,\mathrm{\mathcal {M}}\,}}_k: A_1^{k-1} \rightarrow A_k B_k C_k$. This map sets the variables as follows:

1.
Measure $A_1^{k-1}$ in the standard basis.
2.
Let $A_k \in _R \{0,1\}$ be a randomly chosen bit.
3.
Let $C_k = 0$ with probability $\frac{\epsilon }{2}$ and $C_k = 1$ otherwise.
4.
In the case that $C_k = 1$, let $B_k \in _R \{0,1\}^n$ be a randomly chosen n-bit string. Otherwise, let $B_k = A_1^k R_k$, where $R_k$ is an $(n-k)$ bit randomly chosen string from $\{0,1\}$.

Let ${{\,\mathrm{\mathcal {M}}\,}}'_k$ be the map which always chooses $B_k$ to be a random n-bit string. It is easy to see that in this case, we have $H_{\min }(A_1^n | B_1^n C_1^n)_{{{\,\mathrm{\mathcal {M}}\,}}'_n \circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}'_1(1)} = n$ whereas $H_{\min }(A_1^n | B_1^n C_1^n)_{{{\,\mathrm{\mathcal {M}}\,}}_n\circ \cdots \circ {{\,\mathrm{\mathcal {M}}\,}}_1(1)} = O(1)$ even though for every $k \in [n]$, the maps ${{\,\mathrm{\mathcal {M}}\,}}_k$ are $\epsilon -$close in diamond norm distance to the maps ${{\,\mathrm{\mathcal {M}}\,}}'_k$. This proves that a bound on the size of the side registers is indeed necessary for approximate entropy accumulation. We show these facts formally in the following.

Lemma E.1

Suppose $\Phi : R \rightarrow A$ and $\Phi ': R \rightarrow A$ are two channels which take a register R and measure it in the standard basis and map the resulting classical register C to the classical register A. Then, for every $\rho _{R R'}$, we have

$$\begin{aligned} \left\| \Phi (\rho _{R R'}) - \Phi '(\rho _{R R'}) \right\| _{1} \le \left\| P^{\Phi }_{AC} - P_{AC}^{\Phi '} \right\| _1 \end{aligned}$$

(137)

where $P^{\Phi }_{AC}$ and $P^{\Phi '}_{AC}$ are the classical distributions produced when the maps $\Phi $ and $\Phi '$ are applied to the state $\rho _{R R'}$ respectively.

Proof

Let $\{|{c}\rangle \langle {c}|\}_c$ represent the measurement in the standard basis. Since, both the channels first measure register R in the standard basis, they produce the state

$$\begin{aligned} \rho _{CR'}&= \sum _{c} |{c}\rangle \langle {c}|_C \otimes {{\,\textrm{tr}\,}}_R \left( |{c}\rangle \langle {c}|_R\rho _{RR'} \right) \\&= \sum _c p(c) |{c}\rangle \langle {c}|_C \otimes \rho _{R'|c} \end{aligned}$$

where we have defined $p(c):= {{\,\textrm{tr}\,}}\left( |{c}\rangle \langle {c}|_R\rho _{R} \right) $ and $\rho _{R'|c}:= \frac{1}{p(c)}{{\,\textrm{tr}\,}}_R \left( |{c}\rangle \langle {c}|_R\rho _{RR'} \right) $. Now, the action of channel $\Phi $ on register C can be represented using the conditional probability distribution $p^{\Phi }_{A|C}$ and the action of channel $\Phi '$ on register C can be similarly represented using $p^{\Phi '}_{A|C}$. We can define the states

$$\begin{aligned} \rho ^{\Phi }_{ACR'}&:= \sum _{ac} p^{\Phi }_{A|C}(a|c) p(c) |{a,c}\rangle \langle {a,c}| \otimes \rho _{R'|c} \\ \rho ^{\Phi '}_{ACR'}&:= \sum _{ac} p^{\Phi '}_{A|C}(a|c) p(c) |{a,c}\rangle \langle {a,c}| \otimes \rho _{R'|c}. \end{aligned}$$

Note that ${{\,\textrm{tr}\,}}_C \left( \rho ^{\Phi }_{ACR'} \right) = \Phi (\rho _{RR'})$ and ${{\,\textrm{tr}\,}}_C \left( \rho ^{\Phi '}_{ACR'} \right) = \Phi '(\rho _{RR'})$. Further, we can view the $R'$ register of $\rho ^{\Phi }_{ACR'}$ and $\rho ^{\Phi '}_{ACR'}$ as being created by a channel which measures the register C and outputs the state $\rho _{R'|c}$ in the register $R'$. Therefore, we have

$$\begin{aligned} \left\| \Phi (\rho _{RR'}) - \Phi '(\rho _{RR'}) \right\| _1&\le \left\| \rho ^{\Phi }_{ACR'} - \rho ^{\Phi '}_{ACR'} \right\| _1 \\&\le \left\| \rho ^{\Phi }_{AC} - \rho ^{\Phi '}_{AC} \right\| _1 \\&= \left\| P^{\Phi }_{AC} - P_{AC}^{\Phi '} \right\| _1. \end{aligned}$$

$\square $

We can use the above lemma to evaluate the distance between the channels ${{\,\mathrm{\mathcal {M}}\,}}_k$ and ${{\,\mathrm{\mathcal {M}}\,}}'_k$. Using the above lemma, it is sufficient to suppose that the input of the channels are classical. We can suppose that the registers $A_1^{k-1}$ are classical and distributed as $P_{A_1^{k-1}}$. Let $P_{A_1^k B_k C_k}$ be the output of ${{\,\mathrm{\mathcal {M}}\,}}_k$ on this distribution and $Q_{A_1^k B_k C_k}$ be the output of applying ${{\,\mathrm{\mathcal {M}}\,}}'_k$. Then, we have

$$\begin{aligned}&\left\| P_{A_1^k B_k C_k} - Q_{A_1^k B_k C_k} \right\| _1 = \sum _{a_1^k, c_k} P(a_1^{k-1})P(a_k) P(c_k) \left\| P_{B_k| a_1^k, c_k} - Q_{B_k} \right\| _1 \\&\quad = \sum _{a_1^k} P(a_1^{k-1})P(a_k) \left( \left( 1-\frac{\epsilon }{2} \right) \left\| P_{B_k| a_1^k, c_k=1} - Q_{B_k} \right\| _1 + \frac{\epsilon }{2} \left\| P_{B_k| a_1^k, c_k=0} - Q_{B_k} \right\| _1 \right) \\&\quad \le \sum _{a_1^k} P(a_1^{k-1})P(a_k) \epsilon \\&\quad = \epsilon \end{aligned}$$

where in the first line we have used the fact that $A_k$ and $C_k$ are chosen independently with the same distribution in both the maps and the fact that $B_k$ is chosen independently in ${{\,\mathrm{\mathcal {M}}\,}}'_k$, for the third line we have used the fact that $B_k$ is independent and has the same distribution as $Q_{B_k}$ when $c_k=1$. Since, this is true for all input distributions, we have $\left\| {{\,\mathrm{\mathcal {M}}\,}}_k - {{\,\mathrm{\mathcal {M}}\,}}'_k \right\| _{\diamond } \le \epsilon $.

Now, let $R_{A_1^n B_1^n C_1^n}$ be the probability distribution created when the maps ${{\,\mathrm{\mathcal {M}}\,}}_k$ are applied sequentially n times and $S_{A_1^n B_1^n C_1^n}$ be the probability distribution created when the maps ${{\,\mathrm{\mathcal {M}}\,}}'_k$ are applied sequentially n times. Since, $B_k$ and $C_k$ are independent of $A_k$ in the distribution S, we have

$$\begin{aligned} H_{\min }(A_1^n | B_1^n C_1^n)_S = n. \end{aligned}$$

We will show that $H_{\min }^{\epsilon '} (A_1^n | B_1^n C_1^n)_R = O(1)$ as long as $\epsilon ' \le \frac{1}{4}$. Let $l:= \frac{2}{\epsilon } \log \frac{1}{\epsilon '}$. Let E be the event that there exists a $k > n- l$ such that $C_k=0$. For our choice of l, we have $p(E) \ge 1- \epsilon '$.

Lemma E.2

Let $P_{AB}$ be a subnormalised probability distribution such that $A= f(B)$ for some function f (that is, $P(a,b)>0$ only if $ a= f(b)$). Then, $H^{\epsilon }_{\min } (A|B)_P \le \log \frac{1}{{{\,\textrm{tr}\,}}(P) - \sqrt{2\epsilon }}$.

Proof

Let $P'_{AB}$ be a distribution $\epsilon $-close to P in purified distance. Then, it is $\sqrt{2\epsilon }$ close to P in trace distance. We have that

$$\begin{aligned} 2^{-H_{\min }(A|B)_{P'}}&= P'_{\text {guess}}(A|B) \\&\ge \sum _{b} P'_{AB}(f(b), b) \\&\ge \sum _{b} P_{AB}(f(b), b) - \sqrt{2\epsilon } \\&= {{\,\textrm{tr}\,}}(P) - \sqrt{2\epsilon } \end{aligned}$$

which implies that $H_{\min }(A|B)_{P'} \le \log \frac{1}{{{\,\textrm{tr}\,}}(P) - \sqrt{2\epsilon }}$. Since, this is true for every distribution $\epsilon $-close to P, it also holds for $H^{\epsilon }_{\min } (A|B)_P$. $\square $

We then have that

$$\begin{aligned} H^{\epsilon '}_{\min }(A_1^n | B_1^n C_1^n)_{R}&\le H^{\epsilon '}_{\min }(A_1^n | B_1^n C_1^n \wedge E)_{R} \\&\le H^{\epsilon '}_{\min }(A_1^{n-l} | B_1^n C_1^n \wedge E)_{R} + l \\&\le \log \frac{1}{p(E)- \sqrt{2\epsilon '}} + l \\&\le \log \frac{1}{1- \epsilon '- \sqrt{2\epsilon '}} + l \\&\le l + \log 8/3 = O(1) \end{aligned}$$

where in the first line we have used [41, Lemma 10] in the first line, dimension bound (can be proven using Lemma D.1) in the second line, Lemma E.2 in the third line and the fact that $p(E)\ge 1- \epsilon '$.

Also, note that the example given here satisfies

$$\begin{aligned} \left\| P_{A_1^k B_1^k C_1^k} - P_{A_1^{k-1} B_1^{k-1} C_1^{k-1}} P_{A_k B_k C_k} \right\| _1 \le \epsilon \end{aligned}$$

for every k. This also proves that a bound on the size of the side information registers ($B_k C_k$ here), as we have in Theorem 4.5, is necessary for an approximate version of AEP.

Further, this example also rules out the possibility of a natural approximate extension to the generalised entropy accumulation theorem (GEAT) [9] where the maps ${{\,\mathrm{\mathcal {M}}\,}}_k \approx _\epsilon {{\,\mathrm{\mathcal {M}}\,}}'_k$ and the maps ${{\,\mathrm{\mathcal {M}}\,}}'_k$ satisfy the non-signalling conditions because one can write the entropy accumulation scenario in the form of a generalised entropy accumulation scenario where Eve’s information contains the side information $B_1^k E$ in each step. Thus, it would not be possible to prove a meaningful bound on the smooth min-entropy without some sort of bound on the information transferred between the adversary’s register $E_i$ and the register $R_i$.

Appendix F: Classical Approximate Entropy Accumulation

We present a simple proof for the approximate entropy accumulation theorem for classical distributions. This result also requires a much weaker assumption than Theorem 5.1.

Theorem F.1

Let $p_{A_1^n B_1^n E}$ be a classical distribution such that for every $k \in [n]$, and $a_1^{k-1}, b_1^{k-1}$ and e

$$\begin{aligned} \left\| p_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} - q^{(k)}_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} \right\| _{\infty } \le \epsilon \end{aligned}$$

(138)

where $\left\| v \right\| _{\infty }:= \max _{i} |v(i)|$ and the $q^{(k)}_{B_k | a_1^{k-1}, b_1^{k-1}, e} = q^{(k)}_{B_k | b_1^{k-1}, e}$ or equivalently $q^{(k)}$ satisfies the Markov chain $A_k \leftrightarrow B_1^{k-1}E \leftrightarrow B_k$. Also, let $|A_k| = |A|$, $|B_k| = |B|$ for every $k \in [n]$.

Then, for $\epsilon ' \in (0,1)$ and $\alpha \in \left( 1, 1+ \frac{1}{\log (1+2|A|)} \right) $, we have that

$$\begin{aligned} H_{\min }^{ \epsilon '}(A_1^n |B_1^n E)_{p}&\ge \sum _{k=1}^n \inf _{q} H(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{q^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}} \nonumber \\&\quad - n(\alpha -1)\log ^2(2|A| + 1)- \frac{\alpha }{\alpha -1} n \log (1 + \epsilon |A||B|) -\frac{g_0(\epsilon ')}{\alpha -1}. \end{aligned}$$

(139)

where $g_0(x):= - \log (1 - \sqrt{1-x^2})$. The infimums are taken over all possible input probability distributions.

For $\alpha = 1 + \sqrt{\epsilon }$ (assuming $\sqrt{\epsilon } \le 1+ \frac{1}{\log (1+2|A|)}$), and using $\alpha \le 2$ and $\log (1+x) \le x$ as long as $x \ge 0$, the above bound gives us

$$\begin{aligned} H_{\min }^{ \epsilon '}(A_1^n |B_1^n E)_{p}&\ge \sum _{k=1}^n \inf _{q} H(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{q^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}} \nonumber \\&\quad - n\sqrt{\epsilon }\left( \log ^2(2|A| + 1)- 2 |A||B| \right) -\frac{g_0(\epsilon ')}{\alpha -1} \end{aligned}$$

(140)

Proof

For every $k \in [n]$, we modify $q^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E}$ to create the distributions $r^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E}$, which are defined as follows

1.
Choose a random variable $C_k$ from $\{0,1\}$ with probabilities $\left( \frac{|A||B|\epsilon }{1+|A||B|\epsilon }, \frac{1}{1+|A||B|\epsilon } \right) $.
2.
If $C_k = 1$, then choose random variables $A_k, B_k$ using $q^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E}$ else choose $A_k, B_k$ randomly with probability $\frac{1}{|\mathcal {A}||\mathcal {B}|}$.

That is, we have

$$\begin{aligned} r^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E} := \frac{1}{1+|A||B|\epsilon } q^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E} + \frac{|A||B|\epsilon }{1+|A||B|\epsilon } u_{A_k B_k} \end{aligned}$$

where $u_{A_k B_k}$ is the uniform distribution on the registers $A_k$ and $B_k$.

For every $k, a_1^{k-1}, b_1^{k-1}$, and e, we have

$$\begin{aligned}&\left\| p_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} - q^{(k)}_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} \right\| _\infty \le \epsilon \\&\quad \Rightarrow p_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} \le q^{(k)}_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} + \epsilon {{\,\mathrm{\mathbb {1}}\,}}_{A_k B_k} \\&\quad \Rightarrow p_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} \le q^{(k)}_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} + \epsilon |A||B| u_{A_k B_k} \\&\quad \Rightarrow p_{A_k B_k | a_1^{k-1}, b_1^{k-1}, e} \le (1+|A||B|\epsilon ) r^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E} \end{aligned}$$

Define the distribution

$$\begin{aligned} r_{A_1^n B_1^n E} = \prod _{k=1}^n r^{(k)}_{A_k B_k | A_1^{k-1} B_1^{k-1} E} p_E. \end{aligned}$$

(141)

Note that for every $k, a_1^{k-1}, b_1^k$, and e, we have

$$\begin{aligned} r_{B_k | A_1^{k-1} B_1^{k-1} E}(b_k | a_1^{k-1} b_1^{k-1} e)&=\! \frac{1}{1\!+\!|A||B|\epsilon } \!q^{(k)}_{B_k | A_1^{k-1} B_1^{k-1} E}(b_k | a_1^{k-1} b_1^{k-1} e) \!+\! \frac{\epsilon }{1\!+\!|A||B|\epsilon } \\&= \frac{1}{1+|A||B|\epsilon } q^{(k)}_{B_k | B_1^{k-1} E}(b_k | b_1^{k-1} e) + \frac{\epsilon }{1+|A||B|\epsilon }, \end{aligned}$$

which implies

$$\begin{aligned}&r_{B_k | B_1^{k-1} E}(b_k | b_1^{k-1} e) = \sum _{\bar{a}_1^{k-1}} r_{A_1^{k-1} | B_1^{k-1} E}(\bar{a}_1^{k-1}| b_1^{k-1} e ) r_{B_k | A_1^{k-1} B_1^{k-1} E}(b_k | \bar{a}_1^{k-1} b_1^{k-1} e)\\&\quad = \sum _{\bar{a}_1^{k-1}} r_{A_1^{k-1} | B_1^{k-1} E}(\bar{a}_1^{k-1}| b_1^{k-1} e )\left( \frac{1}{1+|A||B|\epsilon } q^{(k)}_{B_k | B_1^{k-1} E}(b_k | b_1^{k-1} e) + \frac{\epsilon }{1+|A||B|\epsilon } \right) \\&\quad = r_{B_k | A_1^{k-1} B_1^{k-1} E}(b_k | a_1^{k-1} b_1^{k-1} e). \end{aligned}$$

Thus, for every $k \in [n]$, r satisfies the Markov chain $A_1^{k-1} \leftrightarrow B_1^{k-1} E \leftrightarrow B_k$. Further, we have

$$\begin{aligned} p_{A_1^n B_1^n E}(a_1^n, b_1^n, e)&= \prod _{k=1}^n p_{A_k B_k | A_1^{k-1}, B_1^{k-1}, E} (a_k, b_k | a_1^{k-1}, b_1^{k-1}, e) p_E(e)\\&\le (1+\epsilon |A||B|)^n \prod _{k=1}^n r^{(k)}_{A_k B_k | A_1^{k-1}, B_1^{k-1}, E} (a_k, b_k | a_1^{k-1}, b_1^{k-1}, e) p_E(e) \\&= (1+\epsilon |A||B|)^n r_{A_1^n B_1^n E}(a_1^n, b_1^n, e) \end{aligned}$$

which shows that $D_{\max }(p_{A_1^n B_1^n E} || r_{A_1^n B_1^n E}) \le n \log (1 + \epsilon |A||B|)$.

The distribution $r_{A_1^n B_1^n E}$ can be viewed as the result of a series of maps as in Fig. 3. We can now use the EAT chain rule [8, Corollary 3.5] along with [8, Lemma B.9] n-times to bound the entropy of this auxiliary distribution. We get

$$\begin{aligned}&\tilde{H}_{\alpha }^{\uparrow }(A_1^n | B_1^n E)_r \ge \sum _{k=1}^n \inf _{q_{A_1^{k-1}B_1^{k-1}E}} \tilde{H}_{\alpha }^{\downarrow }(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{r^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}} \\&\quad \ge \sum _{k=1}^n \inf _{q_{A_1^{k-1}B_1^{k-1}E}} H(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{r^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}} \\&\quad - n(\alpha -1)\log ^2(2|A| + 1) \\&\quad \ge \sum _{k=1}^n \bigg ( \inf _{q_{A_1^{k-1}B_1^{k-1}E}} \frac{1}{1+|A||B|\epsilon } H(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{q^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}} \\&\quad + \frac{\epsilon }{1+|A||B|\epsilon } \log |A| \bigg ) - n(\alpha -1)\log ^2(2|A| + 1) \\&\quad \ge \sum _{k=1}^n \inf _{q_{A_1^{k-1}B_1^{k-1}E}} H(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{q^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}}\\&\quad - n(\alpha -1)\log ^2(2|A| + 1) \end{aligned}$$

for $\alpha \in \left( 1, 1+\frac{1}{\log (1+2|A|)} \right) $. In the third line, we have used the concavity of the von Neumann entropy along with the definition of $r^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}$. Using Lemma 3.5, we have

$$\begin{aligned} H_{\min }^{\epsilon '}(A_1^n |B_1^n E)_{p}&\ge \tilde{H}_{\alpha }^{\uparrow }(A_1^n |B_1^n E)_{r} - \frac{\alpha }{\alpha -1}D_{\max }(p_{A_1^n B_1^n E}||r_{A_1^n B_1^n E}) -\frac{g_1(\epsilon ', 0)}{\alpha -1}\\&\ge \sum _{k=1}^n \inf _{q} H(A_k | B_k A_1^{k-1}B_1^{k-1}E)_{q^{(k)}_{A_k B_k | A_1^{k-1}B_1^{k-1}E}q_{A_1^{k-1}B_1^{k-1}E}} \\&\quad - n(\alpha -1)\log ^2(2|A| + 1)- \frac{\alpha }{\alpha -1} n \log (1 + \epsilon |A||B|) -\frac{g_0(\epsilon ')}{\alpha -1}. \end{aligned}$$

$\square $

Appendix G: Lemma to Bound Distance after Conditioning

The following Lemma relates the distance of two states conditioned on an event to the distance between them without conditioning.

Lemma G.1

Suppose $\rho _{XA} = \sum _{x \in \mathcal {X}} p(x) |{x}\rangle \langle {x}|\otimes \rho _{A|x}$ and $\tilde{\rho }_{XA} = \sum _{x \in \mathcal {X}} \tilde{p}(x) |{x}\rangle \langle {x}| \otimes \tilde{\rho }_{A|x}$ are classical-quantum states such that $\frac{1}{2}\left\| \rho _{XA}- \tilde{\rho }_{XA} \right\| _1 \le \epsilon $. Then, for $x \in \mathcal {X}$ such that $p(x)>0$, we have

$$\begin{aligned} \frac{1}{2}\left\| \rho _{A|x}- \tilde{\rho }_{A|x} \right\| _1 \le \frac{2\epsilon }{p(x)} \end{aligned}$$

(142)

Proof

$$\begin{aligned}&\frac{1}{2}\left\| \rho _{XA}- \tilde{\rho }_{XA} \right\| _1 = \frac{1}{2}\sum _{x \in \mathcal {X}} \left\| p(x)\rho _{A|x} - \tilde{p}(x) \tilde{\rho }_{A|x} \right\| _1 \le \epsilon \end{aligned}$$

This implies that for $x \in \mathcal {X}$

$$\begin{aligned} \frac{1}{2}\left\| p(x)\rho _{A|x} - \tilde{p}(x) \tilde{\rho }_{A|x} \right\| _1 \le \epsilon \end{aligned}$$

and

$$\begin{aligned} \frac{1}{2}\vert p(x) - \tilde{p}(x) \vert \le \epsilon . \end{aligned}$$

Using these inequalities, we have

$$\begin{aligned} \frac{1}{2}\left\| \rho _{A|x}- \tilde{\rho }_{A|x} \right\| _1&\le \frac{1}{2}\left\| \rho _{A|x}- \frac{\tilde{p}(x)}{p(x)}\tilde{\rho }_{A|x} \right\| _1 + \frac{1}{2}\left| 1- \frac{\tilde{p}(x)}{p(x)} \right| \left\| \tilde{\rho }_{A|x} \right\| _1 \\&= \frac{1}{p(x)}\frac{1}{2}\left\| p(x)\rho _{A|x}- \tilde{p}(x)\tilde{\rho }_{A|x} \right\| _1 + \frac{1}{p(x)}\frac{1}{2}\left| p(x)- \tilde{p}(x) \right| \\&\le \frac{2\epsilon }{p(x)}. \end{aligned}$$

$\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Marwah, A., Dupuis, F. Smooth Min-entropy Lower Bounds for Approximation Chains. Commun. Math. Phys. 405, 211 (2024). https://doi.org/10.1007/s00220-024-05074-8

Download citation

Received: 13 February 2024
Accepted: 13 May 2024
Published: 23 August 2024
DOI: https://doi.org/10.1007/s00220-024-05074-8

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Sequential DIQKD protocol
Parameters:
\(\bullet \) \(\omega _{\text {exp}}\) is the expected winning probability for the honest implementation of the device
\(\bullet \) \(n \ge 1\) is the number of rounds in the protocol
\(\bullet \) \(\gamma \in (0,1]\) is the fraction of test rounds
Protocol:
1. For every \(0 \le i \le n\) perform the following steps:
(a) Alice chooses a random \(T_i \in \{ 0,1\}\) with \(\Pr [T_i =1] =\gamma \).
(b) Alice sends \(T_i\) to Bob.
(c) If \(T_i = 0\), Alice and Bob set the questions \((X_i,Y_i) = (0,2)\), otherwise they sample \((X_i,Y_i)\) uniformly at random from \(\{ 0,1\}\).
(d) Alice and Bob use their device with the questions \((X_i,Y_i)\) and obtain the outputs \(A_i, B_i\).
2. Alice announces her questions \(X_1^n\) to Bob.
3. Error correction: Alice and Bob use an error correction procedure, which lets Bob obtain the raw key \(\tilde{A}_1^n\) (if the error correction protocol succeeds, then \(A_1^n = \tilde{A}_1^n\)). In case the error correction protocol aborts, they abort the QKD protocol too.
4. Parameter Estimation: Bob uses \(B_1^n\) and \(\tilde{A}_1^n\) to compute the average winning probability \(\omega _{\text {avg}}\) on the test rounds. He aborts if \(\omega _{\text {avg}} < \omega _{\text {exp}}\)
5. Privacy Amplification: Alice and Bob use a privacy amplification protocol to create a secret key K from \(A_1^n\) (using \(\tilde{A}_1^n\) for Bob).
Protocol 1

Smooth Min-entropy Lower Bounds for Approximation Chains

Abstract

Similar content being viewed by others

How to Smooth Entropy?

Bitwise Quantum Min-Entropy Sampling and New Lower Bounds for Random Access Codes

Necessary Criterion for Approximate Recoverability

1 Introduction

2 Background and Notation

3 Entropic Triangle Inequality for the Smooth Min-entropy

Lemma 3.1

Proof

Definition 3.2

Lemma 3.3

Proof

Lemma 3.4

Proof

Lemma 3.5

Proof

Corollary 3.6

Proof

4 Approximately Independent Registers

Lemma 4.1

Proof

Theorem 4.2

Theorem 4.3

Proof

Corollary 4.4

4.1 Weak approximate asymptotic equipartition

Theorem 4.5

Proof

4.2 Simple security proof for sequential device independent quantum key distribution

5 Approximate Entropy Accumulation

Theorem 5.1

5.1 Divergence bound for approximately equal states

Lemma 5.2

Proof

Lemma 5.3

Proof

5.2 Bounding the channel divergence for two channels close to each other

Lemma 5.4

5.3 Proof of the approximate entropy accumulation theorem

Proof of Theorem 5.1

Claim 5.5

Proof

5.4 Testing

Definition 5.6

Theorem 5.7

Proof

5.5 Limitations and further improvements

Notes

References

Acknowledgement

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendices

Appendix A: Entropic Triangle Inequalities Cannot be Improved Much

Appendix B: Bounds for \(D^{\#}_{\alpha }\) of the Form in Lemma 5.3 Necessarily Diverge in the Limit \(\alpha \) = 1

Lemma B.1

Proof

Lemma B.2

Proof

Appendix C: Transforming Lemmas for EAT from \({\tilde{H}}^{\downarrow }_\alpha \) to \({\tilde{H}}^{\uparrow }_\alpha \)

Lemma C.1

Corollary C.2

Corollary C.3

Proof

Lemma C.4

Proof

Corollary C.5

Proof

Appendix D: Dimension Bounds for Conditional Rényi Entropies

Lemma D.1

Proof

Lemma D.2

Proof