Optimizing T Counts Given an Error Budget#

Quantum algorithms typically only allow us to estimate properties to within some additive error \(\epsilon\). For quantum phase estimation via Trotterization there are at least three sources of error [1]: Trotter errors (\(\Delta_{TS}\)), circuit synthesis errors (\(\Delta_{HT}\)), and phase estimation errors (\(\Delta_{PE}\)). Here we will focus on the Hubbard model but many features are similar for different Hamiltonian types.

1. Trotter Errors#

Given a \(p\)-th order product formula \(S_p(t)\) for the unitary implementing \(e^{-i t H}\) we have that

\[ \Delta_U \equiv \lVert S_p(t) - e^{-i t H}\rVert_{W_\eta} = \xi(\eta, u, \tau) t^{p+1} \]
for some constant \(\xi\) which depends on the parameters of the system. The constant \(\xi\) can be either computed (through direct evaluation and extrapolating from small system sizes) or bounded using complicated commutator expressions.

Ref. [1] showed that

\[ \Delta_{TS} t \equiv |E - E_{TS}|t \le \arctan\left(\Delta_U \frac{\sqrt{4-\Delta_U^2}}{2-\Delta_U^2}\right) = \Delta_U + \frac{\Delta_U^3}{24} + \mathcal{O}(\Delta_U^5), \]

so that as long as \(\Delta_U \gg \frac{\Delta_U^3}{24}\) we can estimate

\[ \Delta_{TS} = \Delta_U / t \approx \xi(\eta, u, \tau) t^{p} \]

2. Circuit Synthesis Errors#

Circuit synthesis errors account for the approximate implementation of single qubit \(R_z(\theta)\) rotations when compiled to Clifford+\(T\) gates. A single qubit rotation gate can be synthesized to \(\epsilon_R\) error using

\[ T_\mathrm{synth} \approx 1.15 \log_2(1/\epsilon_R) + 9.2 \]
\(T\) gates. Assuming (see Appendix D corollary 1) these errors add at most linearly to the estimated phase then the cost is

\[ N_{HT} = 1.15 \log_2 \left(\frac{N_R}{\Delta_{HT} t}\right) + 9.2 \]

\(T\) gates per single qubit rotation, where \(N_R\) is the number of rotations per Trotter step.

3. QPE Errors#

Phase estimation errors \(\Delta_{PE}\) arise due to not computing enough bits of the phase. Adaptive phase estimation allows one to reach a RMS error of \(\Delta_{PE} t\) using

\[ N_{PE} \approx \frac{0.76 \pi}{\Delta_{PE} t} \]

repetitions of the simulation circuit.

Optimizing the T counts#

We assume \(\epsilon \ge \Delta_{PE} + \Delta_{TS} + \Delta_{HT}\). There is some freedom for the relative weighting of these terms and we can minimize the \(T\) count which this constraint. The total \(T\) count comes from the Trotter step costs and the number of QPE repetitions \(N_{PE}\)

\[ N_{\mathrm{tot}} = N_{PE} (N_{R} N_{HT} + N_T) \]

where \(N_T\) is the number of “direct” \(T\) gates (i.e. those that appear in the circuit before synthesis).

If we write \(t = \left(\frac{\Delta_{TS}}{\xi}\right)^{1/p}\), then

\[ N_\mathrm{tot} \approx \frac{0.76 \pi \xi^{1/p}}{\Delta_{PE}\Delta_{TS}^{1/p}}\left(N_R \left[\log\left(\frac{N_R \xi^{1/p}}{\Delta_{HT}\Delta_{TS}^{1/p}}\right) + 9.2\right] + N_T\right) \]

Ref. [1] minimized this expression numerically. Qualtran should be able to produce this expression and perform the optimization.

Cost analysis#

Let us first reproduce the results from [2] for the PLAQ Hubbard splitting through direct minimization. We will then see how Qualtran can help with this analysis.

from typing import Tuple

import numpy as np
import sympy
import attrs

from qualtran.cirq_interop.t_complexity_protocol import TComplexity
from qualtran.resource_counting import get_cost_value, QECGatesCost


def t_and_rot_counts_from_bloq(bloq) -> Tuple[int, int]:
    costs = get_cost_value(bloq, QECGatesCost())
    n_rot = costs.rotation
    n_t = costs.total_t_count(ts_per_rotation=0)
    return n_t, n_rot


def timestep_from_params(delta_ts: float, xi: float, prod_ord: int) -> float:
    """Get the timestep from the product formula spectral norm error.

    Args
        delta_ts: The allowed Suzuki-Trotter error.
        xi: The constant factor for the Trotter-Suzuki spectral norm error.
        prod_ord: The product formula order.
    Returns:
        timestep: The computed timestep.
    """
    return (delta_ts / xi) ** (1.0 / prod_ord)


def get_single_rot_eps(n_rot: int, delta_ht: float, timestep: float) -> int:
    """Get the precision required for single qubit rotations given n_rot of them.

    Args:
        delta_ht: The allowed circuit synthesis error.
        n_rot: The number of rotations in the circuit.
        timestep: The timestep for trotterization.

    Returns:
        eps: the precision for single qubit rotations
    """
    return (delta_ht * timestep) / n_rot


def qpe_steps(delta_pe: float, timestep: float) -> int:
    """Get the number of QPE steps from the RMS error and timestep.

    Args:
        delta_pe: The desired adaptive phase estimation RMS error.
        timestep: The timestep value.

    Returns:
        n_qpe: The number of QPE steps.
    """
    return (0.76 * np.pi) / (delta_pe * timestep)


def rotation_cost(n_rot: int, delta_ht: float, timestep: float) -> int:
    """Get the rotation costs.

    Args:
        n_rot: The number of rotations in the circuit.
        delta_ht: The allowed circuit synthesis error.
        timestep: The timestep value.
    """
    rot_cost = TComplexity(rotations=n_rot).t_incl_rotations(
        get_single_rot_eps(n_rot, delta_ht, timestep)
    )
    return rot_cost


def qpe_t_count(
    delta_pe: float,
    delta_ts: float,
    delta_ht: float,
    n_rot: int,
    n_t: int,
    xi: float,
    prod_ord: int,
) -> int:
    """Compute the total number of T gates used for Trotterized QPE.

    Args:
        delta_pe: The allowed phase estimation error.
        delta_ts: The allowed Suzuki-Trotter error.
        delta_ht: The allowed circuit synthesis error.
        n_rot: The number of rotations in the circuit.
        n_t: The number of direct T gates (before synthesis).
        xi: The constant factor for the Trotter-Suzuki spectral norm error.
        prod_ord: The product formula order.

    Returns:
        tot_t_cost: The total number of T gates.
    """
    timestep = timestep_from_params(delta_ts, xi, prod_ord)
    rot_cost = rotation_cost(n_rot, delta_ht, timestep)
    n_qpe = qpe_steps(delta_pe, timestep)
    tot_t_cost = n_qpe * (rot_cost + n_t)
    return tot_t_cost
# Get some system parameters from Ref. [2]
# 8x8 lattice
length = 8
hubb_u = 4

# From Ref[2] table 1.
xi_bound = 5.3e2 
# Fig 2. from Ref[2] uses this extensive size error (we're targeting some energy per lattice site)
epsilon = 0.0051 * length**2
# Arbitrary splitting of error for comparison purposes
delta_ts = 0.5 * epsilon 
delta_pe = 0.45 * epsilon 
delta_ht = 0.05 * epsilon 
# using 2nd order Suzuki (Strang) splitting
prod_ord = 2
timestep = timestep_from_params(delta_ts, xi_bound, prod_ord)
print(f"Computed timestep: {timestep:4.3e}")
print(f"Sum Error budget terms: {delta_ts + delta_ht + delta_pe}")
print(f"Expected Error budget: {epsilon}")
Computed timestep: 1.755e-02
Sum Error budget terms: 0.3264
Expected Error budget: 0.3264

Let’s first check the fixed costs from the Trotter step match our expectations

from qualtran.bloqs.chemistry.trotter.hubbard.trotter_step import build_plaq_unitary_second_order_suzuki

trotter_step = build_plaq_unitary_second_order_suzuki(length, hubb_u, timestep, eps=1e-10)
n_t, n_rot = t_and_rot_counts_from_bloq(trotter_step)
print(f"N_T = {n_t} vs {(3*length**2 // 2)*8}")
print(f"N_rot = {n_rot} vs {(3 * length**2 + 2*length**2)}")
N_T = 768 vs 768
N_rot = 320 vs 320

Now let’s look at the total cost and the error we incurred with our default parameter choices.

from qualtran.drawing import show_call_graph
from qualtran.resource_counting.generalizers import generalize_rotation_angle 
# get appropriate epsilon given our input parameters now we know the number of rotations
eps_single_rot = get_single_rot_eps(n_rot, delta_ht, timestep)
print(f"Adjusted eps_single_rot: {eps_single_rot}")
tot_t_count = qpe_t_count(delta_pe, delta_ts, delta_ht, n_rot, n_t, xi_bound, prod_ord)
# This doesn't really matter right now because the cost is determined directly
# from the formula assuming we used an appropriate delta_ht.
# But let's show the call graph anyway to check the parameters all all what we expect.
updated_eps_bloqs = tuple(attrs.evolve(b, eps=eps_single_rot) for b in trotter_step.bloqs)
trotter_step = attrs.evolve(trotter_step, bloqs=updated_eps_bloqs)
trotter_step_g, _ = trotter_step.call_graph(generalizer=generalize_rotation_angle)
show_call_graph(trotter_step_g)
Adjusted eps_single_rot: 8.949367006180983e-07
../../../../_images/d71136462162092554201c8f8b4eb09f4a82f86b2790f8355900d04c8c17e50c.svg
tot_t_count = qpe_t_count(delta_pe, delta_ts, delta_ht, n_rot, n_t, xi_bound, prod_ord)
print(f"N_{{T_tot}} = {tot_t_count:4.3e} T gates.")
N_{T_tot} = 1.049e+07 T gates.

Visualization#

It’s helpful at this point to do some visualization of the error dependencies.

The \(T\) count varies slowly with \(\Delta_{HT}\) so we let’s pick a value an look at the dependence of the \(T\) counts on just \(\Delta_{TS}\) and \(\Delta_{PE}\).

import matplotlib.pyplot as plt
from matplotlib import cm

X = np.logspace(-2, -0.5, 20)
Y = np.logspace(-2, -0.5, 20)
X, Y = np.meshgrid(X, Y)
results = []
delta_ht = 1e-5
for x, y in zip(X, Y):
    for xval, yval in zip(x, y):
        results.append(np.log10(qpe_t_count(xval, yval, delta_ht, n_rot, n_t, xi_bound, prod_ord)))
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
surf = ax.plot_surface(X, Y, np.array(results).reshape(20, 20), cmap=cm.coolwarm,
                       linewidth=0, antialiased=False)
ax.set_xlabel(r'$\Delta_{TS}$')
ax.set_ylabel(r'$\Delta_{PE}$')
Text(0.5, 0.5, '$\\Delta_{PE}$')
../../../../_images/e928134fe7f084772f63509196767b4feb2fb7774d85c4a87a0ea26b2f4d01d2.png
fig, ax = plt.subplots()
surf = ax.contour(
    X, Y, np.array(results).reshape(20, 20), 25, cmap=cm.coolwarm, 
)
cs = ax.contour(X, Y, X + Y, 10, colors="k", linestyles="solid")
ax.clabel(cs, inline=True, fontsize=10)
clb = fig.colorbar(surf, shrink=0.5, aspect=5)
clb.ax.set_title(r'$\log(T_\mathrm{tot}$)')
cs = ax.contour(X, Y, X+Y, levels=[epsilon], linestyles='dotted')
ax.clabel(cs, inline=True, fontsize=10)
ax.set_xlabel(r'$\Delta_{TS}$')
ax.set_ylabel(r'$\Delta_{PE}$')
ax.set_title(f'Desired $\epsilon$ = 0.0051 $L^2$ = {epsilon}')
Text(0.5, 1.0, 'Desired $\\epsilon$ = 0.0051 $L^2$ = 0.3264')
../../../../_images/03c31b44545c92bb9987d8e2bfaaa27ac075ceb0a1aad78dad9521184ca0200d.png

We can plot the cost function a long the line \(\Delta_{HT} + \Delta_{PE} + \Delta_{TS} = \epsilon\).

delta_ht_vals = np.array([1e-4, 1e-3, 1e-2, 1e-1]) * epsilon
fig, ax = plt.subplots()
for dht in delta_ht_vals:
    delta_ts_vals = np.linspace(0.05*epsilon , epsilon - dht, 100)[:-1]
    t_vals = []
    for dts in delta_ts_vals:
        t_vals.append(qpe_t_count(epsilon - dts - dht, dts, dht, n_rot, n_t, xi_bound, prod_ord))
    ax.plot(delta_ts_vals, t_vals, label=fr'$\Delta_{{HT}} / \epsilon$ = {dht/epsilon:4.3e}')
plt.yscale('log')
plt.ylabel('$T$ count')
plt.legend()
plt.title(r'$\Delta_{PE} + \Delta_{TS} + \Delta_{HT} = \epsilon$')
plt.xlabel(r'$\Delta_{TS}/\epsilon$')
Text(0.5, 0, '$\\Delta_{TS}/\\epsilon$')
../../../../_images/1c976786685909ec5698927c2902b731d0d4f12153c2f44f9ddeea01c7cfc15d.png

We can perform a very naive two step constrained optimization along this line to find a minimum which looks to be around \(9\times10^6\) T gates.

from scipy.optimize import minimize
def objective(delta_ts, delta_ht, n_rot, n_t, xi_bound, prod_ord):
    t_counts = qpe_t_count(epsilon - delta_ts - delta_ht, delta_ts, delta_ht, n_rot, n_t, xi_bound, prod_ord)
    return t_counts

def inner_objective(delta_ht, n_rot, n_t, xi_bound, prod_ord):
    min_delta_ts = minimize(objective, x0=0.7*(epsilon-delta_ht), bounds=[(1e-3*epsilon, 0.9*(epsilon-delta_ht))], args=(delta_ht, n_rot, n_t, xi_bound, prod_ord)).x[0]
    return objective(min_delta_ts, delta_ht, n_rot, n_t, xi_bound, prod_ord), min_delta_ts

def minimize_linesearch(n_rot, n_t, xi_bound, prod_ord):
    res_min = np.inf
    dht_min = epsilon
    for dht in np.linspace(5e-3*epsilon, 1e-1*epsilon, 100):
        res, _ = inner_objective(dht, n_rot, n_t, xi_bound, prod_ord)
        if res < res_min:
            res_min = res
            dht_min = dht
    t_opt, delta_ts_opt = inner_objective(dht_min, n_rot, n_t, xi_bound, prod_ord)
    return dht_min, delta_ts_opt, epsilon - delta_ts_opt - dht_min, t_opt

delta_ht_opt, delta_ts_opt, delta_pe_opt, t_opt = minimize_linesearch(n_rot, n_t, xi_bound, prod_ord)
print(rf"\Delta_{{HT}} = {delta_ht_opt}")
print(rf"\Delta_{{TS}} = {delta_ts_opt}")
print(rf"\Delta_{{PE}} = {delta_pe_opt}")
print(rf"T_{{opt}} = {t_opt:4.3e}")
\Delta_{HT} = 0.012907636363636364
\Delta_{TS} = 0.11015760318768891
\Delta_{PE} = 0.20333476044867474
T_{opt} = 9.226e+06

Using Hamming Weight Phasing#

We can compare this cost to that found using Hamming weight phasing for the equal angle rotations.

from qualtran.bloqs.chemistry.trotter.hubbard.trotter_step import build_plaq_hwp_unitary_second_order_suzuki
trotter_step_hwp = build_plaq_hwp_unitary_second_order_suzuki(length, hubb_u, timestep, eps=1e-10)
n_t_hwp, n_rot_hwp = t_and_rot_counts_from_bloq(trotter_step_hwp)
print(f"N_T(HWP) = {n_t_hwp} vs {(3*length**2 // 2)*8}")
print(f"N_rot(HWP) = {n_rot_hwp} vs {(3 * length**2 + 2*length**2)}")
delta_ht_opt, delta_ts_opt, delta_pe_opt, t_opt = minimize_linesearch(n_rot_hwp, n_t_hwp, xi_bound, prod_ord)
print(rf"T_{{OPT}}(HWP) = {t_opt:4.3e}")
# The reference counts Toffolis as 2 T gates, we count them as 4.
print(rf"Reference value = {1.7e6 + 4 * 1.8e5:4.3e}")
N_T(HWP) = 2008 vs 768
N_rot(HWP) = 60 vs 320
T_{OPT}(HWP) = 3.071e+06
Reference value = 2.420e+06

Our value is slightly higher as we included all terms in the Trotter step. The reference only counts one layer of interaction gates. Let’s correct for that.

trotter_step_hwp = build_plaq_hwp_unitary_second_order_suzuki(length, hubb_u, timestep, eps=1e-10, strip_layer=True)
n_t_hwp, n_rot_hwp = t_and_rot_counts_from_bloq(trotter_step_hwp)
print(f"N_T(HWP) = {n_t_hwp}")
print(f"N_rot(HWP) = {n_rot_hwp}")
delta_ht_opt, delta_ts_opt, delta_pe_opt, t_opt = minimize_linesearch(n_rot_hwp, n_t_hwp, xi_bound, prod_ord)
print(rf"T_{{OPT}}(HWP) = {t_opt:4.3e}")
print(rf"Reference value = {1.7e6 + 4 * 1.8e5:4.3e}")
N_T(HWP) = 1760
N_rot(HWP) = 48
T_{OPT}(HWP) = 2.565e+06
Reference value = 2.420e+06

Symbolic T counts#

We can avoid doing any manipulation ourselves using sympy to represent the error expression.

s_eps_r, s_length, s_hubb_u, s_timestep, s_tau = sympy.symbols(r'\epsilon_{R}, L, u, t, \tau')
s_delta_ht, s_delta_ts, s_delta_pe, s_p, s_xi = sympy.symbols(
    '\Delta_{HT}, \Delta_{TS}, \Delta_{PE}, p, xi'
)
s_n_rot, s_n_t, s_n_pe = sympy.symbols('N_R, N_T, N_PE')

First let’s check the Bloq counts look correct for the Trotter step, there are two sources: rotations from the interaction and hopping bloq, and some direct T gates from the TwoBitFFFT gate.

from qualtran.resource_counting.t_counts_from_sigma import t_counts_from_sigma

s_trotter_step = build_plaq_unitary_second_order_suzuki(
    s_length, s_hubb_u, s_timestep, eps=s_eps_r, hubb_t=s_tau
)
t_counts, n_rot = t_and_rot_counts_from_bloq(s_trotter_step)
t_counts + rotation_cost(n_rot, s_delta_ht, s_timestep)
\[\displaystyle \left\lceil{\left(2 L^{2} + 6 \left\lfloor{\frac{L^{2}}{2}}\right\rfloor\right) \left\lceil{1.149 \operatorname{log}_{2}{\left(\frac{1.0 \cdot \left(2 L^{2} + 6 \left\lfloor{\frac{L^{2}}{2}}\right\rfloor\right)}{\Delta_{HT} t} \right)} + 9.2}\right\rceil}\right\rceil + 24 \left\lfloor{\frac{L^{2}}{2}}\right\rfloor\]
# check the symbolic counts match the expected counts
t_counts_orig, n_rot = t_and_rot_counts_from_bloq(trotter_step)
# for some reason substituting both at once leads to a precision error
t_counts = t_counts.evalf(subs={s_eps_r: eps_single_rot})
t_counts_symb = t_counts.evalf(subs={s_length: length})
assert t_counts_orig == t_counts_symb

Now let’s reproduce the expression for the total cost

s_timestep = (s_delta_ts / s_xi) ** (1 / s_p)
s_eps_r = (s_delta_ht * s_timestep) / s_n_rot
s_n_pe = 0.76 * sympy.pi / (s_delta_pe * s_timestep)
s_trotter_step = build_plaq_unitary_second_order_suzuki(
    s_length, s_hubb_u, s_timestep, eps=s_eps_r, hubb_t=s_tau
)
# just use this cost in lieu of a QPE bloq
# See: https://github.com/quantumlib/Qualtran/issues/932 this should be replaced by a real bloq.
s_t_counts, s_n_rot =  t_and_rot_counts_from_bloq(s_trotter_step)
qpe_cost_symb = s_n_pe * (s_t_counts + rotation_cost(s_n_rot, s_delta_ht, s_timestep))
qpe_cost_symb
\[\displaystyle \frac{0.76 \pi \left(\frac{\Delta_{TS}}{\xi}\right)^{- \frac{1}{p}} \left(\left\lceil{\left(2 L^{2} + 6 \left\lfloor{\frac{L^{2}}{2}}\right\rfloor\right) \left\lceil{1.149 \operatorname{log}_{2}{\left(\frac{1.0 \left(\frac{\Delta_{TS}}{\xi}\right)^{- \frac{1}{p}} \left(2 L^{2} + 6 \left\lfloor{\frac{L^{2}}{2}}\right\rfloor\right)}{\Delta_{HT}} \right)} + 9.2}\right\rceil}\right\rceil + 24 \left\lfloor{\frac{L^{2}}{2}}\right\rfloor\right)}{\Delta_{PE}}\]
symb_t_count = qpe_cost_symb.evalf(
    subs={
        s_length: length,
        s_delta_ht: delta_ht,
        s_delta_pe: delta_pe,
        s_delta_ts: delta_ts,
        s_xi: xi_bound,
        s_n_rot: n_rot,
    }
)
symb_t_count = symb_t_count.evalf(subs={s_p: prod_ord}, maxn=500)
tot_t_count = qpe_t_count(delta_pe, delta_ts, delta_ht, n_rot, n_t, xi_bound, prod_ord)
assert int(symb_t_count) == int(tot_t_count)