定制收缩路径#

概述#

如果模拟电路的量子比特数很大，我们建议用户尝试自定义收缩设置，而不是使用贪婪的默认设置。

设置#

cotengra 安装请参考安装文档，由于没有上传到PyPI，所以无法通过 pip install 简单获取。最简单的安装方式是 pip install -U git+https://github.com/jcmgray/cotengra.git。

[3]:

import tensorcircuit as tc
import numpy as np
import cotengra as ctg

我们使用以下示例作为收缩的测试平台，真正的 contractor 是为 Circuit.expectation API 调用的。收缩有两个阶段，第一个是收缩路径搜索，用于在空间和时间方面找到更好的收缩路径。第二阶段是真正的收缩，使用 ML 后端 API 调用矩阵乘法。在本说明中，我们关注第一阶段的性能，并且可以使用任何类型的 opt-einsum 兼容路径求解器自定义收缩路径求解器。

[4]:

def testbed():
    n = 40
    d = 6
    param = K.ones([2 * d, n])
    c = tc.Circuit(n)
    c = tc.templates.blocks.example_block(c, param, nlayers=d, is_split=True)
    # 用 SVD 分解对两个量子比特门进行分割和截断
    return c.expectation_ps(z=[n // 2], reuse=False)

opt-einsum 提供了几个收缩优化器，并随 TensorNetwork 包一起提供。由于 TensorCircuit 建立在 TensorNetwork 之上，我们可以使用这些简单的收缩优化器。尽管对于任何中等系统，只有贪婪优化器有效，但其他优化器具有指数标度并且在电路模拟场景中失败。

在本说明中，我们始终为收缩系统设置 contraction_info=True（默认为 False），它将打印收缩信息摘要，包括 size、flops 和 writes。有关这些指标的定义，另请参阅 cotengra 文档和相应论文。

衡量收缩路径质量的指标包括

* **FLOPs**：通过给定路径收缩张量网络时涉及的所有矩阵乘法所需的计算操作总数。该指标表征了总的模拟时间。

* **WRITE**：在收缩期间计算的所有张量（包括中间张量）的总大小（元素数量）。

* **SIZE**：存储在内存中的最大中间张量的大小。

由于 TensorCircuit 中的模拟启用了 AD，所有中间结果都需要缓存和跟踪，因此需要关注的空间开销是 write 而非 size。

此外，我们将在 set_contractor 中启用 debug_level=2（不要在实际计算中使用此选项！）通过启用此选项，收缩的第二阶段，即真正的收缩，将不会发生。我们可以关注收缩路径信息，它展示了不同定制 contractor 之间的差异。

[5]:

tc.set_contractor("greedy", debug_level=2, contraction_info=True)
# 默认 contractor
testbed()

------ contraction cost summary ------
log10[FLOPs]:  12.393  log2[SIZE]:  30  log2[WRITE]:  35.125

[5]:

<tf.Tensor: shape=(), dtype=complex64, numpy=0j>

cotengra 优化器：有关超参数调整，请参阅文档。

[7]:

opt = ctg.ReusableHyperOptimizer(
    methods=["greedy", "kahypar"],
    parallel=True,
    minimize="write",
    max_time=120,
    max_repeats=1024,
    progbar=True,
)
# 注意：目前对于新版本 python，仅 "ray" 选项对于 parallel 参数适用
tc.set_contractor(
    "custom", optimizer=opt, preprocessing=True, contraction_info=True, debug_level=2
)
# opt-einsum 兼容函数接口作为优化器的参数传递\
# 还要注意 preprocessing=True 可以将单个量子比特门合并到相邻的两个量子比特门中
testbed()

log2[SIZE]: 15.00 log10[FLOPs]: 7.56:  45%|██████████████████▊                       | 458/1024 [02:03<02:32,  3.70it/s]

------ contraction cost summary ------
log10[FLOPs]:  7.565  log2[SIZE]:  15  log2[WRITE]:  19.192

[7]:

<tf.Tensor: shape=(), dtype=complex64, numpy=0j>

我们甚至可以在路径搜索之后包含 reconfigure，这进一步大大提高了收缩路径的空间效率。

[8]:

opt = ctg.ReusableHyperOptimizer(
    minimize="combo",
    max_repeats=1024,
    max_time=120,
    progbar=True,
)


def opt_reconf(inputs, output, size, **kws):
    tree = opt.search(inputs, output, size)
    tree_r = tree.subtree_reconfigure_forest(
        progbar=True, num_trees=10, num_restarts=20, subtree_weight_what=("size",)
    )
    return tree_r.get_path()


# subtree_reconfigure_forest 还有一个默认的 parallel=True 选项，
# 对于上面的较新版本的 python，这只能设置为 “ray”
# 请注意不同版本的 cotengra 在最后一行中 API 如何发生了改变：get_path 或 path
# 用户可能需要更改 API 以使示例工作

tc.set_contractor(
    "custom",
    optimizer=opt_reconf,
    contraction_info=True,
    preprocessing=True,
    debug_level=2,
)
testbed()

log2[SIZE]: 15.00 log10[FLOPs]: 7.46:  32%|█████████████▍                            | 329/1024 [02:00<04:13,  2.74it/s]
log2[SIZE]: 14.00 log10[FLOPs]: 7.02: 100%|█████████████████████████████████████████████| 20/20 [01:05<00:00,  3.30s/it]

------ contraction cost summary ------
log10[FLOPs]:  7.021  log2[SIZE]:  14  log2[WRITE]:  19.953

[8]:

<tf.Tensor: shape=(), dtype=complex64, numpy=0j>