Contraction sequence optimization

When contracting a tensor network, the sequence of contraction makes a big difference in the computational cost. However, the complexity of determining the optimal sequence grows exponentially with the number of tensors, but there are many heuristic algorithms available for computing optimal sequences for small networks[1][2][3][4][5][6]. ITensors.jl provides some functionality for helping you find the optimal contraction sequence for small tensor network, as we will show below.

The algorithm in ITensors.jl currently uses a modified version of[1] with simplifications for outer product contractions similar to those used in TensorOperations.jl.

Functions

ITensors.ContractionSequenceOptimization.contraction_costFunction
contraction_cost(A; sequence)

Return the cost of contracting the collection of ITensors according to the specified sequence, where the cost is measured in the number of floating point operations that would need to be performed to contract dense tensors of the dimensions specified by the indices of the tensors (so for now, sparsity is ignored in computing the costs). Pairwise costs are returned in a vector (contracting N tensors requires N-1 pairwise contractions). You can use sum(contraction_cost(A; sequence)) to get the total cost of the contraction.

If no sequence is specified, left associative contraction is used, in other words the sequence is equivalent to [[[[1, 2], 3], 4], …].

source
NDTensors.contractFunction
contract(ψ::MPS, A::MPO; kwargs...) -> MPS
*(::MPS, ::MPO; kwargs...) -> MPS

contract(A::MPO, ψ::MPS; kwargs...) -> MPS
*(::MPO, ::MPS; kwargs...) -> MPS

Contract the MPO A with the MPS ψ, returning an MPS with the unique site indices of the MPO.

For example, for an MPO with site indices with prime levels of 1 and 0, such as -s'-A-s-, and an MPS with site indices with prime levels of 0, such as -s-x, the result is an MPS y with site indices with prime levels of 1, -s'-y = -s'-A-s-x.

Since it is common to contract an MPO with prime levels of 1 and 0 with an MPS with prime level of 0 and want a resulting MPS with prime levels of 0, we provide a convenience function apply:

apply(A, x; kwargs...) = replaceprime(contract(A, x; kwargs...), 2 => 1)`.

Choose the method with the method keyword, for example "densitymatrix" and "naive".

Keywords

  • cutoff::Float64=1e-13: the cutoff value for truncating the density matrix eigenvalues. Note that the default is somewhat arbitrary and subject to change, in general you should set a cutoff value.
  • maxdim::Int=maxlinkdim(A) * maxlinkdim(ψ)): the maximal bond dimension of the results MPS.
  • mindim::Int=1: the minimal bond dimension of the resulting MPS.
  • normalize::Bool=false: whether or not to normalize the resulting MPS.
  • method::String="densitymatrix": the algorithm to use for the contraction. Currently the options are "densitymatrix", where the network formed by the MPO and MPS is squared and contracted down to a density matrix which is diagonalized iteratively at each site, and "naive", where the MPO and MPS tensor are contracted exactly at each site and then a truncation of the resulting MPS is performed.

See also apply.

contract(A::MPO, B::MPO; kwargs...) -> MPO
*(::MPO, ::MPO; kwargs...) -> MPO

Contract the MPO A with the MPO B, returning an MPO with the site indices that are not shared between A and B.

If you are contracting two MPOs with the same sets of indices, likely you want to call something like:

C = contract(A', B; cutoff=1e-12)
C = replaceprime(C, 2 => 1)

That is because if MPO A has the index structure -s'-A-s- and MPO B has the Index structure -s'-B-s-, if we only want to contract over on set of the indices, we would do (-s'-A-s-)'-s'-B-s- = -s''-A-s'-s'-B-s- = -s''-C-s-, and then map the prime levels back to pairs of primed and unprimed indices with: replaceprime(-s''-C-s-, 2 => 1) = -s'-C-s-.

Since this is a common use case, you can use the convenience function:

C = apply(A, B; cutoff=1e-12)

which is the same as the code above.

If you are contracting MPOs that have diverging norms, such as MPOs representing sums of local operators, the truncation can become numerically unstable (see https://arxiv.org/abs/1909.06341 for a more numerically stable alternative). For now, you can use the following options to contract MPOs like that:

C = contract(A, B; alg="naive", truncate=false)
# Bring the indices back to pairs of primed and unprimed
C = apply(A, B; alg="naive", truncate=false)

Keywords

  • cutoff::Float64=1e-14: the cutoff value for truncating the density matrix eigenvalues. Note that the default is somewhat arbitrary and subject to change, in general you should set a cutoff value.
  • maxdim::Int=maxlinkdim(A) * maxlinkdim(B)): the maximal bond dimension of the results MPS.
  • mindim::Int=1: the minimal bond dimension of the resulting MPS.
  • alg="zipup": Either "zipup" or "naive". "zipup" contracts pairs of site tensors and truncates with SVDs in a sweep across the sites, while "naive" first contracts pairs of tensor exactly and then truncates at the end if truncate=true.
  • truncate=true: Enable or disable truncation. If truncate=false, ignore other truncation parameters like cutoff and maxdim. This is most relevant for the "naive" version, if you just want to contract the tensors pairwise exactly. This can be useful if you are contracting MPOs that have diverging norms, such as MPOs originating from sums of local operators.

See also apply for details about the arguments available.

*(As::ITensor...; sequence = default_sequence(), kwargs...)
*(As::Vector{<: ITensor}; sequence = default_sequence(), kwargs...)
contract(As::ITensor...; sequence = default_sequence(), kwargs...)

Contract the set of ITensors according to the contraction sequence.

The default sequence is "automatic" if ITensors.using_contraction_sequence_optimization() is true, otherwise it is "left_associative" (the ITensors are contracted from left to right).

You can change the default with ITensors.enable_contraction_sequence_optimization() and ITensors.disable_contraction_sequence_optimization().

For a custom sequence, the sequence should be provided as a binary tree where the leaves are integers n specifying the ITensor As[n] and branches are accessed by indexing with 1 or 2, i.e. sequence = Any[Any[1, 3], Any[2, 4]].

source

Examples

In the following example we show how to compute the contraction sequence cost of a

using ITensors
using Symbolics

using ITensors: contraction_cost

@variables m, k, d

l = Index(m, "l")
r = Index(m, "r")
h₁ = Index(k, "h₁")
h₂ = Index(k, "h₂")
h₃ = Index(k, "h₃")
s₁ = Index(d, "s₁")
s₂ = Index(d, "s₂")

H₁ = ITensor(dag(s₁), s₁', dag(h₁), h₂)
H₂ = ITensor(dag(s₂), s₂', dag(h₂), h₃)
L = ITensor(dag(l), l', h₁)
R = ITensor(dag(r), r', h₃)
ψ = ITensor(l, s₁, s₂, r)

TN = [ψ, L, H₁, H₂, R]
sequence1 = Any[2, Any[3, Any[4, Any[1, 5]]]]
sequence2 = Any[Any[4, 5], Any[1, Any[2, 3]]]
cost1 = contraction_cost(TN; sequence = sequence1)
cost2 = contraction_cost(TN; sequence = sequence2)

println("First sequence")
display(sequence1)
display(cost1)
@show sum(cost1)
@show substitute(sum(cost1), Dict(d => 4))

println("\nSecond sequence")
display(sequence2)
display(cost2)
@show sum(cost2)
@show substitute(sum(cost2), Dict(d => 4))

This example helps us learn that in the limit of large MPS bond dimension m, the first contraction sequence is faster, while in the limit of large MPO bond dimension k, the second sequence is faster. This has practical implications for writing an efficient DMRG algorithm in both limits, which we plan to incorporate into ITensors.jl.

Here is a more systematic example of searching through the parameter space to find optimal contraction sequences:

using ITensors
using Symbolics

using ITensors: contraction_cost, optimal_contraction_sequence

function tensor_network(; m, k, d)
  l = Index(m, "l")
  r = Index(m, "r")
  h₁ = Index(k, "h₁")
  h₂ = Index(k, "h₂")
  h₃ = Index(k, "h₃")
  s₁ = Index(d, "s₁")
  s₂ = Index(d, "s₂")

  ψ = ITensor(l, s₁, s₂, r)
  L = ITensor(dag(l), l', h₁)
  H₁ = ITensor(dag(s₁), s₁', dag(h₁), h₂)
  H₂ = ITensor(dag(s₂), s₂', dag(h₂), h₃)
  R = ITensor(dag(r), r', h₃)
  return [ψ, L, H₁, H₂, R]
end

function main()
  mrange = 50:10:80
  krange = 50:10:80
  sequence_costs = Matrix{Any}(undef, length(mrange), length(krange))
  for iₘ in eachindex(mrange), iₖ in eachindex(krange)
    m_val = mrange[iₘ]
    k_val = krange[iₖ]
    d_val = 4

    TN = tensor_network(; m = m_val, k = k_val, d = d_val)
    sequence = optimal_contraction_sequence(TN)
    cost = contraction_cost(TN; sequence = sequence)

    @variables m, k, d
    TN_symbolic = tensor_network(; m = m, k = k, d = d)
    cost_symbolic = contraction_cost(TN_symbolic; sequence = sequence)
    sequence_cost = (dims = (m = m_val, k = k_val, d = d_val), sequence = sequence, cost = cost, symbolic_cost = cost_symbolic)
    sequence_costs[iₘ, iₖ] = sequence_cost
  end
  return sequence_costs
end

sequence_costs = main()

# Analyze the results.
println("Index dimensions")
display(getindex.(sequence_costs, :dims))

println("\nContraction sequences")
display(getindex.(sequence_costs, :sequence))

println("\nSymbolic contraction cost with d = 4")
# Fix d to a certain value (such as 4 for a Hubbard site)
@variables d
var_sub = Dict(d => 4)
display(substitute.(sum.(getindex.(sequence_costs, :symbolic_cost)), (var_sub,)))

A future direction will be to allow optimizing over contraction sequences with the dimensions specified symbolically, so that the optimal sequence in limits of certain dimensions can be found. In addition, we plan to implement more algorithms that work for larger networks, as well as algorithms like[2] which take an optimal sequence for a closed network and generate optimal sequences for environments of each tensor in the network, which is helpful for computing gradients of tensor networks.