Bagaimana cara operator difusi Grover bekerja dan mengapa itu optimal?

Dalam jawaban ini , algoritma Grover dijelaskan. Penjelasan menunjukkan bahwa algoritma ini sangat bergantung pada Operator Difusi Grover , tetapi tidak memberikan rincian tentang cara kerja operator ini.

Secara singkat, Operator Difusi Grover menciptakan 'inversi tentang rata-rata' untuk secara iteratif membuat perbedaan kecil dalam langkah-langkah sebelumnya cukup besar untuk dapat diukur.

Pertanyaannya sekarang:

Bagaimana operator difusi Grover mencapai ini?
Mengapa dihasilkan $O(\sqrt{n})$ total waktu untuk mencari database unordered yang optimal?

algorithm grovers-algorithm

— Kadal diskrit
sumber

Hanya komentar pada pertanyaan kedua. Ada beberapa karya yang menunjukkan bahwa jejak status dalam algoritma Grover mengikuti persis geodesik yang menghubungkan kondisi awal algoritma dan status tujuan. Jadi itu optimal.

— XXDD

Jawaban:

$\newcommand{\bra}[1]{\left<#1\right|}\newcommand{\ket}[1]{\left|#1\right>}\newcommand{\braket}[2]{\left<#1\middle|#2\right>}\newcommand{\bke}[3]{\left<#1\middle|#2\middle|#3\right>}\newcommand{\proj}[1]{\left|#1\right>\left<#1\right|}$ Karena pertanyaan aslinya adalah tentang deskripsi orang awam, saya menawarkan solusi yang sedikit berbeda yang mungkin lebih mudah dipahami (tergantung latar belakang), berdasarkan evolusi waktu yang berkesinambungan. (Namun saya tidak berpura-pura bahwa itu cocok untuk orang awam.)

Kita mulai dari keadaan awal yang merupakan superposisi seragam semua negara, dan kami bertujuan untuk menemukan sebuah negarayang dapat diakui sebagai jawaban yang benar (dengan asumsi ada tepat satu negara tersebut, meskipun hal ini dapat digeneralisasi). Untuk melakukan ini, kami berkembang dalam waktu di bawah tindakan Hamiltonian Fitur yang sangat indah dari pencarian Grover adalah bahwa pada titik ini, kita dapat mengurangi matematika menjadi subruang hanya dari dua negara

| ψ ⟩ = \frac{1}{\sqrt{2^{n}}} \sum_{y \in {0, 1}^{n}} | y ⟩

$\ket{\psi}=\frac{1}{\sqrt{2^n}}\sum_{y\in\{0,1\}^n}\ket{y}$

| x ⟩

$\ket{x}$

H = | x ⟩ ⟨ x | + | ψ ⟩ ⟨ ψ | .

$H=\proj{x}+\proj{\psi}.$

, daripada membutuhkan semua

. Lebih mudah dijelaskan jika kita membuat basis ortonormal dari kondisi ini,

mana

{| x ⟩, | ψ ⟩}

$\{\ket{x},\ket{\psi}\}$

2^{n}

$2^n$

{| x ⟩, | ψ^{⊥} ⟩}

$\{\ket{x},\ket{\psi^\perp}\}$

Menggunakan dasar ini, evolusi waktu

dapat ditulis sebagai

| ψ^{⊥} ⟩ = \frac{1}{\sqrt{2^{n} - 1}} \sum_{y \in {0, 1}^{n} : y \neq x} | y ⟩ .

$\ket{\psi^{\perp}}=\frac{1}{\sqrt{2^n-1}}\sum_{y\in\{0,1\}^n:y\neq x}\ket{y}.$

e^{- i H t} | ψ ⟩

$e^{-iHt}\ket{\psi}$

mana

dan

adalah matriks Pauli standar. Hal ini dapat ditulis kembali sebagai

e^{- i t (I + 2^{- n} Z + \frac{\sqrt{2^{n} - 1}}{2^{n}} X)} \cdot (\begin{matrix} \frac{1}{\sqrt{2^{n}}} \\ \sqrt{1 - \frac{1}{2^{n}}} \end{matrix}),

$e^{-it\left(\mathbb{I}+2^{-n}Z+\frac{\sqrt{2^n-1}}{2^{n}}X\right)}\cdot\left(\begin{array}{c}\frac{1}{\sqrt{2^n}} \\ \sqrt{1-\frac{1}{2^n}} \end{array}\right),$

X

$X$

Z

$Z$

Jadi, jika kita berevolusi untuk sementara waktu

e^{- i t} (I \cos (\frac{t}{2^{n / 2}}) - i \frac{1}{2^{n / 2}} \sin (\frac{t}{2^{n / 2}}) (Z + X \sqrt{2^{n} - 1})) (\begin{matrix} \frac{1}{\sqrt{2^{n}}} \\ \sqrt{1 - \frac{1}{2^{n}}} \end{matrix}) .

$e^{-it}\left(\mathbb{I}\cos\left(\frac{t}{2^{n/2}}\right)-i\frac{1}{2^{n/2}}\sin\left(\frac{t}{2^{n/2}}\right)\left(Z+X\sqrt{2^n-1}\right)\right)\left(\begin{array}{c}\frac{1}{\sqrt{2^n}} \\ \sqrt{1-\frac{1}{2^n}} \end{array}\right).$

, dan mengabaikan fase global, kondisi terakhir adalah

t = \frac{π}{2} 2^{n / 2}

$t=\frac{\pi}{2}2^{n/2}$

\frac{1}{2^{n / 2}} (Z + X \sqrt{2^{n} - 1}) (\begin{matrix} \frac{1}{\sqrt{2^{n}}} \\ \sqrt{1 - \frac{1}{2^{n}}} \end{matrix}) = (\begin{matrix} \frac{1}{2^{n}} \\ - \frac{\sqrt{2^{n} - 1}}{2^{n}} \end{matrix}) + (\begin{matrix} 1 - \frac{1}{2^{n}} \\ \frac{\sqrt{2^{n} - 1}}{2^{n}} \end{matrix}) = (\begin{matrix} 1 \\ 0 \end{matrix}) .

$\frac{1}{2^{n/2}}\left(Z+X\sqrt{2^n-1}\right)\left(\begin{array}{c}\frac{1}{\sqrt{2^n}} \\ \sqrt{1-\frac{1}{2^n}} \end{array}\right)=\left(\begin{array}{c}\frac{1}{2^n} \\ -\frac{\sqrt{2^n-1}}{2^n} \end{array}\right)+\left(\begin{array}{c} 1-\frac{1}{2^n} \\ \frac{\sqrt{2^n-1}}{2^n}\end{array}\right)=\left(\begin{array}{c} 1 \\ 0 \end{array}\right).$

| x ⟩

$\ket{x}$

$\tilde H=5H$ , and evolve using $\tilde H$ and the evolution time would be 5 times shorter. If you wanted to be really radical, replace the 5 with $2^{n/2}$ , and Grover's search runs in constant time! But you're not allowed to do this arbitrarily. Any given experiment would have a fixed maximum coupling strength (i.e. a fixed multiplier). So, different experiments have different running times, but their scaling is the same, $2^{n/2}$ . It's just like saying that the gate cost in the circuit model is constant, rather than assuming that if we use a circuit of depth $k$ each gate can be made to run in time $1/k$ .

The optimality proof essentially involves showing that if you made detection of one possible marked state $\ket{x}$ any quicker, it would make detection of a different marked state, $\ket{y}$ , slower. Since the algorithm should work equally well whichever state is marked, this solution is the best one.

— DaftWullie
sumber

One way of defining the diffusion operator is¹ $D = -H^{\otimes n}U_0H^{\otimes n}$ , where $U_0$ is the phase oracle

U_{0} | 0^{\otimes n} ⟩ = - | 0^{\otimes n} ⟩, U_{0} | x ⟩ = | x ⟩ for | x ⟩ \neq | 0^{\otimes n} ⟩ .

$U_0\left|0^{\otimes n}\right> = -\left|0^{\otimes n}\right>,\,U_0\left|x\right> = \left|x\right>\,\text{for} \left|x\right>\neq\left|0^{\otimes n}\right>.$

This shows that $U_0$ can also be written as $U_0 = I-2\left|0^{\otimes n}\rangle\langle0^{\otimes n}\right|$ , giving

D = 2 | + ⟩ ⟨ + | - I,

$D= 2\left|+\rangle\langle+\right| - I,$ where

| + ⟩ = 2^{- n / 2} {(| 0 ⟩ + | 1 ⟩)}^{\otimes n}

$\left|+\right> = 2^{-n/2}\left(\left|0\right> + \left|1\right>\right)^{\otimes n}$ .

This gives² that the diffusion operator is a reflection about $\left|+\right>$

As the other part of Grover's algorithm is also a reflection, these combine to rotate the current state closer to the 'searched-for' value $x_0$ . This angle decreases linearly with the number of rotations (until it overshoots the searched-for value), giving that the probability of correctly measuring the correct value increases quadratically.

Bennet et. al. showed that this is optimal. By taking a classical solution to an NP-problem, Grover's algorithm can be used to quadratically speed this up. However, taking a language $\mathcal L_A = \left\lbrace y:\exists x\, A\left(x\right) = y\right\rbrace$ for a length preserving function $A$ (here, an oracle), any bounded-error oracle based quantum turing machine cannot accept this language in a time $T\left(n\right)=\mathcal o\left(2^{n/2}\right)$ .

This is achieved by taking a set of oracles where $\left|1\right>^{\otimes n}$ has no inverse (so is not contained in the language). However, this is contained in some new language $\mathcal L_{A_y}$ by definition. The difference in probabilities of a machine accepting $\mathcal L_A$ and a different machine accepting $\mathcal L_{A_y}$ in time $T\left(n\right)$ is then less than $1/3$ and so neither language is accepted and Grover's algorithm is indeed asymptotically optimal.³

Zalka later showed that Grover's algorithm is exactly optimal.

^{1 In Grover's algorithm, minus signs can be moved round, so where the minus sign is, is somewhat arbitrary and doesn't necessarily have to be in the definition of the diffusion operator}

^{2 alternatively, defining the diffusion operator without the minus sign gives a reflection about $\left|+^\perp\right>$}

^{3 Defining the machine using the oracle $A$ as $M^A$ and the machine using oracle $A_y$ as $M^{A_y}$ , this is a due to the fact that there is a set $S$ of bit strings, where the states of $M^A$ and $M^{A_y}$ at a time $t$ are $\epsilon$ -close⁴, with a cardinality $<2T^2/\epsilon^2$ . Each oracle where $M^A$ correctly decides if $\left|1\right>^{\otimes n}$ is in $\mathcal L_A$ can be mapped to $2^n - \text{Card}\left(S\right)$ oracles where $M^A$ fails to correctly decide if $\left|1\right>^{\otimes n}$ is in that oracle's language. However, it must give one of the other $2^n-1$ potential answers and so if $T\left(n\right)=\mathcal o\left(2^{n/2}\right)$ , the machine is unable to determine membership of $\mathcal L_A$ .}

^{4 Using the Euclidean distance, twice the trace distance}

— Mithrandir24601
sumber