Deteksi Otomatis Sudut Rotasi pada Gambar Sewenang-wenang dengan Fitur Orthogonal

Saya punya tugas di mana saya perlu mendeteksi sudut gambar seperti contoh berikut (bagian dari foto microchip). Gambar memang mengandung fitur ortogonal, tetapi mereka dapat memiliki ukuran yang berbeda, dengan resolusi / ketajaman yang berbeda. Gambar akan sedikit tidak sempurna karena beberapa distorsi dan penyimpangan optik. Diperlukan keakuratan deteksi sudut sub-piksel (artinya harus berada di bawah kesalahan <0,1 °, kira-kira 0,01 ° akan dapat ditoleransi). Untuk referensi, untuk gambar ini sudut optimal adalah sekitar 32,19 °.

Saat ini saya sudah mencoba 2 pendekatan: Keduanya melakukan pencarian brute-force untuk minimum lokal dengan langkah 2 °, kemudian gradien turun ke ukuran langkah 0,0001 °.

Fungsi pantas sum(pow(img(x+1)-img(x-1), 2) + pow(img(y+1)-img(y-1))dihitung di seluruh gambar. Ketika garis horizontal / vertikal disejajarkan - ada sedikit perubahan arah horisontal / vertikal. Presisi sekitar 0,2 °.
Fungsi merit adalah (maks-mnt) di atas beberapa garis lebar / tinggi gambar. Strip ini juga di-loop pada gambar, dan fungsi merit diakumulasikan. Pendekatan ini juga berfokus pada perubahan kecerahan yang lebih kecil ketika garis horizontal / vertikal disejajarkan, tetapi dapat mendeteksi perubahan yang lebih kecil pada basis yang lebih besar (lebar garis - yang bisa sekitar 100 piksel lebar). Ini memberikan presisi yang lebih baik, hingga 0,01 ° - tetapi memiliki banyak parameter untuk mengubah (lebar / tinggi garis misalnya cukup sensitif) yang mungkin tidak dapat diandalkan di dunia nyata.

Filter deteksi tepi tidak banyak membantu.

Perhatian saya adalah perubahan yang sangat kecil dalam fungsi prestasi dalam kedua kasus antara sudut terburuk dan terbaik (perbedaan <2x).

Apakah Anda memiliki saran yang lebih baik untuk menulis fungsi merit untuk deteksi sudut?

Pembaruan: Gambar sampel ukuran penuh diunggah di sini (51 MiB)

Setelah semua proses itu akan berakhir tampak seperti ini

image image-processing computer-vision

— BarsMonster
sumber

Sangat menyedihkan bahwa itu ditransisikan dari stackoverflow ke dsp. Saya tidak melihat solusi seperti DSP di sini, dan peluang sekarang jauh berkurang. 99,9% algoritma dan trik DSP tidak berguna untuk tugas ini. Sepertinya algoritma atau pendekatan khusus diperlukan di sini, bukan FFT.

— BarsMonster

Saya sangat senang memberi tahu Anda bahwa sedih sekali; DSP.SE adalah tempat yang paling tepat untuk menanyakan ini! (tidak banyak stackoverflow. Ini bukan pertanyaan pemrograman. Anda tahu pemrograman Anda. Anda tidak tahu bagaimana memproses gambar ini.) Gambar adalah sinyal, dan DSP.SE sangat memperhatikan proses pengolahan gambar! Juga, banyak trik DSP umum (bahkan dikenal sebagai sinyal komunikasi misalnya) sangat berlaku untuk masalah Anda :)

— Marcus Müller

Seberapa pentingkah efisiensi?

— Cedron Dawg

ngomong-ngomong, bahkan ketika berjalan dengan resolusi 0,04 °, saya cukup yakin rotasi tepat 32 °, bukan 32,19 ° - apa resolusi fotografi asli Anda? Karena pada lebar 800 px, rotasi yang tidak dikoreksi sebesar 0,01 ° hanyalah selisih tinggi 0,14 px, dan itu bahkan di bawah interpolasi tulus hampir tidak terlihat.

— Marcus Müller

@CedronDawg Jelas tidak ada persyaratan waktu nyata, saya bisa mentolerir 10-60 detik perhitungan pada 8-12 core.

— BarsMonster

Jawaban:

Jika saya memahami metode 1 Anda dengan benar, dengan itu, jika Anda menggunakan wilayah simetris sirkuler dan melakukan rotasi di sekitar pusat wilayah, Anda akan menghilangkan ketergantungan wilayah pada sudut rotasi dan mendapatkan perbandingan yang lebih adil dengan fungsi jasa antara sudut rotasi berbeda. Saya akan menyarankan metode yang pada dasarnya setara dengan itu, tetapi menggunakan gambar penuh dan tidak memerlukan rotasi gambar berulang, dan akan mencakup penyaringan low-pass untuk menghapus pixel grid anisotropy dan untuk denoising.

Gradien dari gambar dengan filter isotropik low-pass

Pertama, mari kita hitung vektor gradien lokal di setiap piksel untuk saluran warna hijau di gambar sampel ukuran penuh.

Saya mendapatkan kernel diferensiasi horizontal dan vertikal dengan membedakan respons impuls ruang kontinu dari filter low-pass yang ideal dengan respons frekuensi melingkar datar yang menghilangkan efek dari pilihan sumbu gambar dengan memastikan bahwa tidak ada tingkat detail yang berbeda secara diagonal dibandingkan secara horizontal atau vertikal, dengan mengambil sampel fungsi yang dihasilkan, dan dengan menerapkan jendela kosinus yang diputar:

\begin{matrix} (1) & \begin{matrix} h_{x} [x, y] = {\begin{cases} 0 & if x = y = 0, \\ - \frac{ω_{c}^{2} x J_{2} (ω_{c} \sqrt{x^{2} + y^{2}})}{2 π (x^{2} + y^{2})} & otherwise, \end{cases} \\ h_{y} [x, y] = {\begin{cases} 0 & if x = y = 0, \\ - \frac{ω_{c}^{2} y J_{2} (ω_{c} \sqrt{x^{2} + y^{2}})}{2 π (x^{2} + y^{2})} & otherwise, \end{cases} \end{matrix} \end{matrix}

$\begin{gather}h_x[x, y] = \begin{cases}0&\text{if }x = y = 0,\\-\displaystyle\frac{\omega_c^2\,x\,J_2\left(\omega_c\sqrt{x^2 + y^2}\right)}{2 \pi\,(x^2 + y^2)}&\text{otherwise,}\end{cases}\\ h_y[x, y] = \begin{cases}0&\text{if }x = y = 0,\\-\displaystyle\frac{\omega_c^2\,y\,J_2\left(\omega_c\sqrt{x^2 + y^2}\right)}{2 \pi\,(x^2 + y^2)}&\text{otherwise,}\end{cases}\end{gather}\tag{1}$

dimana $J_2$ adalah fungsi Bessel urutan kedua dari jenis pertama, dan $\omega_c$ adalah frekuensi cut-off dalam radian. Sumber python (tidak memiliki tanda minus Persamaan 1):

import matplotlib.pyplot as plt
import scipy
import scipy.special
import numpy as np

def rotatedCosineWindow(N):  # N = horizontal size of the targeted kernel, also its vertical size, must be odd.
  return np.fromfunction(lambda y, x: np.maximum(np.cos(np.pi/2*np.sqrt(((x - (N - 1)/2)/((N - 1)/2 + 1))**2 + ((y - (N - 1)/2)/((N - 1)/2 + 1))**2)), 0), [N, N])

def circularLowpassKernelX(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.fromfunction(lambda y, x: omega_c**2*(x - (N - 1)/2)*scipy.special.jv(2, omega_c*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2))/(2*np.pi*((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2)), [N, N])
  kernel[(N - 1)//2, (N - 1)//2] = 0
  return kernel

def circularLowpassKernelY(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.fromfunction(lambda y, x: omega_c**2*(y - (N - 1)/2)*scipy.special.jv(2, omega_c*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2))/(2*np.pi*((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2)), [N, N])
  kernel[(N - 1)//2, (N - 1)//2] = 0
  return kernel

N = 41  # Horizontal size of the kernel, also its vertical size. Must be odd.
window = rotatedCosineWindow(N)

# Optional window function plot
#plt.imshow(window, vmin=-np.max(window), vmax=np.max(window), cmap='bwr')
#plt.colorbar()
#plt.show()

omega_c = np.pi/4  # Cutoff frequency in radians <= pi
kernelX = circularLowpassKernelX(omega_c, N)*window
kernelY = circularLowpassKernelY(omega_c, N)*window

# Optional kernel plot
#plt.imshow(kernelX, vmin=-np.max(kernelX), vmax=np.max(kernelX), cmap='bwr')
#plt.colorbar()
#plt.show()

Gambar 1. Jendela kosinus diputar 2-d.

Gambar 2. Inti diferensiasi low-pass isotropik windowed, untuk frekuensi cut-off yang berbeda $\omega_c$ pengaturan. Top: omega_c = np.pi, tengah: omega_c = np.pi/4, bottom: omega_c = np.pi/16. Tanda minus Persamaan. 1 ditinggalkan. Kernel vertikal terlihat sama tetapi telah diputar 90 derajat. Sejumlah bobot kernel horisontal dan vertikal, dengan bobot $\cos(\phi)$ dan $\sin(\phi)$ , masing-masing, memberikan kernel analisis dengan tipe yang sama untuk sudut gradien $\phi$ .

Diferensiasi respon impuls tidak memengaruhi bandwidth, seperti yang dapat dilihat oleh 2-d fast Fourier transform (FFT), dengan Python:

# Optional FFT plot
absF = np.abs(np.fft.fftshift(np.fft.fft2(circularLowpassKernelX(np.pi, N)*window)))
plt.imshow(absF, vmin=0, vmax=np.max(absF), cmap='Greys', extent=[-np.pi, np.pi, -np.pi, np.pi])
plt.colorbar()
plt.show()

Gambar 3. Besarnya FFT 2-d $h_x$ . Dalam domain frekuensi, diferensiasi muncul sebagai penggandaan dari band pass sirkular datar oleh $\omega_x$ , dan dengan pergeseran fasa 90 derajat yang tidak terlihat dalam besarnya.

Untuk melakukan konvolusi untuk saluran hijau dan untuk mengumpulkan histogram vektor gradien 2-d, untuk inspeksi visual, dengan Python:

import scipy.ndimage

img = plt.imread('sample.tif').astype(float)
X = scipy.ndimage.convolve(img[:,:,1], kernelX)[(N - 1)//2:-(N - 1)//2, (N - 1)//2:-(N - 1)//2]  # Green channel only
Y = scipy.ndimage.convolve(img[:,:,1], kernelY)[(N - 1)//2:-(N - 1)//2, (N - 1)//2:-(N - 1)//2]  # ...

# Optional 2-d histogram
#hist2d, xEdges, yEdges = np.histogram2d(X.flatten(), Y.flatten(), bins=199)
#plt.imshow(hist2d**(1/2.2), vmin=0, cmap='Greys')
#plt.show()
#plt.imsave('hist2d.png', plt.cm.Greys(plt.Normalize(vmin=0, vmax=hist2d.max()**(1/2.2))(hist2d**(1/2.2))))  # To save the histogram image
#plt.imsave('histkey.png', plt.cm.Greys(np.repeat([(np.arange(200)/199)**(1/2.2)], 16, 0)))

Ini juga memotong data, membuang (N - 1)//2piksel dari setiap tepi yang terkontaminasi oleh batas gambar persegi panjang, sebelum analisis histogram.

$\pi$ $\frac{\pi}{2}$ $\frac{\pi}{4}$
$\frac{\pi}{8}$ $\frac{\pi}{16}$ $\frac{\pi}{32}$ $\frac{\pi}{64}$ - $0$
Gambar 4. Histogram gradien vektor 2-d, untuk frekuensi cutoff filter low-pass yang berbeda $\omega_c$ pengaturan. Dalam rangka: pertama dengan N=41: omega_c = np.pi, omega_c = np.pi/2, omega_c = np.pi/4(sama seperti di Python listing), omega_c = np.pi/8, omega_c = np.pi/16, maka: N=81: omega_c = np.pi/32, N=161: omega_c = np.pi/64. Denoising dengan low-pass filtering mempertajam orientasi gradien tepi jejak sirkuit dalam histogram.

Panjang vektor tertimbang arah rata-rata melingkar

Ada metode Yamartino untuk menemukan arah angin "rata-rata" dari beberapa sampel vektor angin dalam satu kali melewati sampel. Ini didasarkan pada rata - rata jumlah sirkuler , yang dihitung sebagai pergeseran kosinus yang merupakan jumlah kosinus yang masing-masing digeser oleh kuantitas sirkular periode $2\pi$ . Kita dapat menggunakan versi vektor panjang tertimbang dari metode yang sama, tetapi pertama-tama kita harus menyatukan semua arah yang modulo sama $\pi/2$ . Kita dapat melakukan ini dengan mengalikan sudut setiap vektor gradien $[X_k,Y_k]$ oleh 4, menggunakan representasi bilangan kompleks:

\begin{matrix} (2) & Z_{k} = \frac{(X_{k} + Y_{k} i)^{4}}{{\sqrt{X_{k}^{2} + Y_{k}^{2}}}^{3}} = \frac{X_{k}^{4} - 6 X_{k}^{2} Y_{k}^{2} + Y_{k}^{4} + (4 X_{k}^{3} Y_{k} - 4 X_{k} Y_{k}^{3}) i}{{\sqrt{X_{k}^{2} + Y_{k}^{2}}}^{3}}, \end{matrix}

$Z_k = \frac{(X_k + Y_k i)^4}{\sqrt{X_k^2 + Y_k^2}^3} = \frac{X_k^4 - 6X_k^2Y_k^2 + Y_k^4 + (4X_k^3Y_k - 4X_kY_k^3)i}{\sqrt{X_k^2 + Y_k^2}^3},\tag{2}$

memuaskan $|Z_k| = \sqrt{X_k^2 + Y_k^2}$ dan kemudian menafsirkan bahwa fase $Z_k$ dari $-\pi$ untuk $\pi$ mewakili sudut dari $-\pi/4$ untuk $\pi/4$ , dengan membagi fase rata-rata lingkaran yang dihitung dengan 4:

\begin{matrix} (3) & ϕ = \frac{1}{4} atan2 (\sum_{k} Im (Z_{k}), \sum_{k} Re (Z_{k})) \end{matrix}

$\phi = \frac{1}{4}\operatorname{atan2}\left(\sum_k\operatorname{Im}(Z_k), \sum_k\operatorname{Re}(Z_k)\right)\tag{3}$

dimana $\phi$ adalah perkiraan orientasi gambar.

Kualitas estimasi dapat dinilai dengan melakukan lintasan lain melalui data dan dengan menghitung rata-rata jarak melingkar persegi tertimbang , $\text{MSCD}$ , di antara fase-fase bilangan kompleks $Z_k$ dan fase rata-rata lingkaran yang diperkirakan $4\phi$ , dengan $|Z_k|$ sebagai berat:

\begin{matrix} (4) & \begin{matrix} MSCD = \frac{\sum_{k} | Z_{k} | (1 - \cos (4 ϕ - atan2 (Im (Z_{k}), Re (Z_{k}))))}{\sum_{k} | Z_{k} |} \\ = \frac{\sum_{k} \frac{| Z_{k} |}{2} ({(\cos (4 ϕ) - \frac{Re (Z_{k})}{| Z_{k} |})}^{2} + {(\sin (4 ϕ) - \frac{Im (Z_{k})}{| Z_{k} |})}^{2})}{\sum_{k} | Z_{k} |} \\ = \frac{\sum_{k} (| Z_{k} | - Re (Z_{k}) \cos (4 ϕ) - Im (Z_{k}) \sin (4 ϕ))}{\sum_{k} | Z_{k} |}, \end{matrix} \end{matrix}

$\begin{gather}\text{MSCD} = \frac{\sum_k|Z_k|\bigg(1 - \cos\Big(4\phi - \operatorname{atan2}\big(\operatorname{Im}(Z_k), \operatorname{Re}(Z_k)\big)\Big)\bigg)}{\sum_k|Z_k|}\\ = \frac{\sum_k\frac{|Z_k|}{2}\left(\left(\cos(4\phi) - \frac{\operatorname{Re}(Z_k)}{|Z_k|}\right)^2 + \left(\sin(4\phi) - \frac{\operatorname{Im}(Z_k)}{|Z_k|}\right)^2\right)}{\sum_k|Z_k|}\\ = \frac{\sum_k\big(|Z_k| - \operatorname{Re}(Z_k)\cos(4\phi) - \operatorname{Im}(Z_k)\sin(4\phi)\big)}{\sum_k|Z_k|},\end{gather}\tag{4}$

yang diminimalkan oleh $\phi$ dihitung per Persamaan. 3. Dengan Python:

absZ = np.sqrt(X**2 + Y**2)
reZ = (X**4 - 6*X**2*Y**2 + Y**4)/absZ**3
imZ = (4*X**3*Y - 4*X*Y**3)/absZ**3
phi = np.arctan2(np.sum(imZ), np.sum(reZ))/4

sumWeighted = np.sum(absZ - reZ*np.cos(4*phi) - imZ*np.sin(4*phi))
sumAbsZ = np.sum(absZ)
mscd = sumWeighted/sumAbsZ

print("rotate", -phi*180/np.pi, "deg, RMSCD =", np.arccos(1 - mscd)/4*180/np.pi, "deg equivalent (weight = length)")

Berdasarkan mpmatheksperimen saya (tidak diperlihatkan), saya pikir kami tidak akan kehabisan angka numerik bahkan untuk gambar yang sangat besar. Untuk pengaturan filter yang berbeda (beranotasi) hasilnya adalah, seperti yang dilaporkan antara -45 dan 45 derajat:

rotate 32.29809399495655 deg, RMSCD = 17.057059965741338 deg equivalent (omega_c = np.pi)
rotate 32.07672617150525 deg, RMSCD = 16.699056648843566 deg equivalent (omega_c = np.pi/2)
rotate 32.13115293914797 deg, RMSCD = 15.217534399922902 deg equivalent (omega_c = np.pi/4, same as in the Python listing)
rotate 32.18444156018288 deg, RMSCD = 14.239347706786056 deg equivalent (omega_c = np.pi/8)
rotate 32.23705383489169 deg, RMSCD = 13.63694582160468 deg equivalent (omega_c = np.pi/16)

Pemfilteran low-pass yang kuat tampak berguna, mengurangi sudut yang setara dengan RMSCD dihitung sebagai $\operatorname{acos}(1 - \text{MSCD})$ . Tanpa jendela cosine diputar 2-d, beberapa hasil akan mati sekitar satu derajat (tidak ditampilkan), yang berarti bahwa penting untuk melakukan windowing yang tepat dari filter analisis. Sudut setara RMSCD tidak secara langsung merupakan estimasi kesalahan dalam estimasi sudut, yang seharusnya jauh lebih sedikit.

Alternatif fungsi berat persegi panjang

Mari kita coba kuadrat dari panjang vektor sebagai fungsi bobot alternatif, dengan:

\begin{matrix} (5) & Z_{k} = \frac{(X_{k} + Y_{k} i)^{4}}{{\sqrt{X_{k}^{2} + Y_{k}^{2}}}^{2}} = \frac{X_{k}^{4} - 6 X_{k}^{2} Y_{k}^{2} + Y_{k}^{4} + (4 X_{k}^{3} Y_{k} - 4 X_{k} Y_{k}^{3}) i}{X_{k}^{2} + Y_{k}^{2}}, \end{matrix}

$Z_k = \frac{(X_k + Y_k i)^4}{\sqrt{X_k^2 + Y_k^2}^2} = \frac{X_k^4 - 6X_k^2Y_k^2 + Y_k^4 + (4X_k^3Y_k - 4X_kY_k^3)i}{X_k^2 + Y_k^2},\tag{5}$

Dengan Python:

absZ_alt = X**2 + Y**2
reZ_alt = (X**4 - 6*X**2*Y**2 + Y**4)/absZ_alt
imZ_alt = (4*X**3*Y - 4*X*Y**3)/absZ_alt
phi_alt = np.arctan2(np.sum(imZ_alt), np.sum(reZ_alt))/4

sumWeighted_alt = np.sum(absZ_alt - reZ_alt*np.cos(4*phi_alt) - imZ_alt*np.sin(4*phi_alt))
sumAbsZ_alt = np.sum(absZ_alt)
mscd_alt = sumWeighted_alt/sumAbsZ_alt

print("rotate", -phi_alt*180/np.pi, "deg, RMSCD =", np.arccos(1 - mscd_alt)/4*180/np.pi, "deg equivalent (weight = length^2)")

Berat persegi panjang mengurangi sudut setara RMSCD sekitar satu derajat:

rotate 32.264713568426764 deg, RMSCD = 16.06582418749094 deg equivalent (weight = length^2, omega_c = np.pi, N = 41)
rotate 32.03693157762725 deg, RMSCD = 15.839593856962486 deg equivalent (weight = length^2, omega_c = np.pi/2, N = 41)
rotate 32.11471435914187 deg, RMSCD = 14.315371970649874 deg equivalent (weight = length^2, omega_c = np.pi/4, N = 41)
rotate 32.16968341455537 deg, RMSCD = 13.624896827482049 deg equivalent (weight = length^2, omega_c = np.pi/8, N = 41)
rotate 32.22062839958777 deg, RMSCD = 12.495324176281466 deg equivalent (weight = length^2, omega_c = np.pi/16, N = 41)
rotate 32.22385477783647 deg, RMSCD = 13.629915935941973 deg equivalent (weight = length^2, omega_c = np.pi/32, N = 81)
rotate 32.284350817263906 deg, RMSCD = 12.308297934977746 deg equivalent (weight = length^2, omega_c = np.pi/64, N = 161)

Ini sepertinya fungsi berat badan yang sedikit lebih baik. Saya menambahkan juga cutoffs $\omega_c = \pi/32$ dan $\omega_c = \pi/64$ . Mereka menggunakan yang lebih besarN menghasilkan pemotongan gambar yang berbeda dan nilai MSCD yang tidak dapat dibandingkan.

Histogram 1-d

Manfaat dari fungsi berat persegi panjang lebih jelas dengan histogram tertimbang 1-d $Z_k$ fase. Skrip python:

# Optional histogram
hist_plain, bin_edges = np.histogram(np.arctan2(imZ, reZ), weights=np.ones(absZ.shape)/absZ.size, bins=900)
hist, bin_edges = np.histogram(np.arctan2(imZ, reZ), weights=absZ/np.sum(absZ), bins=900)
hist_alt, bin_edges = np.histogram(np.arctan2(imZ_alt, reZ_alt), weights=absZ_alt/np.sum(absZ_alt), bins=900)
plt.plot((bin_edges[:-1]+(bin_edges[1]-bin_edges[0]))*45/np.pi, hist_plain, "black")
plt.plot((bin_edges[:-1]+(bin_edges[1]-bin_edges[0]))*45/np.pi, hist, "red")
plt.plot((bin_edges[:-1]+(bin_edges[1]-bin_edges[0]))*45/np.pi, hist_alt, "blue")
plt.xlabel("angle (degrees)")
plt.show()

Gambar 5. Histogram interpolasi tertimbang linear dari sudut vektor gradien, dibungkus dengan $-\pi/4\ldots\pi/4$ dan ditimbang dengan (dalam urutan dari bawah ke atas di puncak): tanpa bobot (hitam), panjang vektor gradien (merah), kuadrat panjang vektor gradien (biru). Lebar bin adalah 0,1 derajat. Cutoff filter adalah omega_c = np.pi/4, sama seperti dalam daftar Python. Sosok bawah diperbesar di puncak.

Filter matematika yang mudah dikendalikan

Kami telah melihat bahwa pendekatan ini berhasil, tetapi akan lebih baik jika memiliki pemahaman matematika yang lebih baik. Itu $x$ dan $y$ tanggapan impuls filter diferensiasi diberikan oleh Persamaan. Gambar 1 dapat dipahami sebagai fungsi dasar untuk membentuk respons impuls dari filter diferensiasi yang dapat disembelih yang disampel dari rotasi sisi kanan persamaan untuk $h_x[x, y]$ (Persamaan 1). Ini lebih mudah dilihat dengan mengkonversi Persamaan. 1 ke koordinat kutub:

\begin{matrix} (6) & \begin{aligned} h_{x} (r, θ) = h_{x} [r \cos (θ), r \sin (θ)] & = {\begin{cases} 0 & if r = 0, \\ - \frac{ω_{c}^{2} r \cos (θ) J_{2} (ω_{c} r)}{2 π r^{2}} & otherwise \end{cases} \\ = \cos (θ) f (r), \\ h_{y} (r, θ) = h_{y} [r \cos (θ), r \sin (θ)] & = {\begin{cases} 0 & if r = 0, \\ - \frac{ω_{c}^{2} r \sin (θ) J_{2} (ω_{c} r)}{2 π r^{2}} & otherwise \end{cases} \\ = \sin (θ) f (r), \\ f (r) & = {\begin{cases} 0 & if r = 0, \\ - \frac{ω_{c}^{2} r J_{2} (ω_{c} r)}{2 π r^{2}} & otherwise, \end{cases} \end{aligned} \end{matrix}

$\begin{align}h_x(r, \theta) = h_x[r\cos(\theta), r\sin(\theta)] &= \begin{cases}0&\text{if }r = 0,\\-\displaystyle\frac{\omega_c^2\,r\cos(\theta)\,J_2\left(\omega_c r\right)}{2 \pi\,r^2}&\text{otherwise}\end{cases}\\ &= \cos(\theta)f(r),\\ h_y(r, \theta) = h_y[r\cos(\theta), r\sin(\theta)] &= \begin{cases}0&\text{if }r = 0,\\-\displaystyle\frac{\omega_c^2\,r\sin(\theta)\,J_2\left(\omega_c r\right)}{2 \pi\,r^2}&\text{otherwise}\end{cases}\\ &= \sin(\theta)f(r),\\ f(r) &= \begin{cases}0&\text{if }r = 0,\\-\displaystyle\frac{\omega_c^2\,r\,J_2\left(\omega_c r\right)}{2 \pi\,r^2}&\text{otherwise,}\end{cases}\end{align}\tag{6}$

where both the horizontal and the vertical differentiation filter impulse responses have the same radial factor function $f(r)$ . Any rotated version $h(r, \theta, \phi)$ of $h_x(r, \theta)$ by steering angle $\phi$ is obtained by:

\begin{matrix} (7) & h (r, θ, ϕ) = h_{x} (r, θ - ϕ) = \cos (θ - ϕ) f (r) \end{matrix}

$h(r, \theta, \phi) = h_x(r, \theta - \phi) = \cos(\theta - \phi)f(r)\tag{7}$

The idea was that the steered kernel $h(r, \theta, \phi)$ can be constructed as a weighted sum of $h_x(r, \theta)$ and $h_x(r, \theta)$ , with $\cos(\phi)$ and $\sin(\phi)$ as the weights, and that is indeed the case:

\begin{matrix} (8) & \cos (ϕ) h_{x} (r, θ) + \sin (ϕ) h_{y} (r, θ) = \cos (ϕ) \cos (θ) f (r) + \sin (ϕ) \sin (θ) f (r) = \cos (θ - ϕ) f (r) = h (r, θ, ϕ) . \end{matrix}

$\cos(\phi) h_x(r, \theta) + \sin(\phi) h_y(r, \theta) = \cos(\phi) \cos(\theta) f(r) + \sin(\phi) \sin(\theta) f(r) = \cos(\theta - \phi) f(r) = h(r, \theta, \phi).\tag{8}$

We will arrive at an equivalent conclusion if we think of the isotropically low-pass filtered signal as the input signal and construct a partial derivative operator with respect to the first of rotated coordinates $x_\phi$ , $y_\phi$ rotated by angle $\phi$ from coordinates $x$ , $y$ . (Derivation can be considered a linear-time-invariant system.) We have:

\begin{matrix} (9) & \begin{matrix} x = \cos (ϕ) x_{ϕ} - \sin (ϕ) y_{ϕ}, \\ y = \sin (ϕ) x_{ϕ} + \cos (ϕ) y_{ϕ} \end{matrix} \end{matrix}

$\begin{gather}x = \cos(\phi)x_\phi - \sin(\phi)y_\phi,\\ y = \sin(\phi)x_\phi + \cos(\phi)y_\phi\end{gather}\tag{9}$

Using the chain rule for partial derivatives, the partial derivative operator with respect to $x_\phi$ can be expressed as a cosine and sine weighted sum of partial derivatives with respect to $x$ and $y$ :

\begin{matrix} (10) & \begin{matrix} \frac{\partial}{\partial x_{ϕ}} = \frac{\partial x}{\partial x_{ϕ}} \frac{\partial}{\partial x} + \frac{\partial y}{\partial x_{ϕ}} \frac{\partial}{\partial y} = \frac{\partial (\cos (ϕ) x_{ϕ} - \sin (ϕ) y_{ϕ})}{\partial x_{ϕ}} \frac{\partial}{\partial x} + \frac{\partial (\sin (ϕ) x_{ϕ} + \cos (ϕ) y_{ϕ})}{\partial x_{ϕ}} \frac{\partial}{\partial y} = \cos (ϕ) \frac{\partial}{\partial x} + \sin (ϕ) \frac{\partial}{\partial y} \end{matrix} \end{matrix}

$\begin{gather}\frac{\partial}{\partial x_\phi} = \frac{\partial x}{\partial x_\phi}\frac{\partial}{\partial x} + \frac{\partial y}{\partial x_\phi}\frac{\partial}{\partial y} = \frac{\partial \big(\cos(\phi)x_\phi - \sin(\phi)y_\phi\big)}{\partial x_\phi}\frac{\partial}{\partial x} + \frac{\partial \big(\sin(\phi)x_\phi + \cos(\phi)y_\phi\big)}{\partial x_\phi}\frac{\partial}{\partial y} = \cos(\phi)\frac{\partial}{\partial x} + \sin(\phi)\frac{\partial}{\partial y}\end{gather}\tag{10}$

A question that remains to be explored is how a suitably weighted circular mean of gradient vector angles is related to the angle $\phi$ of in some way the "most activated" steered differentiation filter.

Possible improvements

To possibly improve results further, the gradient can be calculated also for the red and blue color channels, to be included as additional data in the "average" calculation.

I have in mind possible extensions of this method:

1) Use a larger set of analysis filter kernels and detect edges rather than detecting gradients. This needs to be carefully crafted so that edges in all directions are treated equally, that is, an edge detector for any angle should be obtainable by a weighted sum of orthogonal kernels. A set of suitable kernels can (I think) be obtained by applying the differential operators of Eq. 11, Fig. 6 (see also my Mathematics Stack Exchange post) on the continuous-space impulse response of a circularly symmetric low-pass filter.

\begin{matrix} (11) & \begin{matrix} lim_{h \to 0} \frac{\sum_{N = 0}^{4 N + 1} (- 1)^{n} f (x + h \cos (\frac{2 π n}{4 N + 2}), y + h \sin (\frac{2 π n}{4 N + 2}))}{h^{2 N + 1}}, \\ lim_{h \to 0} \frac{\sum_{N = 0}^{4 N + 1} (- 1)^{n} f (x + h \sin (\frac{2 π n}{4 N + 2}), y + h \cos (\frac{2 π n}{4 N + 2}))}{h^{2 N + 1}} \end{matrix} \end{matrix}

$\begin{gather}\lim_{h\to 0}\frac{\sum_{N=0}^{4N + 1} (-1)^n f\bigg(x + h\cos\left(\frac{2\pi n}{4N + 2}\right), y + h\sin\left(\frac{2\pi n}{4N + 2}\right)\bigg)}{h^{2N + 1}},\\ \lim_{h\to 0}\frac{\sum_{N=0}^{4N + 1} (-1)^n f\bigg(x + h\sin\left(\frac{2\pi n}{4N + 2}\right), y + h\cos\left(\frac{2\pi n}{4N + 2}\right)\bigg)}{h^{2N + 1}}\end{gather}\tag{11}$

Figure 6. Dirac delta relative locations in differential operators for construction of higher-order edge detectors.

2) The calculation of a (weighted) mean of circular quantities can be understood as summing of cosines of the same frequency shifted by samples of the quantity (and scaled by the weight), and finding the peak of the resulting function. If similarly shifted and scaled harmonics of the shifted cosine, with carefully chosen relative amplitudes, are added to the mix, forming a sharper smoothing kernel, then multiple peaks may appear in the total sum and the peak with the largest value can be reported. With a suitable mixture of harmonics, that would give a kind of local average that largely ignores outliers away from the main peak of the distribution.

Alternative approaches

It would also be possible to convolve the image by angle $\phi$ and angle $\phi + \pi/2$ rotated "long edge" kernels, and to calculate the mean square of the pixels of the two convolved images. The angle $\phi$ that maximizes the mean square would be reported. This approach might give a good final refinement for the image orientation finding, because it is risky to search the complete angle $\phi$ space at large steps.

Another approach is non-local methods, like cross-correlating distant similar regions, applicable if you know that there are long horizontal or vertical traces, or features that repeat many times horizontally or vertically.

— Olli Niemitalo
sumber

How accurate the result you got?

— Royi

@Royi Maybe around 0.1 deg.

— Olli Niemitalo

@OlliNiemitalo which is pretty impressive, given the limited resolution!

— Marcus Müller

@OlliNiemitalo speaking of impressive: this. answer. is. that. word's. very. definition.

— Marcus Müller

@MarcusMüller Thanks Marcus, I anticipate the first extension to be very interesting too.

— Olli Niemitalo

There is a similar DSP trick here, but I don't remember the details exactly.

I read about it somewhere, some while ago. It has to do with figuring out fabric pattern matches regardless of the orientation. So you may want to research on that.

Grab a circle sample. Do sums along spokes of the circle to get a circumference profile. Then they did a DFT on that (it is inherently circular after all). Toss the phase information (make it orientation independent) and make a comparison.

Then they could tell whether two fabrics had the same pattern.

Your problem is similar.

It seems to me, without trying it first, that the characteristics of the pre DFT profile should reveal the orientation. Doing standard deviations along the spokes instead of sums should work better, maybe both.

Now, if you had an oriented reference image, you could use their technique.

Ced

Your precision requirements are rather strict.

I gave this a whack. Taking the sum of the absolute values of the differences between two subsequent points along the spoke for each color.

Here is a graph of around the circumference. Your value is plotted with the white markers.

You can sort of see it, but I don't think this is going to work for you. Sorry.

Progress Report: Some

I've decided on a three step process.

1) Find evaluation spot.

2) Coarse Measurement

3) Fine Measurement

Currently, the first step is user intevention. It should be automatible, but I'm not bothering. I have a rough draft of the second step. There's some tweaking I want to try. Finally, I have a few candidates for the third step that is going to take testing to see which works best.

The good news is it is lighting fast. If your only purposed is to make an image look level on a web page, then your tolerances are way too strict and the coarse measurement ought to be accurate enough.

This is the coarse measurement. Each pixel is about 0.6 degrees. (Edit, actually 0.3)

Progress Report: Able to get good results

Most aren't this good, but they are cheap (and fairly local) and finding spots to get good reads is easy..... for a human. Brute force should work fine for a program.

The results can be much improved on, this is a simple baseline test. I'm not ready to do any explaining yet, nor post the code, but this screen shot ain't photoshopped.

Progress Report: The code is posted, I'm done with this for a while.

This screenshot is the program working on Marcus' 45 degree shot.

The color channels are processed independently.

A point is selected as the sweep center.

A diameter is swept through 180 degrees at discrete angles

At each angle, "volatility" is measuring across the diameter. A trace is made for each channel gathering samples. The sample value is a linear interpolation of the four corner values of whichever grid square the sample spot lands on.

For each channel trace

The samples are multiplied by a VonHann window function

A Smooth/Differ pass is made on the samples

The RMS of the Differ is used as a volatility measure

The lower row graphs are:

First is the sweep of 0 to 180 degrees, each pixel is 0.5 degrees. Second is the sweep around the selected angle, each pixel is 0.1 degrees. Third is the sweep around the selected angle, each pixel is 0.01 degrees. Fourth is the trace Differ curve

The initial selection is the minimal average volatility of the three channels. This will be close, but usually not on, the best angle. The symmetry at the trough is a better indicator than the minimum. A best fit parabola in that neighborhood should yield a very good answer.

The source code (in Gambas, PPA gambas-team/gambas3) can be found at:

https://forum.gambas.one/viewtopic.php?f=4&t=707

It is an ordinary zip file, so you don't have to install Gambas to look at the source. The files are in the ".src" subdirectory.

Removing the VonHann window yields higher accuracy because it effectively lengthens the trace, but adds wobbles. Perhaps a double VonHann would be better as the center is unimportant and a quicker onset of "when the teeter-totter hits the ground" will be detected. Accuracy can easily be improved my increasing the trace length as far as the image allows (Yes, that's automatible). A better window function, sinc?

The measures I have taken at the current settings confirm the 3.19 value +/-.03 ish.

This is just the measuring tool. There are several strategies I can think of to apply it to the image. That, as they say, is an exercise for the reader. Or in this case, the OP. I'll be trying my own later.

There's head room for improvement in both the algorithm and the program, but already they are really useful.

Here is how the linear interpolation works

'---- Whole Number Portion

        x = Floor(rx)
        y = Floor(ry)

'---- Fractional Portions

        fx = rx - x
        fy = ry - y

        gx = 1.0 - fx
        gy = 1.0 - fy

'---- Weighted Average

        vtl = ArgValues[x, y] * gx * gy         ' Top Left
        vtr = ArgValues[x + 1, y] * fx * gy     ' Top Right
        vbl = ArgValues[x, y + 1] * gx * fy     ' Bottom Left
        vbr = ArgValues[x + 1, y + 1] * fx * fy ' Bottom Rigth

        v = vtl + vtr + vbl + vbr

Anybody know the conventional name for that?

— Cedron Dawg
sumber

hey, you don't need to be sorry for something that was a very clever approach, and might be super helpful for someone with a similar problem who'll come here later! +1

— Marcus Müller

@BarsMonster, I am making good progess. You will want to install Gambas (PPA: gambas-team/gambas3) on your Linux box. (Likely, you too Marcus and Olli, if you can.) I'm working on a program that will not only tackle this problem, but will also serve as a good base for other image processing tasks.

— Cedron Dawg

looking forward!

— Marcus Müller

@CedronDawg that's called bilinear interpolation, here's why, indicating also to an alternative implementation.

— Olli Niemitalo

@OlliNiemitalo,Thanks Olli. In this situation, I don't think going bicubic would improve results over bilinear, in fact, it may even be detrimental. Later, I will play around with different volatility metrics along the diameter, and different shaped window function. At this point I am thinking of using a VonHann at the ends of the diameter like paddles or "teeter-totter seats hitting the mud". The flat bottom in the curve is where the teeter-totter hasn't his the ground (edge) yet. Half way between the two corners is a good read. The current settings are good to less than 0.1 degrees,

— Cedron Dawg

Rather performance intensive, but should get you accuracy as wanted:

Edge detect the image
Hough transform to a space where you have enough pixels for the wanted accuracy.
Because there are enough orthogonal lines; the image in the hough space will contain maxima lying on two lines. These are easily detectable and give you the desired angle.

— RobAu
sumber

Nice, exactly my approach: I'm kind of sad that I didn't see it before I went on my train ride and thus didn't incorporate it in my answer. A clear +1!

— Marcus Müller

I've went ahead and basically adjusted the Hough transform example of opencv to your use case. The idea is nice, but since your image already has plenty of edges due to its edgy nature, the edge detection shouldn't have much benefit.

So, what I did above said example was

Omit the edge detection
decompose your input image into color channels and process them separately
count the occurrences of lines in a specific angle (after quantizing the angles and taking them modulo 90°, since you have plenty right angles)
combine the counters of the color channels
correct these rotations

What you could do to further improve the quality of estimation (as you'll see below, the top guess wasn't right – the second was) would probably amount to converting of the image to a grayscale image that represents the actual differences between different materials best – clearly, the RGB channels aren't the best. You're the semiconductor expert, so find a way to combine the color channels in a way that maximizes the difference between e.g. metallization and silicon.

My jupyter notebook is here. See the results below.

To increase the angular resolution, increase the QUANT_STEP variable, and the angular precision in the hough_transform call. I didn't, because I wanted this code to be written in < 20 min, and thus didn't want to invest a minute in computation.

import cv2
import numpy
from matplotlib import pyplot
import collections

QUANT_STEPS = 360*2

def quantized_angle(line, quant = QUANT_STEPS):
    theta = line[0][1]
    return numpy.round(theta / numpy.pi / 2 * QUANT_STEPS) / QUANT_STEPS * 360 % 90

def detect_rotation(monochromatic_img):
    # edges = cv2.Canny(monochromatic_img, 50, 150, apertureSize = 3) #play with these parameters
    lines = cv2.HoughLines(monochromatic_img, #input
                           1, # rho resolution [px]
                           numpy.pi/180, # angular resolution [radian]
                           200) # accumulator threshold – higher = fewer candidates
    counter = collections.Counter(quantized_angle(line) for line in lines)
    return counter

img = cv2.imread("/tmp/HIKRe.jpg") #Image directly as grabbed from imgur.com
total_count = collections.Counter()
for channel in range(img.shape[-1]):
    total_count.update(detect_rotation(img[:,:,channel]))

most_common = total_count.most_common(5)

for angle,_ in most_common:
    pyplot.figure(figsize=(8,6), dpi=100)
    pyplot.title(f"{angle:.3f}°")
    rotation = cv2.getRotationMatrix2D((img.shape[0]/2, img.shape[1]/2), -angle, 1)
    pyplot.imshow(cv2.warpAffine(img, rotation, img.shape[:2]))

— Marcus Müller
sumber

This is a go at the first suggested extension of my previous answer.

Ideal circularly symmetric band-limiting filters

We construct an orthogonal bank of four filters bandlimited to inside a circle of radius $\omega_c$ on the frequency plane. The impulse responses of these filters can be linearly combined to form directional edge detection kernels. An arbitrarily normalized set of orthogonal filter impulse responses are obtained by applying the first two pairs of "beach-ball like" differential operators to the continuous-space impulse response of the circularly symmetric ideal band-limiting filter impulse response $h(x,y)$ :

\begin{matrix} (1) & h (x, y) = \frac{ω_{c}}{2 π \sqrt{x^{2} + y^{2}}} J_{1} (ω_{c} \sqrt{x^{2} + y^{2}}) \end{matrix}

$h(x,y) = \frac{\omega_c}{2\pi \sqrt{x^2 + y^2} } J_1 \big( \omega_c \sqrt{x^2 + y^2} \big)\tag{1}$

\begin{matrix} (2) & \begin{aligned} h_{0 x} (x, y) & \propto \frac{d}{d x} h (x, y), \\ h_{0 y} (x, y) & \propto \frac{d}{d y} h (x, y), \\ h_{1 x} (x, y) & \propto ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y), \\ h_{1 y} (x, y) & \propto ({(\frac{d}{d y})}^{3} - 3 \frac{d}{d y} {(\frac{d}{d x})}^{2}) h (x, y) \end{aligned} \end{matrix}

$\begin{align}h_{0x}(x, y) &\propto \frac{d}{dx}h(x, y),\\ h_{0y}(x, y) &\propto \frac{d}{dy}h(x, y),\\ h_{1x}(x, y) &\propto \left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y),\\ h_{1y}(x, y) &\propto \left(\left(\frac{d}{dy}\right)^3-3\frac{d}{dy}\left(\frac{d}{dx}\right)^2\right)h(x, y)\end{align}\tag{2}$

\begin{matrix} (3) & \begin{aligned} h_{0 x} (x, y) & = {\begin{cases} 0 & if x = y = 0, \\ - \frac{ω_{c}^{2} x J_{2} (ω_{c} \sqrt{x^{2} + y^{2}})}{2 π (x^{2} + y^{2})} & otherwise, \end{cases} \\ h_{0 y} (x, y) & = h_{0 x} [y, x], \\ h_{1 x} (x, y) & = {\begin{cases} 0 & if x = y = 0, \\ \frac{\begin{array}{l} (ω_{c} x (3 y^{2} - x^{2}) (J_{0} (ω_{c} \sqrt{x^{2} + y^{2}}) ω_{c} \sqrt{x^{2} + y^{2}} (ω_{c}^{2} x^{2} + ω_{c}^{2} y^{2} - 24) \\ - 8 J_{1} (ω_{c} \sqrt{x^{2} + y^{2}}) (ω_{c}^{2} x^{2} + ω_{c}^{2} y^{2} - 6))) \end{array}}{2 π (x^{2} + y^{2})^{7 / 2}} & otherwise, \end{cases} \\ h_{1 y} (x, y) & = h_{1 x} [y, x], \end{aligned} \end{matrix}

$\begin{align}h_{0x}(x, y) &= \begin{cases}0&\text{if }x = y = 0,\\-\displaystyle\frac{\omega_c^2\,x\,J_2\left(\omega_c\sqrt{x^2 + y^2}\right)}{2 \pi\,(x^2 + y^2)}&\text{otherwise,}\end{cases}\\ h_{0y}(x, y) &= h_{0x}[y, x],\\ h_{1x}(x, y) &= \begin{cases}0&\text{if }x = y = 0,\\\frac{\begin{array}{l}\Big(ω_cx(3y^2 - x^2)\big(J_0\left(ω_c\sqrt{x^2 + y^2}\right)ω_c\sqrt{x^2 + y^2}(ω_c^2x^2 + ω_c^2y^2 - 24)\\ - 8J_1\left(ω_c\sqrt{x^2 + y^2}\right)(ω_c^2x^2 + ω_c^2y^2 - 6)\big)\Big)\end{array}}{2π(x^2 + y^2)^{7/2}}&\text{otherwise,}\end{cases}\\ h_{1y}(x, y) &= h_{1x}[y, x],\end{align}\tag{3}$

where $J_\alpha$ is a Bessel function of the first kind of order $\alpha$ and $\propto$ means "is proportional to". I used Wolfram Alpha queries ((ᵈ/dx)³; ᵈ/dx; ᵈ/dx(ᵈ/dy)²) to carry out differentiation, and simplified the result.

Truncated kernels in Python:

import matplotlib.pyplot as plt
import scipy
import scipy.special
import numpy as np

def h0x(x, y, omega_c):
  if x == 0 and y == 0:
    return 0
  return -omega_c**2*x*scipy.special.jv(2, omega_c*np.sqrt(x**2 + y**2))/(2*np.pi*(x**2 + y**2))

def h1x(x, y, omega_c):
  if x == 0 and y == 0:
    return 0
  return omega_c*x*(3*y**2 - x**2)*(scipy.special.j0(omega_c*np.sqrt(x**2 + y**2))*omega_c*np.sqrt(x**2 + y**2)*(omega_c**2*x**2 + omega_c**2*y**2 - 24) - 8*scipy.special.j1(omega_c*np.sqrt(x**2 + y**2))*(omega_c**2*x**2 + omega_c**2*y**2 - 6))/(2*np.pi*(x**2 + y**2)**(7/2))

def rotatedCosineWindow(N):  # N = horizontal size of the targeted kernel, also its vertical size, must be odd.
  return np.fromfunction(lambda y, x: np.maximum(np.cos(np.pi/2*np.sqrt(((x - (N - 1)/2)/((N - 1)/2 + 1))**2 + ((y - (N - 1)/2)/((N - 1)/2 + 1))**2)), 0), [N, N])

def circularLowpassKernel(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.fromfunction(lambda x, y: omega_c*scipy.special.j1(omega_c*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2))/(2*np.pi*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2)), [N, N])
  kernel[(N - 1)//2, (N - 1)//2] = omega_c**2/(4*np.pi)
  return kernel

def prototype0x(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.zeros([N, N])
  for y in range(N):
    for x in range(N):
      kernel[y, x] = h0x(x - (N - 1)/2, y - (N - 1)/2, omega_c)
  return kernel

def prototype0y(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  return prototype0x(omega_c, N).transpose()

def prototype1x(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.zeros([N, N])
  for y in range(N):
    for x in range(N):
      kernel[y, x] = h1x(x - (N - 1)/2, y - (N - 1)/2, omega_c)
  return kernel

def prototype1y(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  return prototype1x(omega_c, N).transpose()

N = 321  # Horizontal size of the kernel, also its vertical size. Must be odd.
window = rotatedCosineWindow(N)

# Optional window function plot
#plt.imshow(window, vmin=-np.max(window), vmax=np.max(window), cmap='bwr')
#plt.colorbar()
#plt.show()

omega_c = np.pi/8  # Cutoff frequency in radians <= pi
lowpass = circularLowpassKernel(omega_c, N)
kernel0x = prototype0x(omega_c, N)
kernel0y = prototype0y(omega_c, N)
kernel1x = prototype1x(omega_c, N)
kernel1y = prototype1y(omega_c, N)

# Optional kernel image save
plt.imsave('lowpass.png', plt.cm.bwr(plt.Normalize(vmin=-lowpass.max(), vmax=lowpass.max())(lowpass)))
plt.imsave('kernel0x.png', plt.cm.bwr(plt.Normalize(vmin=-kernel0x.max(), vmax=kernel0x.max())(kernel0x)))
plt.imsave('kernel0y.png', plt.cm.bwr(plt.Normalize(vmin=-kernel0y.max(), vmax=kernel0y.max())(kernel0y)))
plt.imsave('kernel1x.png', plt.cm.bwr(plt.Normalize(vmin=-kernel1x.max(), vmax=kernel1x.max())(kernel1x)))
plt.imsave('kernel1y.png', plt.cm.bwr(plt.Normalize(vmin=-kernel1y.max(), vmax=kernel1y.max())(kernel1y)))
plt.imsave('kernelkey.png', plt.cm.bwr(np.repeat([(np.arange(321)/320)], 16, 0)))

Figure 1. Color-mapped 1:1 scale plot of circularly symmetric band-limiting filter impulse response, with cut-off frequency $\omega_c = \pi/8$ . Color key: blue: negative, white: zero, red: maximum.

Figure 2. Color-mapped 1:1 scale plots of sampled impulse responses of filters in the filter bank, with cut-off frequency $\omega_c = \pi/8$ , in order: $h_{0x}$ , $h_{0y}$ , $h_{1x}$ , $h_{0y}$ . Color key: blue: minimum, white: zero, red: maximum.

Directional edge detectors can be constructed as weighted sums of these. In Python (continued):

composite = kernel0x-4*kernel1x
plt.imsave('composite0.png', plt.cm.bwr(plt.Normalize(vmin=-composite.max(), vmax=composite.max())(composite)))
plt.imshow(composite, vmin=-np.max(composite), vmax=np.max(composite), cmap='bwr')
plt.colorbar()
plt.show()

composite = (kernel0x+kernel0y) + 4*(kernel1x+kernel1y)
plt.imsave('composite45.png', plt.cm.bwr(plt.Normalize(vmin=-composite.max(), vmax=composite.max())(composite)))
plt.imshow(composite, vmin=-np.max(composite), vmax=np.max(composite), cmap='bwr')
plt.colorbar()
plt.show()

Figure 3. Directional edge detection kernels constructed as weighted sums of kernels of Fig. 2. Color key: blue: minimum, white: zero, red: maximum.

The filters of Fig. 3 should be better tuned for continuous edges, compared to gradient filters (first two filters of Fig. 2).

Gaussian filters

The filters of Fig. 2 have a lot of oscillation due to strict band limiting. Perhaps a better staring point would be a Gaussian function, as in Gaussian derivative filters. Relatively, they are much easier to handle mathematically. Let's try that instead. We start with the impulse response definition of a Gaussian "low-pass" filter:

\begin{matrix} (4) & h (x, y, σ) = \frac{e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}}{2 π σ^{2}} . \end{matrix}

$h(x, y, \sigma) = \frac{e^{-\displaystyle\frac{x^2 + y^2}{2 \sigma^2}}}{2\pi \sigma^2}.\tag{4}$

We apply the operators of Eq. 2 to $h(x, y, \sigma)$ and normalize each filter $h_{..}$ by:

\begin{matrix} (5) & \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} h_{. .} (x, y, σ)^{2} d x d y = 1. \end{matrix}

$\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}h_{..}(x, y, \sigma)^2\,dx\,dy = 1.\tag{5}$

\begin{matrix} (6) & \begin{aligned} h_{0 x} (x, y, σ) & = 2 \sqrt{2 π} σ^{2} \frac{d}{d x} h (x, y, σ) = - \frac{\sqrt{2}}{\sqrt{π} σ^{2}} x e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}, \\ h_{0 y} (x, y, σ) & = h_{0 x} (y, x, σ), \\ h_{1 x} (x, y, σ) & = \frac{2 \sqrt{3 π} σ^{4}}{3} ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y, σ) = - \frac{\sqrt{3}}{3 \sqrt{π} σ^{4}} (x^{3} - 3 x y^{2}) e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}, \\ h_{1 y} (x, y, σ) & = h_{1 x} (y, x, σ) . \end{aligned} \end{matrix}

$\begin{align}h_{0x}(x, y, \sigma) &= 2\sqrt{2\pi}σ^2 \frac{d}{dx}h(x, y, \sigma) = - \frac{\sqrt{2}}{\sqrt{\pi}σ^2} x e^{-\displaystyle\frac{x^2 + y^2}{2σ^2}},\\ h_{0y}(x, y, \sigma) &= h_{0x}(y, x, \sigma),\\ h_{1x}(x, y, \sigma) &= \frac{2\sqrt{3\pi}σ^4}{3}\left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y, \sigma) = - \frac{\sqrt{3}}{3\sqrt{\pi}σ^4} (x^3 - 3xy^2) e^{-\displaystyle\frac{x^2 + y^2}{2σ^2}},\\ h_{1y}(x, y, \sigma) &= h_{1x}(y, x, \sigma).\end{align}\tag{6}$

We would like to construct from these, as their weighted sum, the impulse response of a vertical edge detector filter that maximizes specificity $S$ which is the mean sensitivity to a vertical edge over the possible edge shifts $s$ relative to the mean sensitivity over the possible edge rotation angles $\beta$ and possible edge shifts $s$ :

\begin{matrix} (7) & S = \frac{2 π \int_{- \infty}^{\infty} (\int_{- \infty}^{\infty} (\int_{- \infty}^{s} h_{x} (x, y, σ) d x - \int_{s}^{\infty} h_{x} (x, y, σ) d x) d y)^{2} d s}{(\int_{- π}^{π} \int_{- \infty}^{\infty} (\int_{- \infty}^{\infty} (\int_{- \infty}^{s} h_{x} (\cos (β) x - \sin (β) y, \sin (β) x + \cos (β) y) d x - \int_{s}^{\infty} h_{x} (\cos (β) x - \sin (β) y, \sin (β) x + \cos (β) y) d x) d y)^{2} d s d β)} . \end{matrix}

$S = \frac{2\pi\displaystyle\int_{-\infty}^{\infty}\Bigg(\int_{-\infty}^{\infty}\bigg(\int_{-\infty}^{s}h_x(x, y, \sigma)dx - \int_{s}^{\infty}h_x(x, y, \sigma)dx\bigg)dy\Bigg)^2ds} {\Bigg(\displaystyle\int_{-\pi}^{\pi}\int_{-\infty}^{\infty}\bigg(\int_{-\infty}^{\infty}\Big(\int_{-\infty}^{s}h_x\big(\cos(\beta)x- \sin(\beta)y, \sin(\beta)x + \cos(\beta)y\big)dx \\- \displaystyle\int_{s}^{\infty}h_x\big(\cos(\beta)x - \sin(\beta)y, \sin(\beta)x + \cos(\beta)y\big)dx\Big)dy\bigg)^2ds\,d\beta\Bigg)}.\tag{7}$

We only need a weighted sum of $h_{0x}$ with variance $\sigma^2$ and $h_{1x}$ with optimal variance. It turns out that $S$ is maximized by an impulse response:

\begin{matrix} (8) & \begin{aligned} h_{x} (x, y, σ) & = \frac{\sqrt{7625 - 2440 \sqrt{5}}}{61} h_{0 x} (x, y, σ) - \frac{2 \sqrt{610 \sqrt{5} - 976}}{61} h_{1 x} (x, y, \sqrt{5} σ) \\ = - \frac{\sqrt{(15250 - 4880 \sqrt{5}}}{61 \sqrt{π} σ^{2}} x e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}} + \frac{\sqrt{1830 \sqrt{5} - 2928}}{4575 \sqrt{π} σ^{4}} (2 x^{3} - 6 x y^{2}) e^{- \frac{x^{2} + y^{2}}{10 σ^{2}}} \\ = \frac{2 \sqrt{π} σ^{2} \sqrt{15250 - 4880 \sqrt{5}}}{61} \frac{d}{d x} h (x, y, σ) - \frac{100 \sqrt{π} σ^{4} \sqrt{1830 \sqrt{5} - 2928}}{183} ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y, \sqrt{5} σ) \\ \approx 3.8275359956049814 σ^{2} \frac{d}{d x} h (x, y, σ) - 33.044650082417731 σ^{4} ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y, \sqrt{5} σ), \end{aligned} \end{matrix}

$\begin{align}h_x(x, y, \sigma) &= \frac{\sqrt{7625 - 2440\sqrt{5}}}{61} h_{0x}(x, y, \sigma) - \frac{2\sqrt{610\sqrt{5} - 976}}{61} h_{1x}(x, y, \sqrt{5}\sigma)\\ &= - \frac{\sqrt{(15250 - 4880\sqrt{5}}}{61\sqrt{\pi}σ^2}xe^{-\displaystyle\frac{x^2 + y^2}{2σ^2}} + \frac{\sqrt{1830\sqrt{5} - 2928}}{4575 \sqrt{\pi} σ^4}(2x^3 - 6xy^2)e^{-\displaystyle\frac{x^2 + y^2}{10 σ^2}}\\ &= \frac{2\sqrt{\pi}σ^2\sqrt{15250 - 4880\sqrt{5}}}{61}\frac{d}{dx}h(x, y, \sigma) - \frac{100\sqrt{\pi}σ^4\sqrt{1830\sqrt{5} - 2928}}{183}\left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y,\sqrt{5}\sigma)\\ &\approx 3.8275359956049814\,\sigma^2\frac{d}{dx}h(x, y, \sigma) - 33.044650082417731\,\sigma^4\left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y,\sqrt{5}\sigma),\end{align}\tag{8}$

also normalized by Eq. 5. To vertical edges, this filter has a specificity of $S = \frac{10\times5^{1/4}}{9}$ $+$ $2$ $\approx$ $3.661498645$ , in contrast to the specificity $S = 2$ of a first-order Gaussian derivative filter with respect to $x$ . The last part of Eq. 8 has normalization compatible with separable 2-d Gaussian derivative filters from Python's scipy.ndimage.gaussian_filter:

import matplotlib.pyplot as plt
import numpy as np
import scipy.ndimage

sig = 8;
N = 161
x = np.zeros([N, N])
x[N//2, N//2] = 1
ddx = scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[0, 1], truncate=(N//2)/sig)
ddx3 = scipy.ndimage.gaussian_filter(x, sigma=[np.sqrt(5)*sig, np.sqrt(5)*sig], order=[0, 3], truncate=(N//2)/(np.sqrt(5)*sig))
ddxddy2 = scipy.ndimage.gaussian_filter(x, sigma=[np.sqrt(5)*sig, np.sqrt(5)*sig], order=[2, 1], truncate=(N//2)/(np.sqrt(5)*sig))

hx = 3.8275359956049814*sig**2*ddx - 33.044650082417731*sig**4*(ddx3 - 3*ddxddy2)
plt.imsave('hx.png', plt.cm.bwr(plt.Normalize(vmin=-hx.max(), vmax=hx.max())(hx)))

h = scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[0, 0], truncate=(N//2)/sig)
plt.imsave('h.png', plt.cm.bwr(plt.Normalize(vmin=-h.max(), vmax=h.max())(h)))
h1x = scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[0, 3], truncate=(N//2)/sig) - 3*scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[2, 1], truncate=(N//2)/sig)
plt.imsave('ddx.png', plt.cm.bwr(plt.Normalize(vmin=-ddx.max(), vmax=ddx.max())(ddx)))
plt.imsave('h1x.png', plt.cm.bwr(plt.Normalize(vmin=-h1x.max(), vmax=h1x.max())(h1x)))
plt.imsave('gaussiankey.png', plt.cm.bwr(np.repeat([(np.arange(161)/160)], 16, 0)))

Figure 4. Color-mapped 1:1 scale plots of, in order: A 2-d Gaussian function, derivative of the Gaussian function with respect to $x$ , a differential operator $\big(\frac{d}{dx}\big)^3-3\frac{d}{dx}\big(\frac{d}{dy}\big)^2$ applied to the Gaussian function, the optimal two-component Gaussian-derived vertical edge detection filter $h_x(x, y, \sigma)$ of Eq. 8. The standard deviation of each Gaussian was $\sigma = 8$ except for the hexagonal component in the last plot which had standard deviation $\sqrt{5}\times8$ . Color key: blue: minimum, white: zero, red: maximum.

TO BE CONTINUED...

— Olli Niemitalo
sumber