Advanced Computer Vision

Advanced Computer Vision
Chapter 4 Feature Detection and Matching Presented by: 傅楸善 & 董子嘉

Feature Detection and Matching (1/3)
4.1 Points and Patches 4.2 Edges 4.3 Lines

各種特徵偵測器和描述器可以用於分析，描述和匹配圖像 (A)像點有興趣運算子 (B)像區域有興趣運算符 (C)邊 (D)直線

兩對要匹配的圖像。什麼特徵的種類可以用來建立這些圖像之間的一組對應關係？

4.1 Points and Patches (1/2) There are two main approaches:
Find features in one image that can be accurately tracked using a local search technique. (e.g., video sequences) Independently detect features in all the images under consideration and then match features based on their local appearance. (e.g., stitching panoramas, establishing correspondences) 有兩種主要方法：在一個圖像中查找可使用區域搜索技術進行精確跟踪的功能。（例如，視頻序列）獨立檢測所考慮的所有圖像中的特徵，然後根據其局部外觀匹配特徵。（例如，拼接全景，建立對應）

Points and Patches (2/2) Three stages:
Feature detection (extraction) stage Feature description stage Feature matching stage or Feature Tracking stage 三個階段：特徵檢測（抽取）階段特徵描述階段特徵配對階段或特徵追蹤階段

4.1.1 Feature Detectors 圖像對與下面提取的小塊。注意一些小塊如何可以區域化或配對比其他的小塊更高的精度

Aperture Problem 孔徑問題為了不同影像小塊 (A)穩定（像角）流動 (B)經典孔徑問題（理髮師桿錯覺） (C)無紋理區域
兩個圖像I0（黃色）和I1（紅色）被重疊。紅色向量u表示小塊中心之間的位移，並且w（xi）權重函數（小塊視窗）顯示為黑圓

Weighted Summed Square Difference
I0, I1: the two images being compared u: the displacement vector w(x): spatially varying weighting function summation i: over all the pixels in the patch 權重平方差 I0，I1：比較的兩個圖像 u：位移向量 w（x）：空間變化的權重函數總和i：在小塊中的所有像素

Auto-correlation Function or Surface
I0: the image being compared ∆u: small variations in position w(x): spatially varying weighting function summation i: over all the pixels in the patch 自我相關函數或表面 I0：正在比較的圖像 Δu：位置的微小變化 w（x）：空間變化的加權函數求和i：在補丁中的所有像素 When performing feature detection, we do not know which other image locations the feature will end up being matched against. Therefore, we can only compute how stable this metric is with respect to small variations in position u by comparing an image patch against Itself.

三個不同的自相關表面（b-d）顯示為灰度圖像和
表面圖。原始圖像（a）用三個紅十字標記，以指示這些自相關表面在哪裡被計算。補片（b）來自花壇（獨特的最小值），補片（c）來自屋頂邊緣（一維孔徑問題），補片（d）來自云

Approximation of Auto-correlation Function (1/3)
I0(xi+ ∆u) ≈ I0(xi) + ∇ I0(xi)．∆u ∇ I0(xi): image gradient at xi This gradient can be computed using a variety of techniques. The classic “Harris” detector uses a [ ] filter, but more modern variants convolve the image with horizontal and vertical derivatives of a Gaussian.

Auto-correlation matrix A: w: weighting kernel (from spatially varying weighting function) Ix, Iy: horizontal and vertical derivatives of Gaussians 自相關矩陣A： w：權重核心（從空間變化的加權函數） Ix，Iy：高斯的水平和垂直微分

Assume (λ0, λ1) are two eigenvalues of A and λ0 <= λ1. Since the larger uncertainty depends on the smaller eigenvalue, it makes sense to find maxima in the smaller eigenvalue to locate good features to track. 自相關函數近似（3/3）假設（λ0，λ1）是A的兩個特徵值，λ0<=λ1。由於較大的不確定性取決於較小的特徵值，因此在較小的特徵值中找到最大值以定位要跟踪的良好特徵是有意義的。

Other Measurements (1/2)
Quantity proposed by Harris and Stephens: α = 0.06 No square roots It is still rotationally invariant and downweights edge-like features when λ1 >> λ0. 其他測量哈里斯和斯蒂芬斯提出的量： α= 0.06 沒有方根它仍然是旋轉不變的和向下的權重像邊緣特徵當λ1>>λ0時 Finding local maximum

Other Measurements (2/2)
Quantity proposed by Brown, Szeliski, and Winder: It can be used when λ1 ≈ λ0. Brown，Szeliski和Winder提出的數量：它可以在λ1≈λ0時使用。 Finding local maximum

Basic Feature Detection Algorithm (1/2)
Step 1: Compute the horizontal and vertical derivatives of the image Ix and Iy by convolving the original image with derivatives of Gaussians. Step 2: Compute the three images (Ix2, Iy 2, IxIy) corresponding to the outer products of these gradients. 基礎特徵偵測演算法（1/2）步驟1：通過將原始圖像與高斯維芬進行卷積來計算圖像Ix和Iy的水平和垂直微分。步驟2：計算與這些梯度的外積相對應的三個圖像（Ix2，Iy2，IxIy）。

Basic Feature Detection Algorithm (2/2)
Step 3: Convolve each of these images with a larger Gaussian (the weighting kernel). Step 4: Compute a scalar interest measure using one of the formulas discussed above. Step 5: Find local maxima above a certain threshold and report them as detected feature point locations. 步驟3：用較大的高斯（加權核）卷積這些圖像中的每一個。步驟4：使用上述公式之一來計算標量利益度量。步驟5：找到高於特定閾值的局部最大值，並將其報告為檢測到的特徵點位置。

Adaptive Non-maximal Suppression (ANMS) (1/2)
Simply finding local maxima -> uneven distribution of feature points Detect features that are: Local maxima Response value is greater than all of its neighbors within a radius r 簡單地找到局部最大值 - >特徵點的不均勻分佈檢測以下功能：局部最大值響應值大於其在半徑r內的所有鄰居

Adaptive Non-maximal Suppression (ANMS) (2/2)
自適應非最大抑制上面的兩個圖像示出最強的250和500個興趣點，而下面的兩個圖像示出利用自適應非最大抑制選擇的興趣點以及相應的抑制半徑r 注意後面的特徵如何具有跨越圖像的移動均勻的空間分佈

Measuring Repeatability
Which feature points should we use among the large number of detected feature points? Measure repeatability after applying rotations, scale changes, illumination changes, viewpoint changes, and adding noise. 測量重複性我們應該在大量檢測到的特徵點中使用哪些特徵點？測量應用旋轉後的重複性，刻度變化，照明變化，視點變化和增加噪聲。

Scale Invariance (1/2) Problem: if no good feature points in image?
Solution: multi-scale 問題：如果圖像中沒有好的特徵點？解決方案：多尺度多尺度方向的小塊（MOPS）在五個金字塔水平抽取框示出特徵方向和從中描述符矢量被採樣的區域

Scale Invariance (2/2) 示例圖像（a）和兩個不同的興趣算子響應：（b）Harris; （c）DoG。
局部最大值顯示為紅點。

4.1.2 Feature Descriptors Sum of squared difference
Normalized cross-correlation (Chapter 8) 即使在補償這些變化之後，圖像塊的局部外觀通常仍然在圖像之間變化。我們如何使圖像描述符對這種變化更不變，同時仍然保持不同（非對應）補丁之間的可辨別性？差異的平方和歸一化互相關（第8章） Even after compensating for these changes, the local appearance of image patches will usually still vary from image to image. How can we make image descriptors more invariant to such changes, while still preserving discriminability between different (non-corresponding) patches?

Scale Invariant Feature Transform (SIFT)
使用高斯金字塔的“子八度”差異的尺度空間特徵檢測減去“亞八度”高斯金字塔的相鄰級以產生高斯圖像的差異通過將像素與其26個鄰居進行比較來檢測所得到的3D體積中的極值 Performing the same operations at multiple resolutions in a pyramid and then matching features at the same level. Needs Sub-pixel accuracy, Remove low contrast, and Remove edge

Multi-scale Oriented Patches (MSOP)
Simpler SIFT, without every scale DoGs Used on image stitching, and so on Detector: Harris corner detector Multi-scale -> make program robust DoG: Difference of Gaussian 更簡單的SIFT，沒有每個規模的DoG 用於圖像拼接，等等檢測器：哈里斯角檢測器多尺度 - >使程序健壯 DoG：高斯差分

Orientation Estimation
最簡單的可能取向估計是在關鍵點周圍的區域內的平均梯度。例如：高斯加權函數，在關鍵點周圍計算的方向的直方圖 The simplest possible orientation estimate is the average gradient within a region around the keypoint. Ex: Gaussian weighting function, histogram of orientations computed around the keypoint

Gradient Location-orientation Histogram (GLOH)
Lowe's（2004）尺度不變特徵變換（SIFT）的示意圖。在每個像素處計算梯度方向和幅度，然後通過高斯衰減（藍色圓圈）加權。然後使用三線性內插在每個子區域中計算加權梯度定向直方圖。雖然該圖示出了8×8像素塊和2×2描述符陣列，但是Lowe的實際實現使用16×16塊和8×8直方圖的4×4陣列

Maximally Stable Extremal Region (MSER)
Only work for grayscale images Incrementally add pixels as the threshold is changed. Maximally stable: changing rate of area with respect to the threshold is minimal 僅適用於灰度圖像隨著門檻值的改變，增量地增加像素。最大穩定性：相對於門檻值的面積變化率最小

4.1.3 Feature Matching Two subjects: select a matching strategy
devise efficient data structures and algorithms to perform this matching 兩個科目：選擇匹配策略設計高效的數據結構和算法來執行這種匹配

Matching Strategy (1/7) Simplest method: set a distance threshold and match within this threshold. 最簡單的方法：設置距離閾值並在此閾值內匹配。假陽性和陰性的例子。黑色數字1和2是特徵與其他圖像中的特徵的數據庫匹配。在電流閾值設置（黑色圓圈），綠色1是真陽性（良好匹配），藍色1是假陰性（不能匹配），而紅色3是假陽性（不正確匹配）。如果我們設置閾值更高（虛線圓），藍色1變為真陽性，但棕色4變為額外的假陽性。

Matching Strategy (2/7) Confusion matrix to estimate performance: AB:
A = T or F, means whether the result is correct B = P or N, means the result should be correct or not TPR (true positive rate) = TP / (TP + FN) = TP / P FPR (false positive rate) = FP / (FP + TN) = FP / N PPV (positive predictive value) = TP / (TP + FP) = TP / P’ ACC (accuracy) = (TP + TN) / (P + N)

Matching Strategy (3/7) ROC: Receiver Operating Characteristic
ideally: True positive rate -> 1 and False positive rate -> 0

Matching Strategy (4/7) Other methods:
Nearest neighbor Nearest neighbor distance ratio Nearest neighbor distance ratio (NNDR): d1, d2: nearest and second nearest distances DA: the target descriptor DB and DC: closest two neighbors of DA 其他方法：最近的鄰居最近鄰居距離比最近鄰居距離比（NNDR）： d1，d2：最近和第二最近距離 DA：目標描述符 DB和DC：DA的最近的兩個鄰居

Matching Strategy (5/7) 固定閾值，最近鄰和最近鄰距離比匹配。在
固定距離閾值（虛線圓圈），描述符DA不能匹配DB和DD不正確匹配DC和DE。如果我們選擇最近鄰，DA正確匹配DB，但DD不正確匹配DC。使用最近鄰居距離比（NNDR）匹配，小的NNDR d1 / d2使DA與DB正確匹配，並且大NNDR d 1 / d0 2 正確地拒絕匹配 DD NNDR: used when dealing with a collection of images If the distance of nearest feature not “MUCH” less than the second nearest feature, it means that this descriptor has no precise matching.

Matching Strategy (6/7)

Matching Strategy (7/7) Indexing structures:
Multi-dimensional search tree Multi-dimensional hashing

4.1.4 Feature Tracking (1/3) To find a set of likely feature locations in a first image and to then search for their corresponding locations in subsequent images. Expected amount of motion and appearance deformation between adjacent frames is expected to be small. 為了在第一圖像中找到一組可能的特徵位置，然後在隨後的圖像中搜索它們的相應位置。預期相鄰幀之間的運動和外觀變形的預期量很小。

Feature Tracking (2/3) Selecting good features to track is closely related to selecting good features to match. When searching corresponding patch, weighted summed square difference works well enough. 選擇要跟踪的良好要素與選擇要匹配的良好要素密切相關。當搜索相應的補丁時，加權求和平方差工作得很好。

Feature Tracking (3/3) If features are being tracked over longer image sequences, their appearance can undergo larger changes. Continuously match against the originally detected feature Re-sample each subsequent frame at the matching location Use affine motion model to measure dissimilarity (ex: Kanade–Lucas–Tomasi (KLT) tracker) 如果在較長的圖像序列上跟踪特徵，則它們的外觀可以經歷較大的變化。與最初檢測到的功能連續匹配在匹配位置重新採樣每個後續幀使用仿射運動模型來測量不相似性（例如：Kanade-Lucas-Tomasi（KLT）跟踪器） The first strategy is prone to failure as the original patch can undergo appearance changes such as foreshortening. The second runs the risk of the feature drifting from its original location to some other location in the image. measure dissimilarity: like weighted summed square difference (only translation and measure dissimilarity)

4.2 Edges (1/2) What are edges in image: The boundaries of objects
Occlusion events in 3D Shadow boundaries or crease edges Grouped into longer curves or contours 什麼是圖像邊緣：對象的邊界 3D中的遮擋事件陰影邊界或摺痕邊緣分組為較大的曲線或輪廓

Edges (2/2)

4.2.1 Edge Detection (1/3) Edge has rapid intensity variation.
J: local gradient vector Its direction is perpendicular to the edge, and its magnitude is the intensity variation. I: original image 邊緣具有快速的強度變化。 J：局部梯度向量其方向垂直於邊緣，其大小是強度變化。 I：原始圖像

Edge Detection (2/3) Taking image derivatives makes noise larger.
Use Gaussian filter to remove noise: G: Gaussian filter σ: width of Gaussian filter 採用圖像導數使噪聲更大。使用高斯濾波器去除噪聲： G：高斯濾波器 σ：高斯濾波器的寬度 we would like the response of our edge detector to be independent of orientation, a circularly symmetric smoothing filter is desirable. -> Gaussian filter

Edge Detection (3/3) To thin such a continuous gradient image to only return isolated edges: Use Laplacian of Gaussian: S: second gradient operator Then find the zero crossings to get the maximum of gradient. 為了使這樣的連續梯度圖像變薄以僅返回孤立的邊緣：使用高斯拉普拉斯： S：第二梯度算子然後找到零交叉以獲得最大梯度。

Scale Selection Minimum reliable scale: an reliable scale so that an edge can be reliably detected

Color Edge Detection Band separation and combination -> not good
帶分離和組合 - >不好

4.2.2 Edge Linking To link isolated edge into continuous contours.
If we use zero crossing to find edge, then linking is just picking up two unlinked edgels in neighbors. Otherwise, comparing the orientation of adjacent edgels can be used. As the curve grouped by edges more smooth, the more robust object recognition will be. 將隔離邊鏈接到連續輪廓。如果我們使用零交叉找到邊緣，則鏈接只是拾取鄰居中的兩個未鏈接的邊緣。否則，可以使用比較相鄰灰度的取向。隨著由邊緣分組的曲線更平滑，更強的對象識別將是。

4.3 Lines Man-made world is full of straight lines, so detecting and matching these lines can be useful in a variety of applications. The Grassfire Transform is a name given to the concept in image processing for computing the distance of a pixel to the border of a region.

4.3.1 Successive Approximation
Line simplification: Piecewise-linear polyline B-spline curve Spline: a mathematical function used for interpolation or smoothing

4.3.2 Hough Transforms (1/2) Original Hough transforms:

Hough Transforms (2/2) Oriented Hough transform:

Hough Transforms Algorithm (1/2)
Step 1: Clear the accumulator array. Step 2: For each detected edgel at location (x, y) and orientation θ = tan-1(ny/nx), compute the value of d = x*nx + y*ny and increment the accumulator corresponding to (θ, d). The range of (θ; d) values is [-180°; 180°] [-√2; √2], assuming that we are using normalized pixel coordinates that lie in [-1; 1]. vector n: gradient vector, θ can be estimated by local orientation.

Hough Transforms Algorithm (2/2)
Step 3: Find the peaks in the accumulator corresponding to lines. Step 4: Optionally re-fit the lines to the constituent edgels.

RANSAC-based Line Detection
Another alternative to the Hough transform is the RANdom SAmple Consensus (RANSAC) algorithm. RANSAC randomly chooses pairs of edgels to form a line hypothesis and then tests how many other edgels fall onto this line. Lines with sufficiently large numbers of matching edgels are then selected as the desired line segments. Advantage: space efficient (no accumulator array) Disadvantage: many more hypotheses may need

4.3.3 Vanishing Points (1/5) Parallel lines in 3D have the same vanishing point.

Vanishing Points (2/5) An alternative to the 2D polar (θ, d) representation for lines is to use the full 3D m = line equation, projected onto the unit sphere. The location of vanishing point hypothesis: m: line equations in 3D representation

Vanishing Points (3/5) Corresponding weight:
li, lj: corresponding line segment lengths This has the desirable effect of downweighting (near-)collinear line segments and short line segments.

Vanishing Points (4/5)

Vanishing Points (5/5) A robustified least squares estimate for the vanishing point:

Advanced Computer Vision

Similar presentations

Presentation on theme: "Advanced Computer Vision"— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

Advanced Computer Vision

Similar presentations

Presentation on theme: "Advanced Computer Vision"— Presentation transcript:

Similar presentations

About project

反馈