Shape(Structure) From X

Shape(Structure) From X
解决的是从2D图像到2.5D表面形状(场景深度) 的问题 Shape from motion Shape from stereo Shape from monocular cues(shading, vanishing point, defocus, texture,….)

第七章基于运动视觉的场景复原

三维运动估计三维运动估计是指从二维图象序列来估计物体三维运动参数以及三维结构。 SFM (Structure From Motion)

三维刚体运动

小角度旋转小角度旋转矩阵

1. 基于正交投影的三维运动估计小角度旋转矩阵 6个未知数，3对点

基于正交投影的三维运动估计 Aizawa, 1989 1. 根据对应点和深度估计值，计算运动参数 2. 根据运动参数和对应点，重新估计深度
交替直到稳定 2. 根据运动参数和对应点，重新估计深度

基于正交投影的三维运动估计 Bozdagi, 1994 利用深度估计值的随机扰动，跳出局部最优 1. 根据对应点和深度估计值，计算运动参数
2. 根据运动参数和深度估计值，估计对应点坐标 3. 计算估计误差

基于正交投影的三维运动估计 4. 随机扰动深度估计值 5. 重复以上步骤
实验证明，这种改进的迭代算法在初始深度值有50%的误差的情况下，也能很好地收敛到正确的运动参数值。

2 基于透视投影模型的三维运动估计规范化焦距F=1,分子分母同除以Zk

3 基于外极线的三维运动估计外极线方程几何意义

基于外极线的三维运动估计外极线方程三维刚体运动引进一个反对称矩阵：

基于外极线的三维运动估计基本矩阵（essential matrix）平移矢量乘以不为零的系数，不影响外极线方程成立
所恢复的运动参数是关于比例系数的解

本质矩阵的应用可被用于简化匹配问题检测错误的匹配

基于外极线的三维运动估计外极线方程

基于外极线的三维运动估计基本矩阵的性质外极线方程的待求参数
5个未知的独立的参数，这也和运动参数的自由度数量相一致，即三个旋转自由度，二个平移自由度（或三个关于一个比例系数的平移自由度）.

(1) 根据基本矩阵估计运动 1. 计算基本矩阵 8对以上对应点求稳定解(实际经常使用RANSAC算法)

(1) 根据基本矩阵估计运动 1. 计算基本矩阵 In reality, instead of solving , we seek E to minimize , least eigenvector of

8-point algorithm It is achieved by SVD. Let , where , let
To enforce that E is of rank 2, E is replaced by E’ that minimizes subject to It is achieved by SVD. Let , where , let then is the solution.

8-point algorithm % Build the constraint matrix
A = [x2(1,:)‘.*x1(1,:)' x2(1,:)'.*x1(2,:)' x2(1,:)' ... x2(2,:)'.*x1(1,:)' x2(2,:)'.*x1(2,:)' x2(2,:)' ... x1(1,:)' x1(2,:)' ones(npts,1) ]; [U,D,V] = svd(A); % Extract fundamental matrix from the column of V % corresponding to the smallest singular value. E = reshape(V(:,9),3,3)'; % Enforce rank2 constraint [U,D,V] = svd(E); E = U*diag([D(1,1) D(2,2) 0])*V';

Problem with 8-point algorithm
~10000 ~10000 ~100 ~10000 ~10000 ~100 ~100 ~100 1 Orders of magnitude difference between column of data matrix  least-squares yields poor results !

Normalized 8-point algorithm
normalized least squares yields good results Transform image to ~[-1,1]x[-1,1] (0,500) (700,500) (-1,1) (1,1) (0,0) (0,0) (700,0) (-1,-1) (1,-1)

Normalized 8-point algorithm
[x1, T1] = normalise2dpts(x1); [x2, T2] = normalise2dpts(x2); A = [x2(1,:)‘.*x1(1,:)' x2(1,:)'.*x1(2,:)' x2(1,:)' ... x2(2,:)'.*x1(1,:)' x2(2,:)'.*x1(2,:)' x2(2,:)' ... x1(1,:)' x1(2,:)' ones(npts,1) ]; [U,D,V] = svd(A); E = reshape(V(:,9),3,3)'; [U,D,V] = svd(E); E = U*diag([D(1,1) D(2,2) 0])*V'; % Denormalise E = T2'*E*T1;

Normalization function [newpts, T] = normalise2dpts(pts)
c = mean(pts(1:2,:)')'; % Centroid newp(1,:) = pts(1,:)-c(1); % Shift origin to centroid. newp(2,:) = pts(2,:)-c(2); meandist = mean(sqrt(newp(1,:).^2 + newp(2,:).^2)); scale = sqrt(2)/meandist; T = [scale scale*c(1) 0 scale -scale*c(2) ]; newpts = T*pts;

RANSAC compute E based on all inliers repeat
select minimal sample (8 matches) compute solution(s) for F determine inliers until (#inliers,#samples)<95% || too many times compute E based on all inliers

根据基本矩阵估计运动 2. 估计运动参数 T: 根据基本矩阵的性质 R: 根据

(2) 直接根据外极线方程估计运动理想情况下：由于误差，改求：

Structure from motion

Structure from motion Unknown camera viewpoints structure for motion: automatic recovery of camera motion and scene structure from two or more images. It is a self calibration technique and called automatic camera tracking or matchmoving.

坐标转换 Model-view Transformation Camera Coordinate System
World Coordinate System

Camera Projection Matrix
世界坐标系  相机坐标系 Camera Parameter Camera Projection Matrix Intrinsic Extrinsic

对于同一场景点，拍摄一张图像

对于同一场景点，使用同样的相机设置拍摄两张图像

对于同一场景点，使用同样的相机设置拍摄三张图像

Image 1 Image 3 R1,t1 R3,t3 Image 2 R2,t2

Same Camera Same Setting = Same
Point 1 Point 2 Point 3 Image 1 Image 2 Image 3 Same Camera Same Setting = Same

Triangulation Image 1 Image 3 R1,t1 R3,t3 Image 2 R2,t2

相机内部参数矩阵 Principle point offset Skew
especially when images are cropped (Internet) Skew Radial distortion (due to optics of the lens)

Steps Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + =

Steps Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + = = +

Steps Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + = + =

Steps Images  Points: Structure from Motion Points  More points: Multiple View Stereo Points  Meshes: Model Fitting Meshes  Models: Texture Mapping Images  Models: Image-based Modeling + = = +

Pipeline Structure from Motion (SFM) Multi-view Stereo (MVS)

Two-view Reconstruction

keypoints match fundamental matrix essential [R|t] triangulation

Keypoints Detection keypoints match fundamental matrix essential [R|t]
triangulation

Descriptor for each point
SIFT descriptor SIFT descriptor keypoints match fundamental matrix essential [R|t] triangulation

Same for the other images
SIFT descriptor SIFT descriptor SIFT descriptor SIFT descriptor keypoints match fundamental matrix essential [R|t] triangulation

Point Match for correspondences
SIFT descriptor SIFT descriptor SIFT descriptor SIFT descriptor keypoints match fundamental matrix essential [R|t] triangulation

Fundamental Matrix Image 1 R1,t1 Image 2 R2,t2

Estimating Fundamental Matrix
Given a correspondence The basic incidence relation is Need 8 points

Estimating Fundamental Matrix
for 8 point correspondences: Direct Linear Transformation (DLT)

RANSAC to Estimate Fundamental Matrix
For many times Pick 8 points Compute a solution for using these 8 points Count number of inliers Pick the one with the largest number of inliers

Fundamental Matrix  Essential Matrix
Image 1 R1,t1 Image 2 R2,t2

Essential Matrix  For a given essential matrix and the first camera matrix , there are four possible choices for the second camera matrix :

Four Possible Solutions

Triangulation Image 1 R1,t1 Image 2 R2,t2

keypoints match fundamental matrix essential [R|t] triangulation

Pipeline Structure from Motion (SFM) Multi-view Stereo (MVS)

Pipeline Taught Next

Merge Two Point Cloud

Merge Two Point Cloud There can be only one

Merge Two Point Cloud

Oops See From a Different Angle

Bundle Adjustment

Point 1 Point 2 Point 3 Image 1 Image 2 Image 3

= Bundle Adjustment A valid solution and must let Re-projection
Observation

Bundle Adjustment A valid solution and must let the Re-projection close to the Observation, i.e. to minimize the reprojection error

Bundle Adjustment A valid solution and must let the Re-projection close to the Observation, i.e. to minimize the reprojection error Camera Points

Initialization Matters
Input: Observed 2D image position Output: Unknown Camera Parameters (with some guess) Unknown Point 3D coordinate (with some guess)

Descriptor: ZNCC (Zero-mean Normalized Cross-Correlation)
Invariant to linear radiometric changes More conservative than others such as sum of absolute or square diﬀerences in uniform regions More tolerant in textured areas where noise might be important

Seed for propagation

Matching Propagation (propagate.m)
Maintain a priority queue Q Initialize: Put all seeds into Q with their ZNCC values as scores For each iteration: Pop the match with best ZNCC score from Q Add new potential matches in their immediate spatial neighborhood into Q Safety: handle uniqueness, and propagate only on matchable area

the area with maximal gradience > threshold
Matchable Area the area with maximal gradience > threshold

Result (denseMath/run.m)

Triangulation Image 1 Image 3 R1,t1 R3,t3 Image 2 R2,t2

Final Result

Colorize the Point Cloud

Shape(Structure) From X

Similar presentations

Presentation on theme: "Shape(Structure) From X"— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

Shape(Structure) From X

Similar presentations

Presentation on theme: "Shape(Structure) From X"— Presentation transcript:

Similar presentations

About project

反馈