- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
7 3D Shape Matching for Retrieval and Recognition |
269 |
7.2 Literature Review
To support similarity search between two multimedia objects, we must define a model that allows us to compare them. It cannot be done directly because we are interested in retrieving similar objects stored in a database, rather than extracting identical copies. A lot of work has been done to assess the visual similarity between 3D objects when the required model is global. By global, we mean that given a 3D object, an algorithm retrieves those objects in database that look visually similar and the whole shape structure is used for the comparison. The most widely used model consists of obtaining an abstract representation of an object such as feature vectors or graphs, and the similarity is calculated based upon these representations. For example, distance metrics are commonly used to calculate similarity among feature vectors. Another similarity model which has gained much interest involves local features, especially in non-rigid and partial shape retrieval and recognition. The problem of defining a similarity model is very challenging. For example, a visual model allows us to represent a shape by its aspect. However, this model cannot be used to discriminate two shapes semantically similar, but with different aspect.
Bustos et al. [26] described a methodology that consists of representing 3D objects as real vectors of a certain dimension obtained through a transformation function. Then, these vectors can be organized in a multidimensional index, where the similarity corresponds to the proximity in the space where vectors are defined. The Minkowski distance family is usually used to measure the proximity of feature vectors. The authors presented experimental results comparing several transformation functions where the depth-buffer descriptor proposed by Vranic [106] showed the best results. Iyer et al. [55] and Tangelder et al. [100] also discussed techniques for 3D object retrieval, identifying future trends and important issues to be addressed. Also, the Princeton Shape Benchmark [96] provides a reference collection and experimental tests for several descriptors. Other surveys and reports were written by Funkhouser et al. [42], Del Bimbo et al. [15] and Bustos et al. [25, 27]. In 3D shape recognition, Campbell and Flynn [28] presented a comprehensive study of representation and recognition of free-form shapes.
Since then, many other techniques have been proposed to improve effectiveness and efficiency. A brief summary of representative approaches for shape retrieval is given in Table 7.1. Due to space limitations, this table does not show all techniques presented so far. We recommend reading Sect. 7.6 for further references.
Table 7.1 shows a simple taxonomy depending of the extracted shape information which is used in matching. Histogram-based methods summarize certain shape properties such as distribution between points on the surface [82], angle information [86], distribution of face normals [59], and so forth. These properties are used to build histograms which represent shapes and matching is done with common histogram measures. In contrast, transform-based methods apply some transform to the shapes to convert them to a numerical representation of the underlying object. Some examples of such transforms are: Fourier Transform [39], Radon Transform [36], and Wavelet Transform [67]. Image-based methods represent a 3D object as a set of projected images, so the matching becomes an image matching problem. For
270 |
|
|
Table 7.1 3D shape retrieval |
Type |
|
methods |
||
|
||
|
Histogram-based |
B. Bustos and I. Sipiran
Method
Shape distributions [82] Generalized shape distributions [75] Angle histogram [86]
Extended Gaussian images [59]
Transform-based |
3D Fourier [39] |
|
Angular radial transform [91] |
|
Spherical trace transform [114] |
|
Rotation invariant spherical harmonics [60] |
|
Concrete radialized spherical projection [84] |
|
Spherical wavelet descriptor [67] |
Image-based |
Depth-buffer descriptor [106] |
|
Silhouette descriptor [106] |
|
Light-field descriptor [30] |
|
Depth-Line descriptor [29] |
Graph-based |
Reeb graph [104] |
|
Skeleton graph [99] |
Local features |
Salient geometric features [44] |
|
Salient spectral features [52] |
|
Segmentation-based visual vocabulary [103] |
|
Heat kernel signatures [83] |
instance, Vranic [106] proposed to take the frequency spectrum of depth images, Chen [30] considered silhouettes taken from directions according to the vertices of a dodecahedron, and Chaouch and Verroust-Blondet [29] converted a set of depth images in character strings with the matching being performed with variations of the well-known edit distance. Graph-based methods represent shapes by graph structures such as Reeb graphs [51] which contain information about the connectivity of the shape’s parts.
There are several drawbacks with the aforementioned techniques. On the one hand, many of them (especially, image-based and transform-based methods) are pose sensitive. That is, one needs to apply a pose normalization step before the feature extraction process. Clearly, partial and non-rigid matching cannot be addressed with these methods. On the other hand, graph-based methods rely on the topological properties of a 3D object, so topological changes affect the description processes.
Recently, approaches based on local descriptors have received special attention due to its ability to support non-rigid and partial matching. In these approaches, each shape is represented as a set of descriptors and the matching is performed by searching for the best correspondence between them. Gal and Cohen-Or [44]
7 3D Shape Matching for Retrieval and Recognition |
271 |
proposed to represent a 3D object as a set of salient geometric features, which determine the complete object. Their scheme entirely relies on curvature information over the shape’s surface and the matching is done by indexing the salient features using geometric hashing [108] with a vote scheme to determine similar objects.
An interesting approach was given by Hu and Hua [52] to address the non-rigid and partial matching problem. Their method consists in using the Laplace-Beltrami operator to detect and describe interest points in 3D objects. This operator captures the information of a mesh in different scales, and it is also an isometric invariant, which is an important property to support geometric invariance. The authors proposed an energy function based on the Laplace-Beltrami spectrum to detect interest points with its associated scale. Using these points, along with their respective scales, it is possible to extract descriptors for each interest point from the local Laplace-Beltrami spectrum. The matching is performed by solving an integer quadratic programming problem where two sets of features belonging to shapes to be matched are involved.
One of the most widely used approaches for managing local descriptors is the bag-of-features approach. This begins by clustering all the local descriptors from an entire collection and calculating the centroids for each cluster. Then, each shape is represented as a histogram with a number of bins equal to the number of clusters. Each descriptor adds one into the bin corresponding to the closest centroid. Toldo et al. [103] proposed to segment a given mesh and build a descriptor for each segment. Subsequently, a bag-of-features approach combines all the descriptors in the mesh. Similarly, Ovsjanikov et al. [83] proposed a soft version of the Bag-of-Features approach applied to dense descriptors based on the heat kernel signature, originally introduced by Sun et al. [97], which is related to the Laplace-Beltrami operator. The authors also presented a spatially sensitive bag-of-features technique which gave good results in shape retrieval.
Obviously, the use of local features highlights a new problem: the amount of information used in the matching. With these approaches, a shape is represented with a set of descriptors and the problem of matching becomes non-trivial. In addition, the matching step is more expensive than computing a distance between points, as used in global matching. In this respect, future research directions could be motivated by this kind of matching.
With respect to 3D shape recognition, Table 7.2 shows a selection of techniques proposed to date.
As noted, most presented techniques make extensive use of local features because these can mitigate the effect of occlusion in cluttered scenes. Nevertheless, imagebased proposals have also been considered. Lee and Drew [69] extracted contours from 2D projections around a 3D object, and subsequently scale-space curvature image was obtained for each projection. These images were used to identify the class of an object and determine the object in the selected class. In addition, Cyr and Kimia [35] extracted 2D views which were grouped in view sets called aspects. These aspects were represented by a prototype view for accelerating the recognition process given views from new objects.
272 |
|
B. Bustos and I. Sipiran |
|
Table 7.2 3D shape |
Type |
Method |
|
recognition methods |
|||
|
|
||
|
Image-based |
Eigen-scale-space contours [69] |
|
|
|
Aspect-graph [35] |
|
|
Local features |
Local features histogram [50] |
|
|
|
Spin images [56] |
|
|
|
Spherical spin images [94] |
|
|
|
3D shape contexts [41] |
|
|
|
Point signatures [33] |
|
|
|
Point fingerprint [98] |
|
|
|
Harmonic shape images [115] |
|
|
|
Cone Curvature [3] |
|
|
|
Local surface patch [32] |
|
|
|
Pyramid Matching [72] |
Although it is possible to apply any approach from shape retrieval proposals to object recognition, the general technique that has received most attention is matching by local features. In their seminal work, Chua and Jarvis [33] presented the point signature, a 1D descriptor for points on a surface. To construct the descriptor around some 3D surface point, a 3D space curve is generated as the intersection of a sphere around that point. A plane is fitted to this curve which is translated along its normal until it contains the sphere center (i.e. the surface point). The distance profile of the space curve to this plane, then forms the point’s local surface descriptor. In matching, correspondences were found and a voting scheme allowed the determination of objects in a scene.
Following the idea of representing the surrounding geometry of a point, Johnson and Hebert [57] proposed their well-known and well-studied spin images. Given an object, the authors constructed 2D descriptors for points over the surface. As the name suggests, a spin image was obtained by spinning a half plane around the analyzed point’s normal and accumulating the points lying in bins of that half plane. The matching was performed by finding correspondences using the spin images between an object and a scene and subsequently a geometric verification with a modified version of the iterative closest point (ICP) algorithm [14] was performed. A variation of this technique was spherical spin images, presented by Ruiz-Correa et al. [94].
Simple information has also been employed. For instance, Hetzel et al. [50] used pixels depth, normals and curvature information in order to combine them in multi-dimensional histograms. Thus, the matching step was performed using χ 2- divergence and a posteriori Bayesian classifier. Sun et al. [98] proposed their point fingerprint, which consisted of geodesic contours projected onto a point’s tangent plane. Frome et al. [41] introduced 3D shape contexts and harmonic shape contexts. The idea behind the shape context approach is accumulating the surrounding points using concentric spheres around the analyzed point. The authors proposed to use