- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
7 3D Shape Matching for Retrieval and Recognition |
297 |
Table 7.7 Performance using space-sensitive bag of features with vocabulary size 48. Table reproduced from Ovsjanikov et al. [83]
Transformation |
EER |
FPR @ FNR = 1 % |
FPR @ FNR = 0.1 % |
Null |
0.58 % |
0.33 % |
1.98 % |
Isometry |
1.00 % |
1.07 % |
6.16 % |
Topology |
1.12 % |
1.67 % |
4.77 % |
Isometry+Topology |
1.41 % |
2.14 % |
6.80 % |
Triangulation |
2.11 % |
3.43 % |
8.57 % |
Partiality |
3.70 % |
6.19 % |
8.52 % |
All |
1.44 % |
1.79 % |
11.09 % |
|
|
|
|
On the other hand, using the space-sensitive approach, the results were 0.58 % for null shapes, 1.00 % for isometry, 1.12 % for topology, 2.11 % for triangulation, and 3.70 % for partiality; all in terms of EER. A recent track in the Shape Retrieval Contest (SHREC 2010) experimented on large scale retrieval [23], where the presented method and its variations were compared to other state-of-the-art techniques. In this report, heat kernel signatures obtained the best results with the mean average precision close to 100 % in almost all transformations, except partiality.
7.3.4.2 Complexity Analysis
Let S be a 3D object with n vertices. The computational complexity for each stage of the above method are:
•Computation of the Laplacian matrix: O(n2).
•Computation of the eigenvalues and eigenvectors: O(n3).
•Computation of the HKS: O(nm), where m is the dimension of each HKS.
•K-means clustering: O(I KN m), where I is the number of iterations until convergence, K is the number of clusters to be found, N is the number of descriptors of the entire collection, and m is the descriptor dimension.
•Bag of features: O(N km).
The total complexity of this method is dominated by the clustering process. Obviously, the number of descriptors N can be extremely large, so the k-means clustering is expensive. Therefore, the total computational complexity of this method is
O(I KN m).
7.4 Research Challenges
If we observe the literature on shape retrieval and recognition as briefly reviewed in Sect. 7.2, we can observe that this is a relatively young field and therefore presents
298 |
B. Bustos and I. Sipiran |
a number of areas which require further work and progress. This section is devoted to presenting the trends in future research and the challenges which concern the community.
Query specification. The research is commonly focused on testing the effectiveness and efficiency of the presented proposals, however an important factor is left out, users. As a result, little work has been done in query specification. It is generally assumed that we have a query object in the representation required by the application. Nevertheless, often we are interested in retrieving objects similar to the query, so a natural question arises: If we have an object (the query) visually similar to our needs, why do we proceed to search? A more interesting approach is to provide the query as images, video, sketches, or text. However, this proposal will often require human interaction to periodically feed back advisory information to the search. For example, in content-based image retrieval, much research has turned to using sketches as a more natural way of querying an image. This trend has raised new challenges and research interests which are also expected to emerge in the shape retrieval and recognition community.
Efficiency and large scale retrieval. Although a relative level of effectiveness has recently been achieved both in shape retrieval and recognition, important issues related to the efficiency require attention, even if approaches such as local features and the Laplace-Beltrami operator have begun to be extensively used. In addition, most techniques present results over publicly available datasets of no more than 2000 objects and results about efficiency are not even provided. Moreover, Laplace-Beltrami based approaches rely on extensive computations of eigenvalues and eigenvectors of huge matrices, so it is often necessary to simplify the meshes before processing at the expense of losing the level of detail. In this sense, efficient variants and alternatives are expected to be studied.
Object representation. As can be noted from previous sections of this chapter, many approaches rely on a boundary representation for 3D shapes. Perhaps this follows from the fact that this representation is preferred to others because its simplicity and suitability for rendering tasks. In addition, triangle meshes are widely available for processing and the vast majority of 3D objects on the Internet are found in this way. Nevertheless, some potential applications use different representations such as parametric surfaces in CAD and volumetric information in medicine. Each representation has intrinsic advantages which should be considered in order to exploit the information as it is.
Partial matching. A lot of work has been done for 3D objects when the required matching model is global, visual similarity. By global, we mean that given a 3D object, and algorithm retrieves those objects in the database that look visually similar and the whole shape structure is used for comparison. However, many presented methods do not allow partial matching due to the restricted global model that they assume. So given a part of a shape as a query, an interesting problem is to try to retrieve those objects in the database that contain visually similar parts to that query. Difficulties can arise due to the need to represent a model in a compact way, for instance, with local information whose extent is unknown a-priori. In addition, the
7 3D Shape Matching for Retrieval and Recognition |
299 |
matching becomes an expensive task because of the exponential amount of possible memberships of the query. Moreover, an even harder problem is to quantify the similarity and partiality, since the similarity strongly depends of the level of partiality allowed while searching.
Domain applications. With the increasing interest of the computer vision community in 3D shape retrieval and recognition, a current trend is to research the support that these can give to high level vision tasks. What is more, computer vision aims at recognizing and understanding the real composition of a viewed scene through a camera, where a scene is part of a three-dimensional world. In the future, we could consider the combination of shape retrieval and recognition with 3D reconstructions of scenes from images as an attempt to break the semantic gap between a three-dimensional scene and the image which represents it. In the same way, the field of medicine could take advantage in building 3D image analysis automated systems such as magnetic resonance images (MRI) and computed tomographies. It is easy to obtain three-dimensional representations from this kind of information and further processing can be beneficial. Another interesting application is modeling support, such as is required in videogames, 3D films and special effects; all of these require a large amount of work in modeling. These applications could benefit from shape retrieval and recognition tasks to reduce the time spent modeling.
Automatic 3D objects annotation. In order to increase effectiveness, we may require more semantic information to complement the geometric information extracted from the shape. Information about composition is a good choice, so it is necessary to maintain textual information which represent rich semantic information to be used in retrieval tasks. Nevertheless, attaching tags to shapes by humans is an expensive task, taking into account the amount of objects in a database. Thus, by using shape retrieval and recognition we can assign textual tags based on visual similarity or functionality. In addition, this approach can be used to add semantic information, which can be used to improve the visual search effectiveness.
7.5 Concluding Remarks
This chapter introduces the 3D shape matching process from the point of view of representative approaches in the field, potential applications and the main challenges which need to be addressed. The wide variety of available 3D data allows us to choose between different characteristics such as level of detail, shapes classes, and so forth. In addition to the standard datasets, many shape recognition applications use custom-acquired data in order to test their proposals with respect to domainoriented information. However it is important to use datasets widely employed and accepted by the community to have consistent research results and valuable performance comparisons.
Just as the amount of available 3D data has considerably grown in recent times, there is also an increasing interest of researchers for proposing new approaches for shape matching and studying the potential applications in several fields. We have
300 |
B. Bustos and I. Sipiran |
witnessed the achieved benefits of fields such as medicine, engineering, art, entertainment, and security, by the development of shape retrieval and recognition techniques. What is more, the interest in computer vision applications based on shape matching is becoming increasingly evident. It is easy to see the great potential that 3D information can provide and how it can be used to complement 2D images and video processing, in order to improve the effectiveness of high-level vision tasks. We believe that 3D information will be used commonly in the future and processes such as retrieval and recognition will be the basis for cutting-edge applications.
Likewise, it is beneficial to have a large catalog of techniques, because we can select an appropriate technique depending of the application context. Often we can combine techniques to improve the performance in general. In this chapter, we selected four techniques, which were explained in detail in Sect. 7.3. The depth buffer descriptor is an effective method based on extracting information of projections. Interestingly, this is one of the most effective methods yet simple, although it just supports a global similarity model. One way of supporting a certain level of partial similarity is by using local features extracted from shapes. Although the amount of information to be extracted increases, it is the cost to be paid for supporting nonglobal similarity models.
The other three presented techniques assume a non-global similarity model by extracting local descriptors which can be used to do the matching. The first of these is the spin image approach, which has proven to be effective in 3D shape recognition. Its versatility for describing shapes from different aspects has made it a standard technique for recognition tasks and new approaches often compare their results against results using spin images. Nevertheless, its dependency on uniform meshes and normals computation is restrictive. A small difference in calculating normals can produce different spin images, limiting its effectiveness.
Both salient spectral geometric features and heat kernel signature approaches make extensive use of a mathematical tool which has proven to be powerful for shape analysis, namely the Laplace-Beltrami operator. This operator has desirable properties which makes it a valuable tool for shape matching, in addition to the high effectiveness achieved in shape retrieval. Nevertheless, a weak point of this tool is its high computational cost which makes it an interesting challenge to be tackled in the future.
As can be noted, there is a lot of work to be done in proposing new approaches to improve the effectiveness and the efficiency of 3D shape matching, and studying new paradigms, some of which we mention in Sect. 7.4. We are convinced that the future of this research field is promising and the growth in scientific and technological productivity will remain thanks to the enormous efforts that the research communities in various fields are providing.
7.6 Further Reading
As expected, the increasing interest of research communities in shape retrieval and recognition has allowed a rapid advance, both in theory (new approaches) and appli-
7 3D Shape Matching for Retrieval and Recognition |
301 |
cations. Obviously, due to space limitations, all the material could not be addressed in this chapter, so this section is devoted to present additional material for interested readers.
A good starting point to introduce the reader further to the subject of shape retrieval and recognition are the surveys [26, 28, 55, 100]. Early evaluations of algorithms were also presented in the reports [15, 25, 27, 42]. For recent experimentation with state-of-the-art techniques, we recommend the reports of the SHREC contest [1]. For instance, recent SHREC tracks are: robust correspondence benchmark [22], robust large-scale shape retrieval benchmark [23], robust feature detection and description benchmark [17, 21], non-rigid 3D shape retrieval [73, 74], and generic 3D warehouse [105]. These reports represent a good reference for reviewing recent approaches and their performance evaluation.
More advanced and recent approaches have been presented, such as retrieval of 3D articulated objects [4], retrieval by similarity score fusion [5], discriminative spherical wavelet features [65], spherical parameterizations [66], compact hybrid shape descriptor [85], matching of point set surfaces [93], probability density-based shape descriptor [6], and spin images for shape retrieval [7, 8, 37], just to name a few. Another interesting approach is to refine the retrieval results using user information about how relevant the results were with a certain query. This approach is commonly called relevance feedback and it was properly applied by Leng and Qin [70], and Giorgi et al. [48] in shape retrieval tasks.
As stated in Sect. 7.4, partial matching is a challenging and still open problem. Nevertheless, some attempts have been proposed in order to tackle this problem. Among the main approaches are objects as metric spaces [20], priority-driven search [43], shape topics [76], reeb pattern unfolding [102], partial matching for real textured 3D objects [58], regularized partial matching of rigid shapes [19], and matching of 3D shape subparts [78]. Additionally, a good reference for non-rigid shape retrieval is due to Bronstein et al. [24].
The use of machine learning techniques has also been involved in shape retrieval and recognition. For instance, the boosting approach [63], supervised learning of similarity measures [64], unsupervised learning approach [80], learning of semantic categories [81], learning of 3D face models [101], face recognition by SVM classification [13], and the neurofuzzy approach [68]. These approaches need some background in pattern recognition and machine learning theory.
For readers interested in the Laplace-Beltrami operator and its applications in shape retrieval and recognition, we recommend the papers by Belkin et al. [11, 12], Bobenko [16], Chuang et al. [34], Ghaderpanah et al. [46], Levy [71], Rustamov [95], Wu et al. [109], and Xu [110, 111]. These papers have highly mathematical content, so it is recommended for a more advanced level of research.
On the other hand, in addition to the applications listed in Sec. 7.1, in the papers by Perakis et al. [89], Zhou et al. [116] and Giorgi et al. [47], we can find applications to face recognition, and in the work by Wessel et al. [107], the authors presented a benchmark for retrieval of architectural data.