- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
Chapter 8
3D Face Recognition
Ajmal Mian and Nick Pears
Abstract Face recognition using standard 2D images struggles to cope with changes in illumination and pose. 3D face recognition algorithms have been more successful in dealing with these challenges. 3D face shape data is used as an independent cue for face recognition and has also been combined with texture to facilitate multimodal face recognition. Additionally, 3D face models have been used for pose correction and calculation of the facial albedo map, which is invariant to illumination. Finally, 3D face recognition has also achieved significant success towards expression invariance by modeling non-rigid surface deformations, removing facial expressions or by using parts-based face recognition. This chapter gives an overview of 3D face recognition and details both well-established and more recent state-of-the-art 3D face recognition techniques in terms of their implementation and expected performance on benchmark datasets.
8.1 Introduction
Measurement of the intrinsic characteristics of the human face is a socially acceptable biometric method that can be implemented in a non-intrusive way [48]. Face recognition from 2D images has been studied extensively for over four decades [96]. However, there has been a lot of research activity and media publicity in 3D face recognition over the last decade. With the increased availability of affordable 3D scanners, many algorithms have been proposed by researchers and a number of competitions have been arranged for benchmarking their performance. Some commercial products have also appeared in the market and one can now purchase a range of Commercial Off-The-Shelf (COTS) 3D face recognition systems.
A. Mian ( )
School of Computer Science and Software Engineering, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
e-mail: ajmal.mian@uwa.edu.au
N. Pears
Department of Computer Science, University of York, Deramore Lane, York, YO10 5GH, UK e-mail: nick.pears@york.ac.uk
N. Pears et al. (eds.), 3D Imaging, Analysis and Applications, |
311 |
DOI 10.1007/978-1-4471-4063-4_8, © Springer-Verlag London 2012 |
|
312 |
A. Mian and N. Pears |
This chapter will introduce the main concepts behind 3D face recognition algorithms, give an overview of the literature, and elaborate upon some carefully selected representative and seminal techniques. Note that we do not intend to give a highly comprehensive literature review, due to the size of the field and the tutorial nature of this text.
A 2D image is a function of the scene geometry, the imaging geometry, the scene reflectance and the illumination conditions. The same scene appears completely different from different viewpoints or under different illuminations. For images of human faces, it is known that the variations due to pose and illumination changes are greater than the variations between images of different subjects under the same pose and illumination conditions [3]. Therefore, 2D image-based face recognition algorithms usually struggle to cope with such imaging variations.
On the other hand, a captured face surface1 much more directly represents the geometry of the viewed scene, and is much less dependent on ambient illumination and the viewpoint (or, equivalently, the facial pose). Therefore, 3D face recognition algorithms have been more successful in dealing with the challenges of varying illumination and pose.
Strictly, however, we observe that 3D imaging is not fully independent of pose, because when imaging with a single 3D camera with its limited field of view, the part of the face imaged is clearly dependent on pose. In other words, self-occlusion is a problem, and research issues concerning the fact that the surface view is partial come into play. Additionally, 3D cameras do have some sensitivity to strong ambient lighting as, in the active imaging case, it is more difficult to detect the projected light pattern, sometimes leading to missing parts in the 3D data. Camera designers often attempt to counter this by the use of optical filters and modulated light schemes. Finally, as pose varies, the orientation of the imaged surface affects the footprint of the projected light and how much light is reflected back to the camera. This varies the amount of noise on the measured surface geometry.
Despite these issues, 3D imaging for face recognition still provides clear benefits over 2D imaging. 3D facial shape is used both as an independent cue for face recognition, in multimodal 2D/3D recognition schemes [18, 64], or to assist (pose correct) 2D image-based face recognition. These last two are possible because most 3D cameras also capture color-texture in the form of a standard 2D image, along with the 3D shape, and the data from these two modalities (2D and 3D) is registered.
3D face recognition developments have also achieved significant success towards robust operation in the presence of facial expression variations. This is achieved either by building expression-invariant face surface representations [15], or modeling non-rigid surface deformations [58], or by avoiding expression deformations by only considering the more rigid upper parts of the face [64] or regions around the nose [19].
1This may be referred to as a 3D model, a 3D scan or a 3D image, depending on the mode of capture and how it is stored, as discussed in Chap. 1, Sect 1.1. Be careful to distinguish between a specific face model relating to a single specific 3D capture instance and a general face model, such as Blanz and Vetter’s morphable face model [11], which is generated from many registered 3D face captures.
8 3D Face Recognition |
313 |
Data from 3D scanners can be used to construct generative 3D face models offline, where such models can synthesize the face under novel pose and illumination conditions. Using such models, an online recognition system can employ standard 2D image probes, obviating the need for a 3D scanner in the live system. For example, 3D face models have been used to estimate the illumination invariant facial albedo map. Once a 3D model and face albedo are available, any new image can be synthetically rendered under novel poses and illumination conditions. A large number of such images under different illuminations are used to build the illumination cones of a face [38] which are subsequently used for illumination invariant face recognition. Similarly, synthesized images under different poses are used to sample the pose space of human faces and to train classifiers for pose invariant face recognition [38]. In another approach of this general modeling type, Blanz and Vetter [11] built a statistical, morphable model, which is learnt offline from a set of textured 3D head scans acquired with a laser scanner. This model can be fitted to single probe images and the model parameters of shape and texture are used to represent and recognize faces.
Thus there are a lot of different possibilities when using 3D face captures in face recognition systems. Although we examine several of these, the emphasis of this chapter is on recognition from 3D shape data only, where geometric features are extracted from the 3D face and matched against a dataset to determine the identity of an unknown subject or to verify his/her claimed identity. Even within these 3D facial shape recognition systems, there are many different approaches in the literature that can be categorized in several different ways. One of the main categorizations is the use of holistic face representations or feature-based face representations. Holistic methods encode the full visible facial surface after its pose has been normalized (to a canonical frontal pose) and the surface and its properties are resampled to produce a standard size feature vector. The feature vector could contain raw depth values and/or any combination of surface property, such as gradients and curvature. This approach has been employed, for example, using depth maps and the associated surface feature maps in nearest neighbor schemes within both Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) derived subspaces [45]. (Note that the counterpart of these methods for 2D images are often called appearance-based methods, since their low-dimensional representation is faithful to the original image.)
Although holistic methods often do extract features, for example to localize a triplet of fiducial points (landmarks) for pose normalization, and a feature vector (e.g. of depths, normals, curvatures, etc.) is extracted for 3D face matching, the term feature-based method usually refers to those techniques that only encode the facial surface around extracted points of interest, also known as keypoints. For example, these could be the local extrema of curvature on the facial surface or keypoints for which we have learnt their local properties. Structural matching (e.g. graph matching) approaches can then be employed where the relative spatial relations of features is key to the face matching process [65]. Alternatively, a ‘bag of features’ approach could be employed, where spatial relations are completely discarded and the content of more complex ‘information rich’ features is the key to the face matching process.
314 |
A. Mian and N. Pears |
An advantage of holistic methods is that they try to use all of the visible facial surface for discrimination. However, when 3D scans are noisy or low resolution, accurate and reliable pose normalization is difficult and feature-based approaches may perform better.
An outline of the broad steps involved in a typical 3D face recognition system are as follows:
1.3D face scan acquisition. A 3D face model is acquired using one of the techniques described in Chap. 2 (passive techniques) or Chap. 3 (active techniques). Currently, most cameras used for 3D face recognition are active, due to the lack of sufficiently large scale texture on most subject’s facial surface.
2.3D face scan preprocessing. Unlike 2D images from a digital camera, the 3D data is visibly imperfect and usually contains spikes, missing data and significant surface noise. These anomalies are removed during preprocessing and any small holes, both those in the original scan and those created by removal of data spikes and pits (negative spikes), are filled by some form of surface interpolation process. Surface smoothing, for example with Gaussians, is often performed as a final stage of 3D data preprocessing.
3.Fiducual point localization and pose normalization. Holistic face recognition approaches require pose normalization so that when a feature vector is generated, specific parts of the feature vector represent properties of specific parts of the facial surface. Generally this can be done by localizing a set of three fiducial points (e.g. inner eye corners and tip of the nose), mapping them into a canonical frame and then refining the pose by registering the rigid upper face region to a 3D facial template in canonical pose, using some form of Iterative Closest Points (ICP) [9] variant. Note that many feature-based methods avoid this pose normalization stage as, in challenging scenarios, it is often difficult to get a sufficiently accurate normalization.
4.Feature vector extraction. A set of features are extracted from the refined 3D scan. These features represent the geometry of the face rather than the 2D colortexture appearance. The choice of features extracted is crucial to the system performance and is often a trade-off between invariance properties and the richness of information required for discrimination. Example ‘features’ include the raw depth values themselves, normals, curvatures, spin images [49], 3D adaptations of the Scale-Invariant Feature Transform (SIFT) descriptor [57] and many others.
5.Facial feature vector matching/classification. The final step of feature matching/classification is similar to any other pattern classification problem, and most, if not all, of the well-known classification techniques have been applied to 3D face recognition. Examples include k-nearest neighbors (k-NN) in various subspaces, such as those derived from PCA and LDA, neural nets, Support Vector Machines (SVM), Adaboost and many others.