Treffer: A statistical shape modeling approach for predicting subject-specific human skull from head surface.
Original Publication: Stevenage, Eng., Peregrinus.
Weitere Informationen
Human skull is an important body structure for jaw movement and facial mimic simulations. Surface head can be reconstructed using 3D scanners in a straightforward way. However, internal skull is challenging to be generated when only external information is available. Very few studies in the literature focused on the skull generation from outside head information, especially in a subject-specific manner with a complete skull. Consequently, this present study proposes a novel process for predicting a subject-specific skull with full details from a given head surface using a statistical shape modeling approach. Partial least squared regression (PLSR)-based method was used. A CT image database of 209 subjects (genders-160 males and 49 females; ages-34-88 years) was used for learning head-to-skull relationship. Heads and skulls were reconstructed from CT images to extract head/skull feature points, head/skull feature distances, head-skull thickness, and head/skull volume descriptors for the learning process. A hyperparameter turning process was performed to determine the optimal numbers of head/skull feature points, PLSR components, deformation control points, and appropriate learning strategies for our learning problem. Two learning strategies (point-to-thickness with/without volume descriptor and distance-to-thickness with/without volume descriptor) were proposed. Moreover, a 10-fold cross-validation procedure was conducted to evaluate the accuracy of the proposed learning strategies. Finally, the best and worst reconstructed skulls were analyzed based on the best learning strategy with its optimal parameters. The optimal number of head/skull feature points and deformation control points are 2300 and 1300 points, respectively. The optimal number of PLSR components ranges from 4 to 8 for all learning configurations. Cross-validation showed that grand means and standard deviations of the point-to-thickness, point-to-thickness with volumes, distance-to-thickness, and distance-to-thickness with volumes learning configurations are 2.48 ± 0.27 mm, 2.46 ± 0.19 mm, 2.46 ± 0.15 mm, and 2.48 ± 0.22 mm, respectively. Thus, the distance-to-thickness is the best learning configuration for our head-to-skull prediction problem. Moreover, the mean Hausdorff distances are 2.09 ± 0.15 mm and 2.64 ± 0.26 mm for the best and worst predicted skull, respectively. A novel head-to-skull prediction process based on the PLSR method was developed and evaluated. This process allows, for the first time, predicting 3D subject-specific human skulls from head surface information with a very good accuracy level. As perspective, the proposed head-to-skull prediction process will be integrated into our real-time computer-aided vision system for facial animation and rehabilitation. Graphical abstract.
Graph: Fig. 2 Segmentation and surface reconstruction of the head and skull from CT images using 3D Slicer software
Graph: Fig. 3 Post-processing procedure for outlier removal, head cutting, hole filling, surface reconstructing, and skull cutting
Before applying the regression model, all head and skull shapes were registered to the same coordinate system (Fig. 4). The first head–skull dataset was first translated to the origin of the global coordinate system. This means that the centroid of the first head–skull dataset was moved to the origin of the global coordinate system. Note that orientation axes of the first model do not change. Moreover, because the first dataset was chosen as the reference of all datasets, its local coordinate system becomes the global coordinate system of all datasets. Then, the singular value decomposition (SVD) rigid registration method [[39]] and the iterative-closest-point (ICP) algorithm [[40]] were applied for each other dataset to transform it into the first head–skull dataset. Note that anatomical landmarks (left and right ears, forehead center, top nose, and mouth center) were added in each head model as reference points for the SVD registration process. Thus, the head model was only best aligned in facial regions. To optimally align all head regions, the ICP registration process was then applied on all vertices of the head model. The number of head vertices of all datasets was normalized to about 100,000 points in the post-processing procedure (Fig. 3). Finally, the SVD and ICP registration matrices estimated from the head model were applied to the related skull model. As a result, by using both SVD and ICP registration methods, the differences among head–skull datasets were minimized. In fact, the accuracy of the registration process was also optimized to get the best normalization result. Moreover, any holes in the heads and skulls were closed by using a shape generation procedure (Fig. 5). The convex hull of each head/skull (Fig. 5c) was first generated so that it covered all head/skull vertices (Fig. 5b). The convex hull was then re-meshed using the isotropic surface remeshing method [[41]] with the target edge length equal to the average edge length of the convex hull (Fig. 5d). All vertices of the re-meshed convex hull were projected to the nearest vertices of the head/skull model based on the KdTree-Flann nearest searching method in Point Cloud Library (PCL) [[42]]. After being projected, the head/skull convex hull was not isometric (Fig. 5e), so it was isometrically re-meshed to form a head/skull shape (Fig. 5f) with the target edge length equal to the average edge length of the projected head/skull convex hull.
Graph: Fig. 4 Illustration of the head and skull model registration process: the first head/skull was registered into a global coordinate system and then all other heads/skulls were registered into the first head/skull
Graph: Fig. 5 Illustration of the head/skull shapes generated from CT-based reconstructed heads/skulls
In addition, due to different numbers of vertices and facets for each head/skull model, a sampling process was established to achieve a unified number of feature points for all head/skull datasets (Fig. 6). The sampling rays were first generated from a sampling surface (Fig. 6b). The sampling rays started from the center of the sampling surface and have directions that go from this center to the vertices of the sampling surface (Fig. 6c). The sampling rays were also normalized to have the length of 1. In particular, a convex hull of the first skull shape was estimated, and then this convex hull was isometrically re-meshed with the adjustable edge length to form the sampling surface. The number of vertices of the sampling surface can be controlled by adjusting the edge length during the isometric re-meshing process. Because all head–skull datasets were registered to the same coordinate system of the first head–skull dataset, the sampling surface was by default on the same coordinate system with all registered head–skull datasets. Intersected points between the sampling rays and the facets of the head/skull models were the head/skull feature points.
Graph: Fig. 6 Head–skull surface sampling process
Head-to-skull learning using the PLSR method
Two learning strategies were proposed for predicting skull shape from head shape. The first strategy was to study the relationship between the positions of head feature points and the head–skull thickness values at these feature points. The second strategy was to study the relationship between the distances from head feature points to the head center and the head–skull thickness values at these feature points. In addition, each learning strategy also included two configurations in which the training input data were integrated with/without volume descriptor (head volumes).
After the head and skull shape sampling, the head and skull feature points of each dataset were stored in the head and skull feature-point matrices. We also had the sampling ray directions from the sampling surface. The head (
-
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><msubsup><mi mathvariant="bold-italic">V</mi><mi>H</mi><mi>i</mi></msubsup><mfenced close="]" open="["><mrow><mi>M</mi><mo>×</mo><mn>3</mn></mrow></mfenced><mo mathvariant="bold-italic">=</mo><mfenced close="]" open="["><mtable><mtr><mtd><msubsup><mi mathvariant="bold-italic">v</mi><mn>1</mn><mi>H</mi></msubsup></mtd></mtr><mtr><mtd><mo mathvariant="bold-italic">⋮</mo></mtd></mtr><mtr><mtd><msubsup><mi mathvariant="bold-italic">v</mi><mi>M</mi><mi>H</mi></msubsup></mtd></mtr></mtable></mfenced><mo mathvariant="bold-italic">=</mo><mfenced close="]" open="["><mtable><mtr><mtd><msubsup><mi>x</mi><mrow><mi>i</mi><mn>1</mn></mrow><mi>H</mi></msubsup></mtd><mtd><msubsup><mi>y</mi><mrow><mi>i</mi><mn>1</mn></mrow><mi>H</mi></msubsup></mtd><mtd><msubsup><mi>z</mi><mrow><mi>i</mi><mn>1</mn></mrow><mi>H</mi></msubsup></mtd></mtr><mtr><mtd><mo mathvariant="bold-italic">⋮</mo></mtd><mtd><mo mathvariant="bold-italic">⋮</mo></mtd><mtd><mo mathvariant="bold-italic">⋮</mo></mtd></mtr><mtr><mtd><msubsup><mi>x</mi><mi mathvariant="italic">iM</mi><mi>H</mi></msubsup></mtd><mtd><msubsup><mi>y</mi><mi mathvariant="italic">iM</mi><mi>H</mi></msubsup></mtd><mtd><msubsup><mi>z</mi><mi mathvariant="italic">iM</mi><mi>H</mi></msubsup></mtd></mtr></mtable></mfenced><mo mathvariant="bold-italic">,</mo><mi>i</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mo>...</mo><mo>,</mo><mi>N</mi></math>
Graph
2
Graph
3
Graph
4
Graph
5
Graph
6
Graph
in which
Strategy 1: point-to-thickness relationship learning
In this strategy, the relationship between the positions of head feature points and their respective head–skull thickness values were analyzed. In the
7
Graph
in which
The head volume descriptor (<bold>
8
Graph
in which
The predictor variable (<bold>
9
Graph
10
Graph
where
In Eqs. (9) and (10),
The response variable (<bold>
11
Graph
12
Graph
In Eqs. (11) and (12),
Strategy 2: distance-to-thickness relationship learning
In this strategy, the relationship between the distances from the head feature points to the head center and their respective head–skull thickness values of these feature points were analyzed. The distances (
13
Graph
in which
The predictor variable (<bold>
14
Graph
15
Graph
where
PLSR and improved PLSR algorithm
A multivariate linear model is defined as the following equation:
16
Graph
in which <bold>
The objective of the training stage was to estimate the model coefficient matrix <bold>
17
Graph
Using the PLSR method, the predictor datasets and the response datasets were decomposed into the following matrices:
18
Graph
19
Graph
where <bold>
20
Graph
21
Graph
<bold>
22
Graph
where <bold>
Regarding the skull reconstruction process, a generic skull (6122 vertices, 9537 facets), provided by Free3D [[46]], was deformed to fit its covering shape with the regressed skull shape to form the final predicted skull (Fig. 7). The generation procedure includes three main steps: (1) registration, (2) initial deformation, and (3) refinement. First, the generic skull was registered to the regressed skull shape using the SVD [[39]], ICP [[40]], and coherent point drift (CPD) [[47]] registration methods. In particular, the generic skull model was initially registered to the centroid of the new head model using the SVD registration method with manually picked markers on both generic skull and head models. Then, the skull shape of the registered generic skull model was generated based on the skull shape generation procedure in Fig. 5. This skull shape was registered to be best fitted with the regressed skull shape predicted from the new head model using both ICP and CPD rigid registration method. The ICP first centralized the SVD-registered generic skull shape inside the regressed skull shape. However, because of different vertex numbers, the two skull shapes were not optimally fitted. Consequently, the CPD rigid registration method was then used to compute the optimal rigid transformation. As a result, the SVD-registered generic skull was transformed using the optimal ICP–CPD rigid transform matrix to be best fitted with the regressed skull shape inside the new head model. Second, the registered skull was parameterized using mean value coordinates [[48]] before deforming to the regressed skull shape. In particular, the registered generic skull was parameterized using the cage with 14 vertices. The number of cage vertices was chosen so that only the general shape of the skull model could be affected. Using mean coordinate matrix, we could deform the generic skull model by moving cage vertices. The cage of the registered generic skull model was then initially deformed so that the difference between the regressed skull shape and the registered generic skull shape was minimized. After initial deformation, the deformed skull model was relatively fitted with the regressed skull shape. However, some detail regions were not perfectly fitted such as forehead, eye, chin, and teeth regions. Third, the parameterization and deformation were repeated with more control points. In particular, the deformed skull model was again parameterized using the mean value coordinates [[48]] with a larger number of cage vertices. The deformed skull cage was deformed so that the deformed skull shape was optimally fitted with the regressed skull shape. Note that the number of cage vertices in the refinement step affected to the final accuracy of the final predicted skull. We experimentally chose the number of cage vertices in this step as 1300 vertices, which could give us the best fitted predicted skull model.
Graph: Fig. 7 The skull model generation process from regressed skull shape: (1) registration, (2) initial deformation, and (3) refinement deformation
Hyperparameter turning and cross-validation
A convergence study was conducted to determine the optimal number of head/skull feature points, the optimal number of PLSR components for each learning strategy and the optimal number of control points for skull reconstruction process. Throughout the convergence analysis, the full database was divided into a training head–skull datasets (70%,
After having the optimal numbers of components and head/skull feature points, the PLSR models with four learning configurations were cross-validated to determine their accuracy. A 10-fold cross-validation process was performed. Mean errors were estimated and reported. Note that all learning configurations (point-to-thickness, point-to-thickness with volume descriptor, distance-to-thickness, and distance-to-thickness with volume descriptor) were cross-validated using the same protocol for selecting the best configuration. With the best learning configuration and its optimal parameters, the best and worst regressed outputs among all testing datasets were selected to reconstruct the appropriate skulls using the optimal number of control points.
The comparison between the regressed skull shape and CT-based skull shape was evaluated using different metrics including Hausdorff distance and volume deviation. Moreover, the comparison between the reconstructed skull and the CT-based skull was evaluated using only the Hausdorff distance criteria. All training, regressing, and reconstructing procedures were executed on a mobile workstation system with the hardware configuration of Intel® Xeon® E-2176 M CPU @ 2.7GHz 64 bits, 12 cores, 32GB DDRAM. All procedures were developed in Microsoft Visual Studio C++ 2015.
Results
Head and skull volume
The distributions of head, skull, and head–skull volumes computed from head/skull shapes are shown in Fig. 8. The average head volume is 3987 cm<sups>3</sups>, the average skull volume is 2607 cm<sups>3</sups>, and the average head–skull difference volume is 1379 cm<sups>3</sups>. Note that the order of magnitude of the average head volume is in agreement with the mass properties of adult human head [[36]].
Graph: Fig. 8 Volume distributions of all reconstructed subjects
Hyperparameter turning outcomes
The mean Hausdorff distances between the best regressed skull shapes and their CT-based skull shapes for all learning configurations (point-to-thickness, point-to-thickness with volume descriptor, distance-to-thickness, and distance-to-thickness with volume descriptor) according to different number of head/skull feature points are shown in Fig. 9. Overall, the mean distances of all configurations significantly decrease when the number of head/skull feature points increases from 100 to 500 points. These distances decrease slowly and become stable when the number of feature points is from over 500 to 3100 points. The average mean distances of four learning configurations gradually decrease and reach the minimum value at 1.08 mm when the number of feature points is 2300 points. From this number, the average distance stops decreasing.
Graph: Fig. 9 Mean Hausdorff distances computed from the convergence analysis to determine the optimal number of head/skull feature points: the number of head/skull feature points was increased from 100 to 3100 features with the step-size of 100 points. Note that for each learning configurations, the mean Hausdorff distance was computed between the best regressed skull shape and its appropriate CT-based skull shape
By using the optimal number of head/skull feature points as 2300, we trained and tested our PLRS models to determine the optimal number of components. The root mean square errors of all tested learning configurations are shown in Fig. 10. The optimal number of components for the point-to-thickness, point-to-thickness with volume descriptor, distance-to-thickness, and distance-to-thickness with volume descriptor are 8, 7, 5, and 4, respectively. Note that the optimal number of components decreases when the number of predictor variables decreases. Moreover, in the point-to-thickness configuration, the learning strategy with head volume descriptor has smaller optimal number of components than ones without the head volume descriptor.
Graph: Fig. 10 RMS errors computed from the convergence analysis to determine the optimal number of components: the number of components was increased from 1 to 20 components with the step size of one component. The number of head/skull feature points was kept at the optimal value of 2300
The outcomes of the convergence analysis to determine the optimal number of control points for the skull reconstruction process in the refinement step are shown in Fig. 11. Overall, when the number of control points increases from 100 to 3100 points, the average differences of the three randomly selected testing datasets decrease from 2.5 mm to reach the first minimal value of 1.8 mm at 1300 points. From 1300 to 3100 points, the average errors fluctuate around 1.9 mm. Consequently, according to this analysis, we choose the optimal number of control points as 1300 points.
Graph: Fig. 11 Mean Hausdorff distances between the generated skull shapes and the CT-based skull shapes in different number of control points during the refinement process of skull deformation. For each number of control points, three randomly selected testing datasets were evaluated
Tenfold cross-validation
The outcomes of the 10-fold cross-validation using optimal parameter values for each learning configuration are shown in Fig. 12. All the regressed outputs of each learning strategy were relatively stable in each iteration of the cross-validation process. Grand mean and standard deviations of mean differences in the point-to-thickness, point-to-thickness with volumes, distance-to-thickness, and distance-to-thickness with volume learning configurations throughout 10 times of cross-validations are 2.48 ± 0.27 mm, 2.46 ± 0.19 mm, 2.46 ± 0.15 mm, and 2.48 ± 0.22 mm, respectively. Thus, the distance-to-thickness is the best learning configuration for our head-to-skull predicting problem. Note that the training and testing time for each run of this optimal learning configuration is 9 min 4 s ± 10 s.
Graph: Fig. 12 Mean errors between the regressed outputs and the testing outputs from the 10-fold cross-validation. The mean errors were evaluated in different learning configurations with the optimal numbers of components and head/skull feature points
Best and worst prediction cases
The best and worst regressed skull shapes from the 10-fold cross-validation by using the optimal set of hyperparameters and the best learning configuration obtained from previous steps were analyzed. The regressed skull shapes were compared with their respective CT-based skull shapes using the Hausdorff distance and volume metrics. The comparison results are shown in Fig. 13. The mean Hausdorff distances are 1.2 ± 0.14 mm and 3.1 ± 0.29 mm for the best and worst cases, respectively. The volume deviations are 25.3 ± 26.4 cm<sups>3</sups> (1 ± 1%) and 222.5 ± 56.9 cm<sups>3</sups> (8.7 ± 1.8%) for the best and worst cases, respectively. When reconstructing skulls from the best and worst regressed skull shapes (Fig. 14), the mean Hausdorff distances are 2.09 ± 0.15 mm and 2.64 ± 0.26 mm for the best and worst cases, respectively. In particular, the mean Hausdorff distances in muscle attachment regions are 1.68 ± 0.36 mm and 1.98 ± 0.36 mm for the best and worst cases, respectively. The muscle attachment regions were defined based on the standard definitions of facial muscles in MPEG-4 [[50]], facial action coding system (FACS) [[49]], and facial anatomical landmark [[51]]. Based on these standards, we defined muscle attachment regions by manually selecting skull vertices in the skull models (both predicted and CT-based models). Note that the skull reconstruction time from skull shapes is 40.27 ± 3.06 s.
Graph: Fig. 13 Comparisons of best and worst cases throughout 10-fold of cross-validations: (a) mean Hausdorff distances and (b) volume deviations between the regressed skull shapes and CT-based skull shapes
Graph: Fig. 14 Comparison of the best and worst reconstructed skulls with the respective CT-based skulls from the 10-fold cross-validation: (a) best case and (b) worst case
Discussion
Three-dimension skull is commonly reconstructed in computer-aided vision system to perform jaw movements and facial expression/mimic animations. Thus, the skull reconstruction affects directly on the accuracy of such system. The use of medical imaging techniques like CT or MRI could lead to a very accurate skull reconstruction. However, these techniques are not suitable for a real-time computer-aided vision system in which only visual sensors such as 2D cameras [[7]], 3D scanners [[8]], and Kinect sensors [[10]] are available. Consequently, skull generation from head becomes an attractive and challenging issue. In this present paper, we proposed a novel process for predicting 3D skull from head using a statistical shape modeling approach. Our approach showed a high level of approximation with a mean Hausdorff distance ranging from 2.09 ± 0.15 mm to 2.64 ± 0.26 mm between statistics-based regressed skulls and their respective CT-based reconstructed skulls. In particular, the mean Hausdorff distances were from 1.68 ± 0.36 mm to 1.98 ± 0.36 mm in muscle attachment regions. It is important to note that our proposed skull reconstruction from head is a complex process with many processing steps ranging from medical image processing, geometrical transformation, and regression. All these steps contribute into the reported errors. In particular, manual interventions (e.g., head and skull cutting) were performed during medical imaging processing and geometrical transformations. Thus, further study should be investigated to provide an automatic procedure for these interventions to reduce possibly the errors. Another possible improvement could be the increase of samples used in the training process to get better regression outcomes.
In the literature, only one study tried to estimate skull from head, but the generated skull was partial or lacked skull details [[2]]. In fact, this study approximated the skull's forehead and jaws by just down-scaling and down-sampling a head model. The affine transformation was used to deform an available skull to be approximately inside a new head. However, this method was not validated with a ground truth skull. Moreover, the affine transforms were not enough for describing nonlinear deformations and surface variances of the head–skull relationship. Furthermore, skull-to-head relationship was commonly found in other studies for forensic facial reconstruction [[13]–[21]]. Compared with all aforementioned studies, our study regressed head–skull relationship with different learning strategies including point-to-thickness (with/without head volumes) and distance-to-thickness (with/without head volumes). Moreover, the improved PLSR method [[45]] was used in our study because it was more suitable for training large size of predictor and response variables than the classical PLRS method [[43]], which mostly used in the previous studies. In addition, a hyperparameter turning process was conducted leading to more robust prediction outcomes with optimal number of head/skull feature points and PLSR components. In particular, we applied the cage-based deformation method to deform a generic skull to fit with the regressed skull shape. The cage-based deformation method was more suitable than the free-form deformation method [[52]] for deforming models with complex internal structures like skulls.
In addition, in our prediction problem, the number of our training datasets (
Regarding the skull-to-head or head-to-skull relationships, because the number of predictor variables is often more than the number of training datasets, the dimension reduction methods like the PCA [[13], [16]–[18], [20]] and the PLRS [[19]] methods were mostly investigated. While the PCA is an unsupervised pattern recognition technique aiming to analyze variances of predictor variables without considering the response variables, the PLSR is a supervised technique aiming to analyze intrinsic relationships between the variance of predictor variables and variance of response variables. Consequently, fewer components in the PLSR method are needed for fitting the response variables than in the case of PCA method [[43], [54]]. In particular, in this present study, the improved PLSR method proposed by Dayal et al. [[45]] was used. This method is faster and more efficient than the classical PLSR when analyzing a large number of predictor and response variables. Moreover, instead of directly studying skull–face thickness distributions, most studies tried to learn positions of landmarks manually marked on face and skull regions [[13], [16]]. Thus, faces and skulls could also be divided into different ROIs, and their vertices can be used as the predictor and response variables to train the PCA or PLSR models [[19]]. However, the direct use of positions of face–skull markers/ROI vertices as the predictor and response variables increases significantly the complexity and accuracy of the trained models, especially when the number of head/face feature points is large. In this present study, our best learning strategy (i.e., distance-to-thickness learning configuration) has a minimal number of predictors and response variables. In particular, the vector values of head/skull feature points were transferred to the scalar values, so the dimensions of predictor and response variables were significantly decreased. Consequently, this learning configuration is suitable for analyzing a large number of feature points. Furthermore, the use of manual sampling approach limits the number of feature points and datasets (38 points on 25 datasets in [[16]], 52 points on 118 datasets in [[13]], 78 points on 200 datasets in [[17]]). In the present study, the proposed automatic sampling procedure is more suitable for acquiring a large number of feature points on a large number of datasets (2300 points on 209 head–skull datasets). Finally, in the most studied works in the literature, back regions were always lacking in learning head–skull relationship. Our present study learns the head-to-skull relationship with full head/skull shapes. Consequently, our approach could be used for learning both head-to-skull and skull-to-head relationships in a straightforward way.
Three-dimensional geometrical shape learning has become an attractive research field with many applications like 3D object segmentation, 3D shape correspondence, or 3D shape prediction. There are two common approaches to perform shape learning. The first one relates to classical statistical approach like the PLSR used in this present study. The second one deals with machine-learning approach [[55]–[57]]. At the moment, with the progress of new efficient and robust machine-learning algorithms, 3D shape learning could be efficiently achieved. However, this requires specific machine learning libraries like TensorFlow or Keras. Thus, the integration of this approach into a real-time computer-vision system requires specific coding implementation specification. In the present study, a classical statistical approach was selected due to its potential integration into our computer vision system. However, although the used PLSR method is particularly suitable for our head-to-skull prediction problem, this method also has some drawbacks. For example, the variance of predictor variables and response variables is only computed throughout the training datasets over different shapes. The intrinsic variance of each data shape inside is not taken into consideration. Thus, further studies will be investigated to incorporate this important behavior into the prediction model. For this purpose, a new class of deep learning approach [[58]], named geometric deep learning, should be adopted.
One of the limitations of this present study relates to the fact that CT image database was used. It is well known that medical imaging data are commonly acquired in supine position leading to shape artifacts due to gravity effect [[59]]. Thus, a postural transformation should be investigated in the future to correct the shape before using the reconstructed head/skull shapes for learning purpose. Moreover, a larger database focusing on the age group of less than 35 years old will complete our study. Finally, the increase of the number of data sets will improve the accuracy of statistical regression capacity for our head-to-skull prediction problem.
Conclusions
A novel head-to-skull prediction process based on PLSR method was developed and evaluated in this present study. The distance-to-thickness learning configuration was identified as the best learning strategy for our prediction problem. A hyperparameter turning was performed to select the optimal set of learning parameters. The developed process allows to predict 3D human skull from head surface information with a very good accuracy level (i.e., the mean Hausdorff distances range from 2.0937 ± 0.1549 mm to 2.6371 ± 0.2569 mm). As perspective, the proposed head-to-skull prediction process will be integrated into our real-time computer-aided vision system for facial animation and rehabilitation [[53]].
Acknowledgments
This work was carried out and funded in the framework of the Labex MS2T. It was supported by the French Government, through the program "Investments for the future" managed by the National Agency for Research (Reference ANR-11-IDEX-0004-02). We also acknowledge the "Hauts-de-France" region for funding.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflicts of interest.
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1 Y. Lee, D. Terzopoulos, K. Waters (1993), Constructing physics-based facial models of individuals, Proc. of Graphics Interface '93 Conf. 1–8. 10.20380/GI1993.01
2 K. Kähler, J. Haber, H.-P. Seidel (2001), Geometry-based muscle modeling for facial animation. In: Graph. Interface, pp. 37–46. doi:10.20380/GI2001.05
3 Claes P, Vandermeulen D, De Greef S, Willems G, Clement JG, Suetens P. Computerized craniofacial reconstruction: conceptual framework and review. Forensic Sci Int. 2010; 201: 138-145. 10.1016/j.forsciint.2010.03.008
4 Wei M, Liu Y, Dong H, El Saddik A. Human head stiffness rendering. IEEE Trans Instrum Meas. 2017; 66: 2083-2096. 10.1109/TIM.2017.2676258
5 Ping HY, Abdullah LN, Sulaiman PS, Halin AA. Computer facial animation: a review. Int J Comput Theory Eng. 2013; 5: 658-662. 10.7763/ijcte.2013.v5.770
6 K. Waters, D. Terzopoulos (1990), A physical model of facial tissue and muscle articulation, in: [1990] Proc. First Conf. Vis. Biomed. Comput, 1990: pp. 77–82. doi:10.1109/VBC.1990.109305
7 Wang SF, Lai SH. Reconstructing 3D face model with associated expression deformation from a single face image via constructing a low-dimensional expression deformation manifold. IEEE Trans Pattern Anal Mach Intell. 2011; 33: 2115-2121. 10.1109/TPAMI.2011.88
8 Marcos S, Gómez-García-Bermejo J, Zalama E. A realistic, virtual head for human–computer interaction. Interact Comput. 2010; 22: 176-192. 10.1016/j.intcom.2009.12.002
9 Matsuoka A, Yoshioka F, Ozawa S, Takebe J. Development of three-dimensional facial expression models using morphing methods for fabricating facial prostheses. J Prosthodont Res. 2019; 63: 66-72. 10.1016/j.jpor.2018.08.003
Turban L, Girard D, Kose N, Dugelay JL. From Kinect video to realistic and animatable MPEG-4 face model: a complete framework, 2015 IEEE Int. Conf Multimed Expo Work ICMEW. 2015; 2015: 1-6. 10.1109/ICMEW.2015.7169783
Nielsen JD, Madsen KH, Puonti O, Siebner HR, Bauer C, Madsen CG, Saturnino GB, Thielscher A. Automatic skull segmentation from MR images for realistic volume conductor models of the head: assessment of the state-of-the-art. Neuroimage. 2018; 174: 587-598. 10.1016/j.neuroimage.2018.03.001
Minnema J, van Eijnatten M, Kouw W, Diblen F, Mendrik A, Wolff J. CT image segmentation of bone for medical additive manufacturing using a convolutional neural network. Comput Biol Med. 2018; 103: 130-139. 10.1016/j.compbiomed.2018.10.012
Claes P, Vandermeulen D, Suetens R, Willems G, De Greef S. Volumetric deformable face models for cranio-facial reconstruction, ISPA 2005. Proc. 4th Int. Symp. Image Signal Process. Anal. 2008; 2005: 353-358. 10.1109/ispa.2005.195437
R. Liang, Y. Lin, L. Jiang, J. Bao, X. Huang (2009), Craniofacial model reconstruction from skull data based on feature points, Proc. - 2009 11th IEEE Int. Conf. Comput. Des. Comput. Graph. CAD/Graphics 2009. 602–605. doi:10.1109/CADCG.2009.5246828
Yuru Pei, Hongbin Zha, Zhongbiao Yuan (2008), Facial feature estimation from the local structural diversity of skulls. In: 2008 19th Int. Conf. Pattern Recognit, IEEE, 2008: pp. 1–4. doi:10.1109/ICPR.2008.4761858
Y.F. Zhang, M.Q. Zhou, G.H. Geng, J. Feng (2010), Face appearance reconstruction based on a regional statistical craniofacial model (RCSM). Proc. - Int. Conf. Pattern Recognit. 1670–1673. doi:10.1109/ICPR.2010.413
Shui W, Zhou M, Deng Q, Wu Z, Duan F (2010) 3D craniofacial reconstruction using reference skull–face database. Int Conf Image Vis Comput New Zeal:1–7. 10.1109/IVCNZ.2010.6148864
Yang Y, Zhou M, Lu K, Duan F, Li Y, Tian Y, Wu Z. Skull identification via correlation measure between skull and face shape. IEEE Trans Inf Forensics Secur. 2014; 9: 1322-1332. 10.1109/tifs.2014.2332981
Duan F, Huang D, Tian Y, Lu K, Wu Z, Zhou M. 3D face reconstruction from skull by regression modeling in shape parameter spaces. Neurocomputing. 2015; 151: 674-682. 10.1016/j.neucom.2014.04.089
F. Bai, C. Zhengxin, X. Qiao, D. Qingqiong, F. Duan, Y. Tian (2017), Face reconstruction from skull based on Least Squares Canonical Dependency Analysis, 2016 IEEE Int. Conf. Syst. Man, Cybern. SMC 2016 - Conf. Proc. 3612–3617. doi:10.1109/SMC.2016.7844794
D. Madsen, M. Lüthi, A. Schneider, T. Vetter (2018), Probabilistic joint face-skull modelling for facial reconstruction. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit, 2018: pp. 5295–5303
Pearson K LIII On lines and planes of closest fit to systems of points in space. London, Edinburgh, Dublin Philos. Mag. J. Sci 2(1901):559–572
Hotelling H. Analysis of a complex of statistical variables into principal components. J Educ Psychol. 1933; 24: 417. 10.1037/h0071325
Soummer R, Pueyo L, Larkin J. Detection and characterization of exoplanets and disks using projections on Karhunen–Loève eigenimages. Astrophys J Lett. 2012; 755: L28. 10.1088/2041-8205/755/2/L28
Wold S, Geladi P, Esbensen K, Öhman J. Multi-way principal components- and PLS-analysis. J Chemom. 1987; 1: 41-56. 10.1002/cem.1180010107
Garthwaite PH. An interpretation of partial least squares. J Am Stat Assoc. 1994; 89: 122-127. 10.1080/01621459.1994.10476452
Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999; 401: 788-791. 10.1038/44565
Paatero P, Tapper U. Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics. 1994; 5: 111-126. 10.1002/env.3170050203
Ju T, Schaefer S, Warren J. Mean value coordinates for closed triangular meshes. In: ACM SIGGRAPH 2005 Pap. - SIGGRAPH '05. ACM press. 2005: New York; New York, USA: 561
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J. Digit. Imaging. 2013; 26: 1045-1057. 10.1007/s10278-013-9622-73824915
Vallières M, Kay-Rivest E, Perrin LJ, Liem X, Furstoss C, Aerts HJWL, Khaouam N, Nguyen-Tan PF, Wang C-S, Sultanem K, Seuntjens J, El Naqa I. Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer. Sci Rep. 2017; 7: 10117. 10.1038/s41598-017-10371-55579274
Pieper S, Halle M, Kikinis R. 3D Slicer. 2005: 632-635
Dao TT, Pouletaut P, Charleux F, Lazáry Á, Eltes P, Varga PP, Ho Ba Tho MC (2015) Multimodal medical imaging (CT and dynamic MRI) data and computer-graphics multi-physical model for the estimation of patient specific lumbar spine muscle forces. Data Knowl. Eng 96–97:3–18. 10.1016/j.datak.2015.04.001
Lorensen WE, Cline HE. Marching cubes: a high resolution 3D surface construction algorithm. ACM SIGGRAPH Comput Graph. 1987; 21: 163-169. 10.1145/37402.37422
Field DA. Laplacian smoothing and Delaunay triangulations. Commun Appl Numer Methods. 1988; 4: 709-712. 10.1002/cnm.1630040603
Smith KE, Bhatia G, Vannier MW. Assessment of mass properties of human head using various three-dimensional imaging modalities. Med Biol Eng Comput. 1995; 33: 278-284. 10.1007/BF02510500
Bernardini F, Mittleman J, Rushmeier H, Silva C, Taubin G. The ball-pivoting algorithm for surface reconstruction. IEEE Trans Vis Comput Graph. 1999; 5: 349-359. 10.1109/2945.817351
Fabri A, Giezeman G-J, Kettner L, Schirra S, Schnherr S. On the design of CGAL a computational geometry algorithms library. Softw Pract Exp. 2000; 30: 1167-1202. 10.1002/1097-024X(200009)30:11<1167:AID-SPE337>3.0.CO;2-B
Marden S, Guivant J (2012) Improving the performance of ICP for real-time applications using an approximate nearest neighbour search. Australas Conf Robot Autom ACRA:3–5
Besl PJ, McKay NDSchenker PS. A method for registration of 3-D shapes. Sens. Fusion IV Control Paradig. Data Struct. 1992: 586-606
P. Alliez, É.C. De Verdière, O. Devillers, M. Isenburg (2003), Isotropic surface remeshing, Proc. - SMI 2003 Shape Model. Int. 2003. 2003 (2003) 49–58. doi:10.1109/SMI.2003.1199601
R.B. Rusu, S. Cousins (2011), 3D is here: Point Cloud Library (PCL), in: 2011 IEEE Int. Conf. Robot. Autom, IEEE, 2011: pp. 1–4. doi:10.1109/ICRA.2011.5980567
Geladi P, Bruce R. Kowalski, partial least-squares regression: a tutorial. J Optoelectron Adv Mater. 1986; 10: 1-17. 10.1016/0003-2670(86)80028-9
Lindgren F, Geladi P, Wold S. The kernel algorithm for PLS. J Chemom. 1993; 7: 45-59. 10.1002/cem.1180070104
Dayal BS, Macgregor JF. Improved PLS algorithms. J Chemom. 1997; 11: 73-85. 10.1002/(SICI)1099-128X(199701)11:1<73:AID-CEM435>3.0.CO;2-%23
Free 3D Models, (n.d.). www.free3d.com
Myronenko A, Song X. Point set registration: coherent point drifts. IEEE Trans Pattern Anal Mach Intell. 2010; 32: 2262-2275. 10.1109/TPAMI.2010.46
Floater MS. Mean value coordinates. Comput Aided Geom Des. 2003; 20: 19-27. 10.1016/S0167-8396(03)00002-5
N. Aspert, D. Santa-Cruz, T. Ebrahimi (1978), MESH: measuring errors between surfaces using the Hausdorff distance. In: proceedings IEEE Int. Conf. Multimed. Expo, IEEE: pp. 705–708. doi:10.1109/ICME.2002.1035879
Pandzic IS, Forchheimer R. MPEG-4 facial animation. 2003
Prendergast PM. Facial anatomy. Adv Surg Facial Rejuvenation Art Clin Pract. 2012; 9783642178: 3-14. 10.1007/978-3-642-17838-2_1
Sederberg TW, Parry SR. Free-form deformation of solid geometric models. ACM SIGGRAPH Comput. Graph. 1986; 20: 151-160. 10.1145/15886.15903
Nguyen TN, Dakpe S, Ho Ba Tho MC, Dao TT (2020) Real-time computer vision system for tracking simultaneously subject-specific rigid head and non-rigid facial mimic movements using a contactless sensor and system of systems approach. Comput. Methods Programs Biomed:105410. 10.1016/j.cmpb.2020.105410
Godoy JL, Vega JR, Marchetti JL. Relationships between PCA and PLS-regression. Chemom Intell Lab Syst. 2014; 130: 182-191. 10.1016/j.chemolab.2013.11.008
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE Trans Neural Netw. 2008; 20: 61-80. 10.1109/TNN.2008.2005605
R. Bhalodia, S.Y. Elhabian, L. Kavan, R.T. Whitaker, Deepssm: a deep learning framework for statistical shape modeling from raw images. In: Int. Work. Shape Med. Imaging, 2018: pp. 244–257
M. Reuter, C. Wachinger, H. Lombaert, B. Paniagua, M. Lüthi, B. (2018) Egger. Shape in medical imaging: international workshop, ShapeMI 2018, held in conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings, Springer, 2018
Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P. Geometric deep learning: going beyond Euclidean data. IEEE Signal Process Mag. 2017; 34: 18-42. 10.1109/MSP.2017.2693418
Dao TT, Pouletaut P, Lazáry Á, Ho Ba Tho MC (2017) Multimodal medical imaging fusion for patient specific musculoskeletal modeling of the lumbar spine system in functional posture. J Med Biol Eng 37:739–749
By Tan-Nhu Nguyen; Vi-Do Tran; Ho-Quang Nguyen and Tien-Tuan Dao
Reported by Author; Author; Author; Author
Tan-Nhu Nguyen is currently a PhD student in Biomedical Engineering. His current research interest is muscle modeling coupled with serious game for facial rehabilitation.
Vi-Do Tran received the PhD in BioRobotics, Scuola Superiore Sant'Anna di Pisa, Italy, in 2019. His research interests are in the fields of rehabilitation robotics, assistive technologies, human–robot interaction, and biomechanical simulation.
Ho-Quang Nguyen obtained PhD in Biomechanics at the Université de technologie de Compiègne (UTC, France) in 2017. His current research interests are patient-specific modeling of musculoskeletal system, computational biomechanics, and rehabilitation engineering.
Tien-Tuan Dao received the Habilitation diploma in 2015 at the University of Technology of Compiègne (France). His research interests concern computational biomechanics, knowledge and system engineering, and in silico medicine.