Given a facial image, the objective is to estimate the age of the person in the image.
Some examples of images from three popular datasets: MORPH, FG-NET and CACD, and the age of each subject is shown above.
The network is based on a concept called Differentiable regression forests. Unlike the traditional regression forests that perform hard data partitions, these differential regression forests perform soft data partition, so that an input-dependent partition function can be learned to handle heterogeneous data. In addition, the differentiable regression forests can be seamlessly integrated with any deep networks, which enables us to conduct an end-to-end deep age estimation model, which we name Deep Regression Forests (DRFs). The outline of the architecture is given below.
Training and Implementation details
An alternating optimization strategy is adopted:
first the leaf nodes are fixed and then they optimize the data partitions at split nodes as well as the CNN parameters (feature learning) by Back-propagation;
Then, the split nodes are fixed and they optimize the data abstractions at leaf nodes (local regressors) by Variational Bounding.
The realization of DRFs is based on the public available “caffe” framework.
VGG-16 Net is used for the CNN part of the proposed DRFs.
As part of the pre-processing, faces are firstly detected by using a standard face detector and facial landmarks are localized by AAM.
The performance of age estimation is evaluated in terms of mean absolute error (MAE) as well as Cumulative Score (CS).