In this approach, features are obtained by combining the response of local edge-detectors over neighboring positions and multiple orientations, mimicking complex cells in the primary visual cortex. These features are both flexible, allowing for small distortions of the input, and selective, preserving local feature geometry. For an input image, a set of features learned from a training set are computed. A standard classifier is then run on this feature vector. The resulting approach is able to learn from very few examples while being simpler than existing recognition methods.
The system follows the standard model of object recognition in the primate cortex, in a feedforward hierarchy for visual processing. In its simplest version, the standard model consists of four layers of computation units in which simple S units, which increase object selectivity, alternate with complex C units, which introduce gradual invariance to scale and translation. The model has been able to quantitatively duplicate the generalization properties exhibited by neurons in the inferotemporal cortex that remain highly selective for particular objects while being invariant to ranges of scales and positions. The Inventors extend the standard model to learn a vocabulary of visual features from natural images. This model can robustly recognize many object categories and is competitive with state-of-the-art object recognition systems.