Non-linear feature extraction from image data in neural networks |
||
The goal of this project is to investigate the feature extraction capabilities of neural networks. Traditionally, activities to solve a pattern recognition task were twofold. First, a set of features had to be found describing the object(s) being classified. Only after a set of features had been found, a classification mechanism would be chosen and optimized. These two steps are highly interdependent, since the choice of features influences the conditions under which a classifiers operates and vice versa.
With the advent of neural networks however, more and more problems are solved by simply feeding large amounts of 'raw data' (e.g. images, sound signals, stock market index ranges) to a neural network. During training, the network learns what value to place on what feature. The exact nature of this feature extraction process is, however, not clear due to the vast number of interconnections and thus weights in neural networks.
The goal of this project therefore is to study the (non-linear) feature extraction processes taking place in neural networks, putting special emphasis on using image data. Comparisons can be made between man- and machine-generated filters or templates in different tasks.
This resulted in D. de Ridder's M.Sc. thesis. The main conclusion of this thesis is, that while this type of network performs well, it is still outperformed by the traditional 1-nearest neighbour method and large non-restrained feed-forward neural networks. In light of this fact, it seems that the claim of many researchers that a network is a local shift-invariant feature extraction mechanism since it performs well:
local shift-invariant feature extraction <-> good performanceis not valid. The 1-nearest neighbour method performs better, yet it clearly is not extracting features. Therefore:
local shift-invariant feature extraction -> good performance good performance -/-> local shift-invariant feature extractionFurthermore, an attempt was made to study feature extraction using artificially generated images. It was shown that a trade-off exists between the trainability of a simple network and its understandability: the larger the network, the easier to train but the harder to understand.
e-mail:
dick@ph.tn.tudelft.nl
Last update: October 23, 2000 |
Return to the home page
|