Arnhold S. Data restructuring as formal preprocessing for machine learning with neural networks

pdf file
size 3,00 MB

added by Lesly 02/27/2019 19:02
info modified 02/28/2019 00:30

Arnhold S. Data restructuring as formal preprocessing for machine learning with neural networks

Clausthal University of Technology, 2015. — 156 p.

Data Restructuring as formal Preprocessing for Machine Learning with Neural Networks Artificial neural networks are used in the field of machine learning to build functions that emulate expert knowledge. Feedforward networks can map between data with fixed structure, recurrent networks can emulate sequential data such as time series. Recursive networks are used for structural data such as chemical structural formulas. Training, that is adapting the free parameters, of those nets is mostly difficult in practice. Therefore, it is amongst other things a permanent subject of research to develop special network architectures that are well suited for practical use. The network architecture LSTM for example was designed specifically to face the fading gradient which effectively renders training of recurrent networks virtually impossible by gradient descent. The network processes the data in one fixed direction. But if a learning task requires the output for a specific point to depend on the following point, this task cannot be learned. The compromise of using an input window of static size is difficult to implement for recursive networks. Therefore non-causal network architectures exist that take the context into account, which means they include input from successors. Furthermore, bidirectional recurrent networks (BRN) were defined using an already given network architecture to process a sequence in two directions simultaneously. Contextual networks require constraints on their internal structure. Both network architectures, contextual and bidirectional, keep the form of the input data and maintain the sequential nature of the processing. In this work it is shown that a sequence can be mapped to tree structures such that a recurrent Elman-BRN on the sequence does the same job as a recursive Elman net (also: Simple Recurrent Network) on the tree structures. This sequence-to-tree mapping is generalised onto tree structures, so that they can be restructured bidirectionally. This restructuring is interpreted as a form-based preprocessing of the input data. Novel methods of restructuring are defined, i.e. algorithms for mapping sequences to trees. One result is a computationally efficient method for the classification of translation invariant sequences. Furthermore, the possibility to define a non-causal sequence-to-sequence mapping is concluded, which is invertible under certain conditions. One method is presented that is very easy to implement and realises the concept of Divide and Conquer. This is also combined with bidirectional restructuring. All presented methods are compared against the recurrent default method basing on LSTM and Elman networks by learning different classification problems. Networks with only three to five neurons are used. To cover a wide range of usage scenarios, synthetic and real-world data of symbolic and continuous nature are used as input data. The quality of training is compared amongst the methods. For pattern sets with an unbalanced ratio between positive and negative patterns an auto-balancing variant of gradient descent is presented. Furthermore, a special initialization for the training method Resilient Backpropagation is specified. It turns out that the restructuring methods outperform the recurrent default and can be successful even where recurrent networks fail, and they should therefore be considered essential for optimization.