Books on Demand GmbH, 2000 — 252 p. — ISBN: 9783898118613,3898118614.
Today, there is an increased need to extract information for decision-making from a large collection of data. This transformation of data into knowledge is an interactive and iterative process of various subtasks and decisions and is called Knowledge Discovery from Data. The central part of Knowledge Discovery is Data Mining.
Most important for more sophisticated data mining is to try to limit the user involvement in the entire data mining process to the inclusion of well-known a priori knowledge while making this process more automated and more objective. Soft computing, i.e., Fuzzy Modeling, Neural Networks, Genetic Algorithms, and other methods of automatic model generation, is a way to mine data by generating mathematical models from empirical data more or less automatically.
In the past years there has been much publicity about the ability of Artificial Neural Networks to learn and to generalize despite important problems with the design, development, and application of Neural Networks:
Neural Networks have no explanatory power by default to describe why results are as they are. This means that the knowledge (models) extracted by Neural Networks is still hidden and distributed over the network.
There is no systematic approach to designing and developing Neural Networks. It is a trial and-error process.
Training of Neural Networks is a kind of statistical estimation often using algorithms that are slower and less effective than algorithms used in statistical software.
If noise is considerable in a data sample, the generated models systematically tend to be overfitted.
In contrast to Neural Networks that use Genetic Algorithms as an external procedure to optimize the network architecture and several pruning techniques to counteract overtraining, the new approach described in this book introduces principles of evolution - inheritance, mutation, and selection - for generating a network structure systematically enabling automatic model structure synthesis and model validation. Models are generated from the data in the form of networks of active neurons in an evolutionary fashion of repetitive generation of populations of competing models of growing complexity and their validation and selection until an optimal complex model - not too simple and not too complex - has been created. That is, growing a treelike network out of seed information (input and output variables data) in an evolutionary fashion of pairwise combination and survival-of-the-fittest selection from a simple single individual (neuron) to a desired final, not overspecialized behavior (model). Neither, the number of neurons and the number of layers in the network, nor the actual behavior of each created neuron is predefined. All this is adjusted during the process of self-organization, and therefore, is called self-organizing data mining.
Knowledge Discovery from DataModels and their application in decision-making.
Relevance and value of forecasts.
Theory-driven approach.
Data-driven approach.
Data mining.
Self-organizing Data MiningInvolvement of users in the data mining process.
Automatic model generation.
Regression-based models.
Rule-based modeling.
Symbolic modeling.
Nonparametric models.
Self-organizing data mining.
Self-organizing Modeling TechnologiesStatistical Learning Networks.
Inductive approach - The GMDH algorithm.
Induction.
Principles.
Model of optimal complexity.
Parametric GMDH AlgorithmsElementary models (neurons).
Generation of alternate model variants.
Nets of active neurons.
Criteria of model selection.
Validation.
Nonparametric AlgorithmsObjective Cluster Analysis.
Analog Complexing.
Self-organizing Fuzzy Rule Induction.
Logic-based rules.
Application of Self-organizing Data MiningSpectrum of self-organizing data mining methods.
Choice of appropriate modeling methods.
Application fields.
Synthesis.
Software tools.
KnowledgeMinerGeneral features.
GMDH implementation.
Elementary models and active neurons.
Generation of alternate model variants.
Criteria of model selection.
Systems of equations.
Analog Complexing implementation.
FeaturesExample.
Fuzzy Rule Induction implementationFuzzification.
Rule induction.
Defuzzification.
Example.
Using modelsThe model base.
Finance module.
Sample Applications.
From EconomicsNational economy.
Stock prediction.
Balance sheet.
Sales prediction.
Solvency checking.
Energy consumptionFrom Ecology.
Water pollution.
Water quality.
From other FieldsHeart disease.
U.S. congressional voting behavior.