Rijeka: InTech, 2011. — 596 p. — ISBN: 978-953-307-547-1.
Data mining, a branch of computer science and artificial intelligence, is the process of extracting patterns from data. Data mining is seen as an increasingly important tool to transform a huge amount of data into a knowledge form giving an informational advantage. Reflecting this conceptualization, people consider data mining to be just one step in a larger process known as knowledge discovery in databases (KDD). Data mining is currently used in a wide range of practices from business to scientific discovery. The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by ‘Data Mining’ address the need by presenting in-depth description of novel mining algorithms and many useful applications.
The first book (New Fundamental Technologies in Data Mining) is organized into two parts. The first part presents database management systems (DBMS). Before data mining algorithms can be used, a target data set must be assembled. As data mining can only uncover patterns already present in the data, the target dataset must be large enough to contain these patterns. For this purpose, some unique DBMS have been developed over past decades. They consist of software that operates databases, providing storage, access, security, backup and other facilities. DBMS can be categorized according to the database model that they support, such as relational or XML, the types of computer they support, such as a server cluster or a mobile phone, the query languages that access the database, such as SQL or XQuery, performance trade-off s, such as maximum scale or maximum speed or others.
The second part is based on explaining new data analysis techniques. Data mining involves the use of sophisticated data analysis techniques to discover relationships in large data sets. In general, they commonly involve four classes of tasks: (1) Clustering is the task of discovering groups and structures in the data that are in some way or another similar without using known structures in the data. Data visualization tools are followed after making clustering operations. (2) Classification is the task of generalizing known structure to apply to new data. (3) Regression attempts to find a function which models the data with the least error. (4) Association rule searches for relationships between variables.
The second book (Knowledge-Oriented Applications in Data Mining) is based on introducing several scientific applications using data mining. Data mining is used for a variety of purposes in both private and public sectors. Industries such as banking, insurance, medicine, and retailing use data mining to reduce costs, enhance research, and increase sales. For example, pharmaceutical companies use data mining of chemical compounds and genetic material to help guide research on new treatments for diseases. In the public sector, data mining applications were initially used as a means to detect fraud and waste, but they have grown also to be used for purposes such as measuring and improving program performance. It has been reported that data mining has helped the federal government recover millions of dollars in fraudulent Medicare payments.
In data mining, there are implementation and oversight issues that can influence the success of an application. One issue is data quality, which refers to the accuracy and completeness of the data. The second issue is the interoperability of the data mining techniques and databases being used by different people. The third issue is mission creep, or the use of data for purposes other than for which the data were originally collected. The fourth issue is privacy. Questions that may be considered include the degree to which government agencies should use and mix commercial data with government data, whether data sources are being used for purposes other than those for which they were originally designed.
In addition to understanding each part deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining.
Database Management SystemsDerya Birant
Service-Oriented Data Mining
Filipe Mota Pinto and Teresa Guarda
Database Marketing Process Supported by Ontologies: A Data Mining System Architecture Proposal
Sujni Paul
Parallel and Distributed Data Mining
Ying Su
Modeling Information Quality Risk for Data Mining and Case Studies
Simon Fong and Yang Hang
Enabling Real-Time Business Intelligence by Stream Mining
Oscar Marban, José Gallardo, Gonzalo Mariscal and Javier Segovia
From the Business Decision Modeling to the Use Case Modeling in Data Mining Projects
David He, Eric Bechhoefer, Mohammed Al-Kateb, Jinghua Ma, Pradnya Joshi and Mahindra Imadabathuni
A Novel Configuration-Driven Data Mining Framework for Health and Usage Monitoring Systems
Jing-song Li, Hai-yan Yu and Xiao-guang Zhang
Data Mining in Hospital Information System
Habiba Mejhed, Samia Boussaa and Nour el houda Mejhed
Data Warehouse and the Deployment of Data Mining Process to Make Decision for Leishmaniasis in Marrakech City
Viswanathan, Whangbo, and Yang
Data Mining in Ubiquitous Healthcare
Roberto Llorente and Maria Morant
Data Mining in Higher Education
M. Šimůnek and J. Rauch
EverMiner – Towards Fully Automated KDD Process
Georges Edouard KOUAMOU
A Software Architecture for Data Mining Environment
Henrique Santos, Manuel Filipe Santos and Wesley Mathew
Supervised Learning Classifier System for Grid Data Mining
New Data Analysis TechniquesJean-Charles LAMIREL
A New Multi-Viewpoint and Multi-Level Clustering Paradigm for Efficient Data Mining Tasks
Yuichi Yaguchi, Takashi Wagatsuma and Ryuichi Oka
Spatial Clustering Technique for Data Mining
Angel Kuri-Morales and Edwyn Aldana-Bobadilla
The Search for Irregularly Shaped Clusters in Data Mining
Bo Long and Zhongfei (Mark) Zhang
A General Model for Relational Clustering
Marcel Jirina and Marcel Jirina, Jr.
Classifiers Based on Inverted Distances
Keiji Gyohten, Hiroaki Kizu and Naomichi Sueda
2D Figure Pattern Mining
Sai Peck Lee and Chuan Ho Loh
Quality Model based on Object-oriented Metrics and Naive Bayes
Deok Hee Nam
Extraction of Embedded Image Segment Data Using Data Mining with Reduced Neurofuzzy Systems
Mehdi Toloo and Soroosh Nalchigar
On Ranking Discovered Rules of Data Mining by Data Envelopment Analysis: Some New Models with Applications
Paul Cotofrei and Kilian Stoffel
Temporal Rules Over Time Structures with Different Granularities - a Stochastic Approach
Donald E. Brown
Data Mining for Problem Discovery
Hidenao Abe
Development of a Classification Rule Mining Framwork by Using Temporal Pattern Extraction
Rasha Shaker Abdul-Wahab
Evolutionary-Based Classification Techniques
Akira Oyama and Kozo Fujii
Multiobjective Design Exploration in Space Engineering
Xinjing Ge and Jianming Zhu
Privacy Preserving Data Mining
Jean-François Mari, Florence Le Ber, El Ghali Lazrak, Marc Benoît, Catherine Eng, Annabelle Thibessard and Pierre Leblond
Using Markov Models to Mine Temporal and Spatial Data