Springer, 2018. — 373 p. — ISBN 978-3-319-91814-3.
This book discusses text mining and different ways this type of data mining can be used to find implicit knowledge from text collections. The author provides the guidelines for implementing text mining systems in Java, as well as concepts and approaches. The book starts by providing detailed text preprocessing techniques and then goes on to provide concepts, the techniques, the implementation, and the evaluation of text categorization. It then goes into more advanced topics including text summarization, text segmentation, topic mapping, and automatic text management.
Foundation
IntroductionDefinition of Text Mining
Texts
Data Mining Tasks
Data Mining Types
Text IndexingOverview of Text Indexing
Steps of Text Indexing
Text Indexing: Implementation
Additional Steps
Text EncodingOverview of Text Encoding
Feature Selection
Feature Value Assignment
Issues of Text Encoding
Text AssociationOverview of Text Association
Data Association
Word Association
Text Association
Overall Summary
Text Categorization
Text Categorization: Conceptual ViewDefinition of Text Categorization
Data Classification
Classification Types
Variants of Text Categorization
Summary and Further Discussions
Text Categorization: ApproachesMachine Learning
Lazy Learning
Probabilistic Learning
Kernel Based Classifier
Summary and Further Discussions
Text Categorization: ImplementationSystem Architecture
Class Definitions
Method Implementations
Graphic User Interface and Demonstration
Summary and Further Discussions
Text Categorization: EvaluationEvaluation Overview
Text Collections
F1 Measure
Statistical t-Test
Summary and Further Discussions
Text Clustering
Text Clustering: Conceptual ViewDefinition of Text Clustering
Data Clustering
Clustering Types
Derived Tasks from Text Clustering
Summary and Further Discussions
Text Clustering: ApproachesUnsupervised Learning
Simple Clustering Algorithms
K Means Algorithm
Competitive Learning
Summary and Further Discussions
Text Clustering: ImplementationSystem Architecture
Class Definitions
Method Implementations
Class: ClusterAnalysisAPI
Summary and Further Discussions
Text Clustering: EvaluationCluster Validations
Clustering Index
Parameter Tuning
Summary and Further Discussions
Advanced Topics
Text SummarizationDefinition of Text Summarization
Text Summarization Types
Approaches to Text Summarization
Combination with Other Text Mining Tasks
Summary and Further Discussions
Text SegmentationDefinition of Text Segmentation
Text Segmentation Type
Machine Learning-Based Approaches
Derived Tasks
Summary and Further Discussions
Taxonomy GenerationDefinition of Taxonomy Generation
Relevant Tasks to Taxonomy Generation
Taxonomy Generation Schemes
Taxonomy Governance
Summary and Further Discussions
Dynamic Document OrganizationDefinition of Dynamic Document Organization
Online Clustering
Dynamic Organization
Issues of Dynamic Document Organization
Summary and Further Discussions