Final rel. - Manning Publications, 2022. - 305 p. - ISBN 1617298034.
In
Algorithms and Data Structures for Massive Datasets , you'll discover methods for
reducing and sketching data so
it fits in small memory without losing accuracy, and unlock the algorithms and data structures that form the backbone of a big data system. Data structures and algorithms that are great for traditional software may quickly
slow or fail altogether when applied to
huge datasets.
Algorithms and Data Structures for Massive Datasets introduces a toolbox of new techniques that are perfect for handling modern big data applications. Filled with fun illustrations and examples
from real-world businesses, you'll learn how each of these complex techniques can be practically applied to maximize
the accuracy and throughput of
big data processing and analytics.
About this bookAbout the cover illustration.
Hash-Based SketchesReview of hash tables and modern hashing.
Approximate membership: Bloom and quotient filters.
Frequency estimation and count-min sketch.
Cardinality estimation and HyperLogLog.
Real-Time AnalyticsStreaming data: Bringing everything together.
Sampling from data streams.
Approximate quantiles on data streams.
Data Structures for Databases and External Memory AlgorithmsIntroducing the external memory model.
Data structures for databases: B-trees, Bε-trees, and LSM-trees.
External memory sorting.
References.
Index.True PDF