O’Reilly Media, Inc., 2022. — 193 p. — ISBN: 978-1-492-07706-0.
Most data scientists and engineers today rely on quality labeled data to train their machine learning models. However building training sets manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Amit Bahree, Senja Filipi, and Wee Hyong Tok from Microsoft show you how to create products using weakly supervised learning models.
You'll learn how to build natural language processing and computer vision projects using weakly labeled datasets from Snorkel, a spin-off from the Stanford AI Lab. Because so many companies pursue ML projects that never go beyond their labs, this book also provides a guide on how to ship the deep learning models you build.
Get up to speed on the field of weak supervision, including ways to use it as part of the data science process.
Use Snorkel AI for weak supervision and data programming.
Get code examples for using Snorkel to label text and image datasets.
Use a weakly labeled dataset for text and image classification.
Learn practical considerations for using Snorkel with large datasets and using Spark clusters to scale labeling.