Cross D. Data Mungling with Perl

pdf file
size 2,68 MB

added by Shushimora 12/15/2011 19:34
info modified 03/22/2021 18:42

Manning, 2001, -304 p.

Perl is something of a weekend warrior. Outside of business hours you’ll find it indulging in all kinds of extreme sports: writing haiku; driving GUIs; reviving Lisp, Prolog, Forth, Latin, and other dead languages; playing psychologist; shovelling MUDs; inflecting English; controlling neural nets; bringing you the weather; playing with Lego; even running quantum computations. But that’s not its day job.
Nine-to-five it earns its keep far more prosaically: storing information in databases, extracting it from files, reorganizing rows and columns, converting to and from bizarre formats, summarizing documents, tracking data in real time, creating statistics, doing back-up and recovery, merging and splitting data streams, logging and checkpointing computations.
In other words, munging data. It’s a dirty job, but someone has to do it. If that someone is you, you’re definitely holding the right book. In the following pages, Dave will show you dozens of useful ways to get those everyday data manipulation chores done better, faster, and more reliably. Whether you deal with fixed-format data, or binary, or SQL databases, or CSV, or HTML/XML, or some bizarre proprietary format that was obviously made up on a drunken bet, there’s help right here.
Perl is so good for the extreme stuff, that we sometimes forget how powerful it is for mundane data manipulation as well. As this book so ably demonstrates, in addition to the hundreds of esoteric tools it offers, our favourite Swiss Army Chainsaw also sports a set of simple blades that are ideal for slicing and dicing ordinary data.
Now that’s a knife!

Part I Foundations
Data, data munging, and Perl
General munging practices
Useful Perl idioms
Pattern matching
Part II Data Munging
Unstructured data
Record-oriented data
Fixed-width and binary data
Part III Simple Data Parsing
Complex data formats
HTML
XML
Building your own parsers
Part IV The Big Picture
Looking back — and ahead