2nd ed. — SAS Institute Inc., 2008. — 273 p. — ISBN: 1599946599, 9781599946597
Although this book is titled Cody’s Data Cleaning Techniques Using SAS, I hope that it is more
than that. It is my hope that not only will you discover ways to detect data errors, but you will also be exposed to some DATA step programming techniques and SAS procedures that might be new to you.
I have been teaching a two-day data cleaning workshop for SAS, based on the first edition of this book, for several years. I have thoroughly enjoyed traveling to interesting places and meeting other SAS programmers who have a need to find and fix errors in their data. This experience has also helped me identify techniques that other SAS users will find useful.
There have been some significant changes in SAS since the first edition was published — specifically, SAS
9. SAS9 includes many new functions that make the task of finding and correcting data errors much easier. In addition, SAS9 allows you to create integrity constraints and audit trails. Integrity constraints are rules about your data that are stored in the data descriptor portion of a SAS data set. These rules prevent data that violates any of these constraints to be rejected when you try to add it to an existing data set. In addition, SAS can create an audit trail data set that shows which new observations were added and which observations were rejected,
along with the reason for their rejection.