Book review: Pro Perl Parsing by <i>Christopher M. Frenz</i>

Book review: Pro Perl Parsing by Christopher M. Frenz


Pro Perl Parsing is a well-written tome on the subject of various ways of pulling information out of sources such as the HTML, RSS, XML, CSV, the command line and text. More correctly put, the book discusses the extraction of data and some analysis via Perl. The author Christopher M. Frenz understands the value of using CPAN modules and describes parsing through pleasantly brief code examples.

The book’s coverThe book’s cover

My first impression of the book was one of to-the-core efficiency. Pro Perl Parsing defines that which is required to understand concepts such as context free grammar, regular expression pattern matching and even manages to squeeze in a little data mining.

Accurate and focused, this book does not have an ounce of fat. The author Christopher M. Frenz understands the value of using CPAN modules.

My first impression of the book was one of to-the-core efficiency

The contents

Frenz’s well-balanced book packs much compressed information into its relatively short 252 pages. He starts with a “show us the money” chapter on the important subject of parsing and regex expressions and then quickly moves through details of how grammar parsers actually work. I got the distinct impression of mentally flowing seamlessly to the main points and then having the author deal accurately with the required background concepts.

There was a slight slow down when hitting chapters four and five as the chapters where harder to digest due to the underlying complexities of correctly configuring Parse::Yapp and Parse::RecDescent. The story speeds up again after this and I encountered the fun chapters on pulling information from the internet and data mining. I particularly liked example 6-1 (page 147), which, within a few lines of easy to understand code, pulled and processed the temperature for a particular location from Weather.com.

Sadly for me, I have to admit that, despite thinking I knew much in this subject area, I managed to pick up a trick or two from Pro Perl Parsing. Data mining is a pet love for me and chapter ten delivered a couple of practical hints that I will add to some of my prototype code later. For example, listing 10-1 (page 221) explains briefly the use of Statistics::Descriptive for finding the mean and standard deviation of a particular dataset. If I had known of this module earlier, I would not have previously written a number of custom functions myself (and on more than one occasion).

I managed to pick up a trick or two from Pro Perl Parsing

Who’s this book for?

This book has a number of target audiences. Perl scripting system administrators may gain some insight into improving parsing of command line options or configuration files. Scientists, teachers or students should find value in data manipulation and mining.

Relevance to free software

Since its inception in 1987, Perl has been free and has accumulated a particularly large following of addicted developers. This thoroughly RAD scripting language has been ported to possibly more platforms than Java. Pro Perl Parsing addresses one of its significant strengths, as Perl is well known as the main language for report generation and ad hoc data parsing.

This quick to script language has great influence on the proper running of many Linux servers. If you look around a typical server, you will find code for log rotation, report generation, general cleanup scripts and a number of command line tools. Even in this mundane everyday environment an understanding of Perl parsing is at the least handy.

Looking further afield, programming of data parsing and crunching is a fundamental skill for most scientists who coincidentally need to choose from proprietary or open languages or tools. Therefore, this book is helpful to the momentum of the free software movement at a number of levels.

Pros

Accurate and to the point, this book describes the parsing of data via CPAN modules correctly and without fuss. A good solid read and not too thick in pages to be off-putting.

Cons

If you are looking for a hard core Perl regex book then you may find this book light on the subject. However, you may be surprised by how relevant parsing is to your problem domain. Therefore, please read the table of contents before passing this one over.

Title Pro Perl Parsing
Author Christopher M. Frenz
Publisher Apress
ISBN 1590595041
Year 2005
Pages 272
CD included No
FS Oriented 8
Over all score 9

In short

Category: 
Tagging: 
License: 

Author information

Alan Berg's picture

Biography

Alan Berg Bsc. MSc. PGCE, has been a lead developer at the Central Computer Services at the University of Amsterdam for the last eight years. In his spare time, he writes computer articles. He has a degree, two masters and a teaching qualification. In previous incarnations, he was a technical writer, an Internet/Linux course writer, and a science teacher. He likes to get his hands dirty with the building and gluing of systems. He remains agile by playing computer games with his kids who (sadly) consistently beat him physically, mentally and morally.

You may contact him at reply.to.berg At chello.nl

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!