Book review: Regular Expression Recipes <i>by Nathan A. Good</i>

Book review: Regular Expression Recipes by Nathan A. Good


I’ll admit right up front that I am something of a regular expression junkie.Years before I even knew such a system existed (before the days of the internet) I wrotemy own regular expression system to handle the needs of a free-text database managementpackage. Today, we are all familiar with regular expressions in Perl, sed, awk/gawk andeven in “user” applications like email and word processors.

The book’s cover The book’s cover

Despite the utility of the regular expression library used in these systems, getting yourregular expression right can be a major exercise. With the wrong expression you canmatch, or replace, the wrong text and in some applications that could have direconsequences.

Apress have released a new title, Regular Expression Recipes, written by Nathan AGood, that attempts to solve some of the riddles and complexities of the regularexpression writing process by providing example regular expressions to be used indifferent situations and against a variety of problems.

For each problem, a full range of solutions is provided in different languages (perl, sed, etc.), using regular expressions and/or scriptsrelevant to that environment.

The contents

Good has used an interesting approach to organizing the book. One of the major problemswith regular expressions is that they are used in a variety of applications, sometimeswith slight differences. Although the majority of modern applications use a derivationof the Perl regular expression library, some use their own, often out of a need forhistorical compatibility.

The book therefore starts off with a look at the various regular expression systems and arange of handy tables summarizing the differences between Perl, grep and Vim, whichsupport the three main regular expression systems. Handier still, the book goes on toadd details on how to use regular expressions in popular environments, such as Perl,Python, PHP, Vim, grep and sed.

This list of tools is relevant to the rest of the book as each example problem andregular expression solution is followed by two or more examples using the tools. In manycases these are full scripts or examples, although for some a simple fragment would beenough to get the idea. In addition, all of the scripts and regular expressions areexplained and their operation detailed so that you understand why it works.

The bulk of the book is then split up into specific areas of issues that can be resolvedwith regular expressions, beginning with the more obvious area of basic textmanipulation. Here we find the common—but not alwaysstraightforward—issues of finding words and lines and then finding andreplacing text, for example capitalizing the first letter of a word, before moving on tothe more complex issues of finding text in, or around, quotes.

Also handy in this section are a series of examples on replacing ASCII and non-ASCIIcharacters with their equivalents. For example, replacing smart quotes with straightquotes or copyright and trademark symbols with a textual equivalent (i.e. ™with (tm)).

With the basics out of the way, the book moves on to more interesting topics. Iwon’t list all the examples, or even all the different topic groups intowhich they are placed. I will say, however, that there isn’t an example herethat I thought was superfluous. Some highlights and personal favourites include,validating credit card numbers, extracting HTML attributes and a whole bunch onreformatting code.

As stated earlier, for each problem, a full range of solutions is provided in different languages (perl, sed, etc.), using different regular expressions and scriptsrelevant to that environment. For example, one of the examples validates dates andexample scripts and expressions are provided in Perl, PHP, grep and Vim.

Who’s this book for?

Regular expressions are one of the interesting parts of the computing world. Technicallynot a programming language, regular expressions are heavily used in programming. Theyare also common in a number of command line tools like grep and sed. All of this makesregular expressions, and by association this book, useful for just about any“power” user. Administrators will appreciate some of the tools forhelp in their scripts, log file parsing and when searching for information.

For programmers in particular, the book offers a wide range of examples and samples thatcan be used or adapted in applications. Most of the samples can be used verbatim, otherswill probably benefit from direct modification according to your needs.

Pros

The amazing range of problems and their solutions would be my first reason to recommendthe purchase of this book. There is a regular expression example here for everybody. Mysecond reason for recommendation would be the range of environments demonstrated throughthe examples. Use regular expressions, but not a Perl programmer? No problem, not onlydo you get the Perl sample, you get examples in an environment in which you might bemore familiar, such as Vim or PHP. By covering each regular expression, and also detailsabout why it works and examples for key environments like Perl or sed, the book becomesmore than just a regular expression tool. This range means that the book is also anadvanced scripting, programmers and administrators toolkit for performing a variety oftasks.

Cons

I really couldn’t find anything wrong with this book. Occasionally, I thoughta sample in a particular environment was missing, but with such detailed information onthe regular expression it really isn’t that difficult to embed the expressioninto your own script. I’m really scraping the barrel here though; the book iswithout a doubt one of the best and I highly recommend it.

Title Regular Expression Recipes
Author Nathan A Good
Publisher Apress
ISBN 159059441X
Year 2005
Pages 289
CD included No
Mark 9

In short

Category: 
License: 

Author information

Martin Brown's picture

Biography

Martin “MC” Brown is a member of the documentation team at MySQL and freelance writer. He has worked with Microsoft as an Subject Matter Expert (SME), is a featured blogger for ComputerWorld, a founding member of AnswerSquad.com, Technical Director of Foodware.net and, and has written books on topics as diverse as Microsoft Certification, iMacs, and free software programming.

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!