When the command line "grep" wins

When the command line "grep" wins


I was working on a big website recently and faced a really tedious job in editing the content. I needed to find and replace certain words, like 'southeastern' for 'southeast', scattered over something like 140 files in half a dozen folders.

What to do? Install a powerful Content Management System with lots of menus and a global editing tool? Nope. I took the easy way out, using the tools at hand. The trick was to remember that webpages, regardless of how complicated they look in a browser, are really only plain text files.

I opened a terminal and cd'ed to the directory containing the website. Then I entered

grep -r "southeast " .

grep finds patterns in text. In this case, it looked for 'southeast', and the space in "southeast " made sure that grep didn't look for 'southeastern' or 'southeasterly'. The -r option meant that grep would look through all the files in the website directory, recursively. The trailing period . meant that grep would start searching from the current directory.

And report it did, in about a zillionth of a second. It gave the path to each file containing 'southeast', followed by the complete line of text containing that word. Scanning the results, I could see on which pages I needed to do southeast -> southeastern.

I fired up my favourite text editor, gedit, in the same workspace as the terminal. gedit is a multi-tab editor, meaning you can work on a lot of text files at a time, switching from tab to tab. I opened each of the webpages identified by grep and me as needing a fix, one webpage per tab.

Gedit's search and replace function

Next, a cool bit. gedit has a search-and-replace tool which opens in a little window above the text to be edited. This window is persistent - it stays open and usable until you close it. I opened the search-and-replace window Ctrl + h over the first tab, entered 'southeast' in the 'Search for' space and 'southeastern' in the 'Replace with' space. Now for keyboard fun: Alt+a (does the replacing), Ctrl+s (saves the revised text on that tab), Ctrl+w (closes that tab). The search-and-replace window is still there, waiting to do its job over the next tab. Do Alt+a, Crtl+s, Ctrl+w again. And again, until all tabs are closed and all pages edited.

That took much longer to write about than to do. My tedious website editing job was done in just a few minutes. Thank you, grep, and thank you, gedit.

Category: 
License: 

Comments

imgx64's picture
Submitted by imgx64 on

A better solution would be using sed (with the '-i' option) instead of Gedit. Also, if you combine it with find (with '-exec' or with xargs), you can automate the whole process completely in one line. No need to apply it to each file manually.

Another thing, instead of relying on 'southeast' having a space after it, you can use '\b', which matches a word boundary even if it's not followed by a space (for example, in "southeast, southwestern, or southern", your method won't work). '\b' is available in Perl and GNU sed, but I don't know about grep or other sed implementations.

Bob Mesibov's picture

Hi, imgx64.

sed wasn't an option, and neither was automating that particular process. As I mentioned in the note, I had to scan the grep results to see where and whether that replacement was appropriate. I wasn't replacing every instance of the searched-for string, just selected ones, something that I often do in webpage edits. I've done sed replacements and found I'd messed up strings that should have been left alone!

Author information

Bob Mesibov's picture

Biography

I'm retired and live in Penguin, Tasmania, Australia. I love tinkering and prefer working with a keyboard, not a mouse.