Learn some command line: using du, df, file, find to make your life easier

Learn some command line: using du, df, file, find to make your life easier


I love the command line. If the command line were a dog, it would be a hard-headed labrador: big and somewhat intimidating, but really kind of even-tempered and friendly once she gets to know you.

I just compared the command line to my dog Roscoe. I love them both, and they both frustrate me.

I can't do much with Roscoe, but I can help out a bit with the command line. And so allow me to introduce four of my favorite utilities: df, du, file, and find.

Filesystem sizes with df

This one is easy. According to the man page, df stands for, "report file system disk space usage." I say it stands for, "disk free." But what do I know?

$ df -h

The -h tells df to report in human-readable numbers. Here, "human-readable" means "human-readable if you know the difference between G and M and K." You can also use -k (report in kilobytes) or -m (report in megabytes) if you desire. It's all up to you.

df -h gives up something like this:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             7.4G  4.6G  2.4G  66% /
varrun               1014M  128K 1014M   1% /var/run
varlock              1014M     0 1014M   0% /var/lock
procbususb           1014M  108K 1014M   1% /proc/bus/usb
udev                 1014M  108K 1014M   1% /dev
devshm               1014M     0 1014M   0% /dev/shm
/dev/sda4              61G  7.3G   51G  13% /home
/dev/sda1              40G   17G   23G  43% /media/sda1
/dev/scd0             7.8G  7.8G     0 100% /media/cdrom0

The first column is the device. For disks, this will be something like /dev/sda_n_, or /dev/hda_n_, where in a small number. Those other filesystems with names like udev or devshm or varrun are OS-specific. This output was taken from a GNU/Linux box running a 2.6.20 kernel.

The middle three columns show the total size, the amount used, and the amount avialable, just like the title says. The Use% column indicates the total percentage used. Generally, you don't want that to read 100%, except for CDs and DVDs, which will always show 100%. The final column tells you where in your directory hierarchy the filesystem is mounted.

That's if for the Very Short Tour of df.

Directory sizes with du

Suppose df reports a filesystem is full, and you need to find the culprit fast. Let's say for illustrative purposes the filesystem is /home. Here's one of my favorite commands of all time:

$ du -k /home | sort -n

Now, technically that's two commands. du stands for "estimate file space usage," though I hate the word "usage," because "use" will almost always work instead. I like to call it "disk use," for hopefully obvious reasons. The -k specifies reporting in kilobytes, rather than filesystem blocks. You can also use -m, which specifies megabytes, if you like smaller numbers. Do not use the -h option. -h means, "print in human-readable form," which will break our nifty sort operation.

The '|' (official name: "bar thingy") means "pipe." "Pipe" means, "take the output of this command, and pass it to the next command." In even simpler terms, this means "route STDOUT (standard out) of the first program to STDIN (standard in) of the next program."

sort sorts lines of data, just as the name implies. It isn't short for "somehow order random text" or anything like that. It just means, "sort." The -n option specifies to sort as if the first word were a number, rather than to sort it ASCIIbetically. For fun, try the sort without the -n. You'll quickly observe that "1" sorts before "101" which sorts before "2." For our purposes, the -n is quite important.

On my machine, that command gives this output:

4       /home/tony/.config/xfce4/orage
4       /home/tony/.config/xfce4/xfwm4
4       /home/tony/docs/fsm
4       /home/tony/docs/stories/speleology
4       /home/tony/.gimp-2.2/brushes
     .
     .
     .
512564  /home/tony/src
685672  /home/tony/tmp/zips
714508  /home/tony/tmp/iso
789240  /home/tony/tmp/tony
813236  /home/tony/video/roscoe
881512  /home/tony/video/family
1694756 /home/tony/video
3835596 /home/tony/tmp
7442492 /home/tony
7442496 /home

As you can see, I have a lot of stuff in /home/tony/tmp. I would look there for things to remove to free up space.

What kind of file is it?

Unlike some operating systems, GNU/Linux (and Unix-like operating systems in general) don't use filename extensions to determine the type of a file. So, a text file does not have to end in .txt, and a jpeg-encoded image file does not have to end in .jpg. Instead, there is a nifty utility called file that will report the filetype for you.

It's really pretty easy to use:

$ file blah.c
blah.c: ASCII C program text

It's really that simple.

Of coure, it uses magic. /etc/magic. Really. I'm not kidding.

Finding files

Find is one of the unsung heroes of the Free software world. Many do not appreciate the functional finesse, the streamlined beauty of this perfect utility. Find can search for files based on name, on size, on ownership, on permissions, on modification time, on access time, on... well, just about anything. Combined with other utilities, you can search on content or file type.

For instance, to find all files ending in .c:

$ find /home -name \*.c -print

The /home tells find to start the search in the /home directory. The -name *.c specifies the pattern for which to search. The * means "anything," followed by .c, which means just that: search for anything ending in .c. The -print is the "predicate;" that is, the action we wish to perform on the things we find. We can do more than just print out filenames.

This gives the following output:

/home/tony/src/gnome/gnome-columns/src/jewel.c
/home/tony/src/gnome/gnome-columns/src/texture.c
/home/tony/src/gnome/gnome-columns/src/renderable.c
/home/tony/src/gnome/gnome-columns/src/rectangle.c
/home/tony/src/gnome/gnome-columns/src/gnome-columns.c
/home/tony/src/gnome/gnome-columns/src/gameboard.c

(There was really a lot more output, but I wanted to keep the display simple.)

Search for files that have been recently changed:

$ find . -ctime -1 -print

This time I specified the start directory as ".", which is the current directory. I've specified the search criteria as -ctime -1, which means "change time, less than one day ago. Again, I specified print. Here is the output:

.
./blah

It returned only one file, blah. How boring.

Let's do something a little more interesting. Let's look for all PDFs in my home directory:

$ find ~ -exec file {} \; | grep PDF

I commanded find to start in my home directory by using the squiggly, '~'. (Actually, it's called a "tilde.") Then I specified it to execute a command, using -exec file {} \; The -exec is a predicate to cause find to execute a command, in this case, file. The '{}' bit means, "substitute the filename here." When find generates the command, it'll be something like, "file ~/stupidname.ext". The '\;' bit marks the end of the executable command. Then I pipe the output to grep, which prints only the lines containing "PDF".

There are better ways of doing this, especially using a command called xargs, but I don't cotton to those new-fangled methods. Well, I do, but you must first learn to crawl before you can fly on the space shuttle.

Here's the output:

/home/tony/src/beerhacker/Documentation/BeerXML_v2_01.pdf: PDF document, version 1.3
/home/tony/src/e17/docs/ewlbook/pre-rendered/ewlbook.pdf: PDF document, version 1.3
/home/tony/src/e17/docs/ewlbook/pre-rendered/ewlbook.es.pdf: PDF document, version 1.3
/home/tony/src/e17/docs/cookbook/pre-rendered/eflcookbook.pdf: PDF document, version 1.3
/home/tony/src/e17/docs/cookbook/pre-rendered/eflcookbook.fr.pdf: PDF document, version 1.3
/home/tony/src/e17/docs/cookbook/pre-rendered/eflcookbook.es.pdf: PDF document, version 1.3
/home/tony/src/e17/docs/cookbook/pre-rendered/eflcookbook.pt-BR.pdf: PDF document, version 1.3

Finally, let's use find to delete all our old emacs backup files. WARNING! DANGER, WILL ROBINSON! THIS IS VERY DANGEROUS! Be very careful when using find do do file manipulation. Always print out the results of find before executing a dangerous command.

First, do this:

$ find . -name \*~ -print

This prints all the files that end in ~, starting in the current directory. Once you are sure you won't miss these files, do this:

$ find . -name \*~ -exec rm {} \;

That's it! You are now wise in the ways of a couple of minor file utilities. As always, enjoy playing around with them. Be safe. Don't run with scissors, or shave with a rusty razor. Remember that cats have five pointy ends, and that with powerful knowledge comes powerful responsibility. Don't abuse these tools, and they will treat you right until the end of your days.

Category: 

Comments

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

Anthony!

Although you mention the xargs I would like to give an example for the simplest implementation of the recursive grep
find . -type f|xargs grep -i

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

Maybe it's the resolution or something else on my computer but I can't seem to make out what the sign in front of each command is. It doesn't look like an "&" or "$". So what is it?

Anthony Taylor's picture

The '$' ("dollar sign") signifies a standard shell prompt-- that is, the dollar sign is often used by several of the canonical shells to indicate the shell is waiting for input (bash and ksh, for instance).

You don't type it in. It's used in this article to signify the user types in the following line.

Output from various commands is designated by lack of the dollar sign.

Scott Carpenter's picture

I've dimly known about df by using it at work. ("Hey, what's that command again that shows disk usage?") But now will have more occasion to use it on my emerging home GNU/Linux systems. I also think discus is a neat little df "prettifier."

----
http://www.movingtofreedom.org/

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

Just a note that a lot of these commands will give you feedback on what they're doing if you ask them for it. For instance,

find . -name \*~ -print | xargs rm -v

will tell which files it's deleting. Just an extra bit of reassurance that you've done the right thing. :)

Lawrence D'Oliveiro

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

i use find with the -exec arg:
find . -ctime +10 -exec ls -l {} \;

mean:
find from this point (.) the files wich the creation time (.ctime) is more than 10 days (+10) and execute (-exec) a command (ls -l). The argument for this command is pased from find as "{}" . to end the command write and scaped ";" (\;).

in my servers i use this for process a lot of files, in combitation with awk (mmised it in this article) to make batch job files.

Flavio Camus

mobilemail's picture
Submitted by mobilemail on

I've been a Windows user since DOS/3.1, but I'm just now learning the guts of Linux. Practical clues like this make it easier.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

Thanks for a wonderful article,
It will definitely help to make my life in Linux easier.
Specially the find command seems really powerful

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

how to execute a command as well as printing the args into a file using args and find

Anthony Taylor's picture

That's an easy one. Use the "tee" command.

"tee" will copy STDIN to a file, and spit it back to STDOUT. This essentally makes a copy of STDIN into the new file. So, pipe STDIN through "tee" before passing it on to "xargs", like this:

find . -name \*\.c | tee c.list | xargs grep -l sprintf

This would list all the files with the (slightly dangerous) "sprintf" function, while making a copy of all the files checked.

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

I'm more used to the AIX version than the Linux version for "find". I'll have to try a couple of your other commands (du vs. df) as they seem pretty relevant to our current situation. (AIX guru retired, all the kids are left to play and learn on our own).

On a different note, is this Tony Taylor that used to work / go to school at UAF ?

-KS

Anthony Taylor's picture

Well, I imagine it depends on who you are, but since I went to UAF, I am most likely that very same Tony Taylor.

If you want, contact me at tony (at the following domain) paperdove.org. I'd love to get in contact with old friends from UAF.

Author information

Anthony Taylor's picture

Biography

Tony Taylor was born, causing his mother great discomfort, and has lived his life ever since. He expects to die some day. Until that day, he hopes to continue writing, and living out his childhood dream of being a geek.