Saturday, August 30, 2008

xargs and find

xargs and find

find and xargs do go very well together: find to locate what you're looking for, and xargs to run the same command on each of the things found.

Traditionally, an advantage to xargs was its ability to handle long command lines before failing, unlike some other commands. This command:

rm `find tmp -maxdepth 1 -name '*.mp3'`
is intended to remove all tmp/*.mp3 files (and ignore any subdirectories), but can fail with an "Argument list too long" message. This exact equivalent:
find tmp -maxdepth 1 -name '*.mp3' -maxdepth 1 | xargs rm
does exactly the same thing but will avoid the problem by batching arguments up. More modern kernels (since 2.6.23) shouldn't have this issue, but it's wise to make your scripts as portable as possible; and the xargs version is also easier on the eye.

You can also manually batch arguments if needed, using the -n option.

find tmp -maxdepth 1 -name '*.mp3' -maxdepth 1 | xargs -n1 rm
will pass one argument at a time to rm. This is also useful if you're using the -p option as you can confirm one file at a time rather than all at once.

Filenames containing whitespace can also cause problems; xargs and find can deal with this, using GNU extensions to both to break on the null character rather than on whitespace:

find tmp -maxdepth 1 -name *.mp3 -print0 | xargs -0 rm
You must use these options either on both find and xargs or on neither, or you'll get odd results.

Another common use of xargs with find is to combine it with grep. For example,

find . -name '*.pl' | xargs grep -L '^use strict'
will search all the *.pl files in the current directory and subdirectories, and print the names of any that don't have a line starting with 'use strict'. Enforce good practice in your scripting!


No comments: