Bookmark this page

Processing Batches of Files

Objectives

  • Manipulate files by using pattern matching.

Batch Processing

In computing, batch processing can have different meanings depending on the context. In general, though, batch processing is the method of running several uninterrupted tasks that might vary in complexity. Without batch processing, a user executes the same task with different arguments over and over again.

Processing Batches of Files

Globbing is the process of interpreting and expanding metacharacters or wildcards to specify different files or directories. Globbing is a type of batch processing because you interact with many files to avoid repetitive tasks.

To manipulate several files with one command execution, you can use metacharacters. A metacharacter is a special character that the shell uses as a placeholder for something more complex.

The most common metacharacter is the tilde (~), which expands to the path to your home directory. When working on a terminal, the command prompt displays a tilde (~), which indicates that your current working directory is your home directory. You can use the tilde (~) in commands as the absolute path to your home directory. Use the forward slash (/) after the tilde (~) to refer to a file or directory inside the home directory.

[user@host Downloads]$ ls ~
Desktop  Documents  Downloads  Music  Pictures  Public  Templates  Videos
[user@host Downloads]$ ls ~/Desktop/commands.txt
commands.txt

The shell interprets the asterisk (*) and question mark (?) metacharacters as wildcards. Before command execution, the shell replaces a metacharacter with anything that could match it. An asterisk (*) character matches zero or more of any character, and the question mark (?) character matches exactly one of any character.

For example, the Do*s pattern matches the Documents and Downloads directories, and the D* pattern matches the Documents, Downloads, and Desktop directories.

[user@host ~]$ file Do*s
Documents: directory
Downloads: directory
[user@host ~]$ file D*
Desktop:   directory
Documents: directory
Downloads: directory

In this example, the Document? pattern matches the Documents directory, but not the Document file.

[user@host ~]$ echo "hello world" > Document
[user@host ~]$ file Document?
Documents: directory

You can combine metacharacters. For example, the ar*t?z pattern matches the archive.tgz, archive.txz, and artwork.tgz files, but not the artwork.tar.xz file. However, the ar*t*z pattern matches the archive.tgz, archive.txz, artwork.tgz, and artwork.tar.xz files.

Batch Processes

You can use globbing to run a single command on several files. For example, you can move only PDF documents from your Downloads directory to your Documents directory.

[user@host ~]$ mv ~/Downloads/.pdf ~/Documents*

Preventing Expansion

When you need to use an asterisk (*) or a question mark (?) as a literal asterisk or question mark, you must place an escape character before it. On the command line, you can use the backslash (\) as an escape character. The terminal always treats the character immediately after the escape character as a literal character instead of as a metacharacter.

In the following examples, you use wildcards to match files, but you also use escape characters to interpret the question mark (?) literally. From the following list of files, you want to list all the files that have a question mark (?) after the word Linux in the name.

'Any thoughts?.txt'
'Are you Mr. Brown?.mp4'
'Examples in Linux'
'Linux commands.pdf'
'Linux shells and consoles.pdf'
'Where is my Linux distro?.pdf'
'Which Linux?.pdf'

You start by listing all the files that have the word Linux in the name:

[user@host ~]$ ls *Linux*
'Examples in Linux' 'Linux commands.pdf'  'Linux shells and consoles.pdf'  'Where is my Linux distro?.pdf'  'Which Linux?.pdf'

Then, you list all the files that have the word Linux in the name and that have a question mark after the word. In the following example, you do not escape the question mark (?) so that the character is interpreted as a metacharacter. Because of the question mark (?) metacharacter, the results include files that have at least one character after the word Linux.

[user@host ~]$ ls *Linux*?*
'Linux commands.pdf'  'Linux shells and consoles.pdf'  'Where is my Linux distro?.pdf'  'Which Linux?.pdf'

In the following example, you escape the question mark (?) so that the character is interpreted literally.

[user@host ~]$ ls *Linux*\?*
'Where is my Linux distro?.pdf'  'Which Linux?.pdf'

References

bash(1), cd(1), and glob(3) man pages

Revision: rh104-9.1-3d1f2bc