Red Hat System Administration II
- Section Manage Compressed tar Archives
- Guided Exercise: Manage Compressed tar Archives
- Transfer Files Between Systems Securely
- Guided Exercise: Transfer Files Between Systems Securely
- Synchronize Files Between Systems Securely
- Guided Exercise: Synchronize Files Between Systems Securely
- Lab: Archive and Transfer Files
- Summary
Abstract
| Goal |
Archive and copy files from one system to another. |
| Objectives |
|
| Sections |
|
| Lab |
|
Archive files and directories into a compressed file with
tar, and extract the contents of an existingtararchive.
An archive is a single regular file or device file that contains multiple files. The device file could be a tape drive, flash drive, or other removable media. When using a regular file, archiving is analogous to the zip utility and similar variations that are popular on most operating systems.
Note
The original, ubiquitous zip compression and file packaging utility uses the PKZIP (Phil Katz's ZIP for MSDOS systems) algorithm, which has evolved significantly, and is supported on RHEL with the zip and unzip commands. Many other compression algorithms have been developed since zip was introduced, and each has its advantages. For creating compressed archives for general use, any tar-supported compression algorithm is acceptable.
Archive files are used to create manageable personal backups, or to simplify transferring a set of files across a network when other methods, such as rsync, are unavailable or might be more complex. Archive files can be created with or without using compression to reduce the archive file size.
On Linux, the tar utility is the common command to create, manage, and extract archives. Use the tar command to gather multiple files into a single archive file. A tar archive is a structured sequence of file metadata and data with an index so you can extract individual files.
Files can be compressed during creation by using one of the supported compression algorithms. The tar command can list the contents of an archive without extracting, and can extract original files directly from both compressed and uncompressed archives.
One of the following tar command actions is required to perform a tar operation:
-cor--create: Create an archive file.-tor--list: List the contents of an archive.-xor--extract: Extract an archive.
The following tar command general options are often included:
-vor--verbose: Show the files that are being archived or extracted during thetaroperation.-for--file: Follow this option with the archive file name to create or open.-por--preserve-permissions: Preserve the original file permissions when extracting.--xattrs: Enable extended attribute support, and store extended file attributes.--selinux: Enable SELinux context support, and store SELinux file contexts.
The following tar command compression options are used to select an algorithm:
-aor--auto-compress: Use the archive's suffix to determine the algorithm to use.-zor--gzip: Use thegzipcompression algorithm, which results in a.tar.gzsuffix.-jor--bzip2: Use thebzip2compression algorithm, which results in a.tar.bz2suffix.-Jor--xz: Use thexzcompression algorithm, which results in a.tar.xzsuffix.
Note
The tar command still supports the legacy option style that does not use a dash (-) character. You might find this syntax in legacy scripts or documentation, and the behavior is essentially the same. For command consistency, Red Hat recommends using the short- or long-option styles instead.
To create an archive with the tar command, use the create and file options with the archive file name as the first argument, followed by a list of files and directories to include in the archive.
The tar command recognizes absolute and relative file name syntax. By default, tar removes the leading forward slash (/) character from absolute file names, so that files are stored internally with relative path names. This technique is safer, because extracting absolute path names always overwrites existing files. With files that are archived with relative path names, files can be extracted to a new directory without overwriting existing files.
The following command creates the mybackup.tar archive to contain the myapp1.log, myapp2.log, and myapp2.log files from the user's home directory. If a file with the same name as the requested archive exists in the target directory, then the tar command overwrites the file.
[user@host ~]$tar -cf mybackup.tar myapp1.log myapp2.log myapp3.log[user@host ~]$ls mybackup.tarmybackup.tar
A user must have read permissions on the target files that are being archived. For example, creating an archive in the /etc directory requires root privileges, because only privileged users can read all /etc files. An unprivileged user can create an archive of the /etc directory, but the archive excludes files that the user cannot read, and directories for which the user lacks the read and execute permissions.
In this example, the root user creates the /root/etc-backup.tar archive of the /etc directory.
[root@host ~]# tar -cf /root/etc-backup.tar /etc
tar: Removing leading `/' from member namesImportant
Extended file attributes, such as access control lists (ACL) and SELinux file contexts, are not preserved by default in an archive. Use the --acls, --selinux, and --xattrs options to include POSIX ACLs, SELinux file contexts, and other extended attributes, respectively.
Use the tar command t option to list the file names from within the archive that are specified with the f option. The files list with relative name syntax, because the leading forward slash was removed during archive creation.
[root@host ~]# tar -tf /root/etc.tar
etc/
etc/fstab
etc/crypttab
etc/mtab
...output omitted...Extract a tar archive into an empty directory to avoid overwriting existing files. When the root user extracts an archive, the extracted files preserve the original user and group ownership. If a regular user extracts files, then the user becomes the owner of the extracted files.
List the contents of the /root/etc.tar archive and then extract its files to the /root/etcbackup directory:
[root@host ~]#mkdir /root/etcbackup[root@host ~]#cd /root/etcbackup[root@host etcbackup]#tar -tf /root/etc.taretc/ etc/fstab etc/crypttab etc/mtab ...output omitted... [root@host etcbackup]#tar -xf /root/etc.tar
When you extract files from an archive, the current umask is used to modify each extracted file's permissions. Instead, use the tar command p option to preserve the original archived permissions for extracted files. The --preserve-permissions option is enabled by default for a superuser.
[user@host scripts]# tar -xpf /home/user/myscripts.tar
...output omitted...The tar command supports these compression methods, and others:
gzipcompression is the earlier, fastest method, and is widely available across platforms.bzip2compression creates smaller archives but is less widely available thangzip.xzcompression is newer, and offers the best compression ratio of the available methods.
The effectiveness of any compression algorithm depends on the type of data that is compressed. Previously compressed data files, such as picture formats or RPM files, typically do not significantly compress further.
Create the /root/etcbackup.tar.gz archive with gzip compression from the contents of the /etc directory:
[root@host ~]# tar -czf /root/etcbackup.tar.gz /etc
tar: Removing leading `/' from member namesCreate the /root/logbackup.tar.bz2 archive with bzip2 compression from the contents of the /var/log directory:
[root@host ~]$ tar -cjf /root/logbackup.tar.bz2 /var/log
tar: Removing leading `/' from member namesCreate the /root/sshconfig.tar.xz archive with xz compression from the contents of the /etc/ssh directory:
[root@host ~]$ tar -cJf /root/sshconfig.tar.xz /etc/ssh
tar: Removing leading `/' from member namesAfter creating an archive, verify its table of contents with the tar command tf options. It is not necessary to specify the compression option when listing a compressed archive file, because the compression type is read from the archive's header. List the archived content in the /root/etcbackup.tar.gz file, which uses the gzip compression:
[root@host ~]# tar -tf /root/etcbackup.tar.gz
etc/
etc/fstab
etc/crypttab
etc/mtab
...output omitted...The tar command can automatically determine which compression was used, so it is not necessary to specify the compression option. If you do include an incorrect compression type, tar reports that the specified compression type does not match the file's type. In the following example, the tar command uses the -z option, which indicates gzip compression, but the file name extension is .xz, which indicates xz compression:
[root@host ~]#tar -xzf /root/etcbackup.tar.xzgzip: stdin: not in gzip formattar: Child returned status 1 tar: Error is not recoverable: exiting now
Listing a compressed tar archive works in the same way as listing an uncompressed tar archive. Use the tar command with the tf option to verify the content of the compressed archive before extracting its contents:
[root@host logbackup]# tar -tf /root/logbackup.tar
var/log/
var/log/lastlog
var/log/README
var/log/private/
...output omitted...The gzip, bzip2, and xz algorithms are also implemented as stand-alone commands for compressing individual files without creating an archive. With these commands, you cannot create a single compressed file of multiple files, such as a directory. As previously discussed, to create a compressed archive of multiple files, use the tar command with your preferred compression option. To uncompress a single compressed file or a compressed archive file without extracting its contents, use the gunzip, bunzip2, and unxz stand-alone commands.
The gzip and xz commands provide an -l option to view the uncompressed size of a compressed single or archive file. Use this option to verify that enough space is available before uncompressing or extracting a file.
[user@host ~]$gzip -l file.tar.gzcompressed uncompressed ratio uncompressed_name 221603125 303841280 27.1% file.tar [user@host ~]$xz -l file.xzStrms Blocks Compressed Uncompressed Ratio Check Filename 1 1 195.7 MiB 289.8 MiB 0.675 CRC64 file.xz
References
tar(1), gzip(1), gunzip(1), bzip2(1), bunzip2(1), xz(1), and unxz(1) man pages