Bookmark this page

Compressing and Extracting Data

Objectives

  • Back up a subset of local files and extract compressed files.

Compressing and Extracting Data

Not all software is distributed in a repository. Sometimes, a software project distributes an application as a compressed archive. An archive is a single binary file that can contain multiple files and directories that are compressed within it. Several archive formats are available, such as the zip, 7z, and tar formats. The tar format is the most popular format on Linux systems. A tar archive is often referred to as a tarball. A tar archive can be compressed by using tools such as gzip or xz, but compression is optional.

Working with Archives on the Desktop

Archives are a convenient way to send several files to someone, or to generate consolidated backups of important files. The GNOME Files application enables you to extract archives and to create archives from your files or directories.

You can create or extract an archive by right-clicking an item in the Files application. If the Files application identifies a compressed archive, then the Files application displays the option to extract its contents. If the Files application identifies a regular file or directory, then the Files application displays the option to compress the target.

Creating an Archive on the Desktop

To create an archive, right-click the directory that you intend to archive and select Compress from the context menu.

Figure 6.6: Creating an archive from the Files application

In the Create Archive dialog that displays, provide a name for the archive, and then choose a format from the drop-down list. By default, GNOME Files creates a .zip file, but the Files application can also produce a password-protected .zip file, a .tar.xz file, or a .7z file.

Figure 6.7: Selecting the name of the archive

After you create the archive, it appears in your current directory.

Extracting an Archive on the Desktop

The Files application can also extract an archive. In Files, double-click an archive file to extract its contents into the current directory. If the archive contains a directory, then a new directory is created. If the archive contains only files, then the files are written directly into the existing directory.

After extracting the contents of an archive, both the original archive file and the expanded contents are on your system. You can either keep the archive as a backup, or move it to the Trash directory.

Working with Archives on the Command Line

Create and extract archives on the command line with the tar and zip commands. Creating and extracting archives on the command line enables you to perform additional tasks, such as updating files in the archive with the latest changes, and using other compression tools.

Note

For some advanced features to work, the archive file must not be compressed.

Creating an Archive on the Command Line

You can create a compressed .tar archive on the command line by using the tar command. Required arguments are the --create, --gzip, and --file options, followed by a name for your archive, and then the path to the directory that you want to copy into your new archive.

--create

This option generates a new archive file.

--gzip

This option compresses the archive.

--file

This option specifies the name of the archive.

[user@host Downloads]$ tar --create --gzip --file example.tar.gz example
[user@host Downloads]$ ls
example   example.tar.gz

Verify the contents of your archive by using the tar command with the --list option and the --file option, followed by the name of the archive.

[user@host Downloads]$ tar --list --file example.tar.gz
example/
example/file1.txt
example/file2.txt
example/file3.txt

The gzip format is a common archive format in Linux. However, you can use other formats for archives, such as the xz or zstd format. The xz format is available when using the Files application. On the command line, use the tar command with the --xz option to select the xz format.

Extracting an Archive on the Command Line

For some archive formats, you can use the same command to create an archive and to extract the contents of the archive, but with different options. For other archive formats, you use a specific command to extract the contents. For example, you can use the unzip command to extract the contents of a .zip file.

[user@host Downloads]$ unzip example.zip
[user@host Downloads]$ ls
example   example.zip

To extract a .tar file, use the tar command with the --extract option, followed by the --file option and the file path of the archive. Most .tar archives are also compressed, so they end with extensions like .tar.gz or .tar.xz. In this scenario, you must also specify the compression format to extract the archive. To decompress .gz and .xz files, use the --gzip and --xz options, respectively.

[user@host Downloads]$ tar --extract --file --gzip example.tar.gz
[user@host Downloads]$ ls
example   example.tar.gz

Installing Software and Restoring Backups

Archives are sometimes used as backups for important files, and are also an alternative to distributing software through a Linux distribution's repository.

When installing software or restoring a backup, you extract an archived file tree and distribute the files throughout an existing file system. For example, an installable archive could contain the bin and etc subdirectories; the bin directory contains an executable file and the etc directory contains configuration files. If you extract the archive to the ~/.local directory, which contains the same subdirectories, then the executable and configuration files are distributed into the corresponding subdirectories.

To extract an archive into a specific directory, use the tar command with --directory option followed by a file path to the destination directory.

[user@host Downloads]$ tar --extract --file example.tar.gz \
--directory ~/.local

You can extract just one file from an archive by specifying the path of a file as it appears in the archive. For example, if you accidentally remove the file1.txt file, then you can recover it from your backup archive. Because the archive contains a directory called example, any file that is extracted from the archive into the current directory (.) is placed into the example directory.

[user@host Downloads]$ rm example/file1.txt
[user@host Downloads]$ ls example
file2.txt   file3.txt
[user@host Downloads]$ tar --extract --file example.tar.gz example/file1.txt
[user@host Downloads]$ ls example
file1.txt   file2.txt   file3.txt

If the directory does not already exist in the current directory, then it is created.

[user@host Downloads]$ rm -r example
[user@host Downloads]$ ls example
ls: cannot access 'example': No such file or directory
[user@host Downloads]$ tar --extract --file example.tar.gz example/file1.txt \
--directory .
[user@host Downloads]$ ls example
file1.txt

References

tar(1) and unzip(1L) man pages

For more information, refer to How to Unzip a tar.gz File at https://opensource.com/article/17/7/how-unzip-targz-file

Revision: rh104-9.1-3d1f2bc