How to zip and unzip files using tar on Linux

Contents

Tar is more than just an archiving utility: tar comes with some great built-in functions, that allow you to compress and decompress files, while archiving them. Learn all about it in this post and more!!

What is it tar and how do i install it?

According to him tar manual (which you can access by typing man tar once installed), tar is an archive utility. Supports many functions, including compressing and decompressing files on the fly when archiving. Let's start by installing tar:

Install tar on your Debian-based Linux distribution / Apt (like Ubuntu and Mint), run the following command in your terminal:

sudo apt install tar

Install tar on your RedHat-based Linux distribution / Yum (like RHEL, Centos and Fedora), run the following command in your terminal:

sudo yum install tar

Next, crearemos algunos datos de muestra:

mkdir test; cd test
touch a b c d e f 
echo 1 > a; echo 5 > e; echo '22222222222222222222' > b

Configurar datos de muestra para comprimir

Aquí creamos una prueba de directorio y creamos seis archivos vacíos en ella usando el touch command. Además agregamos algunos números a los archivos. a, me, and B, aún cuando notablemente archivo B tiene datos repetitivos, que se comprimirán bien.

Si desea obtener más información acerca de cómo funciona la compresión, puede consultar nuestro ¿Cómo funciona la compresión de archivos? post.

Crear un archivo sin comprimir

Simple uncompressed tar archive creation

tar -hcf all_files.tar *
ls -l | grep -v total | awk '{print $5"tbytes for: "$9}' | sort -n

Here we create an uncompressed file using the tar -hcf all_files.tar * command. Let's take a look at the alternatives used in this command.

First, have -h that even though it is not necessary in this particular circumstance, I recommend always including in your tar commands. This option means dereference, which will remove the reference (or will continue) symbolic links, archiving and dumping the files they point to.

Next we have the -c and -f options. Please note that they are only written in conjunction with the - on -h, In other words, instead of specifying another -, we just tag them in the other shorthand options. quick and easy.

the -c option support for create a new file. Note that, by default, directories are archived recursively, unless the –no-recursion option is also used. the -f The option allows us to specify the name of the file. Because, has to be the last in our chain of alternatives (since you need an option) so we can add the archive file name directly behind it. Using tar -fch test.tar * will not work:

Shorthand options that require an option cannot be placed in front

After the tar is generated, We use an ls output that clearly shows us the number of bytes per file. As you can see, the tar file is much larger than all of our files combined. The files are just being archived and some overhead for tar is being added.

As an interesting side note, we can also see what types of files we are working with simply using the file command at the command prompt:

file c
file b
file all_files.tar

Usando archivo para ver el tipo de archivo

Crear un archivo sin comprimir

Un algoritmo de compresión muy común es GZIP. Agreguemos la opción para lo mismo (-z) a nuestra cadena de alternativas de línea de comandos abreviadas y vea cómo esto afecta el tamaño del archivo:

tar -zhcf all_files.tar.gz [a-f]
ls -l | grep -v total | awk '{print $5"tbytes for: "$9}' | sort -n

Mirando el tamaño de un archivo comprimido frente a uno sin comprimir

Esta vez especificamos una expresión regular para utilizar solo los archivos con nombre a for F, previniendo el tar comando de incluir el all_files.tar archivo dentro del nuevo all_files.tar.gz proceedings!

See How Regex is actually used? and Modify text using regex using sed if you want to learn more about regex.

We also include the -z option that will use GZIP compression to compress the result .tar file after the data dump is complete. It's great to see that we end up with a file of 186 bytes, what tells us that, in this circumstance, the tar The header / overhead of approximately 10 Kb can be compressed very well.

The total size of the file is 7,44 times larger than the total file size, but it matters little, since this fictitious example is not representative of compressing large files where you almost always see gains rather than losses, unless the data is pre-compressed or of such a format that it cannot be easily condensed using a range of algorithms. Even so, an algorithm (like GZIP) may be better than another (as an example BZIP2), and vice versa, for different data sets.

Get more bytes through high-level compression

Can we make the file even smaller? And. We can determine the maximum compression option of GZIP using the -I option a tar which enables us to specify a compression program to use (with thanks to stack overflow user ideas man42):

tar -I 'gzip -9' -hcf all_files.tar.gz [a-f]
ls -l | grep -v total | awk '{print $5"tbytes for: "$9}' | sort -n

Using the -I option to tar to specify a compression schedule

Here we specify -I 'gzip -9' as the compression program to use, and we drop the -z option (since we are now specifying a specific custom program to use instead of using tar's built-in GZIP configuration). The result is that we have 12 fewer bytes as a result of a better compression attempt (but generally slower) (on the level -9) por GZIP.

In general terms, the faster the compression (lower level of compression attempts, In other words, -1), larger file size. AND, the slower the compression (higher level of compression attempts, In other words, -9), smaller is the file. You can determine your own preferences by varying the compression level of the -1 (Quick) a -9 (slow)

Other compression programs

There are two other common compression algorithms that can be explored and tested. (different algorithm options also give different size results and may have additional compression options), and that is bzip2, which can be used by specifying the -j option a tar, and XZ which can be used by specifying the -J option.

Alternatively, you can use the -I command to determine the maximum compression alternatives for bzip2 (-9):

bzip compression program example -9

AND -9e by xz:

compression program example xz -9e

As you can see, the results are less good in this circumstance than if the somewhat standard GZIP algorithm is used. Even so, the bzip2 and xz algorithms can show improvements with other data sets.

Unzip a file

Unzipping a file is very easy, whatever the original method to compress it, and as long as said compression algorithm is present on your computer. As an example, if the original compression algorithm was bzip2 (indicated by a .bz2 extension to the tar file name), then you will want to have done sudo apt install bzip2 (O sudo yum install bzip2) en su computadora de destino que es descomprimir el archivo.

rm a b c d e f
tar -xf all_files.tar.gz
ls

Descomprimir un archivo tar comprimido (o sin comprimir)

Simplemente especificamos -x for expandir o descomprime nuestro all_files.tar.gz e indique cuál es el nombre del archivo usando de nuevo el -f opción de taquigrafía como antes.

Compressing files can help you save a lot of space on your storage devices and know how to use them. tar in combination with the available compression alternatives will help you to do so. Once you need to re-extract the file, it's easy to do so as long as the correct unzipping software is available on the computer used to unzip or extract the data from your archive. Enjoy!

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.