Tar is more than just an archiving utility: tar comes with some great built-in functions, that allow you to compress and decompress files, while archiving them. Learn all about it in this post and more!!
What is it tar and how do i install it?
According to him tar manual (which you can access by typing man tar
once installed), tar is an archive utility. Supports many functions, including compressing and decompressing files on the fly when archiving. Let's start by installing tar:
Install tar on your Debian-based Linux distribution / Apt (like Ubuntu and Mint), run the following command in your terminal:
sudo apt install tar
Install tar on your RedHat-based Linux distribution / Yum (like RHEL, Centos and Fedora), run the following command in your terminal:
sudo yum install tar
Next, crearemos algunos datos de muestra:
mkdir test; cd test touch a b c d e f echo 1 > a; echo 5 > e; echo '22222222222222222222' > b
Aquí creamos una prueba de directorio y creamos seis archivos vacíos en ella usando el touch
command. Además agregamos algunos números a los archivos. a, me, and B, aún cuando notablemente archivo B tiene datos repetitivos, que se comprimirán bien.
Si desea obtener más información acerca de cómo funciona la compresión, puede consultar nuestro ¿Cómo funciona la compresión de archivos? post.
Crear un archivo sin comprimir
tar -hcf all_files.tar * ls -l | grep -v total | awk '{print $5"tbytes for: "$9}' | sort -n
Here we create an uncompressed file using the tar -hcf all_files.tar *
command. Let's take a look at the alternatives used in this command.
First, have -h
that even though it is not necessary in this particular circumstance, I recommend always including in your tar
commands. This option means dereference, which will remove the reference (or will continue) symbolic links, archiving and dumping the files they point to.
Next we have the -c
and -f
options. Please note that they are only written in conjunction with the -
on -h
, In other words, instead of specifying another -
, we just tag them in the other shorthand options. quick and easy.
the -c
option support for create a new file. Note that, by default, directories are archived recursively, unless the –no-recursion option is also used. the -f
The option allows us to specify the name of the file. Because, has to be the last in our chain of alternatives (since you need an option) so we can add the archive file name directly behind it. Using tar -fch test.tar *
will not work:
After the tar is generated, We use an ls
output that clearly shows us the number of bytes per file. As you can see, the tar file is much larger than all of our files combined. The files are just being archived and some overhead for tar is being added.
As an interesting side note, we can also see what types of files we are working with simply using the file
command at the command prompt:
file c
file b
file all_files.tar
Crear un archivo sin comprimir
Un algoritmo de compresión muy común es GZIP. Agreguemos la opción para lo mismo (-z
) a nuestra cadena de alternativas de línea de comandos abreviadas y vea cómo esto afecta el tamaño del archivo:
tar -zhcf all_files.tar.gz [a-f] ls -l | grep -v total | awk '{print $5"tbytes for: "$9}' | sort -n
Esta vez especificamos una expresión regular para utilizar solo los archivos con nombre a for F, previniendo el tar
comando de incluir el all_files.tar
archivo dentro del nuevo all_files.tar.gz
proceedings!
See How Regex is actually used? and Modify text using regex using sed if you want to learn more about regex.
We also include the -z
option that will use GZIP compression to compress the result .tar
file after the data dump is complete. It's great to see that we end up with a file of 186 bytes, what tells us that, in this circumstance, the tar
The header / overhead of approximately 10 Kb can be compressed very well.
The total size of the file is 7,44 times larger than the total file size, but it matters little, since this fictitious example is not representative of compressing large files where you almost always see gains rather than losses, unless the data is pre-compressed or of such a format that it cannot be easily condensed using a range of algorithms. Even so, an algorithm (like GZIP) may be better than another (as an example BZIP2), and vice versa, for different data sets.
Get more bytes through high-level compression
Can we make the file even smaller? And. We can determine the maximum compression option of GZIP using the -I
option a tar
which enables us to specify a compression program to use (with thanks to stack overflow user ideas man42):
tar -I 'gzip -9' -hcf all_files.tar.gz [a-f]
ls -l | grep -v total | awk '{print $5"tbytes for: "$9}' | sort -n
Here we specify -I 'gzip -9'
as the compression program to use, and we drop the -z
option (since we are now specifying a specific custom program to use instead of using tar's built-in GZIP configuration). The result is that we have 12 fewer bytes as a result of a better compression attempt (but generally slower) (on the level -9
) por GZIP.
In general terms, the faster the compression (lower level of compression attempts, In other words, -1
), larger file size. AND, the slower the compression (higher level of compression attempts, In other words, -9
), smaller is the file. You can determine your own preferences by varying the compression level of the -1
(Quick) a -9
(slow)
Other compression programs
There are two other common compression algorithms that can be explored and tested. (different algorithm options also give different size results and may have additional compression options), and that is bzip2, which can be used by specifying the -j
option a tar
, and XZ which can be used by specifying the -J
option.
Alternatively, you can use the -I
command to determine the maximum compression alternatives for bzip2
(-9
):
AND -9e
by xz
:
As you can see, the results are less good in this circumstance than if the somewhat standard GZIP algorithm is used. Even so, the bzip2
and xz
algorithms can show improvements with other data sets.
Unzip a file
Unzipping a file is very easy, whatever the original method to compress it, and as long as said compression algorithm is present on your computer. As an example, if the original compression algorithm was bzip2 (indicated by a .bz2
extension to the tar file name), then you will want to have done sudo apt install bzip2
(O sudo yum install bzip2
) en su computadora de destino que es descomprimir el archivo.
rm a b c d e f
tar -xf all_files.tar.gz
ls
Simplemente especificamos -x
for expandir o descomprime nuestro all_files.tar.gz
e indique cuál es el nombre del archivo usando de nuevo el -f
opción de taquigrafía como antes.
Compressing files can help you save a lot of space on your storage devices and know how to use them. tar
in combination with the available compression alternatives will help you to do so. Once you need to re-extract the file, it's easy to do so as long as the correct unzipping software is available on the computer used to unzip or extract the data from your archive. Enjoy!