Guide on Tar files
Start your free 7-days trial now!
What are tar files?
Tar is a program that packages multiple files into a single archive file called a tar file. Just like zip files, tar files enjoy the following benefits:
portability - instead of distributing multiple files separately, we can package the relevant files such that we just distribute one complete file instead. Also, tar files contain meta information such as the file hierarchy, which allows any operating system to unpack it using the operating system's native file system.
compressibility - by packaging files into a single tar file, we can compress the files to consume a lot less storage.
Difference between tar and zip files
The key difference between tar and zip files is that:
all the bundled files in a tar file are compressed as a whole.
each file within a zip file is compressed separately.
This means that in order to access a single original file within a tar file, we must unpack the entire tar file. In the case of a zip file, we can simply uncompress that file without having to uncompress the rest of the files.
The advantage that tar files have over zip files is that tar files typically take up less space because all the files are compressed as a whole, that is, the compression ratio of tar files is better.
Tar files are not compressed by default so we use zip to compress the tar files. We will see an example of this later.
Creating a tar file
Suppose we have the following setup:
my_folder└── my_file.txt
To package my_folder
into a tar file called my_archived_folder.tar.gz
:
tar -czvf my_archived_folder.tar.gz ./my_folder
a ./my_foldera ./my_folder/my_file.txt
Let's explain what the flag -czvf
is:
c
means creating an archive file.z
means perform compression using gzip.v
means run in verbose mode, that is, print the progress and some other useful information.f
allows us to specify the file name of the created tar file (my_archived_folder.tar.gz
in this case).
After running the above command, our file structure should be as follows:
my_folder└── my_file.txtmy_archived_folder.tar.gz
Even though it's not required, we should add the .gz
extension to indicate that this tar file has been gzipped. This is important later on when we unpack the tar file because we need to specify an additional flag to uncompress gzipped tar files.
Unpacking a tar file
For demonstration, suppose we have a tar file that contains our the my_folder
from before. To unpack our my_archived_folder.tar.gz
, run the following code:
tar -xzf my_archived_folder.tar.gz
Let's explain what the flag -xzf
is:
x
means extract (unpack) the tar file.z
is uncompress using gzip.f
allows us to specify the tar file name (my_archived_folder.tar.gz
in this case) to unpack.
After running the command, we should have our my_folder
like so:
my_folder└── my_file.txtmy_archived_folder.tar.gz
Extracting files within folders using the strip-components option
Instead of extracting the entire my_folder
structure, we could also extract only the content within my_folder
like so:
tar -xzf my_archived_folder.tar.gz --strip-components=2
my_archived_folder.tar.gz my_file.txt
Note the following:
the path of
my_folder
is./my_folder
. This has 2 path components -.
andmy_folder
.the path of
my_file.txt
is./my_folder/my_file.txt
. This has 3 path components -.
,my_folder
andmy_file.txt
.
The flag --strip-components=2
means that we extract files with more than 2
path components. This is why my_folder
was not extracted in this case.
Extracting to a specific location using the -C option
Suppose we wanted to extract a tar file into the my_dest
folder:
my_dest/my_archived_folder.tar.gz
By default, tar files will be extracted to the current directory. To extract a tar file to the my_dest
folder, add the -C
flag like so:
tar -xzf my_archived_folder.tar.gz -C my_dest
We will end up with the following structure:
my_dest└── my_folder └── my_file.txtmy_archived_folder.tar.gz
Note that my_dest
folder should exist before running the command - otherwise an error will be thrown.