Skip to content

How I Learned to Love tar

Tar stands for tape archive. So if you think of tar files like tape archives it makes it a lot easier to manage. Or so I think because I worked with tapes for years.

Create an Archive

Creating an archive is as simple as -cf the c is create and f is for file then your data paths

Example

tar -cf file.name.tar Dir.To.Tar or.Files.too

Compression & Decompression

Tar has an automatic mode for a while now, as long as you a way to do it, tar will figure it out most of the time for you. Let's compress something with the new hotness of zstd. In this example -avcf stands for automatic, verbose, create, file

Example

tar -avcf file.name.tar.zst Dir.To.Tar and.or.Files.too

Now how about extracting a file?

Example

tar -avxf file.name.tar.gz
# or to somewhere else
tar -avxf file.name.tar.gz -C /your/path/here

In that example we told tar to use the flags -avxf that stands for automatic, verbose, extract, file.

Using Custom Compression

If you need to use something that isn't built into tar, for instance if you're using a version of tar that does not support zstd you can use -I to set the program to use for compression

Example

tar -I zstd -cf filename.tar.zst /your/path/here

Note

  • Keep in mind, although this doesn’t apply to tar itself, when dealing with most compression programs there is a big difference between -e and -x being that -e will preserve paths.

    So if Alice made a tar file in /home/alice/mytardis that means that when Bob uses -e it will try to extract the data to the same path. So if Bob wants /home/bob/mytardis rather than /home/alice/tardis Bob needs to use the -x flag.

Adding and Removing Data

So you want to add or remove data from your tar file? Good news, you can! ⭐

However it's not as easy as you may like it to be. In most causes it might be a lot easier to use the --exclude flag and remake the file. However let's solve this problem one step at a time.

Did you know the --exlude flag also works for extracting?

First things first, let's get a list of all the file in the tar. Here are two examples on how to list things and search things in a tar file.

Example

tar -atf file.name.here.tar.bz2
tar -atf file.name.here.tar.zst --wildcards '*.png'

Tip

  • Don't worry if you leave off the -a as tar is pretty smart and will figure out that it's compressed. Just remember to use the main part of the command e.g. -tf at min.

Now that you know the path in the file to the data you're going to use the --delete flag to remove the data or the --append flag to append data.

Keep in mind, there is no way to append or delete from a compressed archive.

If you have a compressed tar you will need to decompress it, let's use file.name.here.tar.zstd as an example. This works the same for anything the file was compressed with e.g. zstd or gzip.

Example

zstd -d file.name.here.tar.zstd
tar -vf file.name.here.tar --delete path/here/oops.foo
tar -vf file.name.here.tar --append path/here/yes_this.bar
zstd file.name.here.tar -o file.name.here.tar.zstd

So what we did is decompress file.name.here.tar.zstd and that became file.name.here.tar. Now we removed oops.foo and added yes_this.bar to the tar file. Lastly we compressed file.name.here.tar to create a shiny new file.name.here.tar.zstd.

Backups with Tar

Tar is or was a backup tool first and for most, so it goes without saying you can do the normal differential and incremental backups with it. Differential backups are hard to automate so let's just stick with incremental backups for now.

Give it the -g

When you want to create an incremental backup you will need to first create a full backup and a snapshot file.

Example

tar -I zstd -g /path/to/backup/backup.snar -cf /path/to/backup/full_backup.tar.zst /path/to/data/here

This command will compress the backup using zstd, create the snar file, and make a fill backup. Once that's done you can make as many backups you want and the snar will store lists of changes between backups.

Now to make our first incremental backup!

Example

tar -I zstd -g /path/to/backup/backup.snar -cf /path/to/backup/$(date '+%F').tar.zst /path/to/data/here

Excluding Data

Tar can exclude data a number of ways

File Steaming with Tar

Tar was created to stream data to tapes. You can do a lot of things that you wouldn't think you should be able to do with it. However thanks to the Unix way… EVERYTHING is a file.

Copying a Large Amount of Small Files

Let's say we have 10 million small files that we need to move from /var/omg/why/did/you/do/this to /mnt/this/is/where/it/should/have/been that is going to take a while right? Well….

Example

tar -cf - /var/omg/why/did/you/do/this | tar -xf - -C /mnt/this/is/where/it/should/have/been

You can also do this over ssh! Now let's add some compression too!

Example

# Make a file on the other end
tar -cf - /var/lib/data | zstd | ssh user@hostname "cat >/var/backups/backup.tar.zst"
# Send the data compressed with a progress bar!
tar -cf - /var/lib/data | pv | zstd | ssh user@hostname tar -axf - -C /var/lib/data/
# Send the data with a progress bar
tar -cf - /var/lib/data | pv | ssh user@hostname tar -xf - -C /var/lib/data/
# Get the data
ssh user@hostname "tar -cf - /var/lib/data" | tar -xvf - -C /var/lib/data/

Note

  • The pipe view command pv will give you different information depending on where it's put and what flags are used. The example above shows the raw data being passed. If you wanted to see the compressed data passed then you should put it after zstd

The above examples show how to use tar with ssh. You can mix and match as needed. Your only limit is your creativity!

Oh and that being said, rsync is almost always the better tool for sending data around. The edge cases being special files like symbolic links, special devices, sockets, named pipes, etc. If you're sending and backing up anything like that or other data rsync is bad for, tar is your go too.