How I Learned to Love
Tar stands for tape archive. So if you think of tar files like tape archives it makes it a lot easier to manage. Or so I think because I worked with tapes for years.
Create an Archive
Creating an archive is as simple as
-cf the c is create and f is for file then your data paths
Compression & Decompression
Tar has an automatic mode for a while now, as long as you a way to do it, tar will figure it out most of the time for you. Let's compress something with the new hotness of zstd. In this example
-avcf stands for automatic, verbose, create, file
Now how about extracting a file?
In that example we told tar to use the flags
-avxf that stands for automatic, verbose, extract, file.
Using Custom Compression
If you need to use something that isn't built into tar, for instance if you're using a version of tar that does not support zstd you can use
-I to set the program to use for compression
Keep in mind, although this doesn’t apply to tar itself, when dealing with most compression programs there is a big difference between
-ewill preserve paths.
So if Alice made a tar file in
/home/alice/mytardisthat means that when Bob uses
-eit will try to extract the data to the same path. So if Bob wants
/home/alice/tardisBob needs to use the
Adding and Removing Data
So you want to add or remove data from your tar file? Good news, you can!
However it's not as easy as you may like it to be. In most causes it might be a lot easier to use the
--exclude flag and remake the file. However let's solve this problem one step at a time.
Did you know the
--exlude flag also works for extracting?
First things first, let's get a list of all the file in the tar. Here are two examples on how to list things and search things in a tar file.
Don't worry if you leave off the
-aas tar is pretty smart and will figure out that it's compressed. Just remember to use the main part of the command e.g.
Now that you know the path in the file to the data you're going to use the
--delete flag to remove the data or the
--append flag to append data.
Keep in mind, there is no way to append or delete from a compressed archive.
If you have a compressed tar you will need to decompress it, let's use
file.name.here.tar.zstd as an example. This works the same for anything the file was compressed with e.g.
So what we did is decompress
file.name.here.tar.zstd and that became
file.name.here.tar. Now we removed
oops.foo and added
yes_this.bar to the tar file. Lastly we compressed
file.name.here.tar to create a shiny new
Backups with Tar
Tar is or was a backup tool first and for most, so it goes without saying you can do the normal differential and incremental backups with it. Differential backups are hard to automate so let's just stick with incremental backups for now.
Give it the
When you want to create an incremental backup you will need to first create a full backup and a snapshot file.
This command will compress the backup using zstd, create the snar file, and make a fill backup. Once that's done you can make as many backups you want and the snar will store lists of changes between backups.
Now to make our first incremental backup!
Tar can exclude data a number of ways
File Steaming with Tar
Tar was created to stream data to tapes. You can do a lot of things that you wouldn't think you should be able to do with it. However thanks to the Unix way… EVERYTHING is a file.
Copying a Large Amount of Small Files
Let's say we have 10 million small files that we need to move from
/mnt/this/is/where/it/should/have/been that is going to take a while right? Well….
You can also do this over ssh! Now let's add some compression too!
# Make a file on the other end tar -cf - /var/lib/data | zstd | ssh user@hostname "cat >/var/backups/backup.tar.zst" # Send the data compressed with a progress bar! tar -cf - /var/lib/data | pv | zstd | ssh user@hostname tar -axf - -C /var/lib/data/ # Send the data with a progress bar tar -cf - /var/lib/data | pv | ssh user@hostname tar -xf - -C /var/lib/data/ # Get the data ssh user@hostname "tar -cf - /var/lib/data" | tar -xvf - -C /var/lib/data/
The pipe view command
pvwill give you different information depending on where it's put and what flags are used. The example above shows the raw data being passed. If you wanted to see the compressed data passed then you should put it after
The above examples show how to use tar with ssh. You can mix and match as needed. Your only limit is your creativity!
Oh and that being said,
rsync is almost always the better tool for sending data around. The edge cases being special files like symbolic links, special devices, sockets, named pipes, etc. If you're sending and backing up anything like that or other data rsync is bad for, tar is your go too.