De-duplication, remote backup, Perl and SSH

So I got sick of trying to figure out how to use subversion so I decided to use my new Digital Ocean droplet as a file server for local backups. I was originally going to run a HTTP server but decided it would be easier (if slower) to send files over SSH using scp or an ssh pipe.

It was quite a bit more difficult than I thought it would be, and works much slower on my pinebook than I would have liked.

I chose to use the SHA256 hexadecimal digest as the file name on the remote server to overcome problems with name spaces (long file names, weird characters, spaces!). This also allows me to de-duplicate files on my computer to reduce how much remote storage I am using and how much bandwidth I am using.

De-duplication is a core feature. The SHA256 checksum takes time but it allows me to detect changes to files and auto upload them.

There is a github page:
https://github.com/wilyarti/hashtree/blob/master/hashtree.pl

You will need an unprivileged remote user that accepts private key based login. It’s called hashtree because it takes hash of the file tree.

dedupl

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s