I’m particularly interested in low bandwidth solutions. My connection to the internet is pretty rough 20mbps down and 1mbps up with no option to upgrade.
That said, this isn’t limited to low bandwidth solutions.
I’m planning on redoing my entire setup soon to run on Kubernetes followed by expanding the scope of what my server does (Currently plex, a sftp server and local client backups). Before i do that i need a proper offsite backup solution.
Sorry, not a direct answer to your question, but still something you may consider : I don’t, my backups never leave my home network. I have redundancy on several machines, and my “the house is on fire” solution is to have one of those backups on a sd card that I keep in a 3d printed amulet I made and that I keep around my neck. When people ask me what it is, I tell them it’s an amulet that protects me against memory loss, it’s always a good laugh. :) And if that burns in the fire, well, I probably don’t need backups anymore anyway.
I wish I had data that could fit on an SD card - I too don’t have off-site backups mostly due to expense. I have one other friend that is into homelabbing but for us to each backup on each other’s hardware would be ~$2k/each. Probably more on his end because I believe he’s using a consumer NAS without the room for additional expansion whereas I have a 25 bay commercial setup that’s only 1/5 populated at the moment
Tbh my current plan was to just put the data on a hard drive and post it to my parents once a week/month.
Saving on an SD card definitely seems kind of sketch tho. they are notoriously unreliable
The trick with SD cards is to not buy the cheap ones, they are horribly fragile (similarly faulty cheap usb sticks are more and more common, sadly). I’ve been using a 512Gb Sandisk Ultra microSDXC for 2 years, and it’s still rocking. It would not be a problem if it was to fail, anyway : I have several other backup storages, and I update that one daily, so I’ll know immediately when it fails, it’s not like I’ll realize it the day I’ll need it. On the plus side, it’s small enough to fit in a small 3d printed object, so that I can both keep it on myself and keep it hidden, just in case (it’s fully encrypted anyway, but still, better safe than sorry).
Might not fit into your plans but if you run Proxmox you can easily backup to an offsite computer (or VM) running Proxmox Backup Server (PBS).
From their website:
By supporting incremental, fully deduplicated backups, Proxmox Backup Server significantly reduces network load and saves valuable storage space. With strong encryption and methods of ensuring data integrity, you can feel safe when backing up data, even to targets which are not fully trusted.
deleted by creator
You self host BPS on the same server where your proxmox PvE is and you just use as storage for the BPS backblaze?
deleted by creator
That’s awesome. Do you care to share said hacks to run PBS on ARM? 😊
I’m currently running PBS in a VM on PVE with a dedicated harddrive pass through, but looking to change to something more robust
deleted by creator
My setup is running in k8s.
I have autorestic running in a container in a pod. The container mounts the volumes from the pod, I want to have backed up. It then runs every 6 hours(easily changed via a cron expression).
The config in autorestic describes the backends(servers), that it should backup to. Currently I have 3 servers, that it is backing everything up to.
I have also added extra functionallity to autorestic, to make it create a dump of a database before the backup runs.
I use the Proxmox Backup server to backup my VM’s.
For Kubernetes Pvc’s I save a snapshot of the underlying block device(CephRBD) to a borg repo on my shared filesystem(CephFS) which is backed up to Backblaze.
For Kubernetes you can use Velero. I tried it, but I didn’t like it (overly complex for my use case), so I wrote my own tool.
Essentially the strategy for me is fairly straightforward, but it depends on the data you have.
I have mostly 2 types:
- manifests and configuration. This I have all in git (as I am using flux).
- persistent volumes. I use openEBS, but for a low resources cluster I use host volumes only. For these I have written my tool that simply runs as a daemonset with the whole root of the host mounted in RO and the DAC_read_search capability, queries the API for volumes and backs up using restic the whole PV to Backblaze. Incidentally, this is also the same way I do all my other backups, outside K8s (I.e.borg or restic to b2).
I chose b2 mostly for the price, but any s3 will do. Since all I am uploading there is encrypted anyway, I don’t need to worry about the privacy implication of having a third party potentially having access to my data.
Multiple ~2tb seedboxes. They can just get shut randomly but I haven’t had that happen to me.
Cheap storage
I use the AWS cli to sync the data to S3 on an hourly scheduled task, then I lifecycle it down to Glacier instant access after a day. This winds up being relatively cheap and simply updates changed data, which keeps bandwidth utilization low.
I originally did this as a really cheap Dropbox alternative, but it works pretty good for backup files too.
I am a fan of using Restic. More specifically using Autorestic, which is a wrapper that allows you to easily configure restic using yaml files. Since all of my services are in docker containers, I just have a hook to shutdown all my containers, do the backup, and then run all my containers again. Downtime is not an issue since it just runs when I would be sleeping. Just have it backup to Backblaze B2, which I think you get 10GB free, which is plenty for me right now.
I don’t back up anything I can rebuild. I have multiple half-assed methods in use together for the rest of it:
- Backups daily of homedirs on desktops and laptops using Borg and Vorta to external usb drives. These devices get rotated out annually. I used to run 2-disk RAID1 and when I rotated the disks out, split them and sent them to family but now I’m taking my chances on having them local and putting them in a fireproof box.
- Code repos are synced to github or srht.
- Monthly backups of homedirs are sent via borg to rsync.net.
- Desktop and laptop homedirs get periodic (roughly monthly) burns to Dual-Layer BDRs which I put in the fireproof box and sometimes hand off to family.
Not my solution, but I liked an idea and thinking to use it too - copy backups on external HDD and put it into your car trunk. Maybe have two drives in rotation.
It eliminates a need to drive somewhere for rotation, and any cost of renting a safebox.
Doesn’t protect from a serious disaster like forest fire or earthquake or nuclear war, but I keep the most important data in cloud, and if my house and car burns I would be having other problems than worrying about some homelab snapshots.
Very neat idea, but I’d explicitly add strong encryption to that method, cars do get broken into.
I’d encrypt every off-site backup, but a car is a bit more exposed than a rented safe box.
actually not a bad idea. i live in a flat so my car is parked in a car park like 200m away from my property. if my entire town goes up in smoke then i imagine that losing data would be the least of my problems
I do it 3 ways.
-
Critical stuff (photos, documents etc) is synced in realtime to backblaze. Low RPO. Low RTO.
-
Critical stuff is also backed up to a secondary NAS 2x per day for versioned backups. And that data is synced nightly to Backblaze. Higher RPO but also Higher RTO.
-
All data from secondary NAS/Backup NAS is backed up nightly to 1 of 3 large external hard drives that are rotated monthly. Each disk holds ~30 days of backup archives from #2.
- Most recently pulled disk is stored off site. Oldest disk is brought back on site but stored in a UL rated fire safe
-
Primarily rely on zfs for file system replication.
I have primary/redundant nas on site, then a single node offsite connected via vpn.
On my list of things to tinker with is zettarepl.
Any backup software that supports incremental backup should work similarly bandwitdth-wise. I like Restic. You can even do incremental backups with plain rsync, if you want. If your data does not change much, than you should be okay. For the initial backup run it would be helpful if you have physical access to the remote location so you can bring a full backup there without having to upload it through your slow uplink.
Definitely an option if I’m a bit more selective with what i back up. At the moment for the client backups i’m zipping and encrypting the entire home folder for each client once a week. I could probably write something that looks for file changes and uploads just those
rsync is your friend. No need to write something that already exists! A simple “rsync -aP /directory/folder/ /backup/solutionFolder” is all it really needs. the / at the end of the first directory search tells it to backup the contents inside the folder to the folder listed after the space.
EDIT: Sorry, I misunderstood this question ~~ I have a raspberry pi connected to a 1 TB SSD. This has the following cron job:
00 8 * * * /usr/bin/bash /home/user/backup/backup.sh
And the command in backup.sh is:
rsync --bwlimit=3200 -avHe ssh user-ip:/var/www/mander/volumes /home/user/backup/$(date | awk '{print $3"-" $2 "-"$6}')
In my case, my home network has a download speed of 1 Gbps, and the server has an upload speed of 50 Mbps, so I use -bwlimit=3200 to limit the download to 25.6 Mbps, and prevent over-loading my server’s bandwidth.
So every morning at 8 am the command is run and a full backup copy is created.
It seems that you have a different problem than me. In your case, rather than doing a full copy like me, you can do incremental backups. The incremental backup is done by using rsync to synchronize the same folder - so, instead of the variable folder name $(date | awk ‘{print $3"-" $2 “-”$6}’), you can simply call that instance_backup. You can copy the folder locally after syncronizing if you would like to keep a record of backups over a period of a few days.
On a second thought, I would also benefit from doing incremental backups and making the copies locally after synchronizing… ~~
@[email protected] Remote backups might be rough with that upload speed. For example, you will be looking at over 2 hours per GiB uploaded.
I personally have a 3 node setup using kubernetes and I run longhorn for volume management. I do hourly snapshots, and then daily backups of all volumes to an additional drive on one of my 3 nodes with a simple NFS server which is also running in kubernetes. In longhorn I keep 2 replicas of every volume as well so losing one doesn’t hurt anything.
I would imagine it would be pretty easy in this case to replace my local NFS with AWS storage and then I would have remote backups, but since I back up roughly 100 GiB per day that would be a little time consuming. At my 50 Mbps that’s about 4.5 hours, though remote backups could be done less often as a last resort backup.
Yeah it is pretty rough although the files don’t necessarily change all that much so if i can set up a backup somewhere and prepopulate it with my data as it stands now then incrementally keep it update it with nightly jobs then i’m hoping it’ll mostly be done by the morning.
My backup backup plan would be to buy a couple high capacity solid state disks and either take them myself or mail them to my parents once a week. The mailman has pretty high bandwidth, even if the latency is rather rough