Well, here my story, might it be useful to others too.
I have a home server with 6Tb RAID1 (os on dedicated nvme). I was playing with bios update and adding more RAM, and out of the blue after the last reboot my RAID was somehow shutdown unclean and needed a fix. I probably unplugged the power chord too soon while the system was shutting down containers.
Well, no biggie, I just run fsck and mount it, so there it goes: “mkfs.ext4 /dev/md0”
Then hit “y” quickly when it said “the partition contains an ext4 signature blah blah” I was in a hurry so…
Guess what? And read again that command carefully.
Too late, I hit Ctrl+c but already too late. I could recover some of the files but many where corrupted anyway.
Lucky for me, I had been able to recover 85% of everything from my backups (restic+backrest to the rescue!) Recreate the remaining 5% (mostly docker compose files located in the odd non backupped folders) and recovered the last 10% from the old 4Tb I replaced to increase space some time ago. Luckly, that was never changing old personal stuff that I would have regret losing, but didn’t consider critical enough on backup.
The cold shivers I had before i checked my restic backup discovering that I didn’t actually postponed backup of those additional folders…
Today I will add another layer of backup in the form of an external USB drive to store never-changing data like… My ISOs…
This is my backup strategy up to yesterday, I have backrest automating restic:
- 1 local backup of the important stuff (personal data mostly)
- 1 second copy of the important stuff on an USB drive connected to an openwrt router on the other side of home
- 1 third copy of the important stuff on a remote VPS
And since this morning I have added:
- a few git repos (pushed and backup in the important stuff) with all docker compose, keys and such (the 5%)
- an additional USB local drive where I will be backup ALL files, even that 10% which never changes and its not “important” but I would miss if I lost it.
Tools like restic and Borg and so critical that you will regret not having had them sooner.
Setup your backups like yesterday. If you didn’t already, do it now.
“There are two types of people: those who back up and those who haven’t lost data … yet.”
That’s really true. I was lucky enough to lose data, but be able to recover it. Very lucky.
And you find out you are not really backing up enough!
- a few git repos (pushed and backup in the important stuff) with all docker compose, keys and such (the 5%)
Um, maybe I’m misunderstanding, but you’re storing keys in git repositories which are where…?
And remember, if you haven’t tested your backups then you don’t have backups!
All my got repos are on my server, not public. Then backupped on my restic, encrypted.
Only the public keys are under backup tough, for the private ones, I prefer to have to regenerate them rather get stolen.
I mean, when like in foegejo you add the public keys for git push and such.
How do we test back-ups honestly? I just open a folder or two in my backup drive. If it opens , I am ok. Any better way to do this?
Hahhh… well really, the only way to test backups is to try to restore from them. VMs are extremely helpful for this - you can try to restore a VM mirror of your production system, see if it works as expected, wipe it and start over if it doesn’t.
Makes sense. Thanks for the reply.
In 2010 I self hosted a Xen hypervisor and used it for everything I could. It was fun!
I had a drive failure in my main raid 5 array so bought a new disk. When it came to swap it out, I pulled the wrong disk.
It was hardware raid using an Adaptec card and the card threw the raid out and refused to bring it together again. I couldn’t afford recovery. I remember I just sat there, dumb founded, in disbelief. I went through all the stages of grief.
I was in the middle of a migration from OVH to Hetzner and it occurred at a time where I had yet to reconfigure remote backups.
I lost all my photos of our first child. Luckily some of them were digitised from developed physical media which we still had. But a lot was lost.
This was my lesson.
I now have veeam, proxmox backup server, backuppc and Borg. Backups are hosted in 2 online locations and a separate physical server elsewhere in the house.
Backups are super, SUPER important. I’ve lost count how many times I’ve logged into backuppc to restore a file or folder because I did a silly. And it’s always reassuring how easy it is to restore.
I feel you man! Lessons are really learnt only the hard way!
I have all my photos on my NAS whith backups to OVH storage for recent data and AWS glacier for folders of pictures older than two years. I can afford to lose all my ISOs but not all the pictures of my child. Never tried to recover data from AWS glacier but I trust them.
How much storage are you using and how much does it cost per month/year?
For 280GB on Glacier : around 1 USD each month For 400GB of hot storage on OVH public cloud : around 5 EUR per month.
In my process I have to sort pictures and video before sending them to cold storage because I don’t want to cold-store all the failed footages, obviously I have some delay here. That is something I usually do during the long winter evenings 😊
Plus, even if you manage to never, ever have a drive fail, accidentally delete something that you wanted to keep, inadvertently screw up a filesystem, crash into a corruption bug, have malware destroy stuff, make an error in writing it a script causing it to wipe data, just realize that an old version of something you overwrote was still something you wanted, or run into any of the other ways in which you could lose data…
You gain the peace of mind of knowing that your data isn’t a single point of failure away from being gone. I remember some pucker-inducing moments before I ran backups. Even aside from not losing data on a number of occasions, I could sleep a lot more comfortably on the times that weren’t those occasions.
Tools like restic and Borg and so critical that you will regret not having had them sooner.
100000%
I just experienced this when a little mini PC I bought <2y ago cooked its nvme and died this month. Guess who has been meaning to set up backups on that guy for months?
Unfortunately, that nvme is D. E. D. And even more unfortunately, that had a few critical systems for the local network (like my network controller/DHCP). Thankfully it was mostly docker containers so the services came up pretty easy, but I lost my DBs so configs and data need to be replicated :(
The first task on the new box was figuring out and automating borg so this doesn’t happen again. I also set up backups via my new proxmox server, so my VMs won’t have that problem too.
Now to do the whole ‘actually testing the backups’ thing.
I think these kind of situations are where ZFS snapshots shine: you’re back in a matter of seconds with no data loss (assuming you have a recent snapshot before the mistake).
Edit: yeah no, if you operate at the disk level directly, no local ZFS snapshot could save you…
How would ZFS snapshots help in a situation like this, where you have accidentally formatted your drive?
I’m not sure, I read that ZFS can help in the case of ransomware, so I assumed it would extend to accidental formatting but maybe there’s a key difference.
ZFS is fantastic and it can indeed restore files that have been encrypted as long as you have an earlier snapshot.
However, it would not have helped in this scenario. In fact, it might have actually made recovery efforts much more difficult.
It could have helped by automatically sending incremental snapshots to a secondary drive, which you could then have restored the original drive from. However, this would have required the foresight to set that up in the first place. This process also would not have been quick; you would need to copy all of the data back just like any other complete drive restoration.
Tell me how that would have helped at all? Can zfs Unformat a drive? Don’t think so…
Zfs is not backup guys. Snapshots too, are not backup!