datahoarder

6699 readers

1 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago

MODERATORS

archivist@lemmy.ml

Overwhelmed with backups and I don't know how to improve my backup strategy. (beehaw.org)

submitted 1 year ago* (last edited 1 year ago) by Nyla_Smokeyface@beehaw.org to c/datahoarder@lemmy.ml

7 comments fedilink hide all child comments

I should really start doing regular backups again, do the 3-2-1 backup strategy again, and organize my backups but there's so many files and stuff on my devices and external drives that it's overwhelming and I don't know how to properly sort it. ADHD kicks my ass too and I know that it takes a while to backup and that means my computer needs to idle so it can get done faster and doesn't slow my computer down.

Also, I'm struggling with cloud backups too. Google Drive and OneDrive often had errors in the middle of uploading and I'm hesitant to spend money on a cloud service. And doesn't Backblaze need you to have your device connected at least once every 30 days or your data will be deleted or something? I have ADHD and I can't guarantee I'll be on top of that. And do I upload image backups to these servers? Is that even possible? What about video game backups and other large files? Or all the videos I have? Aren't these cloud services really slow as well? I feel like I'll end up having the same problem...I could use zip files, but I keep worrying something will get removed in the middle of it.

And how do I check backups? I can't reasonably check every single file I've ever made.

I don't know how to handle all of this. I mainly use FreeFileSync for copying drives over with anything being overwritten just being moved to a Revisions folder. I also sometimes use Macrium Reflect for image backups, but the free version is being retired... I tried Veeam once but it didn't backup the AppData folder when I tried doing a file level backup about a year ago.

I have a Mac and a Windows computer by the way. I do want to check out Linux someday though.

I don't know if I can do a NAS either. I don't think I have an extra computer lying around and I'm a college student who needs to travel from home to my college campus. And I don't have a lot of room on my bedroom desk...And I hate how backups often prevent me from using one of my computers.

top 4 comments

sorted by: hot top controversial new old

[–] Sikeen@lemmy.ml 6 points 1 year ago

I couldn't speak to how you might do it on windows or mac. but as you mentioned linux... most popular linux backup applications have features such as backup on shutdown or backup on startup, which might help as you don't have to think about it (atleast with the backup at shutdown).

But just a word of caution as you're still in school, DON'T use a distribution(version of linux) that is called bleeding edge or unstable, use a "stable" distribution. the bleeding edge is fun, but it's more trouble than it is worth when you don't have a lot of time.

[–] ChojinDSL@discuss.tchncs.de 3 points 1 year ago* (last edited 1 year ago)

How much data in total do you need to backup?

Secondly, backups need to be setup to happen automatically, otherwise you will forget about them.

In terms of NAS, you could just get a raspberry pi, and attach a big portable drive via USB. Should be small enough so you can carry it around, plus you can set it up to do automatic backups into the cloud.

In fact, you can even setup different backup scripts that run, whenever you plug in a different drive into the raspberry pi. Each drive's filesystem would have unique identifier ID, so you can easily create a script the reacts to whenever that drive is plugged in and run a different script when a different drive is plugged in. e.g. lets say you have one drive you always want to sync with the cloud, and another that you just want to sync with a drive on your NAS.

Lastly, you should always consider how important is the stuff you are backing up really. If it's easier to re-create/re-download from other sources as opposed to the time and effort for backup&restore, then that stuff doesn't necessarily need to be backed up.

[–] alpenb@discuss.tchncs.de 1 points 1 year ago* (last edited 1 year ago)

I have a Mac and a Windows computer by the way. I do want to check out Linux someday though.

Either way start using rclone. It is some learning curve but makes life so much easy.

Also it depends on total size of data. I would recommend in addition to local - lets say 3 x 1TB accounts one with Google, one with B2, and with MS. That way you are never locked out.

[–] chkno@lemmy.ml 1 points 1 year ago* (last edited 1 year ago)

There are so many ways do handle backups, so many tools, etc. You'll find something that works for you.

In the spirit of sharing a neat tool that works well for me, addressing many of the concerns you raised, in case it might work for you too: Maybe check out git annex. Especially if you already know git, and maybe even if you don't yet.

I have one huge git repository that in spirit holds all my stuff. All my storage devices have a check-out of this git repo. So all my storage devices know about all my files, but only contain some of them (files not present show up as dangling symlinks). git annex tracks which drives have which data and enforces policies like "all data must live on at least two drives" and "this more-important data must live on at least three drives" by refusing to delete copies unless it can verify that enough other copies exist elsewhere.

I can always see everything I'm tracking -- the filenames and directory structure for everything are on every drive.
I don't have to keep track of where things live. When I want to find something, I can just ask which drives it's on.
- (I also have one machine with a bunch of drives in it which I union-mount together, then NFS mount from other machines, as a way to care even less where individual files live)
Running git annex fsck on a drive will verify that
- All the content that's supposed to live on that drive is in fact present and has the correct sha256 checksum, and
- All policies are satisfied -- all files have enough copies.

load more comments