this post was submitted on 16 Jul 2023
27 points (100.0% liked)

datahoarder

6608 readers
3 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS
 

Has anyone used ArchiveBox for self hosted web archiving? If so, what are your thoughts on it compared to Internet Archive or other publicly available services?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] ThorrJo@lemmy.sdf.org 2 points 1 year ago (1 children)

I have been experimenting with it, for what it is, it works pretty well ... for now. I have concerns about the fact that it's a ton of moving parts basically duct-taped together by an abuse of the Django admin (that's the web app platform it's based on, which I was a developer for long ago). Also, the search function is primitive at best. I don't think it's going to be my long-term solution for this need, but maybe I'm wrong.

[โ€“] oldfart@lemm.ee 1 points 1 year ago

The archived pages are available as files on disk, I also added a script which generates index.html so I can browse it without starting the program. Basically the only time I run archivebox code is when adding a new site. And I never look at the GUI, it adds nothing to the table