When it comes to their stuff, people often have a hard time letting go. When the object of their obsession are rooms full of old clothes or newspapers, it can be unhealthyâeven dangerous. But what about a stash that fits on 10 5-inch hard drives?
Online, youâll find people who use hashtags like â#digitalhoarderâ and hang out in the 120,000-subscriber Reddit forum called /r/datahoarder, where they trade tips on building home data servers, share collections of rare files from video game manuals to ambient audio records, and discuss the best cloud services for backing up files.
The often stereotyped hoarders letting heaps of physical items of questionable utility dominate their homes and lives often suffer social stigma and anxiety as a result. By contrast, many self-proclaimed digital hoarders say they enjoy their collections, can keep them contained in a relatively small amount of physical space, and often take pleasure in sharing them with other hobbyists or anyone who wants access to the same public data.
âData hoarder means to me simply someone who collects and curates digital data,â said the user -Archivist, one of the moderators of /r/datahoarder, in a private message on Reddit. âItâs a little removed from the disorder we usually see from traditional hoarders.â
He and many of his fellow subreddit users also take pride in keeping their data well organized into folders and subfolders. Some even take pains to keep the forum itself from getting bogged down with dubious material: One of the most popular recent threads begs users to stop spamming the subreddit with photos of their hard drives.
![Screenshot: <a href=](https://cdn.statically.io/img/gizmodo.com/app/uploads/2019/03/yx9w9qejv9dc7eqwf3mv.jpg)
âData hoarding isnât about just buying $3,000 worth of hard drives just for posting them here,â wrote user Nooco24, one of the siteâs moderators. âWhatâs interesting is what you do with your storage.â
What users seem to prefer to see are discussions of unusual and intricate storage setups, guides to using complex archive software and, of course, interesting datasets, from public-domain collections of vintage scientific papers to old BBC sound effect samples. Public archives, naturally, are a plus.
In addition to roughly 2.6 petabytes stored on a system of servers in his spare roomâdata collection size is the one fact each moderator highlights on the forumâs mod listâ-Archivist is also the data curator and server manager of The Eye, a sprawling online archive of everything from vintage movie posters to beer-brewing guides to video games from short-lived console systems of the 1980s. A German resident in his late 20s who restores historic paintings and documents for a living, -Archivist said he got his start collecting printed and digitized medical journals.
âAfter that came piracy, which I was introduced to early on by my stepfather,â he quipped, leading him to start developing collections of movies and TV shows. Today, he personally prefers to collect digital books and texts, which he said are often quick to disappear from the internet.
âMost other data types arenât so rare,â he said. âWeird and obscure books and texts seem to vanish first.â
Many people active in the data hoarding community take pride in tracking down esoteric files of the kind that often quietly disappear from the internetâmanuals for older technologies that get taken down when manufacturers redesign their websites, obscure punk show flyers whose only physical copies have long since been pulled from telephone poles and thrown in the trash, or episodes of old TV shows too obscure for streaming services to bid onâand making them available to those who want them.
GitHub, owned by Microsoft since late last year, is mostly known for hosting source code for collaborative programming projects. But itâs also home to a collection of works by the Polish surrealist painter ZdzisĆaw BeksiĆski uploaded by the user itdaniher, a Midwesterner and /r/datahoarder user whoâs been collecting data for a decade and asked to only be identified by their username.
âIâve been in touch with his estate a little bit, and theyâre fine with me hosting a mirror of his works,â said itdaniher, who first obtained the images from a shared BitTorrent file, in a phone interview. Another file they uploaded to GitHub is a database mapping more than 2,000 common names of plants to their Latin scientific names, with entries from âAbe Lincoln Tomatoâ to âZuni Gold Bean.â Itdaniher, who also enjoys gardening and doesnât identify as a true âhoarderâââI try to exercise a certain level of judiciousness,â they say, usually spending three or four hours a week archivingâhopes to expand the list into a larger project documenting ideal temperatures, soil and other conditions for growing the various plants. They hope to find that data scattered across the internet, just as the list of names initially was.
âThe internet is a big place, and a lot of times I will find other people who have HTML tables on their web pages that have some information, but a small fraction of the information that I want,â itdaniher said. âSometimes itâs finding personal sites where [someoneâs said] hereâs the list of the common and Latin names for the plants Iâm growing this year.â
Itdaniher, an experienced Linux system administrator, also runs software provided by the group Archive Team to help download materials at risk of disappearing from the internet and help them make their way to the nonprofit Internet Archive. Founded by the digital archivist and filmmaker Jason Scott in 2009, Archive Team calls itself âa loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage.â Members frequently scramble to preserve aspects of internet history before they disappear as sites fade from the web. Through a mix of manual labor and distributed bots, the project has archived large swaths of sites including the classic free web host Geocities, the text-hosting platform Etherpad and the blog platform Xanga.
âSince 2009 this variant force of nature has caught wind of shutdowns, shutoffs, mergers, and plain old deletionsâand done our best to save the history before itâs lost forever,â the group says on its official site.
Itdaniher shared with Scott a collection of Tumblr postings linked from Reddit and tagged as ânot safe for workâ as part of a global effort to preserve adult content on the now-Verizon-owned blogging network, after the company controversially announced it would no longer allow such material. At least 344,000 archived Tumblr sites marked for deletion are en route to the Internet Archive or already uploaded where theyâll be publicly accessible, Scott said.
âI was able to contribute to that larger project of saving that aspect of internet culture for future generations,â said itdaniher.
Some /r/datahoarder users acknowledge they collect files that other people might not find interesting: HeloRising, a man in his mid-30s from the Pacific Northwest, said via Reddit PM that heâs built up a collection of high-quality digital copies of illuminated manuscripts, which he said he finds fascinating but has yet to find other uses interested in sharing. The files sometimes get posted by institutions that house and scan the medieval documents, but theyâre often difficult to download and can disappear over time or live on only in obscure online archives.
âThe illuminated manuscripts are unicorns,â he said. âThey turn up in odd places.â
HeloRising, who has about 30 terabytes in total of data and spends five or six hours per week on the hobby, said the Reddit community has been a âtreasure troveâ of useful advice and information. Itâs a common sentiment from users, who enjoy solidarity and support on the subreddit, where a recent comment thread filled with excitement about a newly organized collection of thousands of vintage video game manuals.
âHaving a community is great,â said itdaniher. âIt makes me feel like the time that I spend, Iâm working towards of a common goal of not throwing things down the proverbial memory hole, the 1984 trash disposal of uncomfortable facts.â
While people with hoarding disorders are often isolated, embarrassed and overwhelmed by disorganized piles of clutter, members of /r/datahoarder tend to take pride in their digital collections and thrive on keeping them organized, whether for sharing or personal use. More than a few work in technology or simply enjoy tinkering with computers, so tweaking download scripts and data storage networks is a fun part of their hobby, not a chore. Some also share custom-crafted archiving tools and other software theyâve created on GitHub, which can serve as a portfolio for those seeking programming jobs or just a high-tech social outlet.
âWith time flying, we arenât just people archiving data together, we are more than that,â said Corentin Barreau, a 19-year-old administrator on The Eye who is nicknamed âThe French Guy,â in a Twitter direct message. âBeside that, I have an affection to everything that links to collections, even IRL, I like to collect, and itâs peaceful to sort data, itâs satisfying. And the joy of people when you share something [is] worth more than everything.â
His most prized archive is a set of âfamily memories,â digitized from analog photos and VHS tapes taken by his loved ones over the years. Barreau keeps local copies of the digital versions, as well as looking after cloud backups and the analog originals.
âThatâs the most exciting thing [Iâve] done, and the collection Iâm most proud of,â he said.
Barreau said he doesnât see himself as a hoarder in a negative sense, since it doesnât negatively impact his personal life.
âItâs just a passion, like people doing sports every day, or painting,â he said with an ASCII wink.
As with other mental health issues, experts say hoarding really becomes an issue when it interferes with peopleâs happiness or gets in the way of everyday life. Collecting, on the other hand, can be a perfectly healthy hobby, whether people are collecting baseball cards or rare Frank Zappa MP3s.
âThe collections tend to give pride and positive feelings, whereas hoarding tends to be associated with stress and disorganization,â said Gregory Chasson, an associate professor of psychology at the Illinois Institute of Technology who has studied hoarding disorder. âThere doesnât tend to be a sense of cohesion or a theme.â
And digital mediaâs small physical footprint means itâs harder for even disorganized files on hard drives or USB sticks to grow unmanageable and dominate spaces the way physical collections of clothes, books or other materials can.
âI walk into homes where I canât discern where sleeping, bathing and eating takes place because of the volume of the stuff,â said Regina Lark, owner of the Los Angeles area professional organizing firm A Clear Path, which helps people with physical hoarding problems. âI would imagine the uber-acquiring of digital media is not impairing the quality of your life, unless that is what youâre spending your life on, is acquiring.â
Still, problem digital hoarding, where massive collections of files, inbox messages and other digital data bring stress to their owners, isnât unheard of, including among people who already struggle with hoarding tangible objects. Chasson said anecdotally, itâs not uncommon to see people with hoarding issues also have computer desktops riddled with icons or email accounts stuffed with unread messages. There hasnât yet been much formal research into digital hoarding, he said. But a recent paper he coauthored does suggest a connection with physical hoarding, finding âhigher levels of physical acquiring behaviors were significantly related to increased distressâ when experimental subjects were falsely told a digital item from their Pinterest collections would be deleted.
âUltimately, I think itâs tapping into the same mechanisms for a lot of people,â he said.
Both physical and digital hoarding can be motivated by the fear of permanently losing something important, even if others might think itâs easily replaceable or simply trash, said the creator of the YouTube channel I am a Compulsive Hoarder, a self-proclaimed âdisposophobicâ (referring to her fear of throwing out something that might prove valuable) who asked that her name not be used.
âI start thinking, but that particular article has such good information, Iâm not going to find it again,â she said. âWe canât even consider the possibility we could find a better article.â
She said she has a tendency to store disorganized collections of web articles describing exercises sheâs never done, foods sheâs never prepared and even treatments for hoarding. Managing text messages can also be stressful, since she worries about deleting conservation histories en masse without going through each individual message. Even e-commerce can bring challenges for people with hoarding issues, she said, as websites guilt them into signing up for inbox-clogging discount newsletters they hesitate to delete or unsubscribe from.
âThey get inundated about marketing emails,â she said. âOnce youâre there, itâs hard to get unsubscribed, because now youâve got FOMO.â
When old files do turn out to be valuableâlike old Christmas newsletters that bring back old memories, or a wedding speech she recently unearthed and shared with a delighted friendâshe has to remind herself itâs not a reason to stockpile every bit of data.
âWhen I found something else everyone else is so glad I kept, I really have to splash cold water on my face and tell myself, donât let this be a reason to start saving stuff,â she said. âI donât want to keep getting more hard drives.â
The fact is, though, it is often genuinely difficult for users without a decent amount of technical experience to find the right balance. Many systems donât make it easy to find, organize and back up valuable files, while shunting more ephemeral data to the digital trash heap. Social networking sites are notoriously difficult to search, let alone download content from. Cloud services shut down or change policies often with little notice, said the Archive Teamâs Jason Scott, like Tumblrâs about-face on erotic pictures, Googleâs move to shut down social network Google+ or the venerable photo-sharing site Flickrâs recent announcement it would begin purging images from legacy free accounts with more than 1,000 pictures uploaded as of March 12.
âWe have consistently been working since the mid-80s to turn every single aspect of life into a digital file in one way or another,â Scott said. âPeople are suddenly discovering they donât own their data, and all your life is data.â
Archive Team sometimes finds itself effectively the last stop before data disappears from shuttering services. That means thereâs often little time or desire to distinguish between trash and treasure. But many of the groupâs volunteer archivistsâsome of whom also frequent forums like /r/datahoarderâare more inclined to find joy and pride than frustration in loading their hard drives and public online archives with as much data as they can save for posterity.
âPeople are like really, youâre gonna save a bunch of furry art?â Scott said. âWell, we donât know, and weâre not going to be the ones to make that decision.â
Steven Melendez is an independent journalist living in New Orleans.