I’m curious what people think about the Library of Congress’s decision to digitally archive every public tweet.
Every public tweet, ever, since Twitter’s inception in March 2006, will be archived digitally at the Library of Congress. That’s a LOT of tweets, by the way: Twitter processes more than 50 million tweets every day, with the total numbering in the billions.
To be honest, I don’t have a Twitter account, I don’t “follow” anyone, and I don’t really “get” the whole tweeting thing. Obviously, I don’t know enough to have an opinion on this, but I couldn’t help but laugh at this comment made by “Uncle Fred” on a post at the Atlantic about the Twitter archive:
Great, now even future historians can muse over my failed toasted tomato sandwiches.
My questions are for those of you who are, or ever have been, on Twitter: Do you think tweets are something worth archiving? Are there privacy concerns? Will knowledge that your tweets will be archived change the nature of what you write? Any other thoughts or concerns?


Oooh, goodie, I can respond to this as both a tweeter AND a historian. Both of those parts of me think that of course it’s a good thing that tweets are being archived. As anyone who’s done serious archival work before knows, you spend a whole lot of time digging through irrelevant material to find the gem that’ll be the center of your fourth chapter. But that gem only is a gem because of how you contextualize it and relate it to other bits of information you’ve gathered. So, Uncle Fred’s tweet about his failed sandwich won’t be noteworthy in isolation; but, as part part of say, a complex database compiled from millions of tweets about food habits cross-checked against location and date, I could see it being part of a scholarly argument. (To say nothing of research on the role of social media in the uprisings in Iran and Moldova last year, which are already being studied).
A Twitter archive is going to have a lot of fat in it, to be sure, but it will no doubt be of use. There are questions to be answered, however, that will determine how useful: what happens to the links that people include in their Tweets? How easy will it be to follow conversations that are happening on Twitter? Will deleted or private Tweets be archived? The LOC is working through these issues now. I think that as long as tweets aren’t filed and served in isolation, that you can access them as the entries into a conversational stream that they so often are, this can be a very rich resource for future generations.
Another good thing about this development is that it lays the lie to the sense that tweets are ephemeral, which is an easy trap to fall in to.
Thanks, Luke–I was hoping a historian would chime in! Do you think the LOC should be archiving other forms of mundane digital media besides Twitter? Flickr pages? Facebook status updates? Comments on blog posts?
On thinking about this a bit more over the weekend, I think the tone of my original comment may have been a tad too cavalier towards the concerns raised by this move. I still think it’s on whole a good thing, and that the archive will prove useful to future historians; but I think I underplayed the ethical implications of a system that would make opting out impossible. The relationship between Twitter’s current TOS and this deal is a bit fuzzy; you grant Twitter the right to use, reproduce, disseminate your tweets, and I assume this includes deleted tweets, as well. But if those tweets immediately filter to an archive they’ll be searchable, which they’re not if you delete them from your account. I believe (strongly) that control over the life of user-generated content should lie permanently with the user. Unfortunately, that’s rarely the case with much web content, and in fact one of the core missions of Blogs@Baruch is to nurture awareness of these ethical issues among members of our community.
As for other social media platforms, I don’t think so. Archive.org does some of that, though FB is out (they’re retaining all rights to your info so that they can monetize it going forward). I think Twitter is different primarily because the leanness of the content it serves lowers the barriers to archiving; because its user base and the range of content it touches is so vast that an archive will be useful; and because it, more than any other platform, has enabled a new kind of connectivity. Twitter in some ways maps the Internet on a minute-by-minute basis; to the extent an archive can capture that process, I think it will be singularly useful.
Interesting conversation. Lauren, I’m glad you brought up the question of where this archiving will end, since it seems to me that there are many kinds of info-sharing “nodes” (is that the right word for this? perhaps not) that could be argued to be worthy of archiving.
I don’t know enough about the Library of Congress and what it already archives, but I’m curious to know more. For example, when I heard that they were archiving every Storycorps interview, it boggled my mind. I think it’s an important project, and Storycorps tries to respond to the weight of that project by, say, spreading its booths out, doing outreach into various communities and throughout various regions. Twitter doesn’t have that kind of centralized push (as far as I know),
I’m interested, Luke, that you feel that Twitter “maps” the Internet; I wouldn’t imagine it to be so comprehensive, and I wonder if a map is the best metaphor; does Twitter really the contours of what goes on on the web? Either way, it suggests that this archival project really is about capturing Internet activity– as long as it’s looked at through that lens (rather than representative of _____), it’s even more interesting to me.
I don’t personally feel that my own tweets are worth archiving, although I do also tweet for an advocacy organization, and when I think about it through that lens, I think it’s excellent…
The confidentiality stuff is a little questionable, sure. But Luke summed that up better than I could.
@Hillary: I would definitely concede that it’s a very limited map that misses the majority of the valleys of interaction that take place on the web. But I do think you can discern a certain logic from the linking that goes on. For instance, check out Digital Humanities Now, which uses a curated list of Twits who to identify common interests, based primarily on who’s linking to what.
Interesting conversation everyone. Working in antiques, I just wanted to add that documenting tweets could eventually help determine the age of some items. For example, lets say in 60 years a man named Bob finds an old television in his attic and wants to sell it. In those 60 years, televisions would have changed dramatically in different shapes and forms. If we documented these tweets we could generally pinpoint the era in which the television was created and used. If Bob searches through tweets for the television he just found and notices that many tweets talk about the television in the 1990′s, he could generally assume that the television was created in the 1990′s. Of course this would not be 100% accurate, but it sure is an idea!