Right now, you and I have access to more information than anyone else in the history of humanity. The richest man alive in the year 1800 could not get the amount and quality of information that a janitor with a $20/month DSL connection has at his fingertips today. This is all so amazing and wonderful that we mostly take it for granted. But it brings up new problems. No one can argue about the amount of information that's just a Google search away, but the quality of what comes up can be a big question mark. Luckily, we also have in place the most successful model of judging the quality of information in the history of man: the peer-reviewed academic journal. So we have an embarrassment of riches, and a great model to follow that has brought constant improvements in science and technology. So what's the problem? Actually performing academic research is horribly broken, and what's worse, there's no good reason. Read on to find out just how broken the system is. 1. There is absolutely no excuse for why I can't get immediate access to every journal article ever published. I'm serious. If I want to learn all about the misinformation effect, there's no doubt it will take me some time to read all the current research, let alone acquire the background in psychology needed to follow along. But even if I have the time, the motivation, and the background, I can't, not without spending a ton of money or being affiliated with a college or university (which translates to "spending a ton of money"). Unlike the web, with it's search engines, directories, and billions of pages hyperlinked to each other, academic research articles are not all available in one place to anyone who wants them. Why not? "Oh, digitizing those articles is such a huge job." "Oh, it will cost so much money." No it won't. No, it won't. The real problem is the tangled interests of various publishers like Elsevier. They are actively preventing access to information and convoluting the search and linking process. If everyone gave Google rights to digitize this stuff, they would do it in a second. If they gave it to the Internet Archive, or Project Gutenberg, or the Wikimedia Foundation, or created a new open-source project to work on it, it would get done. Virtually everything in the past 20 years was already typed up on a computer for submission or layout in the journal. A huge amount from earlier than that has already be digitized and sits in some database somewhere. And anything that hasn't been digitized yet can be taken care of with scanning, OCR, and a few dozed graduate assistantships (and those poor bastards get very little pay). 2. There is absolutely no excuse for requiring people to search this database for this, that database for that, ask their institution to purchase access to this other database to find some other thing, etc., etc. Twenty years ago, there were reasons for things to be in different databases, and for some things to not show up in any database at all. In fact, twenty or thirty years ago it was a nice bonus for anything to be available via a computer search. This has not been true for years. The web model, where everything is available if the search engine is smart enough to find it, is in every way better than the little-empire, walled-garden approach we have now with various publishers and organizations each having their own excusive, semi-overlapping, databases. Expecting searchers to know enough about a subject to come up with good keywords, evaluate how salient the results of a search are, and understand what they find is good. Expecting them to learn about the quirks and coverage of proprietary databases is dumb. Meta-search engines and multi-database searches are a poor solution. 3. There is no excuse for braking the web with your database interface. If you feel so strongly about your need to protect content, or mirror your ancient telnet interface, or whatever, that you break the back/forward buttons, you should quit your job now and let someone with a clue take over. There is no technical excuse for breaking the back/forward buttons. Don't get me wrong, there are genuine reasons to change behavior of a page depending on how the user gets to it. You don't want people to be able to skip the login screen through a bookmark. But searching for and viewing documents are not those kind of situations. Do me a favor. If the idea of flipping me out of a page because I haven't clicked on something within 15 minutes ever enters your head, I want you to stop, take a deep breath, and smack yourself in the face. You deserve it! If you ever ask your web developers if there's a way to disable right-clicking, you get two smacks. In the face. Also, to all my brothers and sisters out there, the programmers, the web developers, the designers, the database admins -- none of this is aimed at you. I know where you all are at, we've all been there. The fact of the matter is that we spend a lot of our careers implementing things we know are stupid, wrong-headed, or counter-productive. 4. The way that citations are done is archaic and simple-minded. Citations are the hyperlinks of academic research. So why are they so much crappier and more difficult than hyperlinks? Listen, I understand how difficult this must have been to figure out and get organized 100 years ago, when everything was bound in volumes. It is no longer the year 1907, so that is no longer a good excuse. Why are there:
- Lame, arcane rules specifying that this goes here, unless it's one of these, but not one of those, or if there's more than 2 but less than 6 authors, on every other Thursday... Here's a rule of thumb: as soon as you have more than two strict rules for a string of text, it probably shouldn't be a string of text.
- One thousand different citation formats, APA, MLA, CBE, Chicago, blah, blah, blah.
- No automatic way to click from one reference to the next to the next? Some databases implement this, and Google Scholar tries to do this, but why don't we have something crazy... like a URL or ISBN... that makes these things automatically easy?