Four Reasons Why Academic Research is Broken

Right now, you and I have access to more information than anyone else in the history of humanity. The richest man alive in the year 1800 could not get the amount and quality of information that a janitor with a $20/month DSL connection has at his fingertips today. This is all so amazing and wonderful that we mostly take it for granted. But it brings up new problems. No one can argue about the amount of information that's just a Google search away, but the quality of what comes up can be a big question mark. Luckily, we also have in place the most successful model of judging the quality of information in the history of man: the peer-reviewed academic journal. So we have an embarrassment of riches, and a great model to follow that has brought constant improvements in science and technology. So what's the problem? Actually performing academic research is horribly broken, and what's worse, there's no good reason. Read on to find out just how broken the system is. 1. There is absolutely no excuse for why I can't get immediate access to every journal article ever published. I'm serious. If I want to learn all about the misinformation effect, there's no doubt it will take me some time to read all the current research, let alone acquire the background in psychology needed to follow along. But even if I have the time, the motivation, and the background, I can't, not without spending a ton of money or being affiliated with a college or university (which translates to "spending a ton of money"). Unlike the web, with it's search engines, directories, and billions of pages hyperlinked to each other, academic research articles are not all available in one place to anyone who wants them. Why not? "Oh, digitizing those articles is such a huge job." "Oh, it will cost so much money." No it won't. No, it won't. The real problem is the tangled interests of various publishers like Elsevier. They are actively preventing access to information and convoluting the search and linking process. If everyone gave Google rights to digitize this stuff, they would do it in a second. If they gave it to the Internet Archive, or Project Gutenberg, or the Wikimedia Foundation, or created a new open-source project to work on it, it would get done. Virtually everything in the past 20 years was already typed up on a computer for submission or layout in the journal. A huge amount from earlier than that has already be digitized and sits in some database somewhere. And anything that hasn't been digitized yet can be taken care of with scanning, OCR, and a few dozed graduate assistantships (and those poor bastards get very little pay). 2. There is absolutely no excuse for requiring people to search this database for this, that database for that, ask their institution to purchase access to this other database to find some other thing, etc., etc. Twenty years ago, there were reasons for things to be in different databases, and for some things to not show up in any database at all. In fact, twenty or thirty years ago it was a nice bonus for anything to be available via a computer search. This has not been true for years. The web model, where everything is available if the search engine is smart enough to find it, is in every way better than the little-empire, walled-garden approach we have now with various publishers and organizations each having their own excusive, semi-overlapping, databases. Expecting searchers to know enough about a subject to come up with good keywords, evaluate how salient the results of a search are, and understand what they find is good. Expecting them to learn about the quirks and coverage of proprietary databases is dumb. Meta-search engines and multi-database searches are a poor solution. 3. There is no excuse for braking the web with your database interface. If you feel so strongly about your need to protect content, or mirror your ancient telnet interface, or whatever, that you break the back/forward buttons, you should quit your job now and let someone with a clue take over. There is no technical excuse for breaking the back/forward buttons. Don't get me wrong, there are genuine reasons to change behavior of a page depending on how the user gets to it. You don't want people to be able to skip the login screen through a bookmark. But searching for and viewing documents are not those kind of situations. Do me a favor. If the idea of flipping me out of a page because I haven't clicked on something within 15 minutes ever enters your head, I want you to stop, take a deep breath, and smack yourself in the face. You deserve it! If you ever ask your web developers if there's a way to disable right-clicking, you get two smacks. In the face. Also, to all my brothers and sisters out there, the programmers, the web developers, the designers, the database admins -- none of this is aimed at you. I know where you all are at, we've all been there. The fact of the matter is that we spend a lot of our careers implementing things we know are stupid, wrong-headed, or counter-productive. 4. The way that citations are done is archaic and simple-minded. Citations are the hyperlinks of academic research. So why are they so much crappier and more difficult than hyperlinks? Listen, I understand how difficult this must have been to figure out and get organized 100 years ago, when everything was bound in volumes. It is no longer the year 1907, so that is no longer a good excuse. Why are there:
  • Lame, arcane rules specifying that this goes here, unless it's one of these, but not one of those, or if there's more than 2 but less than 6 authors, on every other Thursday... Here's a rule of thumb: as soon as you have more than two strict rules for a string of text, it probably shouldn't be a string of text.
  • One thousand different citation formats, APA, MLA, CBE, Chicago, blah, blah, blah.
  • No automatic way to click from one reference to the next to the next? Some databases implement this, and Google Scholar tries to do this, but why don't we have something crazy... like a URL or ISBN... that makes these things automatically easy?
Why is this important? This is not just a string of complaints from a grad student averse to the hard work of research. I like the hard work of research. I don't like spending time working around horrible interfaces and limitations imposed by copyright holders, who often have nothing to do with the actual production of knowledge. This craziness has come to the point where it is limiting progress, because right now your average know-nothing, head-in-the-sand flat earth creationist is more likely to show up in a Google search than real work by real scientists. Your average citizen is operating under no requirements on what sources they cite when they make decisions and many people, even graduate students, have a hard time evaluating sources. It's hard enough to get most people interested enough in a topic to do any kind of research, even politically hot items like stem cell research and climate change.  People are busy leading their lives.  But when someone does take interest, they have no chance of finding some of the best information out there unless they are already at a university.  This is broken.

  1. I agree that open electronic access to academic journals would, at least help quell the spread of misinformation available online. Even if you cite an academic journal in, say, a Wikipedia article there is still no way to validate that cite without the reader being able to read the cited article themselves.

    From what I’ve read (in Wired – particularly here, here and here), the biggest roadblocks are profits made by the journals by forcing payments for both publishing and subscription and the “that’s the way we’ve always done it” attitude of the academic community. There is indeed quite a movement for having web-based journals. The attitude there is more of “if past innovation and knowledge were more accessible, it could lead to a new wealth of innovation and knowledge.”

    Unfortunately for everyone, it does take a lot of money to set up such large archives, organization to set up a trusted peer-review system and the willingness of enough researchers and granters to look outside the box and agree to a new system.

    The government, by the way, needs to get on this train too. Especially with case documents.

    March 10th, 2007 at 5:37 pm
  2. Great blog and great rant on the publishing situation. I think things are changing, unfortunately too slowly for most of us. PLoS is a great example of people trying to do the right thing.
    Your rant was linked at

    April 19th, 2007 at 9:00 am

