Should academic paper publishing embrace EPUB?

Sometime last year I was considering home improvement options to our house, I was thinking about building a large, built-in bookshelf in our upstairs study area. I always loved to see lots of books on the wall, and really enjoyed pulling down a book to have a browse on whatever subject interested me from my own personal library. But there was all this discussion regarding ebooks, and I was thinking if this ever caught on big time, then printed books would eventually go the way of the dodo – the end of their 400-year cycle of greatness was at hand, and the new way to read anything was going to be on a digital screen.

I’ve since come to my senses. I love books – the binding, the texture of fine paper, the fact that it doesn’t require a battery or power cord, and even the smell are all plusses in my book. Books have been around a long time, and they’re here to stay. Ebooks are just another channel of distribution for such content, and I believe that both have their place in the modern era.

However, for academic research papers, I think we can safely kill the paper. Particularly, I think it should all begin moving towards the EPUB format. I read a lot of academic papers in my work, and I find myself wishing that more of this stuff were published as EPUBs. In contrast to my love of books above, I think academic research would largely be much better served in a purely electronic format. It’s already going that way from the reader’s point of view, right?

Typically, when academic papers get published electronically, the format of choice is PDF. Or in earlier days, PostScript. If you’re lucky, someone had the foresight to publish their paper as HTML. The advantage of a flexible format such as HTML is that you can resize the fonts. Text can flow. It’s easier to get a clean copy of a text or data segment out of HTML than it is from PDF for quoting in one’s own paper, because copying from PDF tends to yield horrific line break issues and other artifacts on the clipboard.

PDF is, I’m sorry to say, hard to read on smaller screens. PDF expects paper, and refuses to reflow itself into smaller screen sizes such as an iPhone or Android device form factor. It barely passes on the 1024 x 768 iPad screen. Anything smaller, such as most ebook readers, is going to be unacceptable. Having to zoom in and scroll left to right to read one line of text at a time on a mobile device is not what anyone would call a user-friendly reading experience.

EPUB by contrast works great on mobile devices. Using the Stanza reader on iPhone is quite comfortable. iBooks on the iPad platform is a joy to use.

After reading this tweet by Dave Gutelius today, I was reminded of how much I hate printing out all my academic papers in preparation for travel. Flying is reading time, and printing this stuff out and stuffing it in my backpack is time consuming, a waste of paper, and added weight that I don’t want to carry.

Stuffing those papers onto my iPad and using GoodReader is a step in the right direction. But still, all too often the PDFs are formatted for paper, not for screen, and I am still cursing the format. PDF usually assumes letter-sized or A4-sized paper, and most ebook readers have physically far smaller screen sizes. Far better I think to start providing EPUB options for academic research, so that folks like me who need ginormous fonts and such can read with greater ease.

Or, should it just go to straight HTML? At that point, papers might even be able to add a little functionality to the electronic reading experience – change variables in information graphics, show rendered 3D representations of models, and so on. EPUB doesn’t support anything fun like HTML5 DOM handling or Flash, although CSS3 might work depending on the EPUB reader’s implementation. Either way, PDF ain’t fitting the bill ebook readers, and I think this sort of format will be far more important in the coming months and years as ebook-capable mobile devices become more and more commonplace.

6 thoughts on “Should academic paper publishing embrace EPUB?”

  1. Thanks so much for this article. I’m been having similar thoughts about the reading of academic papers and PDF books myself. I’ve got an iPad and it just isn’t very nice to read wide PDFs. I didn’t know about GoodReader. This looks like a great interim solution to me. I’ve been considering writing my own PDF -> ebook-format tool myself (so it’s quite a relief from that burden).

    Looking forward to seeing some academic publications published in .epub alongside .html, .ps.gz and .pdf!

  2. While you’re of course right in condemning use of PDF for screen-reading purposes, you are making a common mistake of evaluating current EPUB format in terms of some client instances (e.g. iBooks and Stanza), and then promoting it as a solution for said duties. It is clear that what is needed for academic digital papers/ textbooks of all kinds, is some derivative XML-document scheme, which includes, but does not end with, the EPUB. Besides being capable of adapting itself to all kinds and sizes of screen displays (in single- and multicolumn layout styles), that future format will also have to intelligently accomodate the three kinds of textbook “extras”: highlighting, quotes, and cumulative annotations (in-context and margin-notes). Furthermore, the annotations layer needs to be stored with the document and suitable for device-independent export and merging with other annotation instances. EPUB 3.0 is nowhere near that capability, alas.

    For more on the lack of a native/global annotations layer on the iPad, the lodestar of lean-back screen-reading, see these two of my earlier notes elsewhere:

    http://ipad4edu.com/questions/161/ipad-and-etextbooks-and-annotations/310#310
    http://ipad4edu.com/questions/161/ipad-and-etextbooks-and-annotations/190#190

    BTW. You don’t need to apologize for Adobe’s PDF not being suitable for screen-reading duties – it wasn’t designed primarily for that.

  3. It’s not a mistake – ePub is a better solution than PDF for reading papers that works now in mobile devices. Client instances solve my needs. Yes it would be great to have that be independently supported by the spec for annotations and such, but the fact of the matter is that the reading experience is vastly improved on these devices by using a common ePub format over PDF, and the client tools get the job done. Adding capability with a protocol that extends ePub to support those good features you mention is a fine idea, but in danger of adoption scarcity. Will that ship in readers? Will authors support it? Until then: ePub works.

  4. My complaint about epub for academic works is that it reflows the text and does not maintian consistent page numbers — you can’t even HAVE page numbers in the table of contents. How the heck do you cite an academic work that has a different pagination on every type of device — or even for each person reading it, given than it changes pagination with text size! It is also terrible for works with bulleted lists, diagrams, charts, etc. And it doesn’t always keep diagrams on the same page as the text referring to them — again, because it reflows the text. Epub is fine for novels or nonfiction works with straightforward prose, but for anything with a more complex layout, I still prefer the control that PDF gives me.

  5. Couldn’t agree more. The focus on layout (as opposed to content) in academia is a waste of energy. Every element in an academic paper – footnotes, tables, figures, equations, etc – can be tucked neatly into their own little XML containers to be displayed as the client desires. Eg, you can touch the words “Figure 8a” and have Figure 8a appear. Or parse it to a three column layout resembling your favorite peer-reviewed paper publication from the last century, if that makes you happy. And who cares about citing page numbers when you can link directly to the location within the source? Layout is about publisher control (profit). Epub returns us to a focus on content…not to mention man’s FIRST instinct on how to organize text: the scroll.

Leave a Reply

Your email address will not be published. Required fields are marked *