Thursday, October 02, 2014

BioStor and JournalMap: a geographic interface to articles from the Biodiversity Heritage Library


The recent jump from ~11000 to ~17000 articles in JournalMap is mostly due to JournalMap ingesting content from my BioStor database. BioStor extracts articles from the Biodiversity Heritage Library (BHL), and in turn these get fed back into BHL as "parts" (you can see these in the "Table of Contents" tab when viewing a scanned volume in BHL).

In addition to extracting articles, BioStor pulls out latitude and longitude pairs mentioned in the OCR text and creates little Google Maps for articles that have geotagged content. Working with Jason Karl (@jwkarl), JournalMap now talks to BioStor and grabs all its geotagged articles so that you can browse them in JournalMap. As a consequence, journals such as Proceedings of The Biological Society of Washington now appear on their map (this journal is third most geotagged journal in JournalMap).

As an example of what you can do in JournalMap, here's a screenshot showing localities in Tanzania, and an article from BioStor being displayed:

Bj
JournalMap is an elegant interface to the biodiversity literature, and adding BioStor as a source is a nice example of how the Biodiversity Heritage Library's content is becoming more widely used. BioStor would only be possible if BHL made its content and metadata available for easy downloading. This is a lesson I wish other projects would learn. Instead of focussing on building flash-looking portals, make sure (a) you have lots of content, and (b) make it easy for developers to get that content so they can do cool things with it. BHL does well in this regard — other projects, such as BHL-Europe, not so much.