Thursday, March 06, 2008

PageRank for biodiversity

This will probably tempt fate, but I've an invited manuscript in review for Briefings in Bioinformatics on the topic of identifiers in biodiversity informatics. Readers of this blog will find much of it familiar (DOis, LSIDs, etc.). For fun I constructed a graph for three ant specimens of Probolomyrmex tani, and the images, DNA sequences, and publications that link to these specimens.

Based on this graph I computed the PageRank of each specimen. The motivation for this exercise is that AntWeb lists 43 specimens of this species, in alphabetical order. This is arbitrary. What if we could order them by their "importance"? One way to do this is based on how many times the specimens have been sequenced, photographed, or cited in scientific papers. This gives us a metric for ordering lists of specimens, as well as demonstrating the "value" of a collection (based on people actually using it in their work). I think there is considerable scope for applying PageRank-like ideas to questions in biodiversity informatics. Robert Huber has an intriguing post on TaxonRank that explores this idea further.

1 comment:

Anonymous said...

Actually very interesting. But your idea of a "biodiversity rank" applies only to usefulness in research, right? FUnctionnal importance in the ecosystem is not quantifiable by this way, I assume.

Did you calculate it for others genus/species?