Friday, March 8, 2013

Considerations for automated functional comparisons

Congratulations on a great paper from Sean Mooney's lab and colleagues at the National Center for Biomedical Ontology at Stanford. It shows that when performing high throughput analysis of omics datasets such as microarrays, proteomics etc, combining manually curated GO terms with additional terms found by text mining is more effective than using GO alone. As a curator this doesn't surprise me - we do a great job, but with the volume of work out there we will never catch everything. Often when manually curating a paper there might be a concept that doesn't map exactly to the ontology in use, or sometimes there are situations where it can map to many terms and a curator may be constrained from picking them all by limitations of their curation tool. It is not always possible for a human curator to examine all the ontologies out there to find the best annotation in a given circumstance. And curators are scientists too, so it is essential for us to to be able to incorporate free text especially when it doesn't seem to follow the rules exactly. Methods and tools that account for these limitations by employing a mixed approach therefore seem like a good idea - and over 2000 people have accessed the article in the past 2 weeks, so its getting some attention!