Evaluation standard for PPI extraction
NEW! Data and software implementing the PPI extraction method evaluation standard proposed in Sampo Pyysalo, Rune Sætre, Jun'ichi Tsujii and Tapio Salakoski, Why Biomedical Relation Extraction Results are Incomparable and What to do about it, Proceedings of SMBM'08, is available on the Protein-protein interaction extraction evaluation page.
Graph kernel for PPI extraction
Data and software for the work described in Antti Airola, Sampo Pyysalo, Jari Björne, Tapio Pahikkala, Filip Ginter and Tapio Salakoski, All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning, BMC Bioinformatics, 9 (Suppl 11)(S2), 2008 available on this page. The work was originally presented in BioNLP 2008.
General information
This page provides the software necessary to convert the AIMed [1], BioInfer [2], HPRD50 [3], IEPA [4], and LLL [5] corpora into the common format as described in Pyysalo S, Airola A, Heimonen J, Björne J, Ginter F, Salakoski T, Comparative Analysis of Five Protein-protein Interaction Corpora, LBM'07. 2007. We kindly ask that users of the conversion software cite this paper.
Please download the corpora from their respective homepages, download the conversion software, and follow the instructions given in the readme.txt file. The converted version of BioInfer can also be downloaded directly from this page.
Download
The conversion software and the converted BioInfer corpus are distributed under GPL.
Download the conversion software.
Download the converted version of BioInfer. This is a new, and much improved, binarization of BioInfer as reported in Heimonen et al., Complex-to-Pairwise Mapping of Biological Relationships using a Semantic Network Representation. The previous binarization, as reported in Pyysalo et al., Comparative Analysis of Five Protein-protein Interaction Corpora is still available here.
Contact
Contact sampo.pyysalo at it.utu.fi if you experience any problems with the conversion.
References
[1] Bunescu R, Ge R, Kate RJ, Marcotte EM, Mooney RJ, Ramani AK, Wong YW: Comparative Experiments on Learning Information Extractors for Proteins and their Interactions. Artif Intell Med, Summarization and Information Extraction from Medical Documents 2005, 33(2):139-155.
[2] Pyysalo S, Ginter F, Heimonen J, Björne J, Boberg J, Järvinen J, Salakoski T: BioInfer: A Corpus for Information Extraction in the Biomedical Domain. BMC Bioinformatics 2007, 8(50).
[3] Fundel K, Kuffner R, Zimmer R: RelEx - Relation extraction using dependency parse trees. Bioinformatics 2007, 23(3):365-371.
[4] Ding J, Berleant D, Nettleton D, Wurtele E: Mining MEDLINE: abstracts, sentences, or phrases? In Proceedings of PSB'02 2002:326-337.
[5] Nédellec C: Learning language in logic - genic interaction extraction challenge. In Proceedings of LLL'05 2005.