Skip navigation.


Corpus data

The corpus is a developing resource, and there may be annotation errors in the data. If you identify any issues in the corpus data, we would like to know about them! Please address any comments and questions to

Sampo Pyysalo
sampo.pyysalo at
Filip Ginter
ginter at


Creative Commons License
BioInfer annotation is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.


The 1.1.1 version of the corpus can be downloaded here: This version features dependency graphs following the Stanford scheme and fully manually corrected LG dependency types in the parallel linkage.

The original 1.0.1 version of the corpus can still be downloaded here:

Binarized BioInfer

The 1.2.0b (binarised) version of the corpus can be downloaded here: BioInfer_corpus_1.2.0b.binarised.xml.gz (minor bugfix updates on November 20, 2008). In this version, all relationships have been converted into binary relations between proteins as described in the paper Complex-to-Pairwise Mapping of Biological Relationships using a Semantic Network Representation (SMBM'08 proceedings, p. 45-52). Please note that the binary relations, while more readily usable in some applications, are not intended as a replacement for the original relationship annotation. For the full annotation, version 1.1.1 of the corpus remains current.

Supporting software

The data and software used for the binarisation (see above) are available at BioInfer_binarisation_data.tar.gz and BioInfer_binarisation_software.tar.gz.

The supporting software package can be downloaded here:


To install the supporting software, download the package linked above, and then simply unpack the package. This should create a directory called BioInfer_software_1.0.1. On a UNIX system, the following command can be used for unpacking:


The supporting software programs will be contained in the BioInfer_software_1.0.1 directory, and can be run with the Python interpreter (if you do not already have Python installed, it can be downloaded from the download page). On a Windows system, this should be as simple as double-clicking on the program you wish to run (e.g. On UNIX systems:

cd BioInfer_software_1.0.1

Descriptions and usage instructions for the extract and visualize programs can be found through the supporting software page.


The program liblp2lp and the rules for the conversion from LG linkages to dependency graphs following the Stanford scheme can be downloaded here: