A C++ app for working with phenotype ontologies as well as the GA4GH Phenopackets standard.
First, install the latest version of the protobuf library and compiler.
Download protobuf from https://github.com/protocolbuffers/protobuf/releases/. At the time of this writing, the latest version was 3.8.0.
For these instructions (https://github.com/protocolbuffers/protobuf/blob/master/src/README.md).
On an Ubuntu system, one can install the requirements as follows.
$ sudo apt-get install autoconf automake libtool curl make g++ unzip
Following this, download the latest source code, compile and install it. Briefly
$ tar xvfz protobuf-cpp-3.8.0.tar.gz $ cd protobuf-3.8.0/ $ ./configure $ make $ make check $ sudo make install $ sudo ldconfig # refresh shared library cache.
Phenotools uses the rapidjson C++ JSON library to parse (OBO) ontology files in JSON format. See the rapidjson home page for details. Installation on *NIX systems and Mac probably will work as follows.
$ git clone https://github.com/Tencent/rapidjson.git $ cd rapidjson $ mkdir build $ cd build $ cmake .. $ make $ make install
First, download the phenopacket-schema code. In the following, we download the code directly into the phenotools repository. If you dwnload it somewhere else, adjust the path accordingly.
git clone https://github.com/phenopackets/phenopacket-schema.git
The build process first generates C++ code to represent the Phenopacket on the basis of the protobuf file.
PROTO_PATH='phenopacket-schema/src/main/proto' protoc --proto_path=${PROTO_PATH} --cpp_out=lib ${PROTO_PATH}/base.proto protoc --proto_path=${PROTO_PATH} --cpp_out=lib ${PROTO_PATH}/interpretation.proto protoc --proto_path=${PROTO_PATH} --cpp_out=lib ${PROTO_PATH}/phenopackets.proto
Following this, we can build phenotools using cmake.
mkdir build cd build cmake ..
This will generate a static library file as well as unit testing code and an app called phenotools.
To run the unit tests following the build, enter
./lib/tests/phenopacket_tests
Following the build, the app phenotools
is located in the app
subdirectory. The following
command runs phenotools with the --help
option to show available commands and arguments.
./app/phenotools --help
There is a setup script that works on Macintosh and linux systems only. Enter the following to perform all steps of the build process starting from the download of the phenopackets-schema repository.
$ ./setup.sh
Phenotools is in a beta stage, but it can already do a few useful things.
The software currently decodes a Phenopacket from JSON format, does some Q/C, and outputs a summary of the phenopacket to the shell. To run it, enter
$ ./phenotools phenopacket -p Gebbia-1997-ZIC3.json
It will input the Phenopacket included in the demo and output this.
Phenopacket at: Gebbia-1997-ZIC3.json subject.id: III-1 subject.age: 7W subject.sex: male id: HP:0002139: Arrhinencephaly id: HP:0001750: Single ventricle id: HP:0001643: Patent ductus arteriosus id: HP:0001746: Asplenia id: HP:0004971: Pulmonary artery hypoplasia id: HP:0001674: Complete atrioventricular canal defect id: HP:0001669: Transposition of the great arteries id: HP:0012890: Posteriorly placed anus id: HP:0001629: Ventricular septal defect id: HP:0012262: Abnormal ciliary motility id: HP:0004935: Pulmonary artery atresia id: HP:0003363: Abdominal situs inversus Gene: ZIC3[ENTREZ:7547] GRCh37: X:136649818C>T[] Disease: HETEROTAXY, VISCERAL, 1, X-LINKED; HTX1 [OMIM:306955] Metadata: Hpo Case Annotator : 1.0.13-SNAPSHOT(1970-01-01T00:00:00Z) human phenotype ontology: hp(HP;http://purl.obolibrary.org/obo/hp.owl;2018-03-08;http://purl.obolibrary.org/obo/HP_) Phenotype And Trait Ontology: pato(PATO;http://purl.obolibrary.org/obo/pato.owl;2018-03-28;http://purl.obolibrary.org/obo/PATO_) Genotype Ontology: geno(GENO;http://purl.obolibrary.org/obo/geno.owl;19-03-2018;http://purl.obolibrary.org/obo/GENO_) NCBI organismal classification: ncbitaxon(NCBITaxon;http://purl.obolibrary.org/obo/ncbitaxon.owl;2018-03-02;) Evidence and Conclusion Ontology: eco(ECO;http://purl.obolibrary.org/obo/eco.owl;2018-11-10;http://purl.obolibrary.org/obo/ECO_) Online Mendelian Inheritance in Man: omim(OMIM;https://www.omim.org;;) #### We identified 1 Q/C issue #### [ERROR] phenopacket id missing
In this case, our example phenopacket was missing the id
element but otherwise
was well-formed.
Phenotools can read the HPO JSON file (hp.json
) and output descriptive statistics (-s
, --stats
) and/or
perform quality assessment (-q
,``--qc``).
To run it, download the hp.json
file from the HPO GitHub page, and
execute the following command.
$ ./phenotools hpo -j hp.json [-s/--stats][-q/--qc]
Phenotools will be able to analyze and work with other JSON-serialized obo ontologies including MONDO. This feature is still exerimental.
For the adventurous, download the mondo.json
file from the MONDO GitHub page.
To run the app, enter,
$ ./phenotools mondo -j mondo.json [-s/--stats][-q/--qc]