We standardize global health data and create rich, machine-interpretable metadata to advance interoperability between global health datasets and between global health and related datasets.

Standardize global health data

Standardizing data is a cornerstone for improved interoperability, i.e. the ability for datasets to be used together. Interoperability also helps to use datasets and software applications together without the need for reformatting or data cleaning. For our Project Tycho bsg-s000718>standard data format, we use standard SNOMED-CT codes to describe disease conditions, ISO 3166 and Apollo Location Service codes to describe geographical locations, NCBI TaxonID numbers to describe viruses, bacteria, and other pathogenic organisms, and a variety of other standard names and ontologies, such as the Apollo Standard Vocabulary and the Apollo XML Schema Document.

Machine-interpretable metadata

To improve FAIR (Findable, Accessible, Interoperable, Reusable) compliance, we describe datasets using rich metadata and encode these metadata in a standard, machine-interpretable, format. Metadata elements that have been developed for biomedical data by organizations such as bioCADDIE, DataCite, and the Apollo project can be used for global health data. These elements include dataset creators, contributors, content description, creative works that used the dataset, spatial coverage, temporal coverage, access protocols, and the license agreement. We use standard metadata format definitions such as the Data Tag Suite (DATS) developed by the bioCADDIE DataMed project, the DataCite XML schema, and the Apollo XML Schema Document.

If you can use our help in making your data interoperable, let us know!