githubEdit

Data Builders

Description of builders used to produce Materials Project data.

Builders to produce MongoDB collections associated with specific categories of materials data are implemented in the emmet-buildersarrow-up-right software package. These take a set of MongoDB collections as input, extract and transform data from them, and then output a new collection. The schema used by builders for the input and output collections is defined by a set of standardized document models. We use a Python library called pydanticarrow-up-right to structure these document models, and we store our documents models within the emmet-corearrow-up-right software package. Additionally, these models are used to define the schema for the Materials Project API which has its server-side code implemented in emmet-apiarrow-up-right.

To browse our document models defined in emmet, see herearrow-up-right. For example, see the ThermoDoc defined herearrow-up-right as an example document model that powers both the ThermoBuilderarrow-up-right and the Materials Project's thermo API endpointarrow-up-right.

The figure below illustrates the entire Materials Project build pipeline including builders and all input/output collections:

For information on how to run any of the emmet builders, see the Running Buildersarrow-up-right section of the maggma software package which defines a lot of the core builder related code and CLI.

Last updated

Was this helpful?