Contribute Data
How to contribute your own user-created data to the Materials Project (MP) website/database.
Interactively
Under Construction. We are working on creating interactive interfaces that enable our user to contribute data in addition to the programmatic approach below.
Programmatically
Quickstart
Read the concepts section, Create a project, install mpcontribs-client
, and set the MPCONTRIBS_API_KEY
environment variable to the API key shown on the profile page or your dashboard. The following code snippet outlines the general process of adding data to your project. See the next section for step-by-step instructions.
Step-by-step instructions
The general process of preparing and contributing data using the mpcontribs-client python library follows the steps below. Make sure to read the Concepts section before continuing.
Install dependencies. You might not need all of them depending on your use case:
pip install mpcontribs-client mp_api pandas flatten_dict pymatgen monty
Import commonly used libraries (some might be optional in your case):
Initialize MPRester
and create a project. Take an extra minute to think about the name
for your project. It needs to be 3-31 characters long and can only use alpha-numeric characters or the underscore (_
). You'll use it to refer to your project in the python client, and it'll be used as part of the URL for your project's landing page.
You can either create the project by completing this form or programmatically as shown below. Replace the example values below with the info pertinent to your project. Projects created with these example values or with insufficient information will be removed immediately. Also check out the doc strings for more info about any functions used here.
After the project is created, you can use the MPContribs client directly to exclusively interact with your project going forward:
Update submitted project information if needed. For instance, we might want to add another first author to authors
or change the label for the default reference and add another URL to references
:
Load the data you plan to contribute from disk. As an example,
we load the following CSV file (
main.csv
) with each line containing the values intended for the queryable columns in thedata
component for a material. Usempr.find_structure()
and/ormpr.summary.search()
to identify MP materials matching your structure, if possible. Only contributions with MP IDs asidentifier
will show up on MP's materials details pages.we load a list of associated spectra intended for the
tables
component from a folder namespectra
with the following contents:we load a list of associated CIF files intended for the
structures
component as pymatgen structures (folder namedstructures
contains a list of CIF files named by MP ID).we load data intended for the
attachments
component directly from files/images on disk using the following directory structure. Attachments can also be created dynamically from standard python lists or dictionaries. See the use ofAttachments.from_list()
below or its doc string for more details.
Initialize the columns for the queryable data
component of your contributions. This involves deciding on MPContribs-compatible field names, appropriately grouping / categorizing related columns, setting their units, and adding descriptions to be included in the project info.
Use dot (
.
) notation to group/nest columns up to 4 levels deep. Most non-alphanumeric characters are disallowed (including underscores and spaces) to encourage better data organization by grouping columns and using short, readable, and type-able column names. Use a good description to explain column names, and use the pipe (|
) character to indicate conditions for a column (e.g.max
,300K
, ...) where nesting might not be desired. Falling back on CamelCase is also an option for column naming.The column unit can either be an empty string (
""
) to indicate a dimensionless number, a string representing a unit supported by pint, orNone
to indicate that the column values are not numerical.To remove keys from the
other
field, explicitly set those keys toNone
inupdate_project()
Prepare the list of contributions.
The preferred
data
format is to provide numerical data as strings with units rather than naked numerical values. This will be parsed by the API to yield a value and a unit for display and promotes the correct reporting of significant figures.If you're trying to include
list
objects in thedata
component, first convert them into dictionaries by giving each element in the list a name. For instance, in the case of tensors, convert[[1, 2], [3, 4]]
to{"e11": 1, "e12": 2, "e21": 3, "e22": 4}
. This will make the tensor components queryable.
Submit the contributions and publish the project when ready.
API Docs Page
The mpcontribs-client python library interacts with the MPContribs API to programmatically access or retrieve experimental and theoretical data contributed by the MP community. Project information is retrievable through the projects
resource, and the corresponding contributed data through the contributions
resource. Each project can contain many contributions for an MP material or composition. Each contribution in turn consists of four (optional) components: free-form hierarchical data, tabular data, crystal structures, and attachments. There are separate dedicated resource endpoints for tables
, structures
, and attachments
. See the Concepts section for more details. Check out the "Models" section on the API Docs page for descriptions of available fields and to try out the API in the browser.
Last updated