Where did each of the initial structures come from?
Each material in the QMOF Database, and thereby the MOF Explorer, was taken from an existing dataset of MOF structures. Some of these datasets are dedicated to experimentally synthesized MOF structures, whereas others are hypothetical MOF structures (i.e. computationally constructed). Below, we outline the various datasets of MOF structures used in constructing the QMOF Database.
The Cambridge Structural Database (CSD) contains experimentally derived crystal structures for over a million materials. Of the crystal structures published on the CSD, approximately 100,000 are included in what is referred to as the CSD MOF Subset. It should be noted that the definition of a MOF in the CSD MOF Subset is more inclusive than many other databases and includes non-porous materials that are arguably best described as coordination polymers, in addition to more conventional MOF structures.
In the QMOF Database, structures were taken directly from the CSD MOF Subset with free (i.e. unbound) solvent removed from the pores. ConQuest was used to download the structures, and we excluded materials that were flagged as having charge-balancing ions, any errors in the crystal structure, or disorder in the framework. Additionally, we excluded any structures that lacked carbon or hydrogen atoms, had atoms with close interatomic distances, had lone (i.e. unbonded) atoms, or had terminal oxo ligands on metals where such ligands are typically OH groups or water. Several scripts to carry out these fidelity checks can be found here.
The Computation-Ready, Experimental (CoRE) MOF Database contains experimentally derived crystal structures for ~14,000 porous, three-dimensional MOFs. The materials in the CoRE MOF Database were derived from the CSD but are not directly associated with the CSD MOF Subset, although many of the CoRE MOFs can be found in the CSD MOF Subset as well. Unlike the CSD MOF Subset, which provides as-reported crystal structures, a suite of automated and manual structural corrections were carried out during the construction of the CoRE MOF Database. As with any automated approach, not all of these structural corrections are perfect in their execution and can result in materials with misplaced atoms, under- and over-bonded atoms, charge imbalances, and similar structural fidelity issues that can be determinetal for DFT.
In the QMOF Database, we considered CoRE MOFs that were included in curated lists provided by Chan and Manz and Kancharalapalli and coworkers to increase the likelihood of having high-fidelity CoRE MOF structures. For consistency, the free solvent-removed (FSR) subset of the CoRE MOF Database was conisdered. We emphasize that there are many MOFs present in the CoRE MOF Database that we instead adopted from the CSD MOF Subset directly. As such, if a user is specifically interested in which MOFs in the QMOF Database are also present in the CoRE MOF Database, one should compare the CSD reference codes and/or MOFids for the materials in these two datasets.
The Topology-Based Crystal Constructor (ToBaCCo) code can generate hypothetical MOFs from known inorganic and organic building blocks (and topologies). Here, the "ToBaCCo" dataset of MOFs specifically refers to those found in the original ToBaCCo paper by Colón, Gómez-Gualdrón, and Snurr. In the QMOF Database, MOFs with triangular Cu-containing nodes were selected from the ToBaCCo dataset, as found here.
The Anderson and Gómez-Gualdrón dataset contains hypothetical MOFs constructed using ToBaCCo. In the QMOF Database, we selected Zr-containing MOFs from this dataset. We also expanded the dataset to include hypothetical Hf-containing MOfs by exchanging the Zr species for Hf.
Hypothetical MOFs in the QMOF Database were also adopted from the work of Boyd et al. using the dataset of structures uploaded to the Materials Cloud here. These MOFs were construced using the TOBASCCO code, as described in prior work by Boyd and Woo. As a result, we refer to these hypothetical MOFs as coming from the Boyd & Woo dataset.
In the QMOF Database, we adopted MOFs from select families in the Boyd & Woo daaset and occasionally made modifications to several of these MOFs to diversify our collection. For instance, we occasionally exchanged the metals in the inorganic node, and we constructed Al rod MOFs by exchanging the metals in the pre-existing V rod MOFs and protonating the bridging oxo ligands. We still refer to these structures as being derived from the Boyd & Woo dataset even though custom modifications have been made.