Materials Project Documentation
Return to materialsproject.org
  • Introduction
  • Frequently Asked Questions (FAQ)
    • Glossary of Terms
  • Changes and Updates
    • Database Versions
    • Website Changelog
  • Documentation Credit
  • Community
    • Getting Help
    • Getting Involved
      • Contributor Guide
      • Potential Collaborators
      • MP Community Software Ecosystem
    • Community Resources
    • Code of Conduct
  • Services
    • MPContribs
  • Methodology
    • Materials Methodology
      • Overview
      • Calculation Details
        • GGA/GGA+U Calculations
          • Parameters and Convergence
          • Hubbard U Values
          • Pseudo-potentials
        • r2SCAN Calculations
          • Parameters and Convergence
          • Pseudopotentials
      • Thermodynamic Stability
        • Energy Corrections
          • Anion and GGA/GGA+U Mixing
          • GGA/GGA+U/r2SCAN Mixing
        • Phase Diagrams (PDs)
        • Chemical Potential Diagrams (CPDs)
        • Finite Temperature Estimation
      • Electronic Structure
      • Phonon Dispersion
      • Diffraction Patterns
      • Aqueous Stability (Pourbaix)
      • Magnetic Properties
      • Elastic Constants
      • Piezoelectric Constants
      • Dielectric Constants
      • Equations of State (EOS)
      • X-ray Absorption Spectra (XAS)
      • Surface Energies
      • Grain Boundaries
      • Charge Density
      • Suggested Substrates
      • Related Materials
      • Optical absorption spectra
      • Alloys
    • Molecules Methodology
      • Overview
      • Calculation Details
      • Atomic Partial Charges
      • Atomic Partial Spins
      • Bonding
      • Metal Coordination and Binding
      • Natural Atomic and Molecular Orbitals
      • Redox and Electrochemical Properties
      • Molecular Thermodynamics
      • Vibrational Properties
      • Legacy Data
    • MOF Methodology
      • Calculation Parameters
        • DFT Parameters
        • Density Functionals
        • Pseudopotentials
        • DFT Workflow
  • Apps
    • Explore and Search Apps
      • Materials Explorer
        • Tutorial
      • Molecules Explorer
        • Tutorial
        • Legacy Data
      • Battery Explorer
        • Background
        • Tutorial
      • Synthesis Explorer
        • Background
        • Tutorial
      • Catalysis Explorer
        • Tutorial
      • MOF Explorer
        • Downloading the Data
        • Structure Details
          • QMOF IDs
          • Structure Sources
          • Finding MOFs by Common Name
          • Structural Fidelity
        • Property Definitions
          • SMILES, MOFid, and MOFkey
          • Pore Geometry
          • Topology
          • Electronic Structure
          • Population Analyses and Bond Orders
          • Symmetry
        • Version History
        • How to Cite
    • Analysis Apps
      • Phase Diagram
        • Background
        • Tutorials
        • FAQ
      • Pourbaix Diagram
        • Background
        • Tutorial
        • FAQ
      • Crystal Toolkit
        • Background
        • Tutorial
        • FAQ
      • Reaction Calculator
      • Interface Reactions
    • Characterization Apps
      • X-ray Absorption Spectra (XAS)
    • Explore Contributed Data
  • Downloading Data
    • How do I download the Materials Project database?
    • Using the API
      • Getting Started
      • Querying Data
      • Tips for Large Downloads
      • Examples
      • Advanced Usage
    • Differences between new and legacy API
    • Query and Download Contributed Data
    • AWS OpenData
  • Uploading Data
    • Contribute Data
  • Data Production
    • Data Workflows
    • Data Builders
Powered by GitBook
On this page
  • Introduction
  • Near-neighbor finding
  • Site Fingerprints
  • Structure Fingerprints
  • Structure Distance/Dissimilarity
  • Examples
  • StructureMatcher
  • References
  • Authors

Was this helpful?

Edit on GitHub
Export as PDF
  1. Methodology
  2. Materials Methodology

Related Materials

How related materials are identified on the Materials Project (MP) website.

PreviousSuggested SubstratesNextOptical absorption spectra

Last updated 8 months ago

Was this helpful?

Introduction

The similarity between two structures i and j is assessed on the basis of local coordination information from all sites in the two structures. The four basic steps involved are:

  1. Find near(est) neighbors of all sites in both structures.

  2. Evaluate each coordination pattern via coordination descriptors observed at each site to define site fingerprints.

  3. Compute statistics of the descriptor values across all sites in a structure to define structure fingerprints.

  4. Use structure fingerprints to rate the (dis)similarity between the two (vectors representing the two) structures.

Near-neighbor finding

We use a novel method called to find near(est) neighbors in periodic structures. While the method will be introduced shortly , it is already available through the python package . A benchmarking framework has been developed to evaluate CrystallNN and compare it to other near-neighbor finding algorithms .

Site Fingerprints

The second step of the structure similarity calculation is the computation of a crystal site fingerprint, vsitev^{site}vsite, for each site in the two structures. The fingerprint is a 61-dimensional vector in which each element carries information about the local coordination environment computed with the site module of the python package . For example, the first two elements "wt CN1\text{CN}_1CN1​" and "single bond CN1\text{CN}_1CN1​" provide estimates of the likelihood (or weight) of how much the given site should be considered 1-fold coordinated (i.e., w∣CN=1|_{CN=1}∣CN=1​). The third element "wt CN2\text{CN}_2CN2​" provides a 2-fold coordination likelihood, whereas the fourth element "L-shaped CN2\text{CN}_2CN2​" holds the resemblance similarity to an L-shaped coordination geometry (also called local structure order parameter) given that we find a coordination configuration with 2 atoms (qL∣CN=2q_{L}|_{CN=2}qL​∣CN=2​). The local structure order parameters can assume values between 0, meaning that the observed local environment has no resemblance with the target motif to which it is compared, and 1, which stands for perfect motif match. The remaining elements are: "water-like CN2\text{CN}_2CN2​", "bent 120 degrees CN2\text{CN}_2CN2​", "bent 150 degrees CN2\text{CN}_2CN2​", "linear CN2\text{CN}_2CN2​", "wt CN3\text{CN}_3CN3​", "trigonal planar CN3\text{CN}_3CN3​", "trigonal non-coplanar CN3\text{CN}_3CN3​", "T-shaped CN3\text{CN}_3CN3​", "wt CN4\text{CN}_4CN4​", "square co-planar CN4\text{CN}_4CN4​", "tetrahedral CN4\text{CN}_4CN4​", "rectangular see-saw-like CN4\text{CN}_4CN4​", "see-saw-like CN4\text{CN}_4CN4​", "trigonal pyramidal CN4\text{CN}_4CN4​", "wt CN5\text{CN}_5CN5​", "pentagonal planar CN5\text{CN}_5CN5​", "square pyramidal CN5\text{CN}_5CN5​", "trigonal bipyramidal CN5\text{CN}_5CN5​", "wt CN6\text{CN}_6CN6​", "hexagonal planar CN6\text{CN}_6CN6​", "octahedral CN6\text{CN}_6CN6​", "pentagonal pyramidal CN6\text{CN}_6CN6​", "wt CN7\text{CN}_7CN7​" "hexagonal pyramidal CN7\text{CN}_7CN7​", "pentagonal bipyramidal CN7\text{CN}_7CN7​", "wt CN8\text{CN}_8CN8​" "body-centered cubic CN8\text{CN}_8CN8​", "hexagonal bipyramidal CN8\text{CN}_8CN8​", "wt CN9\text{CN}_9CN9​", "q2 CN9\text{CN}_9CN9​", "q4 CN9\text{CN}_9CN9​", "q6 CN9\text{CN}_9CN9​", "wt CN10\text{CN}_{10}CN10​", "q2 CN10\text{CN}_{10}CN10​", "q4 CN10\text{CN}_{10}CN10​", "q6 CN10\text{CN}_{10}CN10​", "wt CN11\text{CN}_{11}CN11​", "q2 CN11\text{CN}_{11}CN11​", "q4 CN11\text{CN}_{11}CN11​", "q6 CN11\text{CN}_{11}CN11​", "wt CN12\text{CN}_{12}CN12​", "cuboctahedral CN12\text{CN}_{12}CN12​", "q2 CN12\text{CN}_{12}CN12​", "q4 CN12\text{CN}_{12}CN12​", "q6 CN12\text{CN}_{12}CN12​", "wt CN13\text{CN}_{13}CN13​", "wt CN14\text{CN}_{14}CN14​", "wt CN15\text{CN}_{15}CN15​", "wt CN16\text{CN}_{16}CN16​", "wt CN17\text{CN}_{17}CN17​", "wt CN18\text{CN}_{18}CN18​", "wt CN19\text{CN}_{19}CN19​", "wt CN20\text{CN}_{20}CN20​", "wt CN21\text{CN}_{21}CN21​", "wt CN22\text{CN}_{22}CN22​" "wt CN23\text{CN}_{23}CN23​" and "wt CN24\text{CN}_{24}CN24​" Note that qnq_nqn​ refers to Steinhardt bond orientational order parameter of order n. The resulting site fingerprint is thus defined as:

vsite=[w∣CN=1,w∣CN=2,qL∣CN=2,qwater∣CN=2,…,w∣CN=24]T\mathbf{v}^\text{site} = [w|_{\text{CN}=1}, \quad w|_{\text{CN}=2}, \quad q_\text{L}|_{\text{CN}=2}, \quad q_\text{water}|_{\text{CN}=2}, \quad \dots, \quad w|_{\text{CN}=24}]^\text{T}vsite=[w∣CN=1​,w∣CN=2​,qL​∣CN=2​,qwater​∣CN=2​,…,w∣CN=24​]T

Structure Fingerprints

The fingerprints from sites in a given structure are subsequently statistically processed to yield the minimum, maximum, mean, and standard deviation of each coordination information element," The resultant ordered vector defines a structure fingerprint, $v^{struct}$:

vstruct=[min⁡(w∣CN=1),max⁡(w∣CN=1),mean(w∣CN=1),std(w∣CN=1),…,min⁡(w∣CN=24),max⁡(w∣CN=24),mean(w∣CN=24),std(w∣CN=24)]T\mathbf{v}^\text{struct} = [ \min(w|_{\text{CN}=1}), \quad \max(w|_{\text{CN}=1}), \quad \text{mean}(w|_{\text{CN}=1}), \quad \text{std}(w|_{\text{CN}=1}), \dots, \min(w|_{\text{CN}=24}), \quad \max(w|_{\text{CN}=24}), \quad \text{mean}(w|_{\text{CN}=24}), \quad \text{std}(w|_{\text{CN}=24}) ]^\text{T}vstruct=[min(w∣CN=1​),max(w∣CN=1​),mean(w∣CN=1​),std(w∣CN=1​),…,min(w∣CN=24​),max(w∣CN=24​),mean(w∣CN=24​),std(w∣CN=24​)]T

Structure Distance/Dissimilarity

Finally, structure similarity is determined by the distance, d, between two structure fingerprints vistructv_{i}^{struct}vistruct​ andvjstructv_{j}^{struct}vjstruct​:

d=∣∣vistruct−vjstruct∣∣d = || \mathbf{v}_{i}^\text{struct} - \mathbf{v}_{j}^\text{struct} ||d=∣∣vistruct​−vjstruct​∣∣

A small distance value indicates high similarity between two structures, whereas a large distance (>1) suggests that the structures are very dissimilar," The spinel example below gives an approximate threshold up to which distance you can still consider two structures to be similar (0.9)," Anything beyond 0.9 is most certainly not the same structure prototype.

Examples

import numpy as np
from mp_api import MPRester
from matminer.featurizers.site import CrystalNNFingerprint
from matminer.featurizers.structure import SiteStatsFingerprint

with MPRester() as mpr:

    # Get structures.
    diamond = mpr.get_structure_by_material_id("mp-66")
    gaas = mpr.get_structure_by_material_id("mp-2534")
    rocksalt = mpr.get_structure_by_material_id("mp-22862")
    perovskite = mpr.get_structure_by_material_id("mp-5827")
    spinel_caco2s4 = mpr.get_structure_by_material_id("mvc-12728")
    spinel_sicd2O4 = mpr.get_structure_by_material_id("mp-560842")

    # Calculate structure fingerprints.
    ssf = SiteStatsFingerprint(
        CrystalNNFingerprint.from_preset('ops', distance_cutoffs=None, x_diff_weight=0),
        stats=('mean', 'std_dev', 'minimum', 'maximum'))
    v_diamond = np.array(ssf.featurize(diamond))
    v_gaas = np.array(ssf.featurize(gaas))
    v_rocksalt = np.array(ssf.featurize(rocksalt))
    v_perovskite = np.array(ssf.featurize(perovskite))
    v_spinel_caco2s4 = np.array(ssf.featurize(spinel_caco2s4))
    v_spinel_sicd2O4 = np.array(ssf.featurize(spinel_sicd2O4))

    # Print out distance between structures.
    print('Distance between diamond and GaAs: {:.4f}'.format(np.linalg.norm(v_diamond - v_gaas)))
    print('Distance between diamond and rocksalt: {:.4f}'.format(np.linalg.norm(v_diamond - v_rocksalt)))
    print('Distance between diamond and perovskite: {:.4f}'.format(np.linalg.norm(v_diamond - v_perovskite)))
    print('Distance between rocksalt and perovskite: {:.4f}'.format(np.linalg.norm(v_rocksalt - v_perovskite)))
    print('Distance between Ca(CoS2)2-spinel and Si(CdO2)2-spinel: {:.4f}'.format(np.linalg.norm(v_spinel_caco2s4 - v_spinel_sicd2O4)))

StructureMatcher

References

[1]: Zimmermann, N. E. R. and Jain, A., Local structure order parameters and site fingerprints for quantification of coordination environment and crystal structure similarity, RSC Adv., 2020,10, 6063-6081

[2]: Zimmermann NER, Horton MK, Jain A and Haranczyk M (2017) Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization. Front. Mater.4:34. doi: 10.3389/fmats.2017.00034

[3]: Pan, H., Ganose, A. M., Horton, M., Aykol, M., Persson, K. A., Zimmermann, N. E., & Jain, A. (2021). Benchmarking coordination number prediction algorithms on inorganic crystal structures. Inorganic chemistry, 60(3), 1590-1603.

Authors

Nils Zimmermann, Donny Winston, Handong Ling, Oxana Andriuc

Diamond () vs. GaAs\text{GaAs}GaAs () ightarrowightarrowightarrow d = 0

Diamond () vs. rocksalt () ightarrowightarrowightarrow d = 3.5724

Diamond () vs. perfect CaTiO3\text{CaTiO}_3CaTiO3​ perovskite () ightarrowightarrowightarrow d = 3.5540

Rocksalt () vs. perfect CaTiO3\text{CaTiO}_3CaTiO3​ perovskite () ightarrowightarrowightarrow d = 2.7417

Ca(CoS2)2\text{Ca(CoS}_2\text{)}_2Ca(CoS2​)2​-spinel () vs. Si(CdO2)2\text{Si(CdO}_2\text{)}_2Si(CdO2​)2​-spinel () ightarrowightarrowightarrow d = 0.8877

Below is a python code snippet that allows you to quickly reproduce above results," You will need to install and for this to work," Both are easily accessible via the .

Another tool that is used to group materials is the . There are multiple comparators (for example: , , etc.) that can be used to determine how to make comparisons between structures when determining their similarity.

mp-66
mp-2534
mp-66
mp-22862
mp-66
mp-5827
mp-22862
mp-5827
mvc-12728
mp-560842
pymatgen
matminer
Python Package Index
StructureMatcher
SpinComparator
ElementComparator
CrystalNN
pymatgen
matminer
[1,2]
[3]
[4]