# Related Materials

## Introduction

The similarity between two structures *i* and *j* is assessed on the basis of local coordination information from all sites in the two structures. [\[1,2\]](#references) The four basic steps involved are:

1. Find near(est) neighbors of all sites in both structures.
2. Evaluate each coordination pattern via coordination descriptors observed at each site to define site fingerprints.
3. Compute statistics of the descriptor values across all sites in a structure to define structure fingerprints.
4. Use structure fingerprints to rate the (dis)similarity between the two (vectors representing the two) structures.

## Near-neighbor finding

We use a novel method called [CrystalNN](https://github.com/materialsproject/pymatgen/blob/master/pymatgen/analysis/local_env.py#L3751) to find near(est) neighbors in periodic structures. While the method will be introduced shortly [\[3\]](#references), it is already available through the python package [pymatgen](https://github.com/materialsproject/pymatgen). A benchmarking framework has been developed to evaluate CrystallNN and compare it to other near-neighbor finding algorithms [\[4\]](#references).

## Site Fingerprints

The second step of the structure similarity calculation is the computation of a crystal site fingerprint, $$v^{site}$$, for each site in the two structures. The fingerprint is a 61-dimensional vector in which each element carries information about the local coordination environment computed with the *site* module of the python package [matminer](https://github.com/hackingmaterials/matminer). For example, the first two elements "wt $$\text{CN}*1$$" and "single bond $$\text{CN}*1$$" provide estimates of the likelihood (or weight) of how much the given site should be considered 1-fold coordinated (i.e., *w*$$|*{CN=1}$$*).\_ The third element "wt $$\text{CN}*2$$" provides a 2-fold coordination likelihood, whereas the fourth element "L-shaped $$\text{CN}*2$$" holds the resemblance similarity to an L-shaped coordination geometry (also called local structure order parameter) given that we find a coordination configuration with 2 atoms ($$q*{L}|*{CN=2}$$). The local structure order parameters can assume values between 0, meaning that the observed local environment has no resemblance with the target motif to which it is compared, and 1, which stands for perfect motif match. The remaining elements are: "water-like $$\text{CN}\_2$$", "bent 120 degrees $$\text{CN}\_2$$", "bent 150 degrees $$\text{CN}\_2$$", "linear $$\text{CN}\_2$$", "wt $$\text{CN}\_3$$", "trigonal planar $$\text{CN}\_3$$", "trigonal non-coplanar $$\text{CN}\_3$$", "T-shaped $$\text{CN}*3$$", "wt $$\text{CN}*4$$", "square co-planar $$\text{CN}*4$$", "tetrahedral $$\text{CN}*4$$", "rectangular see-saw-like $$\text{CN}*4$$", "see-saw-like $$\text{CN}*4$$", "trigonal pyramidal $$\text{CN}*4$$", "wt $$\text{CN}*5$$", "pentagonal planar $$\text{CN}*5$$", "square pyramidal $$\text{CN}*5$$", "trigonal bipyramidal $$\text{CN}*5$$", "wt $$\text{CN}*6$$", "hexagonal planar $$\text{CN}*6$$", "octahedral $$\text{CN}*6$$", "pentagonal pyramidal $$\text{CN}*6$$", "wt $$\text{CN}*7$$" "hexagonal pyramidal $$\text{CN}*7$$", "pentagonal bipyramidal $$\text{CN}*7$$", "wt $$\text{CN}*8$$" "body-centered cubic $$\text{CN}*8$$", "hexagonal bipyramidal $$\text{CN}*8$$", "wt $$\text{CN}*9$$", "q2 $$\text{CN}*9$$", "q4 $$\text{CN}*9$$", "q6 $$\text{CN}*9$$", "wt $$\text{CN}*{10}$$", "q2 $$\text{CN}*{10}$$", "q4 $$\text{CN}*{10}$$", "q6 $$\text{CN}*{10}$$", "wt $$\text{CN}*{11}$$", "q2 $$\text{CN}*{11}$$", "q4 $$\text{CN}*{11}$$", "q6 $$\text{CN}*{11}$$", "wt $$\text{CN}*{12}$$", "cuboctahedral $$\text{CN}*{12}$$", "q2 $$\text{CN}*{12}$$", "q4 $$\text{CN}*{12}$$", "q6 $$\text{CN}*{12}$$", "wt $$\text{CN}*{13}$$", "wt $$\text{CN}*{14}$$", "wt $$\text{CN}*{15}$$", "wt $$\text{CN}*{16}$$", "wt $$\text{CN}*{17}$$", "wt $$\text{CN}*{18}$$", "wt $$\text{CN}*{19}$$", "wt $$\text{CN}*{20}$$", "wt $$\text{CN}*{21}$$", "wt $$\text{CN}*{22}$$" "wt $$\text{CN}*{23}$$" and "wt $$\text{CN}*{24}$$" Note that $$q\_n$$ refers to Steinhardt bond orientational order parameter of order n. The resulting site fingerprint is thus defined as:

$$
\mathbf{v}^\text{site} = \[w|*{\text{CN}=1}, \quad w|*{\text{CN}=2}, \quad q\_\text{L}|*{\text{CN}=2}, \quad q*\text{water}|*{\text{CN}=2}, \quad \dots, \quad w|*{\text{CN}=24}]^\text{T}
$$

## Structure Fingerprints

The fingerprints from sites in a given structure are subsequently statistically processed to yield the minimum, maximum, mean, and standard deviation of each coordination information element," The resultant ordered vector defines a structure fingerprint, $v^{struct}$:

$$
\mathbf{v}^\text{struct} = \[ \min(w|*{\text{CN}=1}), \quad \max(w|*{\text{CN}=1}), \quad \text{mean}(w|*{\text{CN}=1}), \quad \text{std}(w|*{\text{CN}=1}), \dots, \min(w|*{\text{CN}=24}), \quad \max(w|*{\text{CN}=24}), \quad \text{mean}(w|*{\text{CN}=24}), \quad \text{std}(w|*{\text{CN}=24}) ]^\text{T}
$$

## Structure Distance/Dissimilarity

Finally, structure dissimilarity is determined by the distance, *d*, between two structure fingerprints $$v\_{i}^{struct}$$ and$$v\_{j}^{struct}$$:

$$
d = || \mathbf{v}*{i}^\text{struct} - \mathbf{v}*{j}^\text{struct} ||
$$

A small distance value indicates high similarity between two structures, whereas a large distance (>1) suggests that the structures are very dissimilar," The spinel example below gives an approximate threshold up to which **distance you can still consider two structures to be similar (0.9)**," Anything beyond 0.9 is most certainly not the same structure prototype.

The following function may also be used to convert this distance metric to a similarity value between 0 and 1:

$$
s = e^{-|| \mathbf{v}*{i}^\text{struct} - \mathbf{v}*{j}^\text{struct} ||}
$$

This similariy metric, s, is positively correlated with the similarity between the structures and can be easily converted to a percentage.

## Examples

* Diamond ([mp-66](https://materialsproject.org/materials/mp-66/)) vs. $$\text{GaAs}$$ ([mp-2534](https://materialsproject.org/materials/mp-2534/)) $$\rightarrow$$ *d* = 0
* Diamond ([mp-66](https://materialsproject.org/materials/mp-66/)) vs. rocksalt ([mp-22862](https://materiahttps/materialsproject.org/materials/mp-5827/lsproject.org/materials/mp-22862/)) $$\rightarrow$$ *d* = 3.5724
* Diamond ([mp-66](https://materialsproject.org/materials/mp-66/)) vs. perfect $$\text{CaTiO}\_3$$ perovskite ([mp-5827](https://materialsproject.org/materials/mp-5827/)) $$\rightarrow$$ *d* = 3.5540
* Rocksalt ([mp-22862](https://materialsproject.org/materials/mp-22862/)) vs. perfect $$\text{CaTiO}\_3$$ perovskite ([mp-5827](https://materialsproject.org/materials/mp-5827/)) $$\rightarrow$$ *d* = 2.7417
* $$\text{Ca(CoS}\_2\text{)}\_2$$-spinel ([mp-1408976](https://next-gen.materialsproject.org/materials/mp-1408976?material_ids=mp-1408976)) vs. $$\text{Si(CdO}\_2\text{)}\_2$$-spinel ([mp-560842](https://materialsproject.org/materials/mp-560842/)) $$\rightarrow$$ *d* = 0.8877

Below is a python code snippet that allows you to quickly reproduce above results," You will need to install [pymatgen](https://github.com/materialsproject/pymatgen) and [matminer](https://github.com/hackingmaterials/matminer) for this to work," Both are easily accessible via the [Python Package Index](https://pypi.python.org/pypi).

```python
import numpy as np
from mp_api.client import MPRester
from matminer.featurizers.site import CrystalNNFingerprint
from matminer.featurizers.structure import SiteStatsFingerprint

with MPRester() as mpr:

    # Get structures
    diamond = mpr.get_structure_by_material_id("mp-66")
    gaas = mpr.get_structure_by_material_id("mp-2534")
    rocksalt = mpr.get_structure_by_material_id("mp-22862")
    perovskite = mpr.get_structure_by_material_id("mp-5827")
    spinel_caco2s4 = mpr.get_structure_by_material_id("mp-1408976")
    spinel_sicd2O4 = mpr.get_structure_by_material_id("mp-560842")

# Calculate structure fingerprints
ssf = SiteStatsFingerprint(
    CrystalNNFingerprint.from_preset('ops', distance_cutoffs=None, x_diff_weight=0),
    stats=('mean', 'std_dev', 'minimum', 'maximum'))
v_diamond = np.array(ssf.featurize(diamond))
v_gaas = np.array(ssf.featurize(gaas))
v_rocksalt = np.array(ssf.featurize(rocksalt))
v_perovskite = np.array(ssf.featurize(perovskite))
v_spinel_caco2s4 = np.array(ssf.featurize(spinel_caco2s4))
v_spinel_sicd2O4 = np.array(ssf.featurize(spinel_sicd2O4))

# Print out distance between structures
print('Distance between diamond and GaAs: {:.4f}'.format(np.linalg.norm(v_diamond - v_gaas)))
print('Distance between diamond and rocksalt: {:.4f}'.format(np.linalg.norm(v_diamond - v_rocksalt)))
print('Distance between diamond and perovskite: {:.4f}'.format(np.linalg.norm(v_diamond - v_perovskite)))
print('Distance between rocksalt and perovskite: {:.4f}'.format(np.linalg.norm(v_rocksalt - v_perovskite)))
print('Distance between Ca(CoS2)2-spinel and Si(CdO2)2-spinel: {:.4f}'.format(np.linalg.norm(v_spinel_caco2s4 - v_spinel_sicd2O4)))
    
# Print out structure similarity percentages
print('Diamond and GaAs Similarity: {:.2f}%'.format(np.exp(-np.linalg.norm(v_diamond - v_gaas)) * 100))
print('Diamond and rocksalt Similarity: {:.2f}%'.format(np.exp(-np.linalg.norm(v_diamond - v_rocksalt)) * 100))
print('Diamond and perovskite Similarity: {:.2f}%'.format(np.exp(-np.linalg.norm(v_diamond - v_perovskite)) * 100))
print('Rocksalt and perovskite Similarity: {:.2f}%'.format(np.exp(-np.linalg.norm(v_rocksalt - v_perovskite)) * 100))
print('Ca(CoS2)2-spinel and Si(CdO2)2-spinel Similarity: {:.2f}%'.format(np.exp(-np.linalg.norm(v_spinel_caco2s4 - v_spinel_sicd2O4)) * 100))
```

## StructureMatcher

Another tool that is used to group materials is the [StructureMatcher](https://github.com/materialsproject/pymatgen/blob/master/pymatgen/analysis/structure_matcher.py#L292). There are multiple comparators (for example: [SpinComparator](https://github.com/materialsproject/pymatgen/blob/master/pymatgen/analysis/structure_matcher.py#L135), [ElementComparator](https://github.com/materialsproject/pymatgen/blob/master/pymatgen/analysis/structure_matcher.py#L176), etc.) that can be used to determine how to make comparisons between structures when determining their similarity.

## References

\[1]: Zimmermann, N. E. R. and Jain, A., Local structure order parameters and site fingerprints for quantification of coordination environment and crystal structure similarity, ***RSC Adv.***, 2020,**10**, 6063-6081

\[2]: Zimmermann NER, Horton MK, Jain A and Haranczyk M (2017) Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization. *Front. Mater.*&#x34;:34. doi: 10.3389/fmats.2017.00034

\[3]: Pan, H., Ganose, A. M., Horton, M., Aykol, M., Persson, K. A., Zimmermann, N. E., & Jain, A. (2021). Benchmarking coordination number prediction algorithms on inorganic crystal structures. *Inorganic chemistry*, *60*(3), 1590-1603.

## Authors

Nils Zimmermann, Donny Winston, Handong Ling, Oxana Andriuc


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.materialsproject.org/methodology/materials-methodology/related-materials.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
