PyPI version PyPI version

Document

CrySPY (pronounced as crispy) is a crystal structure prediction tool written in Python.
CrySPY automates the following:

  • Structure generation
  • Submitting jobs for structure optimization
  • Collecting data for structure optimization
  • Selecting candidates using machine learning

CrySPY can be install by pip install csp-cryspy.

fig_csp fig_csp

Latest version

CrySPY 1.4.0 (2025 June 17)

News

Discussions

Discussions in GitHub (questions and comments)

License

CrySPY is distributed under the MIT License
Copyright (c) 2018 CrySPY Development Team

Code contributors

  • Tomoki Yamashita and Lab members (Nagaoka University of Technology)
  • Nobuya Sato (National Institute of Advanced Industrial Science and Technology)
  • Hiori Kino (National Institute for Materials Science)
  • Kei Terayama (Yokohama City University)
  • Hikaru Sawahata (Kanazawa University)
  • Shinichi Kanehira (Osaka University)

Reference

  • CrySPY(software)
    • T. Yamashita, S. Kanehira, N. Sato, H. Kino, H. Sawahata, T. Sato, F. Utsuno, K. Tsuda, T. Miyake, and T. Oguchi,
      “CrySPY: a crystal structure prediction tool accelerated by machine learning”,
      Sci. Technol. Adv. Mater. Meth. 1, 87 (2021). Link
  • Bayesian optimization
    • T. Yamashita, N. Sato, H. Kino, T. Miyake, K. Tsuda, and T. Oguchi,
      “Crystal structure prediction accelerated by Bayesian optimization”,
      Phys. Rev. Mater. 2, 013803 (2018). Link
    • N. Sato, T. Yamashita, T. Oguchi, K. Hukushima, and T. Miyake,
      “Adjusting the descriptor for a crystal structure search using Bayesian optimization”,
      Phys. Rev. Mater. 4, 033801 (2020). Link
  • Bayesian optimization and evolutionary algorithm
    • T. Yamashita, H. Kino, K. Tsuda, T. Miyake, and T. Oguchi,
      “Hybrid algorithm of Bayesian optimization and evolutionary algorithm in crystal structure prediction”,
      Sci. Technol. Adv. Mater. Meth. 2, 67 (2022). Link
  • LAQA
    • K.Terayama, T. Yamashita, T. Oguchi, and K. Tsuda,
      “Fine-grained optimization method for crystal structure prediction”,
      npj Comput. Mater. 4, 32 (2018). Link
    • T. Yamashita and H. Sekine,
      “Improvement of look ahead based on quadratic approximation for crystal structure prediction”,
      Sci. Technol. Adv. Mater. Meth. 2, 84 (2022). Link

GitHub repo GitHub discussions CrySPY utility

Subsections of

Subsections of Version information

Version 1.4.0

Important change

New algorithm: EA-vc

Interactive mode

EA

  • tot_struc is no longer used in EA. The number of structures in the first generation is now determined by n_pop.

Interatomic distance check after structrue optimization

  • Added check_mindist_opt to the [option] section in cryspy.in.
  • Default: check_mindist_opt = True.
  • After structure relaxation, a check is performed to ensure that the minimum interatomic distance constraint is satisfied.

Common

  • The natot parameter in cryspy.in has been removed.
  • Ctrl_ext has been removed.

Fixed

  • Fixed a bug related to using TS as the score in BO.
  • Several other minor fixes.

Version 1.3.0

Important change

Common

  • working directory name
    work000000 –> work0
  • We used to pickle data by grouping several data into tuples, but we changed it to pickle each item individually.
    For example, rs_id_data.pkl –> id_queueing.pkl and id_running.pkl

BO

Fixed

soiap

  • support for recent pymatgen

Added

  • Random structure generation and structure generation by EA are now available as libraries. see Features > As library

for developer

  • We stopped using global variables (rin), now uses dataclass for input data.
  • Many of the input variables were lists, but we changed them to tuples.

Version 1.2.5

Bug fix

  • simple bug fix

Version 1.2.4

Bug fix

EA

  • default value of cls_lat: equal –> random

EA-vc

  • test version of variable composition EA (EA-vc). only binary system for now.

Version 1.2.2

Enthalpy

You can use enthalpy instead of energy for VASP and QE.

See also

Version 1.2.1

ASE interface

Bug fixed for multiple stages.

Version 1.1.1

Bug fix for spg_error

In random structure generation, when a structure cannot be generated for a certain space group, the space group number is recorded in the variable sgp_error, and the number is skipped thereafter, but a bug was found in which the number was registered incorrectly in rare cases. Therefore, this spg_error function has been removed.

Version 1.1.0

Parallelization with MPI

Random structure generation using MPI has been available.

See also

LAQA

Updated score formula to take into account the stress term (T. Yamashita and H. Sekine, Sci. Technol. Adv. Mater. Meth. 2, 84 (2022).).

See also

Backup

Files are copied to the directory named by the date and time in “backup” directory.
See features/backup in detail.

Version 1.0.0

Install and run

CrySPY is now available in PyPI. You can install by

pip install csp-cryspy

The executable script, cryspy is automatically installed in your PATH. To run CrySPY, just type cryspy:

cryspy &

CrySPY stops once before going to next selection (BO, LAQA) or next generation (EA). For example, EA case:

[old version]

  • cryspy run
    • check jobs (finish current generation?)
    • structure generation by EA automatically starts

[CrySPY 1.0.0]

  • cryspy run
    • check jobs (finish current generation?)
    • stop
  • cryspy run
    • auto backup
    • structure generation by EA automatically starts

Auto and manual backup

Automatically backup:

  • before going to next selection or next generation
  • structure generation

To manually back up:

cryspy -b

See features/backup in detail.

Clean

cryspy -c

See features/clean in detail.

Directory tree

Changed the directory tree.

  • genstruc/RS –> RS/
  • genstruc/EA –> EA/
  • genstruc/struc_util.py –> util/
  • utility.py –> util/

IO

  • Fixed standard output file and standard error file: log_cryspy and err_cryspy
  • cryspy.out is obsoleted

Moved to CrySPY Utility

With the change in installation method, examples and cal_fingerprint have been moved to the CrySPY Utility.

COMBO

The python library COMBO is now optional in CrySPY. If you do not use Bayesian optimizaion, you do not need to install it.

New calc_code

  • ext: Deprecated in version 1.4.0.

cryspy.in

fppath

New input variable for cal_fingerprint. See Instllation/cryspy/cryspy_1.0

fwpath

New input variable for find_wy. See Instllation/requirements/find_wy

mindist

  • mindist can be omitted in cryspy.in
  • mindist_ea is obsoleted
  • added mindist_mol_bs and mindist_mol_bs_factor in cryspy.in

Version 0.10.3 or earlier

  • [2022 May 17] version 0.10.3 released
    • Bug fixed: LAMMPS IO.
  • [2022 January 24] version 0.10.2 released
    • Added nrot: maximum number of times to rotate molecules in mol_bs
  • [2021 September 30] version 0.10.1 released
    • Fixed the problem of numpy.random.seed in multiprocessing
  • [2021 July 25] version 0.10.0 released
    • Support PyXtal 0.2.9 or later
    • LAQA can be used with QE
    • Upper and lower limits of energy for EA and BO
  • [2021 July 13] paper published
    • Our paper on CrySPY software has been published in STAM:Methods
  • [2021 March 18] version 0.9.2 released
    • Support pymatgen v2022.
  • [2021 February 7] version 0.9.0 released
    • Interfaced with OpenMX
    • Employ PyXtal library to generate initial structures
    • If you use PyXtal (default), find_wy program is not required
    • LAQA can be used with soiap
    • Change the name: [lattice] section –> [structure] section
    • Several input variables move to [structure] section
      • natot: [basic] –> [structure]
      • atype: [basic] –> [structure]
      • nat: [basic] –> [structure]
      • maxcnt: [option] –> [structure]
      • symprec: [option] –> [structure]
      • spgnum: [option] –> [structure]
    • New features
      • Molecular crystal structure generation
      • Scale volume
  • [2020 March 19] paper published
  • [2020 February 16] version 0.8.0 released
    • Migrate to Python 3
    • CrySPY logo created
    • Change several variable names and data formats
    • Change style of output for energy: eV/cell –> eV/atom
    • IDs of working directories corresponds to structure IDs
    • New features
      • recalculation
      • manual select in BO
  • [2018 December 5] version 0.7.0 released
    • New features
      • Evolutionary algorithm
  • [2018 August 20] version 0.6.4 released
  • [2018 July 2] version 0.6.3 released
  • [2018 June 26] Version 0.6.2 released
  • [2018 March 1] Version 0.6.1 released
  • [2018 January 9] paper published

Subsections of Installation

Subsections of System requirements

Python

Python

CrySPY 1.3.0 or later

  • Python >= 3.8
    • PyXtal (>= 0.5.3)
    • (optional) mpi4py
    • (optional, required if algo is BO) PHYSBO (Not COMBO)
    • (optional, required if algo is BO) dscribe
    • (optional) nglview

If you install csp-cryspy with pip, necessary libraries such as PyXtal, pymatgen, and ASE will be installed automatically. Go to Installation > CrySPY for detail.

2025 June 17, Tested and confirmed to work in the following environments (as installed via pip install csp-cryspy)

  • python 3.13.5
  • CrySPY 1.4.0
  • numpy 1.26.4 (with physbo) and 2.3.0(without physbo. physbo requires numpy < 2.0)
  • pandas 2.3.0
  • pymatgen 2025.6.14
  • pyxtal 1.0.9
  • scipy 1.15.3

Quick install

pip install csp-cryspy

When using BO

pip install dscribe physbo

CrySPY 1.1.0 – 1.2.5

  • Python >= 3.8
    • PyXtal (>= 0.5.3)
    • (optional) mpi4py
    • (optional, required if algo is BO) COMBO

If you install csp-cryspy with pip, necessary libraries such as PyXtal will be installed automatically. Go to Installation > CrySPY. Manual installation of COMBO is required when using Bayesian optimization.

CrySPY 1.0.0

  • Python >= 3.8
    • PyXtal (>= 0.5.3)
    • (optional, required if algo is BO) COMBO
Info

[2023 April 22] How to instlal PyXtal (pyshtools) on arm64 MacOS is figured out. See Arm64 on MacOS (without Rosseta 2)
[2023 March 15] On MacOS, it is difficult to install PyXtal in the arm64 environment, so it is recommended to use the x86_64 environment with Rosetta 2.

CrySPY 0.10.0 – 0.10.3

Tested with Homebrew Python 3.8.x and 3.9.x on Mac and Python 3.8.x on Linux.

CrySPY 0.9.2

Tested with Homebrew Python 3.8.x and 3.9.x on Mac and Python 3.8.x on Linux.

Info

[2021 July 15] If you use PyXtal >= 0.2.9, update CrySPY to the version 0.10.0 or later.

Info

[2021 March 18] There is a breaking change in pymatgen 2022.x.x. CrySPY 0.9.2 and PyXtal 0.2.2 support this change in pymatgen.

Info

[2021 Feb. 5] PyXtal depends on numba, but numba does not support Python 3.9. So you should use Python 3.8.x for a while.
[2021 March 18] Currently numba supports Python 3.9.x.

Info

[2021 Feb. 7] PyXtal requires SciPy, but the latest version of SciPy (v1.6.0) might include a bug for deepcopy. You should use SciPy v1.5.4 for a while.
[2021 March 18] This bug has been fixed in SciPy v1.6.1.

CrySPY 0.9.0 – 0.9.1

CrySPY 0.8.0 or earlier

See the old document which is included CrySPY itself.

Structure optimizer

Structure optimizer

At least one optimizer is required.

find_wy (optional)

CrySPY have utilized find_wy to generate a random structure for a given space group (symmetry). However, CrySPY employs PyXtal library for structure generation as default since version 0.9.0. You can skip to install find_wy in CrySPY 0.9.0 or later, but you may use find_wy. For CrySPY 0.8.x or earlier, find_wy is required to generate a random structure for a given space group.

Info

You can skip to install find_wy in CrySPY 0.9.0 or later.

Installation of find_wy

m_tspace

First you need compile m_tspace for find_wy. Check these sites to compile it.

Download the source code of m_tspace in an arbitrary directory. For example:

$ mkdir -p ~/local
$ cd ~/local
$ git clone https://github.com/nim-hrkn/m_tspace.git

Additional two files are required to compile m_tspace. Download the following files in ~/local/m_tspace from TSPASE:

$ cd m_tspace
$ wget http://phoenix.mp.es.osaka-u.ac.jp/~tspace/tspace_main/tsp07/tsp98.f
$ wget http://phoenix.mp.es.osaka-u.ac.jp/~tspace/tspace_main/tsp07/prmtsp.f

Edit the makefile and run the make command. If you use ifort, you had better delete -check all option and use -O2 option.

$ emacs makefile
$ head -n 4 makefile
#FC=gfortran
#FFLAGS=-g -cpp -DUSE_GEN -ffixed-line-length-255
FC=ifort
FFLAGS=-O2 -g -traceback -cpp -DUSE_GEN -132
$ make

If you used gfortran, you might face the following problem:

tsp98.f:9839:32:

       CALL SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
                                1
Error: Actual argument contains too few elements for dummy argument 'ntab' (12/48) at (1)
make: *** [tsp98.o] Error 1

Then change the source file of tsp98.f like this (line 9925):

Before:

9913: C SUBROUTINE SUBGRP ====*====3====*====4====*====5====*====6====*====7
9914: C
9915: C    IF (JG(I),I=1,MG) IS A SUBGROUP OF (JGT(J),J=1,MGT) THEN
9916: C          TABLE (NTAB(I),I=1,MG) IS MADE HERE AND IND=0
9917: C    ELSE
9918: C          IND=-1
9919: C
9920: C                 1993/12/25
9921: C                   BY  S.TANAKA AND A. YANASE
9922: C---*----1----*----2----*----3----*----4----*----5----*----6----*----7
9923: C
9924:       SUBROUTINE SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
9925:       DIMENSION NTAB(48),JG(48),JGT(48)

After:

9913: C SUBROUTINE SUBGRP ====*====3====*====4====*====5====*====6====*====7
9914: C
9915: C    IF (JG(I),I=1,MG) IS A SUBGROUP OF (JGT(J),J=1,MGT) THEN
9916: C          TABLE (NTAB(I),I=1,MG) IS MADE HERE AND IND=0
9917: C    ELSE
9918: C          IND=-1
9919: C
9920: C                 1993/12/25
9921: C                   BY  S.TANAKA AND A. YANASE
9922: C---*----1----*----2----*----3----*----4----*----5----*----6----*----7
9923: C
9924:       SUBROUTINE SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
9925:       DIMENSION NTAB(12),JG(48),JGT(48)

If you succeed in compiling, you get m_tsp.a.

find_wy

Check these sites to compile find_wy:

Download the source code of find_wy in an arbitrary directory. For example:

$ mkdir -p ~/local
$ cd ~/local
$ git clone https://github.com/nim-hrkn/find_wy.git

Edit make.inc and set the path to the m_tsp.a that you just prepared.

$ cd find_wy
$ emacs make.inc
$ head -n 4 make.inc
TSPPATH=~/local/m_tspace
#INCPATH = -I $(TSPPATH)
TSP=$(TSPPATH)/m_tsp.a

You can delete -check all option and use -O2 option. Then run the make command.

$ make

When you get the executable file of find_wy, run the following command for test:

$ ./find_wy input_sample/input_si4o8.txt

If there is no problem, POS_WY_SKEL_ALL.json file is generated.

Executable file of find_wy

CrySPY 1.0.0 or later

Put the executable file of find_wy in your PATH. Or, specify the path of the executable file in cryspy.in as follows:

[structure]
use_find_wy = True
fwpath = /xxx/xxx/xxx/find_wy

CrySPY 0.10.3 or earlier

When you use find_wy, put the executable file of find_wy in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/, so that the executable file path is ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/find_wy.

Subsections of CrySPY

CrySPY 1.3.0 or later

CrySPY

pip

pip install csp-cryspy

Please note that the name is csp-cryspy on PyPI, not cryspy. The executable script, cryspy, is automatically installed in your PATH. You can check by

which cryspy

Editable mode

If you want to change the source code of CrySPY, you can use pip’s editable mode (-e option).

git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git
pip install -e ./CrySPY

Instead of git clone, you can download the compressed file from the release page

PHYSBO and DScribe (optional)

If you use Bayesian optimization, PHYSBO and DScribe are required.

pip install physbo dscribe
Info

cal_fingerprint program and COMBO are obsolete.

mpi4py (optional)

When performing random structure generation with MPI parallelization, mpi4py is required.

pip install mpi4py

Jupyter and nglview (optional)

For analysis on a local PC or in interactive mode, Jupyter is required. If you want to visualize crystal structures using nglview in interactive mode, install nglview by pip.

pip install nglview

CrySPY 1.0.0 -- 1.2.5

CrySPY

pip

CrySPY 1.0.0 or later can be installed by pip.

pip install csp-cryspy

The executable script, cryspy, is automatically installed in your PATH. You can check by

which cryspy

Editable mode

If you want to change the source code of CrySPY, you can use pip’s editable mode (-e option).

git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git
pip install -e ./CrySPY

Instead of git clone, you can download the compressed file from the release page

cal_fingerprint (optional)

cal_fingerprint is a program to calculate structure descriptors and is required if algo is BO. From CrySPY 1.0.0, the cal_fingerprint program is included in CrySPY utility. See Instllation/CrySPY_utility/Compile cal_fingerprint for compilation.

Put the executable file of cal_fingerprint in your PATH. Or, specify the path of the executable file in cryspy.in as follows:

[BO]
fppath = /xxx/xxx/xxx/cal_fingerprint

Arm64 on MacOS (without Rosseta 2)

Info

In PyXtal, starting from version 0.6.3, pyshtools is no longer mandatory. Therefore, you can ignore the information written below if you are using a recent version of PyXtal.

  1. Install miniforge3 (We do not know how to install pyshtools with homebrew python.)
  2. Install pymatgen, pyshtools by conda (recent versions of pyshtools are available in conda-forge)
conda install pymatgen
conda install pyshtools
  1. Install CrySPY
pip3 install csp-cryspy

CrySPY 0.10.3 or earlier

Installation of CrySPY is very simple. Just download it!

Download

You can put the source code of CrySPY in an arbitrary directory. For example, let us put the source code in ~/CrySPY_root/CrySPY-x.x.x (x.x.x means the version). Use git or download the compressed file.

Git

mkdir ~/CrySPY_root
cd ~/CrySPY_root
git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git CrySPY-x.x.x

zip or tar.gz file

Download the source as a zip or tar.gz file from GitHub release .
Then put the source like ~/CrySPY_root/CrySPY-x.x.x

Directory tree

Directory tree in ~/CrySPY_root/CrySPY-x.x.x/:

CrySPY-x.x.x
├── CHANGELOG.md
├── CrySPY/
│   ├── BO/
│   ├── EA/
│   ├── IO/
│   ├── LAQA/
│   ├── RS/
│   ├── __init__.py
│   ├── calc_dscrpt/
│   ├── f-fingerprint/
│   ├── find_wy/
│   ├── gen_struc/
│   ├── interface/
│   ├── job/
│   └── start/
│   └── utility.py
├── LICENSE
├── README.md
├── cryspy.py
├── docs/
├── example/
└── utility/
Info

Main script is cryspy.py.

Setup (optional)

find_wy (optional)

When you use find_wy, put the executable file of find_wy in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/, so that the executable file path is ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/find_wy.

cd ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy
cp ~/local/find_wy/find_wy .

Compile cal_fingerprint (optional)

When you use Bayesian optimization, compile cal_fingerpirnt program which calculates structure descriptors.

cd ~/CrySPY_root/CrySPY-x.x.x/CrySPY/f-fingerprint
emacs Makefile
make

Make sure that the executable file of cal_fingerprint exists in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/f-fingerprint/.

CrySPY utility (optional)

Setting up Python environment in your local PC is useful to analyze CrySPY results. Utility tools (jupyter notebook and python scripts) are available for analysis and visualization. Input examples are also included in CrySPY utility.

Info

You need several Python libraries such as

Download CrySPY utility

Git

$ git clone https://github.com/Tomoki-YAMASHITA/CrySPY_utility.git

zip

Go to CrySPY utility and click green Code button, then choose Download ZIP.

Subsections of Tutorial

Random Search (RS)

Info

ASE is easy to start for beginners because when you install CrySPY, ASE is also automatically installed. Although not highly accurate, ASE provides very lightweight and fast interatomic potentials, making it suitable for testing on a laptop or other low-spec machines.

Preparation of input files

Follow one of the instructions below, then proceed to the section on running CrySPY.

Running CrySPY

  1. Check cryspy.in
  2. (version 0.10.3 or earlier) Script to run
  3. First run
  4. Submit job
  5. Check results
  6. Append structures
  7. Analysis and visualization

Subsections of Random Search (RS)

ASE on your local PC

2025 June 16, updated

ASE provides interfaces to different codes. ASE also includes Pure Python EMT calculator, which is suitable for testing CrySPY because of its fast and easy structure optimization.

In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Cu 8 atoms.

Assumption

Here, we assume the following conditions:

  • CrySPY 1.2.0 or later in your local PC
  • CrySPY job filename: job_cryspy
  • ase input filename: ase_in.py

Input files

Move to your working directory, and copy the example files by one of the following methods.

cd ase_Cu8_RS
tree
.
├── calc_in
│   ├── ase_in.py_1
│   └── job_cryspy
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = ASE
tot_struc = 5
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu
nat = 8

[ASE]
ase_python = ase_in.py

[option]

In [basic] section, jobcmd = zsh can be changed to jobcmd = sh or jobcmd = bash in accordance with your environment. CrySPY runs zsh job_cryspy as a background job internally.

[ASE] section is required when you use ASE.

You can name the following files whatever you want:

  • jobfile: job_cryspy
  • ase_python: ase_in.py

The other input variables are discussed later.

calc_in directory

The job file and input files for ASE are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh

# ---------- ASE
python3 ase_in.py

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

You can specify the input (ase_in.py) file names, but it must match the values of ase_python in cryspy.in. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string CrySPY_ID is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for ASE

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 1 in this ASE tutorial, so we need only ase_in.py_1. ase_in.py_1 is listed below. Refer to the ASE documentation for details.

from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.calculators.emt import EMT
from ase.optimize import BFGS
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(cell_filter)
opt.run(fmax=0.01, steps=2000)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = cell_filter.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ write structure
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

Unlike VASP and QE, the ASE input (python script) is more flexible. CrySPY has two rules:

  1. Energy is output in units of eV/cell to log.tote file. CrySPY reads the last line of it.
  2. Optimized structure is output to CONTCAR file in the VASP format.

Running CrySPY

Go to Running CrySPY

soiap on your local PC

2025 March 6 update

soiap is Structure Optimization with InterAtomic Potential. It is suitable for testing CrySPY because of its fast structure optimization. See instructions to install soiap.

In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Si 8 atoms.

Assumption

Here, we assume the following conditions:

  • (only version 0.10.3 or earlier) CrySPY main script: ~/CrySPY_root/CrySPY-0.9.0/cryspy.py
  • CrySPY job filename: job_cryspy
  • soiap executable file: ~/local/soiap-0.3.0/src/soiap
  • soiap input filename: soiap.in
  • soiap output filename: soiap.out
  • soiap input structure filename: initial.cif

Input files

Move to your working directory, and copy input example files by one of the following methods.

  • Download from Cryspy_utility/examples/soiap_Si8_RS
  • Copy from CrySPY utility that you installed
  • (only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/soiap_RS_Si8 .
cd soiap_RS_Si8
tree
.
├── calc_in
│   ├── job_cryspy
│   └── soiap.in_1
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = soiap
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

In [basic] section, jobcmd = zsh can be changed to jobcmd = sh or jobcmd = bash in accordance with your environment. CrySPY runs zsh job_cryspy as a background job internally.

[soiap] section is required when you use soiap.

You can name the following files whatever you want:

  • jobfile
  • soiap_infile
  • soiap_outfile
  • soiap_cif

The other input variables are discussed later.

calc_in directory

The job file and input files for soiap are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh

# ---------- soiap
EXEPATH=/path/to/soiap
$EXEPATH/soiap soiap.in 2>&1 > soiap.out

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Change /path/to/soiap into right path suitable for your environment. You can specify the input (soiap.in) and output (soiap.out) file names, but they must match the values of soiap_infile and soiap_outfile in cryspy.in. The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for soiap

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 1, so we need only soiap.in_1.

soiap.in_1 is listed below.

crystal initial.cif ! CIF file for the initial structure
symmetry 1 ! 0: not symmetrize displacements of the atoms or 1: symmetrize

md_mode_cell 3 ! cell-relaxation method
               ! 0: FIRE, 2: quenched MD, or 3: RFC5
number_max_relax_cell 100 ! max. number of the cell relaxation
number_max_relax 1 ! max. number of the atom relaxation
max_displacement 0.1 ! max. displacement of atoms in Bohr

external_stress_v 0.0 0.0 0.0 ! external pressure in GPa

th_force 5d-5 ! convergence threshold for the force in Hartree a.u.
th_stress 5d-7 ! convergence threshold for the stress in Hartree a.u.

force_field 1 ! force field
              ! 1: Stillinger-Weber for Si, 2: Tsuneyuki potential for SiO2,
              ! 3: ZRL for Si-O-N-H, 4: ADP for Nd-Fe-B, 5: Jmatgen, or
              ! 6: Lennard-Jones

The input structure file is specified at the first line. Use the same name as the value of soiap_cif in cryspy.in.

Running CrySPY

Go to Running CrySPY

VASP

2025 March 6 update

In this tutorial, we try to use CrySPY in a PC cluster with a job scheduler system such as PBS. Here we employ VASP. The target system is Na8Cl8, 16 atoms.

Assumption

Here, we assume the following conditions:

  • CrySPY 1.2.0 or later in your PC cluster
  • CrySPY job command: qsub
  • CrySPY job filename: job_cryspy
  • executable file, vasp_std in your PC cluster

Input files

Move to your working directory, and copy the example files by one of the following methods.

cd vasp_Na8Cl8_RS
tree
.
├── calc_in
│   ├── INCAR_1
│   ├── INCAR_2
│   ├── POTCAR
│   ├── POTCAR_is_dummy
│   └── job_cryspy
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5

[VASP]
kppvol = 40 80

[option]

In [basic] section, jobcmd = qsub can be changed in accordance with your environment. CrySPY runs qsub job_cryspy as a background job internally in this setting. You can name the following file whatever you want:

  • jobfile

We adopt a stage-based system for structure optimization calculations. Here, we use nstage = 2. For example, users can configure the following settings. In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density. Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.

[VASP] section is required when you use VASP. You have to specify k-point grid density (Å^-3) for each stage in kppvol.

Info

See Input file > Kpoint for details of kppvol

The other input variables are discussed later.

calc_in directory

The job file and input files for VASP are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Na8Cl8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q
####$ -q ibis3.q
####$ -q ibis4.q

# ---------- vasp
VASPROOT=/usr/local/vasp/vasp.6.4.2/bin
mpirun -np $NSLOTS $VASPROOT/vasp_std

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Change VASPROOT to the appropriate path suitable for your environment. The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for VASP

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 2, so we need INCAR_1 and INCAR_2. Here, INCAR_1 is set to fix the cell and relax only the ionic positions, while INCAR_2 is configured to fully relax both the cell and ionic positions.

INCAR_1

SYSTEM = NaCl
!!!LREAL = Auto
Algo = Fast
NSW = 40

LWAVE = .FALSE.
!LCHARG = .FALSE.

ISPIN =  1

ISMEAR = 0
SIGMA = 0.1

IBRION = 2
ISIF = 2

EDIFF = 1e-5
EDIFFG = -0.01

INCAR_2

SYSTEM = NaCl
!!LREAL = Auto
Algo = Fast
NSW = 200

ENCUT = 341

!!LWAVE = .FALSE.
!!LCHARG = .FALSE.


ISPIN =  1

ISMEAR = 0
SIGMA = 0.1

IBRION = 2
ISIF = 3

EDIFF = 1e-5
EDIFFG = -0.01

CrySPY automatically generates POSCAR and KPOINTS files. You have to prepare POTCAR file yourself. The POTCAR included in this example file is empty, so please be aware of that.

Warning

POTCAR in this example is empty. We cannot distribute it.

Running CrySPY

Go to Running CrySPY

QE

2025 March 6, updated

In this tutorial, we try to use CrySPY in a machine with a job scheduler system such as PBS. Here we employ QUANTUM ESPRESSO. (QE). The target system is Si 8 atoms.

Assumption

Here, we assume the following conditions:

  • CrySPY job command: qsub
  • CrySPY job filename: job_cryspy
  • QE executable file: /usr/local/qe-6.5/bin/pw.x
  • QE input filename: pwscf.in
  • QE output filename: pwscf.out

Input files

Move to your working directory, and copy input example files by one of the following methods.

  • Download from CrySPY_utility/examples/qe_Si8_RS
  • Copy from CrySPY utility that you installed
  • (only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/QE_Si8_RS .
cd QE_RS_Si8
tree
.
├── calc_in
│   ├── job_cryspy
│   ├── pwscf.in_1
│   └── pwscf.in_2
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = QE
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Si
nat = 8

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  40  80

[option]

In [basic] section, jobcmd = qsub can be changed in accordance with your environment. CrySPY runs qsub job_cryspy as a background job internally in this setting.

We adopt a stage-based system for structure optimization calculations. Here, we use nstage = 2. For example, users can configure the following settings. In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density. Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.

[QE] section is required when you use QE. You have to specify k-point grid density (Å^-3) for each stage in kppvol.

Info

See Input file > Kpoint for details of kppvol

You can name the following files whatever you want:

  • jobfile
  • qe_infile
  • qe_outfile

The other input variables are discussed later.

calc_in directory

The job file and input files for QE are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS /path/to/pw.x < pwscf.in > pwscf.out


if [ -e "CRASH" ]; then
    sed -i -e '3 s/^.*$/skip/' stat_job
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

Change /path/to/pw.x to the appropriate path suitable for your environment. You can specify the input (pwscf.in) and output (pwscf.out) file names, but they must match the values of qe_infile and qe_outfile in cryspy.in.

The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for QE

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 2, so we need pwscf.in_1 and pwscf.in_2. Here, pwscf.in_1 is set to fix the cell and relax only the ionic positions, while pwscf.in_2 is configured to fully relax both the cell and ionic positions.

pwscf.in_1

 &control
    title = 'Si8'
    calculation = 'relax'
    nstep = 100
    restart_mode = 'from_scratch',
    pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
    outdir='./out.d/'
 /

 &system
    ibrav = 0
    nat = 8
    ntyp = 1
    ecutwfc = 44.0
    occupations = 'smearing'
    degauss = 0.01
 /

 &electrons
 /

 &ions
 /

 &cell
 /

ATOMIC_SPECIES
  Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF

pwscf.in_2

 &control
    title = 'Si8'
    calculation = 'vc-relax'
    nstep = 200
    restart_mode = 'from_scratch',
    pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
    outdir='./out.d/'
 /

 &system
    ibrav = 0
    nat = 8
    ntyp = 1
    ecutwfc = 44.0
    occupations = 'smearing'
    degauss = 0.01
 /

 &electrons
 /

 &ions
 /

 &cell
 /

ATOMIC_SPECIES
  Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF

Change pseudo_dir to your suitable directory. Inputs for structure data and k-point such as ATOMIC_POSITIONS and K_POINTS are automatically appended by CrySPY with pymatgen. Users do not have to prepare them in pwscf.in_x.

Running CrySPY

Go to Running CrySPY

OpenMX

Coming soon.

LAMMPS

Coming soon.

Check cryspy.in

2025 June 16, updated

See Input file in detail.

Let’s take a look at cryspy.in again. This may be slightly different depending on calc_code you chose.

[basic]
algo = RS
calc_code = ASE
tot_struc = 5
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu
nat = 8

[ASE]
ase_python = ase_in.py

[option]

[basic] section

  • algo: Algorithm. Set RS for Random Search.
  • calc_code: Structure optimizer. Choose from VASP, QE, OMX, soiap, LAMMPS, ASE
  • tot_struc: The total number of structures. In this case, 5 random structures are generated at 1st run.
  • nstage: The number of stages. It’s up to you.
  • njob: The number of jobs running at the same time. In this example, CrySPY sets 2 slots for structure optimization, in other words, optimizes every 2 structures.
  • jobcmd: Command for jobs. Use bash, zsh, qsub, and so on.
  • jobfile: File name of the job file.

[structure] section

  • atype: Atom type. e.g. for Na8Cl8: atype = Na Cl.
  • nat: The number of atoms corresponding to each atype. e.g. for Na8Cl8: nat = 8 8

Script to run

Note

For version 1.0.0 or later, skip this page. The executable script is automatically installed.

Assumption

Here, we assume the following condition:

  • CrySPY main script: ~/CrySPY_root/CrySPY-0.9.0/cryspy.py

Make script

Let’s make a convenient shell script to avoid typing long commands over and over again. Here, we create the script, cryspy (any file name will do).

$ emacs cryspy
$ chmod 744 cryspy
$ cat cryspy
#!/bin/sh

python3 -u ~/CrySPY_root/CrySPY-0.9.0/cryspy.py 1>> log 2>> err

-u option (unbuffered option) can be omitted.

You can put this script in your $PATH, or just use like bash ./cryspy.

Firsrt run

2025 March 6, updated

Make sure you have the following in your working directory.

  • calc_in/
  • (cryspy)
  • cryspy.in
$ ls
calc_in/  cryspy.in

Then, run CyrSPY!

cryspy

If you use old version (0.10.3 or earlier):

bash ./cryspy

At the first run, CrySPY goes into structure generation mode. CrySPY stops after 5 structure generation.

If it worked properly, the following output appears on the screen:

[2025-03-06 18:52:21,495][cryspy_init][INFO] 


Start CrySPY 1.4.0b10


[2025-03-06 18:52:21,495][cryspy_init][INFO] # ---------- Library version info
[2025-03-06 18:52:21,495][cryspy_init][INFO] pandas version: 2.2.2
[2025-03-06 18:52:21,495][cryspy_init][INFO] pymatgen version: 2025.1.24
[2025-03-06 18:52:21,495][cryspy_init][INFO] pyxtal version: 1.0.6
[2025-03-06 18:52:21,495][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2025-03-06 18:52:21,496][write_input][INFO] [basic]
[2025-03-06 18:52:21,496][write_input][INFO] algo = RS
[2025-03-06 18:52:21,496][write_input][INFO] calc_code = ASE
[2025-03-06 18:52:21,496][write_input][INFO] tot_struc = 5
[2025-03-06 18:52:21,496][write_input][INFO] nstage = 1
[2025-03-06 18:52:21,496][write_input][INFO] njob = 2
[2025-03-06 18:52:21,496][write_input][INFO] jobcmd = zsh
[2025-03-06 18:52:21,496][write_input][INFO] jobfile = job_cryspy
...
(omitted)
...
[2025-03-06 18:52:21,497][rs_gen][INFO] # ---------- Initial structure generation
[2025-03-06 18:52:21,497][rs_gen][INFO] # ------ mindist
[2025-03-06 18:52:21,498][struc_util][INFO] Cu - Cu: 1.32
[2025-03-06 18:52:21,498][rs_gen][INFO] # ------ generate structures
[2025-03-06 18:52:21,519][gen_pyxtal][INFO] Structure ID      0: (8,) Space group:  31 -->  31 Pmn2_1
[2025-03-06 18:52:21,525][gen_pyxtal][INFO] Structure ID      1: (8,) Space group: 198 --> 198 P2_13
[2025-03-06 18:52:21,554][gen_pyxtal][INFO] Structure ID      2: (8,) Space group:   4 -->   4 P2_1
[2025-03-06 18:52:21,580][gen_pyxtal][INFO] Structure ID      3: (8,) Space group: 193 --> 191 P6/mmm
[2025-03-06 18:52:21,581][gen_pyxtal][WARNING] Compoisition [8] not compatible with symmetry 172:     spg = 172 retry.
[2025-03-06 18:52:21,625][gen_pyxtal][INFO] Structure ID      4: (8,) Space group:  64 -->  64 Cmce
[2025-03-06 18:52:22,013][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.516183

Several output files are also generated.

  • (cryspy.out): Short log. only version 0.10.3 or earlier.
  • cryspy.stat: Status file.
  • data/init_POSCARS: Initial struture file in POSCAR format. You can open this file using VESTA
  • data/pkl_data: Directory to save pickled data.
  • log_cryspy: log.
  • err_cryspy: error and warning.

Let’s take a look at cryspy.stat file.

[status]
id_queueing = 0 1 2 3 4

Structure ID 0 – 4 are queueing because we just generated structures, and have not submitted yet.

Tip

Check the initial structures, if the distances between atoms are too close, you should set the mindist in cryspy.in.

Submit job

2023 July 10, update

Continue

CrySPY continues the simulation if you have cryspy.stat file.

Tip

Continue if you have crypy.stat
Start from the beginning if you don’t have cryspy.stat

Submit job

Run CyrSPY again.

cryspy

Check the screen or log_cryspy file.

[2023-07-10 18:52:51,859][cryspy_restart][INFO] 


Restart CrySPY 1.2.0


[2023-07-10 18:52:51,869][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:52:51,904][ctrl_job][INFO] ID      0: submit job, Stage 1
[2023-07-10 18:52:51,931][ctrl_job][INFO] ID      1: submit job, Stage 1

And also cryspy.stat file.

...
(omit)
...
[status]
id_queueing = 2 3 4
id      0 = Stage 1
id      1 = Stage 1

CrySPY submitted two jobs for structure ID 0 and 1 as you set njob = 2 in cryspy.in. Calculations are performed in the work directory. These directory names correspond to their structure ID.

tree -d work
work
├── 000000
├── 000001
└── fin

When the two jobs are done, run CrySPY again.

cryspy
[2023-07-10 18:55:01,053][cryspy_restart][INFO] 


Restart CrySPY 1.2.0


[2023-07-10 18:55:01,058][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,058][ctrl_job][INFO] ID      0: Stage 1 Done!
[2023-07-10 18:55:01,093][ctrl_job][INFO]     collect results: E = -0.00696997755502915 eV/atom
[2023-07-10 18:55:01,132][ctrl_job][INFO] ID      1: Stage 1 Done!
[2023-07-10 18:55:01,133][ctrl_job][INFO]     collect results: E = 0.4934076667166454 eV/atom
[2023-07-10 18:55:01,144][cryspy][INFO] 

recheck 1

[2023-07-10 18:55:01,145][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,153][ctrl_job][INFO] ID      2: submit job, Stage 1
[2023-07-10 18:55:01,161][ctrl_job][INFO] ID      3: submit job, Stage 1

If you set nstage = 2 (more than 2), new jobs on stage 2 for ID 0 and 1 are submitted. If you set nstage = 1, CrySPY collects calculation data of ID 0 and 1, then submits next ID’s jobs. Directories of the finished structure are moved to the fin directory.

Repeat cryspy several times until all 5 structures are done. You can delete the work directory when the simulation is done if you do not need it.

The auto script (repeat_cryspy) may help you.

Check results

Move to data directory. There should be a few more files.

$ cd data
$ ls
cryspy_rslt  cryspy_rslt_energy_asc  init_POSCARS  opt_POSCARS  pkl_data/
  • cryspy_rslt: Result file.
  • cryspy_rslt_energy_asc: Result file sorted in energy ascending order.
  • init_POSCARS: Initial struture file in POSCAR format.
  • opt_POSCARS: Optimized structure file in POSCAR format.
  • pkl_data/: Directory to save pickled data.

The results are written to text files, cryspy_rslt and cryspy_rslt_energy_asc (and also saved in pickle data in pkl_data directory).

Each result appends to cryspy_rslt file in the order in which one finished earlier.

cat cryspy_rslt
   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
Info

Not ID order in cryspy_rslt

In cryspy_rslt_energy_asc file, the results are sorted in energy ascending order.

cat cryspy_rslt_energy_asc
   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done

Spg_num and Spg_sym show space group information on initial structures. Spg_num_opt and Spg_sym_opt are those of optimized structures. The last column Opt indicates whether or not optimization reached required accuracy.

Append structures

Of course only 5 structures are not enough to find stable structures. You can append structures whenever you want. Here let’s append more 5 structures.

For Si-Si mindist, the default value of 1.11 Å is used in the first structure generation (see log_cryspy), which is a little too close. Let us try to set the mindist to 2.0 Å.

Edit cryspy.in and change the value of tot_struc into 10, and add mindist_1 = 2.0

emacs cryspy.in
cat cryspy.in
[basic]
algo = RS
calc_code = soiap
tot_struc = 10
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Si
nat = 8
mindist_1 = 2.0

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

Then run cryspy, and check log_cryspy file.

cryspy &
cat log_cryspy
...
(omit)
...

2023/03/19 00:01:47
CrySPY 1.0.0
Restart cryspy.py


Changed tot_struc from 5 to 10
Changed mindist from None to [[2.0]]

Backup data

# ---------- Append structures
# ------ mindist
Si - Si 2.0
Structure ID      5 was generated. Space group: 218 --> 221 Pm-3m
Structure ID      6 was generated. Space group:  86 --> 129 P4/nmm
Structure ID      7 was generated. Space group: 129 --> 129 P4/nmm
Structure ID      8 was generated. Space group: 191 --> 191 P6/mmm
Structure ID      9 was generated. Space group:  31 -->  31 Pmn2_1

Remember that CrySPY goes into structure generation mode whenever you change the value of tot_struc. In this mode, CrySPY does not do any other action such as collecting data, submitting jobs, and so on.

Note

Structure generation mode whenever you change the value of tot_struc.
From version 1.0.0, CrySPY automatically backs up when adding structures. See features/backup.

Repeat cryspy & several times until all appended structures are done. The auto script (repeat_cryspy) may help you.

Analysis and visualization

Download data

It is assumed here that you analyze and visualize CrySPY data on your local PC. If you use CrySPY on a supercomputer or workstation, download the data to your local machine. You can delete the work and backup directories if they are not needed, as their file size can be very large.

Jupyter notebook

Move to the data/ directory in the results you downloaded earlier. Then, if the CrySPY utility has already been downloaded locally, copy cryspy_analyzer_RS.ipynb. Alternatively, you can download it directly from GitHub (CrySPY_utility/notebook/). Launch Jupyter (e.g., VS Code, Jupyter Lab, or Jupyter Notebook), and simply run the cells in order to obtain a figure like the one shown below.

Cu8_RS Cu8_RS

Subsections of Evolutionary Algorithm (EA)

ASE on your local PC

2025 April 5

The files used in this tutorial can be downloaded from CrySPY_utility/examples/ase_Cu8Au8_EA. This tutorial demonstrates a test run on a local machine using ASE’s lightweight Pure Python EMT calculator. The target system is Cu8Au8.

cryspy.in

Example of cryspy.in.

[basic]
algo = EA
calc_code = ASE
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu Au
nat = 8 8

[EA]
n_pop = 10
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 1
n_elite = 1
n_fittest = 5
slct_func = TNM
t_size = 2
maxgen_ea = 0

[ASE]
ase_python = ase_in.py

[option]

calc_in/

The contents under calc_in/ are the same as in Tutorial > Random Search (RS) > ASE in your local PC.

calc_in/ase_in.py_1

from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.calculators.emt import EMT
from ase.optimize import BFGS
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(cell_filter)
opt.run(fmax=0.01, steps=2000)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = cell_filter.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ write structure
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

calc_in/job_cryspy

#!/bin/sh

# ---------- ASE
python3 ase_in.py > out.log

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Create next generation

2025 April 6

First run

When you run cryspy, the program enters structure generation mode. It generates the first generation of random structures and then exits.

cryspy

log

...
[2025-04-06 09:15:34,720][ea_init][INFO] # ---------- Initialize evolutionary algorithm
[2025-04-06 09:15:34,720][ea_init][INFO] # ------ Generation 1
[2025-04-06 09:15:34,720][ea_init][INFO] 10 structures by random

In EA, running cryspy appends the current generation’s information to cryspy.stat.

[status]
generation = 1
id_queueing = 0 1 2 3 4 5 6 7 8 9

Optimize structures

After running cryspy several times and completing the structure optimization for the first generation, the output will appear as shown below.

...
[2025-04-06 09:20:26,218][ctrl_job][INFO] Done generation 1
[2025-04-06 09:20:26,218][ctrl_job][INFO] 
EA is ready

Create next generation

Once all preparations are complete, running cryspy again automatically creates a backup and starts generating the next-generation structures.

cryspy
...
[2025-04-06 09:35:11,546][cryspy_restart][INFO] read input, cryspy.in
[2025-04-06 09:35:11,554][ctrl_job][INFO] # ---------- job status
[2025-04-06 09:35:11,554][ctrl_job][INFO] Done generation 1
[2025-04-06 09:35:11,554][utility][INFO] Backup data
[2025-04-06 09:35:11,611][ea_next_gen][INFO] # ---------- Evolutionary algorithm
[2025-04-06 09:35:11,611][ea_next_gen][INFO] Generation 2
[2025-04-06 09:35:11,613][ea_next_gen][INFO] # ------ natural selection
[2025-04-06 09:35:11,687][ea_next_gen][INFO] ranking without duplication (including elite):
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      1, fitness:   -0.00530
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      3, fitness:    0.01490
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      4, fitness:    0.04485
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      7, fitness:    0.11501
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      8, fitness:    0.15254
[2025-04-06 09:35:11,687][ea_next_gen][INFO] # ------ Generate children
[2025-04-06 09:35:11,687][ea_child][INFO] # -- mindist
[2025-04-06 09:35:11,689][struc_util][INFO] Cu - Cu: 1.32
[2025-04-06 09:35:11,689][struc_util][INFO] Cu - Au: 1.34
[2025-04-06 09:35:11,689][struc_util][INFO] Au - Au: 1.36
[2025-04-06 09:35:11,740][crossover][INFO] Structure ID     10 (8, 8) was generated from      3 and      1 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,764][crossover][WARNING] remove_within_mindist: some atoms within mindist. retry.
[2025-04-06 09:35:11,774][crossover][INFO] Structure ID     11 (8, 8) was generated from      3 and      1 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,789][crossover][INFO] Structure ID     12 (8, 8) was generated from      1 and      4 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,833][crossover][INFO] Structure ID     13 (8, 8) was generated from      1 and      3 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,852][crossover][WARNING] mindist in _add_border_line: Cu - Cu, 0.567032320824818. retry.
[2025-04-06 09:35:11,861][crossover][INFO] Structure ID     14 (8, 8) was generated from      7 and      1 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,875][permutation][INFO] Structure ID     15 (8, 8) was generated from      1 by permutation. Space group: 146 R3
[2025-04-06 09:35:11,888][permutation][INFO] Structure ID     16 (8, 8) was generated from      3 by permutation. Space group:   1 P1
[2025-04-06 09:35:11,890][strain][WARNING] mindist in strain: Cu - Cu, 1.3050485787603692. retry.
[2025-04-06 09:35:11,903][strain][INFO] Structure ID     17 (8, 8) was generated from      3 by strain. Space group:   1 P1
[2025-04-06 09:35:11,917][strain][INFO] Structure ID     18 (8, 8) was generated from      1 by strain. Space group:   1 P1
[2025-04-06 09:35:12,513][ea_child][INFO] # ------ Random structure generation
[2025-04-06 09:35:12,513][rs_gen][INFO] # ------ mindist
[2025-04-06 09:35:12,515][struc_util][INFO] Cu - Cu: 1.32
[2025-04-06 09:35:12,515][struc_util][INFO] Cu - Au: 1.34
[2025-04-06 09:35:12,515][struc_util][INFO] Au - Au: 1.36
[2025-04-06 09:35:12,516][rs_gen][INFO] # ------ generate structures
[2025-04-06 09:35:12,530][gen_pyxtal][INFO] Structure ID     19: (8, 8) Space group:  86 -->  86 P4_2/n
[2025-04-06 09:35:12,533][ea_next_gen][INFO] # ------ Select elites
[2025-04-06 09:35:12,533][ea_next_gen][INFO] Structure ID      9 keeps as the elite

After that, simply running cryspy repeatedly will advance the structure search.

Check results

cryspy_rslt

The following is an example of cryspy_rslt after completing calculations up to the third generation. In EA, generation information (Gen) is also included.

    Gen  Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
0     1      214  I4_132          230       Ia-3d   1.168743     NaN  no_file
1     1      198   P2_13          198       P2_13  -0.005303     NaN  no_file
2     1       95  P4_322           95      P4_322   0.389566     NaN  no_file
3     1       27    Pcc2           27        Pcc2   0.014898     NaN  no_file
4     1       60    Pbcn           60        Pbcn   0.044852     NaN  no_file
5     1      116   P-4c2          116       P-4c2   0.403246     NaN  no_file
6     1      187   P-6m2          187       P-6m2   1.054706     NaN  no_file
7     1      161     R3c          160         R3m   0.115009     NaN  no_file
8     1      146      R3          146          R3   0.152535     NaN  no_file
9     1       60    Pbcn           47        Pmmm  -0.005676     NaN  no_file
10    2        1      P1            1          P1   0.026070     NaN  no_file
11    2        1      P1            7          Pc   0.005898     NaN  no_file
12    2        1      P1            1          P1   0.005208     NaN  no_file
13    2        1      P1            1          P1   0.005506     NaN  no_file
14    2        1      P1            1          P1   0.024364     NaN  no_file
15    2      146      R3          146          R3   0.011525     NaN  no_file
16    2        1      P1            1          P1   0.014590     NaN  no_file
17    2        1      P1            1          P1   0.015236     NaN  no_file
18    2        1      P1            2         P-1  -0.012335     NaN  no_file
19    2       86  P4_2/n          140      I4/mcm   0.274548     NaN  no_file
20    3        1      P1            1          P1   0.013611     NaN  no_file
21    3        1      P1           10        P2/m  -0.014166     NaN  no_file
22    3        1      P1            1          P1   0.019472     NaN  no_file
23    3        1      P1            1          P1   0.011641     NaN  no_file
24    3        1      P1            1          P1   0.000297     NaN  no_file
25    3        1      P1            1          P1   0.005596     NaN  no_file
26    3        1      P1            1          P1   0.013374     NaN  no_file
27    3        1      P1            2         P-1   0.005055     NaN  no_file
28    3        2     P-1           12        C2/m  -0.012396     NaN  no_file
29    3      182  P6_322          182      P6_322   0.711472     NaN  no_file

ea_info

The EA parameters used in each generation are recorded in ea_info.

Gen Population Crossover Permutation Strain Random Elite crs_lat slct_func
  1         10         0           0      0     10     0  random       TNM
  2         10         5           2      2      1     1  random       TNM
  3         10         5           2      2      1     1  random       TNM

ea_origin

Information about the structure generation method and parent individuals is output to ea_origin.

Gen Struc_ID   Operation   Parent
  1        0      random     None
  1        1      random     None
  1        2      random     None
  1        3      random     None
  1        4      random     None
  1        5      random     None
  1        6      random     None
  1        7      random     None
  1        8      random     None
  1        9      random     None
  2       10   crossover   (3, 1)
  2       11   crossover   (3, 1)
  2       12   crossover   (1, 4)
  2       13   crossover   (1, 3)
  2       14   crossover   (7, 1)
  2       15 permutation     (1,)
  2       16 permutation     (3,)
  2       17      strain     (3,)
  2       18      strain     (1,)
  2       19      random     None
  2        9       elite    elite
  3       20   crossover (18, 12)
  3       21   crossover  (12, 9)
  3       22   crossover (12, 18)
  3       23   crossover  (18, 9)
  3       24   crossover (13, 18)
  3       25 permutation    (18,)
  3       26 permutation     (9,)
  3       27      strain    (18,)
  3       28      strain    (18,)
  3       29      random     None
  3       18       elite    elite

Analysis and visualization

Download data

It is assumed here that you analyze and visualize CrySPY data on your local PC. If you use CrySPY on a supercomputer or workstation, download the data to your local machine. You can delete the work and backup directories if they are not needed, as their file size can be very large.

Jupyter notebook

Move to the data/ directory in the results you downloaded earlier. Then, if the CrySPY utility has already been downloaded locally, copy cryspy_analyzer_EA.ipynb. Alternatively, you can download it directly from GitHub (CrySPY_utility/notebook/). Launch Jupyter (e.g., VS Code, Jupyter Lab, or Jupyter Notebook), and simply run the cells in order to obtain a figure like the one shown below.

Cu8Au8_EA Cu8Au8_EA

Variable-composition evolutionary algorithm (EA-vc)

Info
Info

System requirements

  • CrySPY 1.4.0 or later
  • As of June 2025, only the ASE interface is supported( see CrySPY > Interface

Preparation of input files

Follow one of the instructions below, then proceed to the section on running CrySPY.

Running CrySPY

Subsections of Variable-composition evolutionary algorithm (EA-vc)

ASE on your local PC (Cu-Ag-Au)

2025 June 16

The files used in this tutorial can be downloaded from CrySPY_utility/examples/ase_Cu-Ag-Au_EA-vc. This tutorial demonstrates a test run on a local machine using ASE’s lightweight Pure Python EMT calculator. The target system is the ternary Cu-Ag-Au system.

cryspy.in

Example of cryspy.in.

[basic]
algo = EA-vc
calc_code = ASE
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu Ag Au
ll_nat = 0 0 0
ul_nat = 8 8 8

[ASE]
ase_python = ase_in.py

[EA]
n_pop = 20
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 2
n_add = 3
n_elim = 3
n_subs = 3
target = random
n_elite = 2
n_fittest = 10
slct_func = TNM
t_size = 2
end_point = 0.0 0.0 0.0
maxgen_ea = 0

[option]

calc_in/

The contents under calc_in/ are the same as those in Tutorial > Random Search (RS) > ASE on your local PC.

calc_in/ase_in.py_1

from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.calculators.emt import EMT
from ase.optimize import BFGS
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(cell_filter)
opt.run(fmax=0.01, steps=2000)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = cell_filter.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ write structure
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

calc_in/job_cryspy

#!/bin/sh

# ---------- ASE
python3 ase_in.py > out.log

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

ASE-CHGNet(Cu-Au)

2025 June 16

Info

CHGNet needs to be installed.

The files used in this tutorial can be downloaded from CrySPY_utility/examples/ase_chgnet_Cu-Au_EA-vc. In this tutorial, we assume that a computing cluster with a job scheduler is used together with the machine learning potential CHGNet. The calculation can also be performed on a local PC, so if you prefer this, please modify the input settings accordingly. The target system is the binary Cu-Au system.

Pre-calculation

In EA-vc, the per-atom energies of each elemental phase must be used as the reference in the end_point setting of cryspy.in, so they need to be calculated beforehand. There should be two directories inside the example.

Au-fcc
├── POSCAR
├── chgnet_in.py
└── job_cryspy
Cu-fcc
├── POSCAR
├── chgnet_in.py
└── job_cryspy

Each directory contains a crystal structure file (POSCAR), a Python script (chgnet_in.py) to perform structure relaxation and calculate energy, and a job script (job_cryspy). Please modify these files according to your computing environment.

Submit the job (replace the job submission command as appropriate for your system).

cd Au-fcc
qsub job_cryspy
cd ../Cu-fcc
qsub job_cryspy
cd ..

When the calculations finish successfully, a file named end_point will be created in each directory, containing the energy per atom (eV/atom) after structure relaxation.

cat Au-fcc/end_point
-3.2357187271118164
cat Cu_fcc/end_point
-4.083529472351074

These values are then used as input for the cryspy.in file.

cryspy.in

[basic]
algo = EA-vc
calc_code = ASE
nstage = 1
njob = 20
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

[ASE]
ase_python = chgnet_in.py

[EA]
n_pop = 20
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 2
n_add = 3
n_elim = 3
n_subs = 3
target = random
n_elite = 2
n_fittest = 10
slct_func = TNM
t_size = 2
maxgen_ea = 0
end_point = -4.08352709  -3.23571777

[option]

calc_in/

The contents under calc_in/ are the same as those in Tutorial > Random Search (RS) > ASE on your local PC, with minor modifications for CHGNet. Be sure to adjust paths such as the Python executable in the job script to match your computing environment. Be sure to adjust the Python executable path in the job script.

calc_in/chgnet_in.py_1

# ---------- import
from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.io import read, write
from ase.optimize import FIRE, BFGS, LBFGS
from chgnet.model import CHGNetCalculator

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR')

# ---------- set up
atoms.calc = CHGNetCalculator()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms)
opt = BFGS(cell_filter, trajectory='opt.traj')

# ---------- run
opt.run(fmax=0.01, steps=2000)

# ---------- write structure
write('opt_struc.vasp', cell_filter.atoms, format='vasp', direct=True)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
# ------ energy
e = cell_filter.atoms.get_total_energy()    # eV/cell
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ struc
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

calc_in/job_cryspy

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N CuAu_CrySPY_ID
#$ -pe smp 2

# ---------- OpenMP
export OMP_NUM_THREADS=2

# ---------- ASE
/usr/local/Python-3.10.13/bin/python3 chgnet_in.py > out.log

# --------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Create next generation

2025 June 16

First run

When you run cryspy, the program enters structure generation mode. It generates the first generation of random structures and then exits.

cryspy

It can be confirmed from the output that structures are generated with the number of atoms within the range specified by ll_nat and ul_nat.

...
[2025-06-16 10:04:45,648][cryspy_init][INFO] # ---------- Initial structure generation
[2025-06-16 10:04:45,648][rs_gen][INFO] # ------ mindist
[2025-06-16 10:04:45,650][struc_util][INFO] Cu - Cu: 1.32
[2025-06-16 10:04:45,650][struc_util][INFO] Cu - Ag: 1.385
[2025-06-16 10:04:45,650][struc_util][INFO] Cu - Au: 1.34
[2025-06-16 10:04:45,650][struc_util][INFO] Ag - Ag: 1.45
[2025-06-16 10:04:45,650][struc_util][INFO] Ag - Au: 1.405
[2025-06-16 10:04:45,650][struc_util][INFO] Au - Au: 1.36
[2025-06-16 10:04:45,650][rs_gen][INFO] # ------ generate structures
[2025-06-16 10:04:45,659][gen_pyxtal][WARNING] Compoisition [1 4] not compatible with symmetry 34:     spg = 34 retry.
[2025-06-16 10:04:45,662][gen_pyxtal][WARNING] Compoisition [ 2  2 12] not compatible with symmetry 39:     spg = 39 retry.
[2025-06-16 10:04:45,691][gen_pyxtal][INFO] Structure ID      0: (3, 1, 2) Space group:  82 --> 119 I-4m2
[2025-06-16 10:04:45,694][gen_pyxtal][WARNING] Compoisition [6 6 2] not compatible with symmetry 57:     spg = 57 retry.
[2025-06-16 10:04:45,749][gen_pyxtal][INFO] Structure ID      1: (1, 8, 5) Space group:  71 -->  71 Immm
[2025-06-16 10:04:45,857][gen_pyxtal][INFO] Structure ID      2: (3, 7, 8) Space group: 174 --> 174 P-6
...

The file cryspy.stat shows that the current generation’s information is being added during the EA process.

[status]
generation = 1
id_queueing = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Optimize structures

After running cryspy several times and completing the structure optimization for the first generation, the output will appear as shown below.

...
[2025-06-16 10:25:56,962][ctrl_job][INFO] Done generation 1
[2025-06-16 10:25:56,962][ctrl_job][INFO] Calculate convex hull for generation 1
[2025-06-16 10:25:57,854][ctrl_job][INFO] 
EA is ready

Convex hull

At this point, the hull distance data and the convex hull plot have been output to ./data/convex_hull/.

  • hull_dist_all_gen_1
    ID    hull distance (eV/atom)    Num_atom
     7                   0.000000    (0, 2, 6)
    14                   0.036510    (1, 7, 6)
    17                   0.064702    (0, 1, 5)
    19                   0.113649    (0, 0, 8)
    16                   0.168530    (6, 4, 8)
     9                   0.186497    (8, 4, 6)
     1                   0.187379    (1, 8, 5)
    11                   0.233893    (4, 5, 4)
     3                   0.273365    (6, 5, 5)
    10                   0.326759    (1, 4, 4)
     2                   0.330749    (3, 7, 8)
     8                   0.359543    (6, 2, 7)
     4                   0.404169    (4, 4, 2)
    18                   0.422989    (0, 6, 8)
    13                   0.428456    (0, 6, 3)
     5                   0.444792    (7, 4, 7)
     6                   0.464305    (7, 7, 7)
    12                   0.556654    (3, 0, 0)
    15                   0.560062    (6, 7, 1)
     0                   0.644278    (3, 1, 2)
  • conv_hull_gen_1.svg conv_hull_gen_1.svg conv_hull_gen_1.svg

Create next generation

Once all preparations are complete, running cryspy again automatically creates a backup and starts generating the next-generation structures.

cryspy
...
[2025-06-16 10:37:19,860][ctrl_job][INFO] Done generation 1
[2025-06-16 10:37:20,136][utility][INFO] Backup data
[2025-06-16 10:37:20,173][ea_next_gen][INFO] # ---------- Evolutionary algorithm
[2025-06-16 10:37:20,174][ea_next_gen][INFO] Generation 2
[2025-06-16 10:37:20,174][ea_next_gen][INFO] # ------ natural selection
[2025-06-16 10:37:20,177][ea_next_gen][INFO] ranking without duplication (including elite):
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      7, fitness:    0.00000
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     14, fitness:    0.03651
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     17, fitness:    0.06470
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     19, fitness:    0.11365
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     16, fitness:    0.16853
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      9, fitness:    0.18650
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      1, fitness:    0.18738
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     11, fitness:    0.23389
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      3, fitness:    0.27336
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     10, fitness:    0.32676
[2025-06-16 10:37:20,177][ea_next_gen][INFO] # ------ Generate children
[2025-06-16 10:37:20,177][ea_child][INFO] # -- mindist
[2025-06-16 10:37:20,179][struc_util][INFO] Cu - Cu: 1.32
[2025-06-16 10:37:20,179][struc_util][INFO] Cu - Ag: 1.385
[2025-06-16 10:37:20,179][struc_util][INFO] Cu - Au: 1.34
[2025-06-16 10:37:20,179][struc_util][INFO] Ag - Ag: 1.45
[2025-06-16 10:37:20,179][struc_util][INFO] Ag - Au: 1.405
[2025-06-16 10:37:20,179][struc_util][INFO] Au - Au: 1.36
[2025-06-16 10:37:20,217][crossover][INFO] Structure ID     20 (0, 4, 7) was generated from     19 and     14 by crossover. Space group:   1 P1
[2025-06-16 10:37:20,219][crossover][INFO] Structure ID     21 (0, 1, 7) was generated from      7 and     17 by crossover. Space group:   1 P1
[2025-06-16 10:37:20,221][crossover][INFO] Structure ID     22 (3, 0, 8) was generated from     16 and     19 by crossover. Space group:   1 P1
[2025-06-16 10:37:20,225][crossover][INFO] Structure ID     23 (0, 1, 7) was generated from      7 and     17 by crossover. Space group:   1 P1
...
[2025-06-16 10:37:20,809][ea_next_gen][INFO] # ------ Select elites
[2025-06-16 10:37:20,809][ea_next_gen][INFO] Structure ID      7 keeps as the elite
[2025-06-16 10:37:20,809][ea_next_gen][INFO] Structure ID     14 keeps as the elite

After that, simply running cryspy repeatedly will advance the structure search.

Check results

This section focuses on the differences from the EA method.

cryspy_rslt

Below is an example of a cryspy_rslt file after completing calculations up to the third generation. In EA-vc, formation energy (Ef_eV_atom) and number of atoms (Num_atom) are also included.

    Gen  Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Ef_eV_atom   Num_atom  Magmom      Opt
0     1      119   I-4m2          119       I-4m2   0.639865    0.639865  (3, 1, 2)     NaN  no_file
1     1       71    Immm           71        Immm   0.182650    0.182650  (1, 8, 5)     NaN  no_file
2     1      174     P-6          187       P-6m2   0.324864    0.324864  (3, 7, 8)     NaN  no_file
3     1       71    Immm           71        Immm   0.269227    0.269227  (6, 5, 5)     NaN  no_file
4     1       12    C2/m           65        Cmmm   0.401521    0.401521  (4, 4, 2)     NaN  no_file
7     1      123  P4/mmm          123      P4/mmm  -0.009930   -0.009930  (0, 2, 6)     NaN  no_file
10    1      107    I4mm          107        I4mm   0.320875    0.320875  (1, 4, 4)     NaN  no_file
5     1      121   I-42m          121       I-42m   0.439643    0.439643  (7, 4, 7)     NaN  no_file
6     1      115   P-4m2          115       P-4m2   0.459892    0.459892  (7, 7, 7)     NaN  no_file
8     1       81     P-4           81         P-4   0.354247    0.354247  (6, 2, 7)     NaN  no_file
9     1       11  P2_1/m           11      P2_1/m   0.182084    0.182084  (8, 4, 6)     NaN  no_file
11    1       10    P2/m           10        P2/m   0.229819    0.229819  (4, 5, 4)     NaN  no_file

nat_data

Information on the number of atoms is also included in nat_data.

    ID    ('Cu', 'Ag', 'Au')
     0    (3, 1, 2)
     1    (1, 8, 5)
     2    (3, 7, 8)
     3    (6, 5, 5)
     4    (4, 4, 2)
     5    (7, 4, 7)
     6    (7, 7, 7)
     7    (0, 2, 6)
     8    (6, 2, 7)
     9    (8, 4, 6)
    10    (1, 4, 4)
...

hull_dist_all_gen_x

For example, after the third generation is completed, the hull distance data is output to the file ./convex_hull/hull_dist_all_gen_3.

    ID    hull distance (eV/atom)    Num_atom
    43                   0.000000    (0, 2, 5)
    42                   0.000000    (0, 5, 5)
    48                   0.000000    (0, 1, 5)
    46                   0.000009    (0, 1, 5)
    28                   0.000011    (0, 1, 5)
    41                   0.000360    (0, 4, 6)
    47                   0.001838    (0, 1, 5)
    36                   0.001992    (1, 1, 6)
    21                   0.002544    (0, 1, 7)
    23                   0.002551    (0, 1, 7)
    24                   0.002795    (0, 4, 7)

conv_hull_gen_x.svg

The convex hull plot at the end of generation 3 is saved as ./convex_hull/conv_hull_gen_3.svg. Although svg is the default format, it can be changed to pdf or png by modifying the fig_format in the input file.

conv_hull_gen_3.svg conv_hull_gen_3.svg

Analysis and visualization

Automatic convex hull plotting

In EA-vc simulations of binary and ternary systems, a convex hull plot is automatically generated at the end of each generation. For further customization, you can edit the plot yourself using a Jupyter notebook. For quaternary systems, visualization using Plotly with Jupyter is available (Plotly should already be installed automatically, as it is a dependency of pymatgen). Below are some usage examples.

Binary system

conv_hull_gen_3_with_desc.svg conv_hull_gen_3_with_desc.svg

The above figure shows an example after search up to the third generation, with red labels added for explanation. The input file settings related to convex hull plotting are listed below (default values in parentheses).

  • show_max: Upper limit of the y-axis (0.2)
  • label_stable: Whether to display compositions of stable phases (True)
  • vmax: Maximum value of the colorbar on the right (0.2)
  • bottom_margin: Margin between the minimum value and the lower end of the y-axis (0.02)
  • fig_format: File format of the output figure. Supported formats: svg, png, pdf. (svg)

Each marker corresponding to the latest generation is marked with a cross.

Ternary system

conv_hull_gen_3_with_desc.svg conv_hull_gen_3_with_desc.svg

The above figure shows an example after search up to the third generation, with red labels added for explanation. The input file settings related to convex hull plotting are listed below (default values in parentheses).

  • show_max: Only entries with a hull distance less than or equal to show_max are plotted. (0.2)
  • label_stable: Whether to display compositions of stable phases (True)
  • vmax: Maximum value of the colorbar on the right (0.2)
  • bottom_margin: Not applicable to ternary systems
  • fig_format: File format of the output figure. Supported formats: svg, png, pdf. (svg)

Each marker corresponding to the latest generation is marked with a cross.

Download data

It is assumed here that you analyze and visualize CrySPY data on your local PC. If you use CrySPY on a supercomputer or workstation, download the data to your local machine. You can delete the work and backup directories if they are not needed, as their file size can be very large.

Jupyter notebook

Move to the data/ directory in the results you downloaded earlier. Then, if the CrySPY utility has already been downloaded locally, copy cryspy_analyzer_EA-vc.ipynb. Alternatively, you can download it directly from GitHub (CrySPY_utility/notebook/).

The Jupyter notebook file includes the same functions as the CrySPY code, allowing you to freely customize the convex hull plots. Execute the cells in order as appropriate, and choosing one of the following options will produce the same plot as the automatic output.

  • Binary system, matplotlib
  • Ternary system, matplotlib

In the section

  • Interactive plot using Plotly,

interactive plots using Plotly are available for binary, ternary, and quaternary systems. See CrySPY > Tutorial > Interactive Mode (Jupyter Notebook) #Interactive plot using Plotly for example plots.

Bayesian Optimization (BO)

BO

LAQA

May 15th, 2023

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.
Here, we assume CrySPY 1.1.0 or later.

The example files used here can be downloaded from CrySPY_utility/examples/qe_Si16_LAQA. In this tutorial, only 50 initial structures are generated, but originally, LAQA is designed to select candidates from many more structures.

Input

cryspy.in

Here is an example of cryspy.in.

[basic]
algo = LAQA
calc_code = QE
tot_struc = 50
nstage = 1
njob = 10
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Si
nat = 16
mindist_1 = 1.5

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  80

[LAQA]
nselect_laqa = 4

[option]
  • nstage must be 1 in LAQA
  • You have to write nselect_laqa in [LAQA] section. nselect_laqa is the number of candidates you select at one time.

If you want to change the value of the weight for LAQA score, edit wf and ws as below. If omitted, the default values are used (0.1 and 10.0, respectively). See, Search algorithms > LAQA for the score.

[LAQA]
nselect_laqa = 4
wf = 0.1
ws = 10.0

calc_in/pwscf.in_1

&control
    calculation = 'vc-relax'
    pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
    outdir='./outdir/'
    nstep = 10
/

&system
    ibrav = 0
    nat = 16
    ntyp = 1
    ecutwfc = 40
    ecutrho = 200
    occupations = 'smearing'
    degauss = 0.01
/

&electrons
/

&ions
/

&cell
/

ATOMIC_SPECIES
  Si -1.0 si_pbe_v1.uspp.F.UPF
  • nstep controls how many steps of structure optimization can proceed in one selection. (NSW for VASP)

calc_in/job_cryspy

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS pw.x -nk 4 < pwscf.in > pwscf.out

if [ -e "CRASH" ]; then
    sed -i -e '3 s/^.*$/skip/' stat_job
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job
  • The job file is the same as the usual way.

Run

Tip

An automatic script is also available. See the bottom of this page.

Just type cryspy for the 1st run.

cryspy &

Check log_cryspy. 50 random structures are generated.

2023/05/13 13:02:07
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 1

Read input file, cryspy.in
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.5
Structure ID      0 was generated. Space group: 165 --> 165 P-3c1
Structure ID      1 was generated. Space group:  66 -->  66 Cccm
Structure ID      2 was generated. Space group: 146 --> 146 R3
Structure ID      3 was generated. Space group:  82 -->  82 I-4
Structure ID      4 was generated. Space group: 162 --> 162 P-31m
...
...
...
Structure ID     47 was generated. Space group:  90 -->  90 P42_12
Structure ID     48 was generated. Space group: 214 --> 214 I4_132
Structure ID     49 was generated. Space group:  23 -->  23 I222

Elapsed time for structure generation: 0:00:10.929030


# ---------- Initialize LAQA
# ---------- Selection 0
selected_id: 50 IDs

In LAQA, jobs of structure optimization for all structures are submitted once at the beginning. Note that only 10 steps are proceeded here since we set nstep = 10. Repeat cryspy command until all of these (10 steps) are completed. If necessary, you can also submit all jobs at once by increasing the value of njob.

After all the initial optimizations, LAQA is ready is displayed at the end of log_cryspy.

2023/05/13 13:23:31
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status
ID     41: Stage 1 Done!

LAQA is ready

Next cryspy run will make the first selection.

2023/05/13 13:23:33
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status

Backup data

# ---------- Selection 1
selected_id: 37 8 10 48

Here, only the number set in nselect_laqa will be selected. Type cryspy to submit the jobs (next 10 steps).

cryspy &
2023/05/13 13:23:36
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status
ID     37: submit job, Stage 1
ID      8: submit job, Stage 1
ID     10: submit job, Stage 1
ID     48: submit job, Stage 1

Then, by repeating this over and over again, the optimization of the structure selected according to the score advances by 10 steps each time. Proceed until several structures are completed, and finish (stop) when you like.

Status

If you want to check the LAQA score during the simulation, you can look at the status file:

  • ./data/LAQA_status

Other files for LAQA will be output:

  • ./data_LAQA_bias
  • ./data_LAQA_energy
  • ./data_LAQA_score
  • ./data_LAQA_selected_id
  • ./data_LAQA_step

Analysis and visualization

It is assumed here that you analyze and visualize CrySPY data in your local PC. If you use CrySPY in super computers or workstations, download the data in your local PC. You can delete the work and backup directory if you do not need it because the file size could be very large. You may gzip the pkl data to decrease the file size.

jupyter notebook

Move to the data/ directory in results you just downloaded. Then copy cryspy_analyzer_LAQA.ipynb from CrySPY utility.

You can obtain the graph and animation with the notebook. In the gif below, all of the optimizations were completed. This is just for animation. (When all of the optimizations are completed, the computational cost is the same as random search.)

fig_LAQA fig_LAQA

This graph shows the energy as a function of optimization step. The red lines indicate three structures with the lowest energy. The most stable one reached diamond structure. The structures that eventually become stable were selected at an early stage.

Info

If algo = LAQA, the followings are automatically set in the [option] section.

  • force_step_flag = True
  • stress_step_flag = True

Force and stress data are collected step by step. Energy and structure data are NOT. They are collected for each selection. In other words, in this case, energy and structure data are saved once every 10 steps. If you want to collect energy and structure data step by step, manually set up as follows:

[option]
energy_step_flag = True
struc_step_flag = True

Auto script

You may find it tedious to run cryspy over and over again. The auto script could help you.

repeat_cryspy

Molecular crystal structure prediction

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.

In this section, we give a tutorial on the molecular structure generation part only. Since version 0.9.0, CrySPY has been able to generate random molecular crystal structures using PyXtal.

You need to use a pre-defined molecular by PyXtal’s database (see, https://pyxtal.readthedocs.io/en/latest/Usage.html?highlight=benzene#pyxtal-molecule-pyxtal-molecule)) or create molecule files that define molecular structures.

Pre-defined molecule

PyXtal currently supports C60, H2O, CH4, NH3, benzene, naphthalene, anthracene, tetracene, pentacene, coumarin, resorcinol, benzamide, aspirin, ddt, lindane, glycine, glucose, and ROY.

Let us generate molecular crystal structures that consist of 2 benzenes.

Move to your working directory, and copy input example files by one of the following methods.

Take a look at cryspy.in.

$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 6
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
struc_mode = mol
atype = H C
nat = 12 12
mol_file = benzene
nmol = 2

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40  60

[option]

In generating molecular crystal structures, you have to set struc_mode = mol in the [structure] section. Molecule file(s) and the number of molecule(s) are specified as:

  • mol_file = benzene
  • nmol = 2

Run CrySPY and see the initial structures (./data/init_POSCARS).

User-defined molecule

Move to your working directory, and copy input example files for 2 formula units of Li3PS4.

  • version 1.0.0 or later
    • Copy from CrySPY utility
  • version 0.10.3 or earlier
    • cp -r ~/CrySPY_root/CrySPY-0.9.0/example/QE_Li3PS4_2fu_RS_mol .
$ cd QE_Li3PS4_2fu_RS_mol
$ ls
Li.xyz  PS4.xyz  calc_in/  cryspy.in

Molecule files of Li and PS4 are included. Supported formats in PyXtal are .xyz, .gjf, .g03, .g09, .com, .inp, .out, and pymatgen’s JSON serialized molecules.

$ cat Li.xyz
1
New structure
 Li  0.000  0.000  0.000
$ cat PS4.xyz
5
New structure
 P    0.000000    0.000000    0.000000
 S    1.200000    1.200000   -1.200000
 S    1.200000   -1.200000    1.200000
 S   -1.200000    1.200000    1.200000
 S   -1.200000   -1.200000   -1.200000

Check cryspy.in.

$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 4
nstage = 2
njob = 1
jobcmd = qsub
jobfile = job_cryspy

[structure]
struc_mode = mol
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40  60

[option]

A single atom (Li atom in this case) is treated as a molecule in the molecular crystal structure generation mode. In this example, a random molecular structure is composed of six Li molecules (atoms) and two PS4 molecules specified as:

  • mol_file = ./Li.xyz ./PS4.xyz
  • nmol = 6 2

In mol_file, set relative path of molecule files from cryspy.in. Here the molecule files are placed in the same directory.

Run CrySPY and see the initial structures (./data/init_POSCARS).

timeout_mol

Molecular crystal structure generation can be time consuming because PyXtal calculates the molecule directions according to a specified space group. Sometimes molecular crystal structure generation gets stuck. So we set a time limit on the single structure generation. The time limit (timeout_mol) is set to 120 seconds by default. If the limit is insufficient, you have to increase it as (see last line):

struc_mode = mol
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0

Volume of unit cell

You can control the volume of unit cells by changing the value(s) of scaling factor, vol_factor, in cryspy.in. By default, vol_factor is set to 1.0. It is also possible to specify a range of factors. Set minimum and maximum values as follows:

struc_mode = mol
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0
vol_factor = 0.8 1.5

Random structure generation with MPI

Oct. 21 2023, update

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.

Info

Requirements:

  • CrySPY 1.1.0 1.2.3 or later
  • mpi4py
  • MPI library (Open MPI, Intel MPI, MPICH, etc.)
Warning

1.1.0 <= CrySPY <=1.2.2 has a bug. When you use bash (zsh) to run a job with MPI (e.g., jobcmd = zsh, jobfile = job_cryspy), the MPI job does not run. There is no problem when you use a job scheduler (qsub, sbatch). It has already fixed in version 1.2.3.

mpi4py

Install mpi4py if it is not already installed.

pip install mpi4py

Input

cryspy.in is the same as normal usage and does not need to be changed. Here we try structure generation with MPI using the following settings:

[basic]
algo = RS
calc_code = soiap
tot_struc = 100
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

All except tot_struc, atype, and nat are irrelevant for structure generation and can be ignored here.

Run

If you want to generate structures with 4 MPI processes, just use mpiexec -n (with `-p`` option):

mpiexec -n 4 cryspy -p

In 1.1.0 <= CrySPY <= 1.2.2, use (without `-p`` option)

mpiexec -n 4 cryspy

If you submit the job with a job scheduler system, make the job file. Here is an example:

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
#$ -N n_nproc
#$ -pe smp 4


mpirun -np $NSLOTS ~/.local/bin/cryspy

Please edit the location of the executable script cryspy.

Result

CrySPY simply divides the task (number of structures) by the number of processes:

  • Rank 0: IDs 0 – 24
  • Rank 1: IDs 25 – 49
  • Rank 2: IDs 50 – 74
  • Rank 3: IDs 75 – 99

CrySPY outputs the log in the order they are generated as follows:

2023/04/24 22:47:51
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 4

Read input file, cryspy.in
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.11
Structure ID     25 was generated. Space group: 138 --> 123 P4/mmm
Structure ID     75 was generated. Space group:  99 -->  99 P4mm
Structure ID      0 was generated. Space group: 127 --> 123 P4/mmm
Structure ID      1 was generated. Space group:  61 -->  61 Pbca
Structure ID     50 was generated. Space group:  38 -->  38 Amm2
Structure ID     51 was generated. Space group: 134 --> 123 P4/mmm
Structure ID     26 was generated. Space group: 111 --> 123 P4/mmm
Structure ID      2 was generated. Space group:   9 -->   9 Cc
Structure ID      3 was generated. Space group:  80 -->  80 I4_1
Structure ID      4 was generated. Space group: 107 --> 107 I4mm
Structure ID      5 was generated. Space group:  75 -->  75 P4
Structure ID     76 was generated. Space group: 108 --> 108 I4cm
Structure ID     77 was generated. Space group: 100 --> 100 P4bm
Structure ID     27 was generated. Space group: 207 --> 221 Pm-3m

However, the order in init_POSCARS is by structure ID since CrySPY outputs after all structures have been generated.

ID_0
1.0
   2.9636956737951818    0.0000000000000002    0.0000000000000002
   0.0000000000000000    2.9636956737951818    0.0000000000000002
   0.0000000000000000    0.0000000000000000    6.2634106638053080
Si
8
direct
  -0.1602734164607877   -0.1602734164607877   -0.0000000000000000 Si
   0.1602734164607877    0.1602734164607877    0.5000000000000000 Si
   0.6602734164607877    0.3397265835392123    0.7500000000000000 Si
   0.3397265835392122    0.6602734164607877    0.2500000000000000 Si
   0.4469739273741755    0.4469739273741755   -0.0000000000000000 Si
   0.5530260726258245    0.5530260726258244    0.5000000000000000 Si
   0.0530260726258245    0.9469739273741754    0.7500000000000000 Si
   0.9469739273741754    0.0530260726258245    0.2500000000000000 Si
ID_1
1.0
   7.2751506682509657    0.0000000000000004    0.0000000000000004
   0.0000000000000000    7.2751506682509657    0.0000000000000004
   0.0000000000000000    0.0000000000000000    5.1777634169924873
Si
8
direct
  -0.3845341807505553   -0.3845341807505553    0.4999999999999999 Si
   0.3845341807505553    0.3845341807505553    0.5000000000000000 Si
   0.3845341807505553   -0.3845341807505553    0.0000000000000000 Si
  -0.3845341807505553    0.3845341807505553   -0.0000000000000000 Si
   0.0000000000000000    0.5000000000000000    0.2500000000000000 Si
   0.5000000000000000    0.0000000000000000    0.7500000000000000 Si
   0.0000000000000000    0.5000000000000000    0.7500000000000000 Si
   0.5000000000000000    0.0000000000000000    0.2500000000000000 Si
ID_2
1.0
  -4.3660398676292269   -4.3660398676292269    0.0000000000000000
  -4.3660398676292269   -0.0000000000000003   -4.3660398676292269
   0.0000000000000000   -4.3660398676292269   -4.3660398676292269
Si
8
direct
   0.8700001548800920    0.8700001548800920    0.1299998451199080 Si
   0.1299998451199080    0.1299998451199080    0.8700001548800920 Si
   0.8700001548800920    0.1299998451199080    0.8700001548800920 Si
   0.1299998451199080    0.8700001548800920    0.1299998451199080 Si
   0.1299998451199080    0.8700001548800920    0.8700001548800920 Si
   0.8700001548800920    0.1299998451199080    0.1299998451199080 Si
   0.7500000000000000    0.7500000000000000    0.7500000000000000 Si
   0.2500000000000000    0.2500000000000000    0.2500000000000000 Si
Note

Except for the random structure generation part, there is no point in using MPI because it is not parallelized.

Interactive mode (Jupyter Notebook)

2025 March 6

Info

Requirements:

  • CrySPY 1.4.0 or later
  • Jupyter
  • Structure optimization software compatible with ASE (e.g., machine learning potentials).
  • nglview (optional)

Preparation

When CrySPY is installed, ASE is automatically installed as well. Set up Jupyter to be usable on a workstation or local PC. In this tutorial, Pure Python EMT calculator is used for structure optimization. Note that the accuracy of the EMT potential is poor, as it is intended for demonstration purposes only.

The example notebook also includes code for using the machine learning potential CHGNet. If you want to try CHGNet, make sure to install it in advance using pip.

Input file

Move to your working directory, and copy the example files by one of the following methods.

Even in interactive mode, cryspy.in is used as the input file. The calc_in directory is not used in interactive mode. You can refer to the examples of cryspy.in in the input_examples directory.

Here, the following cryspy.in using EA-vc will be used. For more details on EA-vc, refer to the EA-vc tutorial.

[basic]
algo = EA-vc
calc_code = ASE
nstage = 1
njob = 10
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

[ASE]
ase_python = ase_in.py

[EA]
n_pop = 20
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 2
n_add = 3
n_elim = 3
n_subs = 3
target = random
n_elite = 2
n_fittest = 10
slct_func = TNM
t_size = 2
maxgen_ea = 5
end_point = 0.0 0.0

[option]

Notebook

Open cryspy_interactive.ipynb and execute the cells from the top.

Check current working directory

The first cell only checks the files and the contents of cryspy.in.

!pwd
print()
!ls
print()
!cat cryspy.in

Import

Ignore the commented-out sections this time and execute the cell that imports the core libraries for CrySPY’s interactive mode.

# ---------- import
from cryspy.interactive import action

Initialize CrySPY

This cell corresponds to a standard initial run. It reads cryspy.in and generates the initial structures.

# ---------- initial structure generation
action.initialize()

Set calculator

This cell sets the ASE calculator. Here, ASE’s EMT is used.

# ---------- EMT in ASE
from ase.calculators.emt import EMT
calculator = EMT()

# ---------- CHGNet
#from chgnet.model import CHGNetCalculator
#calculator = CHGNetCalculator()

Restart CrySPY

Executing this cell starts the optimization of the previously generated initial structures. In interactive mode, structure optimization calculations are performed sequentially, one by one. A progress bar is also displayed during the process.

# ---------- structure optimization
action.restart(
    njob=20,    # njob=0: njob in cryspy.in will be used
    calculator=calculator,
    optimizer='BFGS',    # 'FIRE', 'BFGS' or 'LBFGS'
    symmetry=True,       # default: True
    fmax=0.01,           # default: 0.01 eV/Å
    steps=2000,          # default: 2000
)
  • njob: The number of structures to be optimized in a single execution. If set to 0, the value specified in cryspy.in is used.
  • calculator: Assign the previously set calculator.
  • optimizer: Select from FIRE, BFGS, or LBFGS. Specify as a string.
  • symmetry: If True, structure optimization is performed while preserving symmetry.
  • fmax: The maximum atomic force (eV/Å) used for convergence criteria.
  • steps: Maximum optimization steps.

If the njob value is set to a small number, execute this cell multiple times to complete the optimization of all initial structures. When using EA-vc, the following message will be displayed upon completion.

EA is ready

Executing this cell again will trigger generational turnover. Once the next-generation structures are generated, continue executing this cell repeatedly in the same manner.

Show results

Running this cell allows you to display files such as cryspy_rslt_energy_asc.

# ---------- show results
#!cat ./data/cryspy_rslt    # Order of structure optimization completion
!cat ./data/cryspy_rslt_energy_asc    # show energy ascending order
#!sed -n 2,4p ./data/cryspy_rslt    # show i--jth lines
#!tail -n 5 ./data/cryspy_rslt    # show last 5 lines

Structure visualization

You can interactively visualize both the initial and optimized structures.

from ase.visualize import view
atoms = action.get_atoms('opt', cid=0)    # 'init' or 'opt'
view(atoms, viewer='ngl')    # viewer = 'ngl', 'ase', or 'x3d'

Changing opt to init in action.get_atoms('opt', cid=0) allows you to check the initial structure. The cid parameter specifies the structure ID. Since this utilizes ASE’s functionality, the viewer option supports ngl, ase, and x3d. To use ngl, you need to install nglview, so make sure to install it via pip in advance.

fig_struc_visu fig_struc_visu

Energy plot for RS, EA

For random search (RS) and evolutionary algorithm (EA), an energy graph shown below can be displayed. In the case of EA-vc, direct energy comparison is not possible due to differences in the number of atoms, so the convex hull plot, discussed later, is used instead.

fig, ax = action.plot_E(
              title=None,
              ymax=2.0,
              ymin=-0.5,
              markersize=12,
              marker_edge_width=1.0,
              marker_edge_color='black',
              alpha=1.0,
          )

fig_eplot fig_eplot

Convex hull plot for EA-vc

Interactive plot using Plotly

For EA-vc, an interactive convex hull plot using Plotly is available. When CrySPY is installed, Plotly is automatically installed as well. This convex hull plot utilizes pymatgen’s functionality.

action.interactive_plot_convex_hull(cgen=None, show_unstable=0.2, ternary_style='2d')
  • cgen: Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
  • show_unstable: The maximum hull distance value to display on the plot
  • ternary_style
    • Binary system: ternary_style = ‘2d’
    • Ternary system: ternary_style = ‘2d’, ‘3d’
    • Quaternary system: ternary_style = ‘3d’

fig_convplot fig_convplot

When performing calculations with ternary or quaternary systems instead of binary systems, you can obtain the following interactive plots.

From left to right:

  • Ternary system (ternary_style = ‘2d’)
  • Ternary system (ternary_style = ‘3d’)
  • Quaternary system (ternary_style = ‘3d’)

fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary

Binary system using matplotlib

Running this cell plots the binary convex hull using matplotlib.

fig, ax = action.plot_convex_hull_binary(
              cgen=None,
              show_max=0.2,
              label_stable=True,
              vmax=0.2,
              bottom_margin=0.02,
          )
fig    # to show plot in jupyter
  • cgen: Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
  • show_max: The maximum formation energy to display on the plot
  • label_stable: Whether to display the labels (compositions) of stable structures
  • vmax: The maximum hull distance in the color bar
  • bottom_margin: Bottom margin of y-axis

fig_convplotmat fig_convplotmat

Ternary system using matplotlib

If exploring a ternary system, running this cell will generate a convex hull plot using matplotlib.

fig, ax = action.plot_convex_hull_ternary(
              cgen=None,
              show_max=0.2,
              label_stable=True,
              vmax=0.2,
          )
fig    # to show plot in jupyter
  • cgen: Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
  • show_max: The maximum formation energy to display on the plot
  • label_stable: Whether to display the labels (compositions) of stable structures
  • vmax: The maximum hull distance in the color bar

For example, the following plot can be obtained.

fig_convplotmat fig_convplotmat

Interface

2025 June 16

CrySPY supports multiple structure optimizers:

At least one optimizer is required.

Algorithm compatibility

CrySPY 1.4.0
Developing additional interfaces for EA-vc.

RSEAEA-vcBOLAQA
VASP×
Quantum Espresso×
OpenMX××
soiap×
LAMMPS××
ASE×
Interactive (ASE)××

Option compatibility

energy_step_flagstruc_step_flagforce_step_flagstress_step_flag
VASP
Quantum Espresso
OpenMX××××
soiap
LAMMPS××××
ASE××××

Subsections of Search algorithms

Random search (RS)

under construction

Evolutionary algorithm (EA)

2025 April 2

Background

Evolutionary algorithms (EAs) are metaheuristic methods inspired by the theory of evolution. EA can effectively generate new structures (offspring) by inheriting the local environments of the stable structures (parents) explored so far. Oganov group’s USPEX is a well-known software, and there are others such as XtalOpt. For example, there are previous studies such as the following papers.

  • T. S. Bush, C. R. A. Catlow, and P. D. Battle, J. Mater. Chem. 5, 1269 (1995).
  • A. R. Oganov and C. W. Glass, J. Chem. Phys. 124, 244704 (2006).
  • A. R. Oganov, A. O. Lyakhov, and M. Valle, Acc. Chem. Res. 44, 227 (2011).
  • A. O. Lyakhov, A. R. Oganov, H. T. Stokes, and Q. Zhu, Comput. Phys. Commun. 184, 1172 (2013).

fig_EA_pes fig_EA_pes

Procedure

  1. Initialize population
  2. Evaluate fitness
  3. Natural selection
  4. Select parents
  5. Create next generation
  6. Repeat from step 2: Evaluate fitness

Initialize population

In the first generation, a set of random structures is generated according to the number specified by n_pop. tot_struc is not used in EA or EA-vc.

Evaluate fitness

Currently, energy is the only property that can be used as fitness in CrySPY. By setting fit_reverse = False, the algorithm is configured to search for the minimum value. The fit_reverse setting is designed for future cases where fitness may be based on properties other than energy.

Natural selection

DFT calculations occasionally fail and produce extremely unreasonable energy values. emax_ea and emin_ea can be used to filter out structures with unreasonably low (or high) energy values:    $$ \mathrm{emin\_ea} \le E \ (\mathrm{eV/atom}) \le \mathrm{emax\_ea} $$ For example, if emin_ea is set, any structure with an energy lower than that value will be ignored.

In natural selection, the current population and elite individuals preserved from previous generations are first ranked based on fitness. The number of elite individuals used here is specified by n_elite. Only the top n_fittest individuals among the current population and elite individuals are selected, while all others are eliminated. During the natural selection process, duplicates are removed using the StructureMatcher class provided by pymatgen, and then the top n_fittest individuals are selected. n_fittest is often set to about half of n_pop (the population size). Note that in the current implementation, if n_fittest = 0, all individuals are retained. The figure below shows an example of the natural selection process when n_fittest = 5.

fig_EA_natural_selection fig_EA_natural_selection

Select parents

Two parent selection methods are implemented in CrySPY to select a single parent individual from the candidate parents. Both methods are designed so that individuals with higher fitness have a higher probability of being selected. Setting slct_func = TNM enables tournament selection, while slct_func = RLT enables roulette selection. Tournament selection requires fewer parameters and is easier to use.

Create next generation

The next generation consists of offspring produced by evolutionary operations on candidate parents, along with some randomly generated structures. Random structures are added in each generation to maintain diversity and to help escape from local minima.

Evolutionary operations

Here, we introduce the operations of the fixed-composition EA implemented in CrySPY.

Population size

The sum of structures from crossover, permutation, strain, and random generation must be equal to n_pop.

  • n_pop = n_crsov + n_perm + n_strain + n_rand

Subsections of Evolutionary algorithm (EA)

Crossover

Overview

Crossover is an evolutionary operation that creates a new structure (offspring) by exchanging sliced regions between two parent structures. This promotes structural diversity and enables the inheritance of locally stable features. It is one of the main operators used to explore low-energy configurations in structure search.

How it works

  1. Select two distinct individuals from the candidate parents
  2. Perform a random translation
  3. Randomly select a lattice vector
  4. Slice the parents near the center
  5. Swap the sliced halves
  6. Select the offspring with more atoms
  7. Adjust the number of atoms near the border
  8. Perform a minimum interatomic distance check

fig_EA_crossover fig_EA_crossover

4. Slice the parents near the center

The slice point is placed near the center and slightly varied each time.

slice_point = np.clip(np.random.normal(loc=0.5, scale=0.1), 0.3, 0.7)

If any of the subsequent steps fail, the process may be restarted from step 4. However, the number of retries is limited to maxcnt_ea, and if this limit is exceeded, the parent selection step is repeated.

5. Swap the sliced halves

When crs_lat is set to random, the lattice vectors of one of the two parent structures are randomly selected. When crs_lat is set to equal, the average of the lattice vectors of the two parent structures is used. The default is random.

6. Select the offspring with more atoms

Swapping the sliced parts of the parent structures results in two structures with different numbers of atoms. Temporarily, we select the structure with more atoms.

fig_EA_crossover_natoms fig_EA_crossover_natoms

However, if the composition differs too much from the target, the process restarts from step 4 (Slice the parents near the center). The tolerance for the difference in the number of atoms is set by nat_diff_tole. The default value of nat_diff_tole is 4, which allows a tolerance of ±4 atoms per element. In the figure above, the number of blue atoms is -1 and the number of green atoms is +2 relative to the original composition.

7. Adjust the number of atoms near the border

Deletion

When adjusting the number of atoms, the process starts with atom deletion. The number of green atoms is excessive and needs to be reduced. As illustrated in the figure below, atoms that do not satisfy the minimum interatomic distance defined by mindist are preferentially removed.

fig_EA_crossover_rm_mindist fig_EA_crossover_rm_mindist

As shown below, if atoms that violate the minimum interatomic distance remain after the deletion process, the procedure is restarted from step 4 (Slice the parents near the center).

fig_EA_crossover_rm_mindist2 fig_EA_crossover_rm_mindist2

If there are no atoms violating the minimum interatomic distance but atoms still need to be deleted, atoms are removed in order of increasing distance from the border, as shown in the figure below. Note that, in addition to the central slicing point, positions with internal coordinates of 0 are also considered borders.

fig_EA_crossover_rm_border fig_EA_crossover_rm_border

Addition

When atoms are lacking, the missing atoms are added near the border. The internal coordinate along the selected axis is determined as shown below. Here, mean refers to either the slice point or 0.

coords[axis] = np.random.normal(loc=mean, scale=0.08)

The remaining two components of the coordinate are determined randomly. Atoms are added until the target number is reached, while checking for violations of the minimum interatomic distance.

fig_EA_crossover_addition fig_EA_crossover_addition

Permutation

Overview

Permutation is an evolutionary operation that generates new structures (offspring) by modifying the atomic arrangement within a single structure. It enables the exploration of alternative configurations without changing the lattice or the overall composition.

How it works

The positions of atoms of different elements are swapped. The number of swaps can be specified by ntimes, which is set to 1 by default. After the swap, a minimum interatomic distance check is performed.

fig_EA_permutation fig_EA_permutation

Strain

Overview

Strain is an evolutionary operation that generates a new structure (offspring) by applying a small random distortion to the lattice of a parent structure. It helps to explore nearby regions of the configuration space while preserving atomic connectivity and composition. This operator is useful for fine-tuning structural candidates and escaping local minima.

How it works

The lattice vectors are $ \mathbf{a} $ transformed to $ \mathbf{a}' $ by applying a strain matrix, as follows:

$$ \mathbf{a}' = \begin{pmatrix} 1 + \eta_1 & \frac{1}{2} \eta_6 & \frac{1}{2} \eta_5 \\ \frac{1}{2} \eta_6 & 1 + \eta_2 & \frac{1}{2} \eta_4 \\ \frac{1}{2} \eta_5 & \frac{1}{2} \eta_4 & 1 + \eta_3 \end{pmatrix} \mathbf{a}. $$

Here, $ \eta_i $ are given by a Gaussinan distribution $ \mathcal{N}\left( 0, \ \sigma_{\mathrm{st}}^2 \right) $. $ \sigma_{\mathrm{st}} $ is specified by the input parameter sigma_st (by default, sigma_st = 0.5). As shown in the figure below, the lattice is deformed and then rescaled to restore the original volume. Finally, the minimum interatomic distance constraint is checked.

fig_EA_strain fig_EA_strain

Tournament selection

Overview

Tournament selection is a method used to choose parent individuals from the candidate parents based on their fitness. It is designed to balance selection pressure and diversity in the population. The figure below shows an example with n_fittest = 10 and t_size = 3.

fig_EA_tournament fig_EA_tournament

How it works

  1. A fixed number of individuals (t_size) are randomly selected from the candidate parents.
  2. Among them, the individual with the highest fitness (i.e., lowest energy) is chosen as the parent.
  3. This process is repeated until the required number of parents is selected.

Advantages

  • Simple and efficient
  • Requires only one parameter (t_size)
  • Can control selection pressure by adjusting t_size

Notes

  • The default value of t_size is 3.
  • If t_size is small, diversity is promoted.
  • If t_size is large, selection pressure increases, favoring the fittest individuals.
  • Unlike roulette selection, tournament selection never chooses the bottom (t_size - 1 ) individuals from the candidate parents.

Roulette selection

Overview

Roulette selection is a probabilistic method used to select parent individuals from the candidate parents based on their fitness. In roulette selection, each individual’s chance of being selected is proportional to its fitness.

How it works

  1. When fit_reverse is set to False (default), corresponding to minimization mode where energy is used as fitness, the fitness values of the candidate parents are multiplied by –1.
  2. The fitness values $ f_i $ are linearly scaled into $ f'_i $ using the following equation, where $ a $ and $ b $ are parameters specified by a_rlt and b_rlt, respectively (with the condition that $ a > b $). $$ f_i' = \frac{a - b}{f_{\mathrm{max}} - f_{\mathrm{min}}} f_i + \frac{b f_{\mathrm{max}} - a f_{\mathrm{min}}}{f_{\mathrm{max}} - f_{\mathrm{min}}} $$
  3. The scaled fitness values $ f_i’ $ are converted into selection probabilities using the following equation:    $$ p_i = \frac{f_i’}{\sum_k f_k’} $$ Each probability $ p_i $ represents the likelihood of selecting the $ i $-th individual.
  4. Parent individuals are then selected one by one according to the probabilities $ p_i $ using roulette wheel sampling, until the required number of parents is obtained.

Advantages

  • All individuals have a non-zero chance of being selected
  • Selection pressure can be adjusted by scaling the fitness values

Notes

  • By default, a_rlt = 10.0 and b_rlt = 1.0
  • Proper scaling of fitness values is important to ensure meaningful selection pressure.The figure below shows examples of $ p_i $ when $ a $ is relatively small (left) and relatively large (right). If $ a $ is too small, the selection pressure becomes weak, making it more difficult to favor individuals with higher fitness.

fig_EA_roulette fig_EA_roulette

Variable-composition evolutionary algorithm (EA-vc)

2025 April 4

Overview

Since CrySPY 1.4.0, a variable-composition EA (EA-vc) has been available as an extension of the fixed-composition EA. Refer to the following page for the supported interfaces (Interface). Although the overall flow is similar to the fixed-composition EA, EA-vc differs in how fitness is evaluated and how offspring are generated in order to handle varying compositions. Here, we describe the parts that have been modified from the original EA.

Procedure

  1. Initialize population
  2. Evaluate fitness
  3. Natural selection
  4. Select parents
  5. Create next generation
  6. Repeat from step 2: Evaluate fitness

Initialize population

In the first generation, a set of random structures is generated according to the number specified by n_pop. tot_struc is not used in EA or EA-vc. In EA-vc, the number of atoms for each atom type is randomly determined within a user-defined range. The minimum (ll_nat) and maximum (ul_nat) number of atoms per type can be specified in cryspy.in as shown below.

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

Evaluate fitness

The convex hull computed from formation energies is used to evaluate the phase stability of different compositions, since directly comparing total energies of structures with different numbers of atoms is not meaningful. Information on formation energy, the convex hull, and phase diagrams can be found online. For example, see Materials Project Documentation. In EA-vc, the fitness is defined as the energy above hull (also referred to as hull distance).

fig_EA-vc_phase_diagram_binary.svg fig_EA-vc_phase_diagram_binary.svg

Formation energy

Formation energy is calculated based on the reference energies (in eV/atom) of stable pure elements, which are specified as end_point in cryspy.in. For example, in the case of the Cu–Au binary system, the end_point should contain the per-atom energies (in eV/atom) of fcc-Cu and fcc-Au, in that order. Note that even if a structure with the same composition as end_point is found during the structure search and has a total energy lower than the corresponding end_point value, the formation energy is still currently calculated based on the original end_point values defined in cryspy.in.

Convex hull

The energy difference between a given structure’s formation energy and the convex hull is called the energy above hull, also known as the hull distance. This value indicates how much higher the formation energy of a structure is compared to the most stable combination of phases at the same composition. Structures with a hull distance of zero are on the convex hull and are thus thermodynamically stable.

Unlike in the fixed-composition EA, EA-vc filters structures based on their per-atom energy when computing the convex hull, using the condition:    $$ \mathrm{emin\_ea} \le E \le \mathrm{emax\_ea} $$ Note that this filtering is based only on the total energy per atom, not on the formation energy.

To compute the convex hull, CrySPY uses the PhaseDiagram class provided by the pymatgen library. Unlike in the case of formation energy, if a structure with the same composition as a pure element has a total energy lower than the corresponding end_point value, that structure is used as the reference for computing the convex hull and hull distance.

Natural selection

As shown in the figure below, EA-vc can produce multiple stable structures (i.e., with a hull distance of 0). In such cases, multiple individuals share the top rank in terms of hull distance. If the number of elite structures specified by n_elite is smaller than the number of equally ranked individuals, the selection becomes non-deterministic. Currently, CrySPY randomly selects n_elite individuals from those with a hull distance less than 0.001 eV/atom. If the number of individuals with a hull distance less than 0.001 eV/atom is smaller than n_elite, elite structures are selected in the usual way, based on fitness ranking. When selecting elite individuals as well, duplicate structures are removed using the StructureMatcher class provided by the pymatgen library. fig_EA-vc_elite.svg fig_EA-vc_elite.svg

Elite individuals are selected based on the best structures from previous generations. However, because hull distance can vary from one generation to the next, the values for elite individuals are recalculated using the current convex hull before natural selection is applied.

As described in the Convex hull section, emin_ea and emax_ea are not used for natural selection in EA-vc.

Select parents

The method for selecting parents is the same as in the fixed-composition EA.

Create next generation

Evolutionary operations

The crossover (vc) operation is slightly different from that in the fixed-composition EA, while permutation and strain are the same. EA-vc introduces several new operations to enable compositional variation.

Population size

The sum of structures from crossover, permutation, strain, addition, elimination, substitution, and random generation must be equal to n_pop.

  • n_pop = n_crsov + n_perm + n_strain + n_add + n_elim + n_subs+ n_rand

Subsections of Variable-composition evolutionary algorithm (EA-vc)

Crossover (vc)

The variable-composition crossover is almost the same as the fixed-composition version, but it differs in that the adjustment of the number of atoms is minimized.

In step 6 of the fixed-composition crossover, the difference in the number of atoms in each atom type is calculated directly. In contrast, in crossover (vc), the difference is calculated based on the allowed range defined by ll_nat and ul_nat. For example:

ll_nat = [4, 4, 4]
ul_nat = [8, 8, 8]
offspring_nat = [2, 6, 12]
nat_diff = [-2, 0, 4]

If this difference in the number of atoms (nat_diff in the example above) exceeds the allowed tolerance (nat_diff_tole), the operation is retried. Otherwise, the number of atoms is adjusted to fall within the range defined by ll_nat and ul_nat.

Addition

An atom type whose current count does not exceed the limit specified by ul_nat is randomly selected, and one atom of that type is added at a random position.

  • One atom is added, and if it does not violate the minimum interatomic distance defined by mindist, the structure is accepted.
  • If the distance condition is not satisfied, the atom is placed again at a different random position. This process is repeated up to maxcnt_ea times.
  • If no valid offspring is obtained, the volume is expanded by 10%, and the same procedure is retried up to maxcnt_ea times.
  • If that also fails, the volume is expanded up to 20% and the structure generation is attempted again. If it still fails, the parent is replaced.

fig_EA-vc_addition.svg fig_EA-vc_addition.svg

Elimination

An atom type whose current count is above the lower limit specified by ll_nat is randomly selected, and one atom of that type is removed.

fig_EA-vc_elimination.svg fig_EA-vc_elimination.svg

Substitution

Substitution randomly selects two different atom types: one whose current count is above the lower limit specified by ll_nat, and another whose current count is below the upper limit specified by ul_nat. Then, one atom of the former type is replaced with an atom of the latter type. Finally, the minimum interatomic distance defined by mindist is checked, and if no violations are found, the structure is accepted as an offspring.

fig_EA-vc_substitution.svg fig_EA-vc_substitution.svg

LAQA

One of the selection-type algorithms.

fig_LAQA fig_LAQA

fig_LAQA fig_LAQA

Score $ L $

$$ L = -E + w_F \frac{F^2}{2\Delta F} + w_S S. $$
SymbolNote
$$ E $$Energy (eV/atom)
$$ w_F $$Weight of the force term. Default: $ w_F = 0.1$
$$ F $$Averaged norm of the atomic force (eV/Å)
$$ \Delta F $$Absolute difference of $ F $ from the previous step. $ \Delta F = 1$ for the first step. $ \Delta F = 10^{-6}$ if $ \Delta F \le 10^{-6} $.
$$ w_S $$Weight of the stress term. Default: $ w_S = 10.0$
$$ S $$Average of the absolute values of the components of the stress tensor (eV/Å^3).

Reference

Subsections of Structure generation

struc_mode = crystal

under construction

struc_mode = mol

under construction

struc_mode = mol_bs

CrySPY uses pyxtal in normal molecular crystal structure generation mode (struc_mode = mol). The molecules are arranged to fit a point group at a selected Wykoff position in the space group to keep the symmetry. (Sometimes it takes a long time to generate.)

In mol_bs mode (bs means break symmetry), dummy atoms are placed in Wykoff positions as in ordinary crystals, and then the dummy atoms are replaced by molecules without considering symmetry and rotated randomly. The structure generation is relatively fast.

under construction

Subsections of Features

Logging

2023 July 10

CrySPY 1.2.0 adopts logging library of Python. CrySPY logs are output to both the screen and files(log_cryspy and err_cryspy).

  • log –> screen and log_cryspy
  • error and warning –> screen and err_cryspy

Here is the example:

[2023-07-10 18:40:54,389][cryspy_init][INFO] 


Start CrySPY 1.2.0


[2023-07-10 18:40:54,389][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2023-07-10 18:40:54,390][read_input][INFO] Save input data in cryspy.stat
[2023-07-10 18:40:54,391][cryspy_init][INFO] # ---------- Initial structure generation
[2023-07-10 18:40:54,391][cryspy_init][INFO] Number of MPI processes: 1
[2023-07-10 18:40:54,391][gen_init_struc][INFO] # ------ mindist
[2023-07-10 18:40:54,395][struc_util][INFO] Cu - Cu: 1.32
[2023-07-10 18:40:54,395][gen_init_struc][INFO] # ------ generate structures
[2023-07-10 18:40:54,481][gen_pyxtal][INFO] Structure ID      0 was generated. Space group:   1 -->   1 P1
[2023-07-10 18:40:54,493][gen_pyxtal][INFO] Structure ID      1 was generated. Space group:  28 -->  28 Pma2
[2023-07-10 18:40:54,498][gen_pyxtal][INFO] Structure ID      2 was generated. Space group:  29 -->  29 Pca2_1
[2023-07-10 18:40:54,704][gen_pyxtal][INFO] Structure ID      3 was generated. Space group: 137 --> 137 P4_2/nmc
[2023-07-10 18:40:54,725][gen_pyxtal][INFO] Structure ID      4 was generated. Space group: 212 --> 214 I4_132
[2023-07-10 18:40:54,800][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.408367

If you do not want output in the console, run cryspy with the -n option as follows:

cryspy -n

Backup

2024 Dec. 22 updated

CrySPY has a simple backup function. The following files are backed up:

  • cryspy.in
  • cryspy.stat
  • log_cryspy
  • err_cryspy
  • debug_cryspy
  • cryspy_interactive.ipynb
  • calc_in/*
  • data/*

work/* are NOT included.

  • (v1.1.0 or later) above files are copied to a directory named by date and time in “backup” directory. Previous backups are NOT automatically deleted.
  • (v1.0.0) only one generation is backed up, and previous backups will be deleted.

Auto backup

The timing of the automatic backup is as follows:

  • before going to next selection (BO, LAQA) or next generation (EA)
  • append structures

Manual backup

To manually back up, run cryspy with -b or --backup option as:

cryspy -b

This command only performs backups, unlike the normal execution.

Clean

2024 Dec. 22 updated

CrySPY has a simple clean (just move files) function. It is useful when you want to start over from the beginning. The following files are cleaned up:

  • cryspy.stat
  • log_cryspy
  • err_cryspy
  • lock_cryspy
  • data/*
  • work/*
  • tmp_gen_struc/*

To clean up, run cryspy with -c or --clean option as:

$ ls
calc_in  cryspy.in  cryspy.stat  data  err_cryspy  log_cryspy
$ cryspy -c
Are you sure you want to clean the data? 'yes' or 'no' [y/n]: y
$ ls
calc_in  cryspy.in  trash
$ ls trash
20230318_100728

Files other than calc_in/* and cryspy.in are moved to trash and grouped into a directory named by date and time. If you do not need them, you can delete them manually.

Restriction on interatomic distances

2024 April 23, updated

You can restrict the interatomic distance in structure generation. Here is an example of [structure] section in the input file to limit minimum interatomic distance for a A-B binary system.

[structure]
natot = 8
atype = A B
nat = 4 4
mindist_1 = 2.0 1.8
mindist_2 = 1.8 1.5

This means that minimum interatomic distances of A-A, A-B, and B-B are limited to 2.0, 1.8, and 1.5 Å, respectively. Structures with interatomic distances shorter than these values are automatically eliminated.

For ternary systems, you will need mindist_1, mindist_2, and mindist_3. Mindist matrix must be a symmetric matrix.

Since CrySPY version 1.4.0, the minimum interatomic distance check is also performed after structure relaxation. This feature was introduced because, with machine learning potentials, structures with nearly overlapping atoms can sometimes be obtained. You can disable this feature by adding the following line to cryspy.in (it is enabled by default):

[option]
check_mindist_opt = False

Example: Na8Cl8

Without mindist

cryspy.in

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8

[VASP]
kppvol = 40 80

[option]

log_cryspy

[2024-04-23 13:46:28,598][cryspy_init][INFO] 


Start CrySPY 1.2.3


[2024-04-23 13:46:28,598][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2024-04-23 13:46:28,598][read_input][INFO] Save input data in cryspy.stat
[2024-04-23 13:46:28,599][gen_init_struc][INFO] # ------ mindist
[2024-04-23 13:46:28,601][struc_util][INFO] Na - Na: 1.66
[2024-04-23 13:46:28,602][struc_util][INFO] Na - Cl: 1.3399999999999999
[2024-04-23 13:46:28,602][struc_util][INFO] Cl - Cl: 1.02
...

fig_mindist fig_mindist

In the default settings of PyXtal, atoms can sometimes be too close to each other, as shown in the figure above, so it is recommended to set the mindist parameter. That would help simplify DFT calculations.

With mindist

cryspy.in

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5

[VASP]
kppvol = 40 80

[option]

log_cryspy

[2024-04-23 14:06:21,955][cryspy_init][INFO] 


Start CrySPY 1.2.3


[2024-04-23 14:06:21,955][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2024-04-23 14:06:21,956][read_input][INFO] Save input data in cryspy.stat
[2024-04-23 14:06:21,956][gen_init_struc][INFO] # ------ mindist
[2024-04-23 14:06:21,956][struc_util][INFO] Na - Na: 2.5
[2024-04-23 14:06:21,956][struc_util][INFO] Na - Cl: 1.5
[2024-04-23 14:06:21,956][struc_util][INFO] Cl - Cl: 2.5

In cases like ionic crystals, it is advisable to set up the configuration in such a way that cations and anions are kept apart from each other.

CrySPY_ID in job files

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 8
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS pw.x -nk 4 -nb 2 < pwscf.in > pwscf.out


if [ -e "CRASH" ]; then
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

Structure generation with MPI parallelization

Oct. 21 2023, update

Random structure generation using MPI has been available since version 1.1.0 ( using CrySPY >= 1.2.3 is better). You need to install mpi4py in your Python environment for MPI parallelization. Of course, an MPI library such as Open MPI, Intel MPI, and MPICH is required for your workstation.

Info

Requirements:

  • CrySPY 1.1.0 1.2.3 or later
  • mpi4py
  • MPI library (Open MPI, Intel MPI, MPICH, etc.)
Warning

1.1.0 <= CrySPY <=1.2.2 has a bug. When you use bash (zsh) to run a job with MPI (e.g., jobcmd = zsh, jobfile = job_cryspy), the MPI job does not run. There is no problem when you use a job scheduler (qsub, sbatch). It has already fixed in version 1.2.3.

The figure below shows the relationship between elapsed time and the number of processes for 1000 structures of Si8 with the following setting:

[basic]
algo = RS
calc_code = soiap
tot_struc = 1000
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8
mindist_1 = 2.2

The structure generation is taking a long time because of a slightly stricter setting like mindset_1 = 2.2. The structure generation was performed 10 times for each number of processes.

fig_MPI fig_MPI

Run

mpiexec -n 4 cryspy -p

Enthalpy

2023/10/18

Info

Requirements:

  • CrySPY 1.2.2 or later
  • VASP or QE

When performing CSP at high pressure, enthalpy results can be collected instead of total energy. Not yet compatible with softwares other than VASP and QE.

E_eV_atom in cryspy_rslt and cryspy_rslt_energy_asc turns into enthalpy (eV/atom). Here is the example of CSP results under 40 GPa pressure for Sr4O4. CsCl-type structure (ID 5) is more stable than NaCl-type (ID 6).

   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
5       26  Pmc2_1          221       Pm-3m  -2.276790     NaN     done
6      225   Fm-3m          225       Fm-3m  -2.244800     NaN     done
1      101  P4_2cm          107        I4mm  -2.181115     NaN     done
4      123  P4/mmm          123      P4/mmm  -2.034509     NaN  not_yet
3       20  C222_1           63        Cmcm  -0.686541     NaN     done
2       75      P4           75          P4  -0.008713     NaN  not_yet
9       51    Pmma           47        Pmmm   0.096430     NaN     done
8       65    Cmmm          123      P4/mmm   1.099657     NaN     done
0      187   P-6m2          187       P-6m2   1.292124     NaN     done
7       53    Pmna           53        Pmna   5.153504     NaN  not_yet

VASP

CrySPY reads energy (enthalpy) from a OSZICAR file. This automatically changes to enthalpy when PSTRESS is set in INCAR_x as follows:

PSTRESS = 400

You do not have to do anything in cryspy.in. energy_step_flag is also supported for enthalpy.

Example: CrySPY utility > examples > qe_Sr4O4_RS_pv_term

QE

Add pv_term = True in the QE section of cryspy.in to use enthalpy:

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  40  80
pv_term = True

Don’t forget to write press in the QE input:

 &cell
    press = 400
 /
Warning

In QE, energy_step_flag is not supported yet for enthalpy.

As library

2024 May 31

Info

Requirements:

  • CrySPY 1.3.0 or later

Cryspy can be used as a library to generate random structures or structures by evolutionary algoritym. The jupyter notebook is available in CrySPY utility > notebook > as_library.

Random structure generation

####
#### when you change set_logger(), you need to restart the kernel
####
from cryspy.util.utility import set_logger    # optional
set_logger()    # optional
#set_logger(noprint=True, logfile='log_cryspy', errfile='err_cryspy')    # write log and err messages to files

from cryspy.RS.gen_struc_RS import gen_pyxtal

nstruc = 10
atype = ('Na', 'Cl')
nat = (4, 4)
mindist = ((2.0, 1.5),
           (1.5, 2.0))
spgnum = 'all'

init_struc_data = gen_pyxtal.gen_struc(
    nstruc=nstruc,
    atype=atype,
    nat=nat,
    mindist=mindist,
    spgnum=spgnum,
)

You can get init_struc_data (dict: {ID: pymatgen Strcture, …})

Structure generation by evolutionary algorithm

Situation: parent A (, parent B) –> child

Prepare two (one) parent structures as pymatgen Structure object.
In this example, just use the results of RS for Cu4Au4 (see, CrySPY utility > notebook > as_library).

import pickle
with open('./Cu4Au4_sample/opt_struc_data.pkl', 'rb') as f:
    opt_struc_data = pickle.load(f)

Crossover

from cryspy.EA.gen_struc_EA import crossover

# you can change parent_A and parent_B
parent_A = opt_struc_data[0]
parent_B = opt_struc_data[1]

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))

child = crossover.gen_child(
    atype=atype,
    nat=nat,
    mindist=mindist,
    parent_A=parent_A,
    parent_B=parent_B,
)

# child: pymatgen Structure

Permutation

from cryspy.EA.gen_struc_EA import permutation

# you can change parent_A
parent_A = opt_struc_data[0]

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))
ntimes = 1    # number of times to perform permutatio

child = permutation.gen_child(
    atype=atype,
    mindist=mindist,
    parent_A=parent_A,
    ntimes=ntimes,
)

# child: pymatgen Structure

Strain

from cryspy.EA.gen_struc_EA import strain

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))
sigma_st = 0.05    # standard deviation of strain

child = strain.gen_child(
    atype=atype,
    mindist=mindist,
    parent_A=parent_A,
    sigma_st=sigma_st,
)

Situation: parent group, fitness –> children

Data set

Prepare structure and fitness (energy) data as dict. The key is structure ID. In this example, just use the results of RS for Cu4Au4 (see, CrySPY utility > notebook > as_library)..

e.g.
struc_data = {0: (pymatgen Structure), 1: (pymatgen Structure), …}
fitness = {0: 0.019632287242441926, 1: -0.005437509701440302, …}

import pickle
with open('./Cu4Au4_sample/opt_struc_data.pkl', 'rb') as f:
    opt_struc_data = pickle.load(f)
with open('./Cu4Au4_sample/rslt_data.pkl', 'rb') as f:
    rslt_data = pickle.load(f)

struc_data = opt_struc_data    # dict
fitness = rslt_data['E_eV_atom'].to_dict()    # you may include None or np.nan for values

Survival of the fittest

from cryspy.EA.survival import survival_fittest
from cryspy.EA.gen_struc_EA.select_parents import SelectParents
from cryspy.EA.gen_struc_EA import crossover, permutation, strain

n_fittest = 5    # number of survivors

ranking, _, _ = survival_fittest(
    fitness=fitness,
    struc_data=struc_data,
    elite_struc=None,
    elite_fitness=None,
    n_fittest=n_fittest,
    fit_reverse=False,
    emax_ea=None,
    emin_ea=None,
)

# ranking <-- e.g. [2, 1, 0, 7, 9] without structure duplicaiton

Select parents class

sp = SelectParents(ranking)    # after set_xxx, we can use sp.get_parents(n_parent)
sp.set_tournament(t_size=2)

Crossover

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))
n_crsov = 5    # number of structures to be generated by crossover
#id_start = len(init_struc_data)  # next Structure ID
id_start = 10

co_children, co_parents, co_operation = crossover.gen_crossover(
    atype=atype,
    nat=nat,
    mindist=mindist,
    struc_data=struc_data,
    sp=sp,
    n_crsov=n_crsov,
    id_start=id_start,
)

# co_children <-- dict {ID: pymatgen Structure, ID: pymatgen Structure, ...}
# co_parents  <-- e.g. {10: (2, 7), 11: (2, 1), 12: (2, 1), 13: (0, 2), 14: (2, 1)}
# co_operation <-- e.g. {10: 'crossover', 11: 'crossover', ...}

Permutation

n_perm = 5    # number of structures to be generated by permutation
#id_start = len(init_struc_data) + n_crsov   # next Structure ID
id_start = 15
ntimes = 1    # number of times to perform permutation

pm_children, pm_parents, pm_operation = permutation.gen_permutation(
    atype=atype,
    mindist=mindist,
    struc_data=struc_data,
    sp=sp,
    n_perm=n_perm,
    id_start=id_start,
    ntimes=ntimes,
)

# pm_children <-- dict {ID: pymatgen Structure, ID: pymatgen Structure, ...}
# pm_parents  <-- e.g. {15: (2,), 16: (1,), 17: (2,), 18: (1,), 19: (1,)}
# pm_operation <-- e.g. {15: 'permutaion', 16: 'permutation', ...}

Strain

n_strain = 5    # number of structures to be generated by strain
#id_start = len(init_struc_data) + n_crsov + n_perm   # next Structure ID
id_start = 20
sigma_st = 0.05    # standard deviation of strain

st_children, st_parents, st_operation = strain.gen_strain(
    atype=atype,
    mindist=mindist,
    struc_data=struc_data,
    sp=sp,
    n_strain=n_strain,
    id_start=id_start,
    sigma_st=sigma_st,
)

# st_children <-- dict {ID: pymatgen Structure, ID: pymatgen Structure, ...}
# st_parents  <-- e.g. {20: (1,), 21: (2,), 22: (0,), 23: (2,), 24: (2,)}
# st_operation <-- e.g. {20: 'strain', 21: 'strain', ...}

Interactive mode

2025 March 6

Info

Requirements:

  • CrySPY 1.4.0 or later
  • Jupyter
  • Structure optimization software compatible with ASE (e.g., machine learning potentials).
  • nglview (optional)

An interactive mode using Jupyter Notebook has been made available to ensure ease of use, even for those unfamiliar with PC clusters or supercomputers. Since the structure optimization calculations are designed for ASE, compatible machine learning potentials can be used.

For detailed usage, please refer to Tutorial > Interactice mode(Jupyter Notebook).

fig_struc_visu fig_struc_visu fig_eplot fig_eplot fig_convplot fig_convplot

fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary fig_conv_plotly3d_ternary

fig_convplotmat fig_convplotmat fig_convplotmat fig_convplotmat

Subsections of Input file

File format

CrySPY uses the configparser module to read input file, cryspy.in . cryspy.in consists of sections, led by a [section] header and followed by name = value or name : value entries. Section names and values are case sensitive, but names are not. Lines beginning with # or ; are ignored and may be used to provide comments. Accepted bool values are 1, yes, true, and on, which cause this method to return True, and 0, no, false, and off, which cause it to return False. These string values for bool are checked in a case-insensitive manner. Some values are given in a space-separated manner.

Info

See configparser in detail.

Note

section name: case sensitive
name: case insensitive
value: case sensitive except for bool

[basic] section

2025 March 6 updated

NameValueDefaultDescription
algoRS, EA, EA-vc, BO, LAQAAlgorithm
calc_codeVASP, QE, OMX, soiap, LAMMPS, ASECaluculation code for structure optimization
tot_strucintThe total number of structures. Not used in the case of EA or EA-vc.
nstageintThe number of stages
njobintThe number of jobs running at the same time.
jobcmdstrCommand to submit jobs such as qsub and sbatch.
jobfilestrFile name of the job file.

[structure] section

2025 March 6, updated

NameValueDefaultDescription
struc_modecrystal, mol, mol_bscrystalStructure generation mode
atypeatomic symbol [atomic symbol …]Atom type. e.g. atype = Na Cl.
natint [int …]The number of atoms corresponding to each atype. e.g. nat = 8 8. Not used in EA-vc.
mindist (mindist_?)float [float …]NoneConstraint on minimum interatomic distance [Å].
mindist_factorfloat1.0Scaling factor for mindist.
vol_factorfloat1.0Volume scaling factor.
vol_mufloatNoneMean of volume if you want specify the volume of cells.
vol_sigmafloatNoneStandard deviation of volume if you want specify the volume of cells.
symprecfloat0.01Precision for symmetry finding.
spgnumall, space group number, 0allConstraint on space group. If all, 1–230. If 0, random structure without space group information (no symmetry).
use_find_wyboolFalseStructure generation with find_wy.

mindist


if algo is EA-vc

NameValueDefaultDescription
ll_natint [int …]Lower limit of nat. e.g. ll_nat = 0 0.
ul_natint [int …]Upper limit of nat. e.g. ul_nat = 8 8.

if struc_mode is mol or mol_bs

NameValueDefaultDescription
mol_filestr [str …]Path of molecule files or molecule names.
nmolint [int …]The number of molecules.
timeout_molfloatNoneTime out for molecular structure generation.
rot_molrandom, random_mol, random_wyckoffrandom_wyckoffOnly used in mol_bs. Mode for rotation of molecules.
nrotint20Only used in mol_bs. Maximum number of trials to rotate molecules.
mindist_mol_bs (mindist_mol_bs_?)float [float …]NoneOnly used in mol_bs. Constraint on minimum intermolecular distance [Å].
mindist_mol_bs_factorfloat1.0Only used in mol_bs. Scaling factor for mindist_mol_bs.

if use_find_wy is True or spgnum = 0

NameValueDefaultDescription
fwpathstrNoneOnly used with find_wy. Path of find_wy. If None, fwpath is automatically searched in your $PATH.
minlenfloatOnly used with find_wy or spgnum = 0. Minimum length of lattice vector [Å].
maxlenfloatOnly used with find_wy or spgnum = 0. Maximum length of lattice vector [Å].
danglefloatOnly used with find_wy or spgnum = 0. Delta angle for alpha, beta, and gamma in degree unit.
maxcntint50Only used with find_wy or spgnum = 0. Maximum number of trials to determine atom positions.

[VASP] section

2024 April 22

[VASP] section is required only if you use VASP (calc_code = VASP)

NameValueDefaultDescription
kppvolint [int …]Grid density per Å**(-3) of reciprocal cell in each stage.
force_gammaboolFalseIf true, force gamma-centered mesh.

kppvol and force gamma

[QE] section

[QE] section is required only if you use QE (calc_code = QE)

NameValueDefaultDescription
kppvolint [int …]Grid density per Å**(-3) of reciprocal cell in each stage
qe_infilestrFile name of QE input file.
qe_outfilestrFile name of QE output file.
pv_termboolFalseIf true, read enthalpy instead of total energy.

kppvol

pv_term

[OMX] section

[OMX] section is required only if you use OpenMX (calc_code = OMX)

NameValueDefaultDescription
kppvolint [int …]Grid density per Å**(-3) of reciprocal cell in each stage
OMX_infilestrFile name of OpenMX input file.
OMX_outfilestrFile name of OpenMX output file.
ValenceElectronsstr float float [str float float …]The number of initial charges for up and down spin states.

kppvol

ValenceElectrons

e.g. in NaCl: ValenceElectrons = Na 4.5 4.5 Cl 3.5 3.5.

[soaip] section

[soiap] section is required only if you use soiap (calc_code = soiap)

NameValueDefaultDescription
soiap_infilestrFile name of soiap input file.
soiap_outfilestrFile name of soiap output file.
soiap_cifstrFile name of soiap CIF-formatted initial structure.

[LAMMPS] section

[LAMMPS] section is required only if you use LAMMPS (calc_code = LAMMPS)

NameValueDefaultDescription
lammps_infilestrFile name of LAMMPS input file.
lammps_outfilestrFile name of LAMMPS output file.
lammps_potentialstr [str …], NoneNonePotential.
lammps_datastrFile name of LAMMPS data file.

[ASE] section

[ASE] section is required only if you use ASE (calc_code = ASE)

NameValueDefaultDescription
ase_pythonstrFile name of ASE input file.

[EA] section

2025 June 15, updated

NameValueDefaultDescription
n_popintPopulation (see also Population size)
n_crsovintNumber of offspring created by crossover
n_permintNumber of offspring created by permutation
n_strainintNumber of offspring created by strain
n_randintNumber of structures created randomly
n_eliteintNumber of elite individuals (see also Natural selection)
fit_reverseboolFalseIf False, minimal search (see also Evaluate fitness)
n_fittestint0Number of individuals that remain natural selection. If set to 0, all individuals are retained.
slct_funcTNM, RLTFunction to select parents
t_sizeint3Tournament size. Used only used slct_func = TNM. (see also Tournament selection)
a_rltfloat10.0Parameter for linear scaling. Used only with slct_func = RLT. (see also Roulette selection)
b_rltfloat1.0Parameter for linear scaling. Used only with slct_func = RLT. (see also Roulette selection)
crs_latequal, randomrandomHow to mix lattice vectors (see also crossover > 5. Swap the sliced halves)
nat_diff_toleint4Tolerance for difference in the number of atoms in crossover. (see also crossover > 6. Select the offspring with more atoms)
ntimesint1Number of times in permutation.
sigma_stfloat0.5Standard deviation for strain.
maxcnt_eaint50Maximum number of trials in EA.
maxgen_eaint0Maximum generation. If set to 0, no upper limit is applied.
emax_eafloatNoneEnergy upper limit (eV/atom) for natural selection.
emin_eafloatNoneEnergy lower limit (eV/atom) for natural selection.

NameValueDefaultDescription
n_addintNumber of offspring created by addition.
n_elimintNumber of offspring created by elimination.
n_subsintNumber of offspring created by substitution.
targetstrrandomTarget. Only random for now.
end_point(float, …, float)Energy of end points for formation energy.
emax_eafloatNoneEnergy upper limit (eV/atom) for computing the convex hull.
emin_eafloatNoneEnergy lower limit (eV/atom) for computing the convex hull.
show_maxfloat0.2When plotting the convex hull, the maximum value of the y-axis (for binary systems) or the maximum hull distance (for ternary systems) is set by show_max.
lable_stableboolTrueWhether to show stable compositions when plotting the convex hull.
vmaxfloat0.2Maximum value of the colorbar representing hull distance.
bottom_marginfloat0.02Bottom margin of the y-axis for binary convex hull plot.
fig_formatstrsvgFigure format for convex hull plot: svg, png, or pdf.

[BO] section

2024 May 27th, updated

[BO] section is required only if you use BO (algo = BO)

NameValueDefaultDescription
nselect_bointThe number of structures to be selected at once.
scoreTS, EI, PIAcquisition function.
num_rand_basisint0If 0, Gaussian process. The number of basis function.
cdevfloat0.001Cutoff of deviation for standardization.
dscrptFPStructure descriptor.
max_select_boint0Maximum number of selection.
manual_select_boint [int …]NoneStructure IDs to be selected manually.
emax_bofloatNoneUpper limit of energy in BO.
emin_bofloatNoneLower limit of energy in BO.

if decrpt is FP

CrySPY 1.3.0 or later

fppath and fp_rmin are obsolete.

NameValueDefaultDescription
fp_rmaxfloat8.0Only used with dscrpt = FP. Maximum cutoff of r in fingerprint.
fp_npointsint20Only used with dscrpt = FP. Number of discretized points for each pair in fingerprint.
fp_sigmafloat0.7Only used with dscrpt = FP. Sigma parameter [Å] in Gaussian smearing function.

CrySPY 1.2.5 or earlyer

NameValueDefaultDescription
fppathstrNoneOnly used with dscrpt = FP. Path of cal_fingerprint. If None, fwpath is automatically searched in your $PATH.
fp_rminfloat0.5Only used with dscrpt = FP. Minimum cutoff of r in fingerprint.
fp_rmaxfloat5.0Only used with dscrpt = FP. Maximum cutoff of r in fingerprint.
fp_npointsint20Only used with dscrpt = FP. Number of discretized points for each pair in fingerprint.
fp_sigmafloat1.0Only used with dscrpt = FP. Sigma parameter [Å] in Gaussian smearing function.

[LAQA] section

[LAQA] section is required only if you use LAQA (algo = LAQA)

NameValueDefaultDescription
nselect_laqaintThe number of structures to be selected at once.
wffloat0.1Weight of the force term.
wsfloat10.0Weight of the stress term.
Info

If algo = LAQA, the followings are automatically set in the [option] section.

  • force_step_flag = True
  • stress_step_flag = True

Force and stress data are collected step by step. Energy and structure data are NOT. They are collected for each selection. In other words, in this case, energy and structure data are saved once every 10 steps. If you want to collect energy and structure data step by step, manually set up as follows:

[option]
energy_step_flag = True
struc_step_flag = True

[option] section

NameValueDefaultDescription
check_mindist_optboolTrueIf True, a mindist constraint is checked after structure relaxation.
stop_chkptint0CrySPY stops at a specified check point.
load_struc_flagboolFalseIf True, load initial structures from ./data/pkl_data/init_struc_data.pkl.
stop_next_strucboolFalseIf True, CrySPY does not submit jobs for next structures, but jobs for next stage are submitted.
recalcint [int …](empty list)Specify structure IDs if you want to recalculate or continue optimization.
append_struc_eaboolFalseIf True, append structures by EA.
energy_step_flagboolFalseIf True, save energy_step_data in ./data/pkl_data/energy_step_data.pkl.
struc_step_flagboolFalseIf True, save struc_step_data in ./data/pkl_data/struc_step_data.pkl.
force_step_flagboolFalseIf True, save force_step_data in ./data/pkl_data/force_step_data.pkl.
stress_step_flagboolFalseIf True, save stress_step_data in ./data/pkl_data/stress_step_data.pkl.

Kpoint

2024 April 22

CrySPY automatically generates the k-point setting using the pymatgen.io.vasp.Kpoints.automatic_density_by_vol function from pymatgen. An example in cryspy.in with nstage = 2 is as follows:

[VASP]
kppvol = 40 120
  • stage 1: kppvol = 40
  • stage 2: kppvol = 120

kppvol means a grid density per Å ${}^{-3} $ of reciprocal cell.
VASP: gamma centered meshes are used for hexagonal cells and face-centered cells; otherwise, Monkhorst-Pack grids are employed.
QE and OMX: only a k-mesh is provided, no offset.

What is the appropriate value for kppvol?

Here are the guidelines. We use VESTA for visualizing crystal structures.

Primitive cell of diamond Si

fig_prim_diamond fig_prim_diamond

a = b = c = 3.836 Å

kppvolk-mesh
0[1, 1, 1]
20[4, 4, 4]
40[6, 6, 6]
60[7, 7, 7]
80[7, 7, 7]
100[8, 8, 8]
120[9, 9, 9]
140[9, 9, 9]
160[9, 9, 9]
180[10, 10, 10]
200[10, 10, 10]
400[13, 13, 13]
600[15, 15, 15]
800[17, 17, 17]

Conventional cell of diamond Si

fig_conv_diamond fig_conv_diamond

a = b = c = 5.431 Å

kppvolk-mesh
0[1, 1, 1]
20[3, 3, 3]
40[3, 3, 3]
60[4, 4, 4]
80[4, 4 ,4]
100[5, 5, 5]
120[5, 5, 5]
140[6, 6, 6]
160[6, 6, 6]
180[6, 6, 6]
200[6, 6, 6]
400[8, 8, 8]
600[9, 9, 9]
800[10, 10, 10]

Nd2Fe14B

fig_Nd2Fe12B fig_Nd2Fe12B

a = b = 8.804 Å
c = 12.205 Å

kppvolk-mesh
0[1, 1, 1]
20[1, 1, 1]
40[2, 2, 1]
60[2, 2, 2]
80[3, 3 ,2]
100[3, 3, 2]
120[3, 3, 2]
140[3, 3, 2]
160[3, 3, 2]
180[4, 4, 2]
200[4, 4, 3]
400[5, 5, 3]
600[6, 6, 4]
800[6, 6, 4]

Subsections of Data format

Subsections of Common data

Initial and optimized structure data

Initial and optimized structure data are saved in init_struc_data.pkl and opt_struc_data.pkl, respectively. pymatgen library is required to analyze these data files.

Data format

  • type: dict
    • key: structure ID
    • value: structure data
  • string form
    • {0: Structure Summary …,
      1: Structure Summary …,
      …}
  • structure data format

How to access

import pickle
with open('init_struc_data.pkl', 'rb') as f:
   init_struc_data = pickle.load(f)
with open('opt_struc_data.pkl', 'rb') as f:
   opt_struc_data = pickle.load(f)

# struc_step_data[ID]
#
#

# ---------- structure step data of ID 0
cid = 0      # ID
init_struc_data[cid]    # to show initial structure of ID 0
Structure Summary
Lattice
    abc : 5.727301 5.727301 4.405757
 angles : 90.0 90.0 90.0
 volume : 144.5175386563631
      A : 5.727301 0.0 0.0
      B : 0.0 5.727301 0.0
      C : 0.0 0.0 4.405757
PeriodicSite: Si (0.2506, 5.4767, 1.1014) [0.0438, 0.9562, 0.2500]
PeriodicSite: Si (2.6130, 3.1143, 1.1014) [0.4562, 0.5438, 0.2500]
PeriodicSite: Si (3.1143, 0.2506, 1.1014) [0.5438, 0.0438, 0.2500]
PeriodicSite: Si (5.4767, 2.6130, 1.1014) [0.9562, 0.4562, 0.2500]
PeriodicSite: Si (5.4767, 0.2506, 3.3043) [0.9562, 0.0438, 0.7500]
PeriodicSite: Si (3.1143, 2.6130, 3.3043) [0.5438, 0.4562, 0.7500]
PeriodicSite: Si (2.6130, 5.4767, 3.3043) [0.4562, 0.9562, 0.7500]
PeriodicSite: Si (0.2506, 3.1143, 3.3043) [0.0438, 0.5438, 0.7500]

Result data

Common result data such as space group, energies, etc. are saved in rslt_data.pkl. pandas library is required to analyze this data file.

Data format

  • type: pandas.core.frame.DataFrame
    • row lable: structure ID
  • string form
    • see blow

How to access

import pickle
with open('rslt_data.pkl', 'rb') as f:
   rslt_data = pickle.load(f)


# ---------- sort by Energy
# top 5
rslt_data.sort_values(by=['E_eV_atom']).head(5)
   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done

Random Search (RS)

Table of contents

    Evolutionary algorithm (EA)

    Table of contents

      Bayesian Optimization (BO)

      Table of contents

        LAQA

        Table of contents

          Subsections of Optional data

          Energy step data

          Energy step data is saved in energy_step_data.pkl if you set energy_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

          Warning

          energy_step_flag = True is currently available only with VASP, QE, and soiap.

          Info

          In soiap, energy_step_data is collected only if loopa == 1. This is because other data (struc, force, and stress) are output only when loopa == 1. See, https://github.com/nbsato/soiap/blob/master/doc/instructions.md

          Data format

          • type: dict
            • key: structure ID
            • value: list of energy step data in each stage
          • string form
            • {0: [array([-3.4439912 , -3.55040935, -3.66697038, ..]), array([-4.0613393 , -4.05445631, -4.06159641, …]), …],
              1: [array([-2.68209823, -2.69012487, -2.68364907, ..]), array([-2.79140967, -2.79183827, -2.79206508, …]), …],
              …}
          • unit of energy
            • eV/atom

          How to access

          import pickle
          with open('energy_step_data.pkl', 'rb') as f:
              energy_step_data = pickle.load(f)
          
          # energy_step_data[ID][stage][step]
          # energy_step_data[ID][0] <-- stage 1
          # energy_step_data[ID][1] <-- stage 2
          #
          # in LAQA
          # energy_step_data[ID][selection][step]
          # energy_step_data[ID][0] <-- 1st selection
          # energy_step_data[ID][1] <-- 2nd selection
          
          # ---------- energy step data of ID 3, stage 1
          cid = 3      # ID
          stage = 1    # stage
          energy_step_data[cid][stage-1][:10]    # show only 10 enegies in jupyter
          
          array([-3.4439912 , -3.55040935, -3.66697038, -3.77192063, -3.84320717,
                 -3.80679245, -3.84633935, -3.87374706, -3.89123193, -3.90422926])
          

          Structure step data

          Structure step data is saved in struc_step_data.pkl if you set struc_step_flag = True in [option] section of cryspy.in. pymatgen library is required to analyze this data file.

          Warning

          struc_step_flag = True is currently available only with VASP, QE, and soiap.

          Info

          struc_step_data includes initial structures. For example, struc_step_data[cid][0][0] is the initial structure of ID = cid.

          Data format

          • type: dict
            • key: structure ID
            • value: list of structure step data in each stage
          • string form
            • {0: [[Structure Summary …, Structure Summary, …], […], …],
              1: [[Structure Summary …, Structure Summary, …], […], …],
              …}
          • structure data format

          How to access

          import pickle
          with open('struc_step_data.pkl', 'rb') as f:
              struc_step_data = pickle.load(f)
          
          # struc_step_data[ID][stage][step]
          # struc_step_data[ID][0] <-- stage 1
          # struc_step_data[ID][1] <-- stage 2
          #
          #
          # in LAQA
          # struc_step_data[ID][selection][step]
          # struc_step_data[ID][0] <-- 1st selection
          # struc_step_data[ID][1] <-- 2nd selection
          
          # ---------- structure step data of ID 3, stage 1, step 4
          cid = 0      # ID
          stage = 1    # stage
          step = 0     # step index (start from 0)
          struc_step_data[cid][stage-1][step]    # to show initial structure of ID 0 at stage 1 in jupyter
          
          Structure Summary
          Lattice
              abc : 5.727301 5.727301 4.405757
           angles : 90.0 90.0 90.0
           volume : 144.5175386563631
                A : 5.727301 0.0 0.0
                B : 0.0 5.727301 0.0
                C : 0.0 0.0 4.405757
          PeriodicSite: Si (0.2506, 5.4767, 1.1014) [0.0438, 0.9562, 0.2500]
          PeriodicSite: Si (2.6130, 3.1143, 1.1014) [0.4562, 0.5438, 0.2500]
          PeriodicSite: Si (3.1143, 0.2506, 1.1014) [0.5438, 0.0438, 0.2500]
          PeriodicSite: Si (5.4767, 2.6130, 1.1014) [0.9562, 0.4562, 0.2500]
          PeriodicSite: Si (5.4767, 0.2506, 3.3043) [0.9562, 0.0438, 0.7500]
          PeriodicSite: Si (3.1143, 2.6130, 3.3043) [0.5438, 0.4562, 0.7500]
          PeriodicSite: Si (2.6130, 5.4767, 3.3043) [0.4562, 0.9562, 0.7500]
          PeriodicSite: Si (0.2506, 3.1143, 3.3043) [0.0438, 0.5438, 0.7500]
          

          Force step data

          Force step data is saved in force_step_data.pkl if you set force_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

          Warning

          force_step_flag = True is currently available only with VASP, QE, and soiap.

          Data format

          • type: dict
            • key: structure ID
            • value: list of force step data in each stage
          • string form
            • {0: [array([[ 0.26314927, -0.26314927, -0. ], […], …[…]]), array([[…], …, […]]), …],
              1: [array([[ 0. , 0. , 0. ], […], …[…]]), array([[…], …, […]]), …],
              …}
          • unit of force
            • eV/Å

          How to access

          import pickle
          with open('force_step_data.pkl', 'rb') as f:
              force_step_data = pickle.load(f)
          
          # force_step_data[ID][stage][step][atom]
          # force_step_data[ID][0] <-- stage 1
          # force_step_data[ID][1] <-- stage 2
          #
          # in LAQA
          # force_step_data[ID][selection][step][atom]
          # force_step_data[ID][0] <-- 1st selection
          # force_step_data[ID][1] <-- 2nd selection
          
          # ---------- force step data of ID 3, stage 1
          cid = 0      # ID
          stage = 1    # stage
          force_step_data[cid][stage-1][:3]    # to show only 3 steps in jupyter 
          
          [array([[ 0.26314927, -0.26314927, -0.        ],
                  [-0.26314927,  0.26314927, -0.        ],
                  [ 0.26314927,  0.26314927,  0.        ],
                  [-0.26314927, -0.26314927, -0.        ],
                  [-0.26314927,  0.26314927, -0.        ],
                  [ 0.26314927, -0.26314927,  0.        ],
                  [-0.26314927, -0.26314927, -0.        ],
                  [ 0.26314927,  0.26314927,  0.        ]]),
           array([[-0.12103692,  0.12103692,  0.        ],
                  [ 0.12103692, -0.12103692, -0.        ],
                  [-0.12103692, -0.12103692, -0.        ],
                  [ 0.12103692,  0.12103692,  0.        ],
                  [ 0.12103692, -0.12103692, -0.        ],
                  [-0.12103692,  0.12103692,  0.        ],
                  [ 0.12103692,  0.12103692,  0.        ],
                  [-0.12103692, -0.12103692, -0.        ]]),
           array([[-0.29801618,  0.29801618,  0.        ],
                  [ 0.29801618, -0.29801618, -0.        ],
                  [-0.29801618, -0.29801618, -0.        ],
                  [ 0.29801618,  0.29801618,  0.        ],
                  [ 0.29801618, -0.29801618, -0.        ],
                  [-0.29801618,  0.29801618,  0.        ],
                  [ 0.29801618,  0.29801618,  0.        ],
                  [-0.29801618, -0.29801618, -0.        ]])]
          
          step = 0     # step index (start from 0)
          atom = 2     # atom index (start from 0)
          force_step_data[cid][stage-1][step][atom]
          
          array([0.26314927, 0.26314927, 0.        ])
          

          Stress step data

          Stress step data is saved in stress_step_data.pkl if you set stress_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

          Warning

          stress_step_flag = True is currently available only with VASP, QE, and soiap.

          Data format

          • type: dict
            • key: structure ID
            • value: list of stress step data in each stage
          • string form
            • {0: [array([[-0.16770062, 0. , 0. ], […], […]]), array([[…], ]…], […]]), …],
              1: [array([[ 0.39260083, -0. , -0. ], […], […]]), array([[…], […], […]]), …],
              …}
          • unit of stress
            • eV/(Å**3)

          How to access

          import pickle
          with open('stress_step_data.pkl', 'rb') as f:
              stress_step_data = pickle.load(f)
          
          # stress_step_data[ID][stage][step][atom]
          # stress_step_data[ID][0] <-- stage 1
          # stress_step_data[ID][1] <-- stage 2
          #
          # in LAQA
          # stress_step_data[ID][selection][step][atom]
          # stress_step_data[ID][0] <-- 1st selection
          # stress_step_data[ID][1] <-- 2nd selection
          
          # ---------- stress step data of ID 3, stage 1
          cid = 0      # ID
          stage = 1    # stage
          stress_step_data[cid][stage-1][:3]    # to show only 3 steps in jupyter 
          
          [array([[-0.16770062,  0.        ,  0.        ],
                  [ 0.        , -0.16770062, -0.        ],
                  [ 0.        ,  0.        ,  0.21823358]]),
           array([[-0.16020785, -0.        , -0.        ],
                  [-0.        , -0.16020785,  0.        ],
                  [-0.        ,  0.        ,  0.18646321]]),
           array([[-0.13572003, -0.        ,  0.        ],
                  [-0.        , -0.13572003,  0.        ],
                  [-0.        ,  0.        ,  0.15953926]])]
          

          Subsections of CrySPY utility

          repeat_cryspy

          Link: CrySPY_utility/script/repeat_cryspy

          You may find it tedious to run cryspy over and over again. The automated script could help you.
          This automated script runs cryspy every 5 minutes by default. The time interval can be adjusted by editing the following part of the script.

              sleep 300    # seconds
          

          Usage

          1. copy repeat_cryspy to the working directory
          2. (optional) edit the time interval in repeat_cryspy
          3. run the script

          You can use the nohup command to keep the job running even after logging out.

          [bash]

          nohup ./repeat_cryspy &
          

          [zsh]

          nohup ./repeat_cryspy &!
          

          extract_struc.py

          2023 April 16 update

          Link: CrySPY_utility/script/extract_struc.py

          Script to extract structures from init_struc_data.pkl or opt_struc_data.pkl. This script can print stucture information and output cif files.

          One can specify structure ID(s) using -i option. Top k structures (the k most stable structures) can be extracted using -t option. -a option is for outputting all the structures. (note that many cif files will be output.) Symmetrized cif files can be generated with -s option. When outputting a symmetrized CIF file, you can also specify a tolerance with --tolerance. Structure information is printed with -p. If you use -p option, cif files are not output. You can also read a gzipped file (e.g., opt_struc_data.pkl.gz).

          Update History

          • 2024 April 16: –tolerance option, gzip
          • 2023 July 21: –print option

          Usage

          python3 extract_struc.py -h
          

          or if you put the script in your PATH, you can omit python3

          extract_struc.py -h
          
          usage: extract_struc.py [-h] [-p] [-a] [-i [INDEX ...]] [-t TOP] [-r] [-s] [--tolerance TOLERANCE] infile
          
          positional arguments:
            infile                input file
          
          options:
            -h, --help            show this help message and exit
            -p, --print           just print, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -ps
            -a, --all_id          all structures, e.g., extract_struc.py opt_struc_data.pkl -as
            -i [INDEX ...], --index [INDEX ...]
                                  structure ID, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
            -t TOP, --top TOP     top k structures, e.g. (k = 3), extract_struc.py opt_struc_data.pkl -t 3 -s
            -r, --rank            add rank in file names, e.g., extract_struc.py opt_struc_data.pkl -t 3 -rs
            -s, --symmetrized     symmetrized structure, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
            --tolerance TOLERANCE
                                  tolerance for symmetrization (default 0.01), e.g., extract_struc.py opt_struc_data.pkl -i 0 1 -s --tolerance 0.01
          

          Examples

          Print

          The -p option can be used in combination with any option except for -s option.

          extract_struc.py -p opt_struc_data.pkl -i 0 1
          
          ID 0
          Full Formula (Na8 Cl8)
          Reduced Formula: NaCl
          abc   :   6.823618   6.823618   7.566454
          angles:  90.000000  90.000000  96.650518
          pbc   :       True       True       True
          Sites (16)
            #  SP           a         b         c
          ---  ----  --------  --------  --------
            0  Na    0         0         1
            1  Na    0         0         0.5
            2  Na    0.704707  0.295293  0.75
            3  Na    0.295293  0.704707  0.25
            4  Na    0.5       0         1
            5  Na    0.5       0         0.5
            6  Na    0         0.5       0.5
            7  Na    0         0.5       0
            8  Cl    0.5       0.5       0
            9  Cl    0.5       0.5       0.5
           10  Cl    0.484753  0.515247  0.75
           11  Cl    0.515247  0.484753  0.25
           12  Cl    0.828247  0.171753  0.851096
           13  Cl    0.171753  0.828247  0.351096
           14  Cl    0.828247  0.171753  0.648904
           15  Cl    0.171753  0.828247  0.148904
          
          ID 1
          Full Formula (Na8 Cl8)
          Reduced Formula: NaCl
          abc   :   8.145021   8.145021   4.324235
          angles:  90.000000  90.000000 120.000000
          pbc   :       True       True       True
          Sites (16)
            #  SP            a          b         c
          ---  ----  ---------  ---------  --------
            0  Na     0.666667   0.333333  0.736206
            1  Na     0.666667   0.333333  0.263794
            2  Na     0.913147   0.086853  0.5
            3  Na     0.913147   0.826295  0.5
            4  Na     0.173705   0.086853  0.5
            5  Na     0.77711    0.22289   0
            6  Na     0.77711    0.55422   0
            7  Na     0.44578    0.22289   0
            8  Cl     0.027675   0.423376  0.5
            9  Cl    -0.423376  -0.395701  0.5
           10  Cl     0.395701  -0.027675  0.5
           11  Cl    -0.423376  -0.027675  0.5
           12  Cl     0.395701   0.423376  0.5
           13  Cl     0.027675  -0.395701  0.5
           14  Cl     0.333333   0.666667  0.5
           15  Cl     0          0         0
          

          Structure ID

          extract_struc.py opt_struc_data.pkl -i 7 10 12
          

          7.cif, 10.cif, and 12.cif are output.

          For symmetrized cif,

          extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
          

          2024 April 16
          With the tolerance parameter (default 0.01)

          extract_struc.py opt_struc_data.pkl -i 7 10 12 -s --tolerance 0.01
          

          Top k structures

          Info

          rslt_data.pkl is required in the same directory as the input.

          Let us suppose

          • ./data/pkl_data/opt_struc_data.pkl
          • ./data/pkl_data/rslt_data.pkl

          and cryspy_rslt_energy_asc file is as follows:

              Spg_num     Spg_sym  Spg_num_opt Spg_sym_opt    E_eV_atom  Magmom      Opt
          9       110      I4_1cd          110      I4_1cd -1284.708037     NaN  not_yet
          16        4        P2_1            4        P2_1 -1284.693651     NaN     done
          97       92    P4_12_12           91      P4_122 -1284.692494     NaN     done
          8        57        Pbcm           57        Pbcm -1284.668504     NaN     done
          81       19  P2_12_12_1           19  P2_12_12_1 -1284.635684     NaN     done
          ...
          

          Top k(=3) structures can be extracted with:

          extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3
          

          In this example, rlst_data.pkl must be in ./data/pkl_data/. 9.cif, 16.cif, and 97.cif are output.

          The rank can be included in cif file names with -r option:

          extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -r
          

          1_9.cif, 2_16.cif, and 3_97.cif are output.

          For symmetrized cif:

          extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -rs
          

          All the structures

          You should make a directory.

          mkdir init_cifs
          cd init_cifs
          extract_struc.py /path/to/opt_struc_data.pkl -a
          

          For symmetrized cif,

          extract_struc.py /path/to/init_struc_data.pkl -as
          

          Gzipped files

          2024 April 16
          Gzipped files (end with .gz) can be read:

          extract_struc.py opt_struc_data.pkl.gz -i 0 1 -s
          

          print_pkl.py

          2024 May 31

          When you want to quickly check the pickled files under data/pkl_data/, using print_pkl.py is convenient.

          Usage

          python3 print_pkl.py xxxx.pkl
          

          or if you put the script in your PATH, you can omit python3

          print_pkl.py xxxx.pkl
          

          Example

          print_pkl.py init_struc_data.pkl
          
          Number of structures: 10
          dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
          
          print_pkl.py input_data.pkl 
          
          [basic]
          algo = RS
          calc_code = ASE
          tot_struc = 10
          nstage = 1
          njob = 5
          jobcmd = zsh
          jobfile = job_cryspy
          
          [structure]
          struc_mode = crystal
          natot = 8
          atype = ('Cu', 'Au')
          nat = (4, 4)
          mindist_factor = 1.0
          vol_factor = 1.1
          symprec = 0.01
          spgnum = all
          use_find_wy = False
          
          [option]
          stop_chkpt = 0
          load_struc_flag = False
          stop_next_struc = False
          append_struc_ea = False
          energy_step_flag = False
          struc_step_flag = False
          force_step_flag = False
          stress_step_flag = False
          
          [ASE]
          kpt_flag = False
          force_gamma = False
          ase_python = ase_in.py
          
          print_pkl.py elite_struc.pkl
          
          Number of structures: 2
          dict_keys([3, 6])
          
          print_pkl.py elite_fitness.pkl
          
          {3: -325.79973412221455, 6: -324.8381948581405}
          

          pos2pkl.py

          2023 July 23 update

          Script to convert structre data into init_struc_data.pkl. The default input format is init_POSCARS. Single structure data such as POSCAR and cif files can be optionally converted. Output is init_struc_data.pkl. Structure data can be added to an already existing init_struc_data.pkl. The structure ID is not taken into account and is newly assigned. If the number of atoms is different, an error is generated.

          init_struc_data.pkl can be loaded at the start of the simulation in CrySPY.

          You can remove and sort species with -f option. Note that without this option, pymatgen will sort the species in electronegativity order!

          Usage

          usage: pos2pkl.py [-h] [-s [SINGLE ...]] [-f [FILTER ...]] [-p] [infile ...]
          
          positional arguments:
            infile                input file: init_POSCARS
          
          options:
            -h, --help            show this help message and exit
            -s [SINGLE ...], --single [SINGLE ...]
                                  input file: single structure file (POSCAR, cif)
            -f [FILTER ...], --filter [FILTER ...]
                                  filter (sort): remove species and sort
            -p, --permit_diff_comp
                                  flag for permitting different composition
          

          Examples

          init_POSCARS –> init_struc_data.pkl

          It can be used to convert init_POSCARS generated by CrySPY to init_struc_data.pkl in another machine such as a supercomputer. Multiple input files can be converted.

          python3 pos2pkl.py init_POSCARS
          

          If you put the pos2pkl.py in your PATH, you can omit python3.

          pos2pkl.py init_POSCARS
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 4
          Save init_struc_data.pkl
          

          Multiple inputs:

          python3 pos2pkl.py init_POSCARS init_POSCARS2 init_POSCARS3
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 12
          Save init_struc_data.pkl
          

          If init_struc_data.pkl already exists in the current directory and you want to append to it:

          python3 pos2pkl.py init_POSCARS
          
          init_struc_data.pkl already exists.
          Append to init_struc_data.pkl? [y/n]: y
          
          Load init_struc_data
          Composition: Na8 Cl8
          The number of structures: 12
          
          Converted. The number of structures: 16
          Save init_struc_data.pkl
          

          POSCAR or cif –> init_struc_data.pkl

          Single structure data such as POSCAR and cif files can also be converted. -s/--single option is required.

          python3 pos2pkl.py -s POSCAR test.cif
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 2
          Save init_struc_data.pkl
          

          init_POSCARS, POSCAR –> init_struc_data.pkl

          python3 pos2pkl.py init_POSCARS -s POSCAR
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 5
          Save init_struc_data.pkl
          
          Warning

          The following is wrong. The init_POSCARS is also treated as a single structure.

          python3 pos2pkl.py -s POSCAR init_POSCARS
          

          Filter (remove and sort)

          Here we consider a cif file with the composition of Sr8 Co8 O20 X4, including 4 dummy atoms (X4). -f/--filter option can be used to remove and sort species. Specify the same as atype in cryspy.in.

          python3 pos2pkl.py -s Sr8Co8O20X4.cif -f Sr Co O
          
          Removed species: {'X0+'}
          Composition: Sr8 Co8 O20
          
          Converted. The number of structures: 1
          Save init_struc_data.pkl
          

          With extract_struc.py you can see how it was registered in init_struc_data.pkl.

          python3 extract_struc.py init_struc_data.pkl -pa
          
          ID 0
          Full Formula (Sr8 Co8 O20)
          Reduced Formula: Sr2Co2O5
          ...
          

          -f option can allow you to sort.

          python3 pos2pkl.py -s Sr8Co8O20X4.cif -f O Co 
          
          Removed species: {'Sr', 'X0+'}
          Composition: O20 Co8
          
          Converted. The number of structures: 1
          Save init_struc_data.pkl
          

          kpt_check.py

          kpt_check.py can check a k-point mesh with a given kppvol. This script supports POSCAR, CONTCAR, and init_struc_data.pkl. pymatgen library is required.

          After generating initial structures, you can try to see how much the value of kppvol should be.

          Usage

          python3 kpt_check.py -h
          

          or if you put the script in your PATH, you can omit python3

          kpt_check.py -h
          
          usage: kpt_check.py [-h] [-w] [-n NSTRUC] infile kppvol
          
          positional arguments:
            infile                input file: POSCAR, CONTCAR, or init_struc_data.pkl
            kppvol                kppvol
          
          options:
            -h, --help            show this help message and exit
            -w, --write           write KPOINTS
            -n NSTRUC, --nstruc NSTRUC
                                  number of structure to check
          

          Example

          POSCAR with a given kppvol

          kpt_check.py POSCAR 100
          
          a = 10.689217
          b = 10.689217
          c = 10.730846
              Lattice vector
          10.689217 0.000000 0.000000
          0.000000 10.689217 0.000000
          0.000000 0.000000 10.730846
          
          kppvol:  100
          k-points:  [2, 2, 2]
          

          Write KPOINTS file

          You can generate a KPOINTS file using -w option.

          kpt_check.py -w POSCAR 100
          
          $ cat KPOINTS
          pymatgen 4.7.6+ generated KPOINTS with grid density = 607 / atom
          0
          Monkhorst
          2 2 2
          

          Check k-point meshes for init_struc_data.pkl

          In checking k-point meshes for init_struc_data.pkl, first five structures are automatically checked in the default setting. You can change the number of structures using -n option.

          kpt_check.py -n 3 init_struc_data.pkl 100
          
          # ---------- 0th structure
          a = 8.0343076893
          b = 8.03430768936
          c = 9.1723323373
              Lattice vector
          8.034308 0.000000 0.000000
          -4.017154 6.957915 0.000000
          0.000000 0.000000 9.172332
          
          kppvol:  100
          k-points:  [3, 3, 3]
          
          
          # ---------- 1th structure
          a = 9.8451944096
          b = 9.84519440959
          c = 6.8764313585
              Lattice vector
          9.845194 0.000000 0.000000
          -4.922597 8.526188 0.000000
          0.000000 0.000000 6.876431
          
          kppvol:  100
          k-points:  [3, 3, 4]
          
          
          # ---------- 2th structure
          a = 7.5760383679
          b = 7.57603836797
          c = 6.6507478296
              Lattice vector
          7.576038 0.000000 0.000000
          -3.788019 6.561042 0.000000
          0.000000 0.000000 6.650748
          
          kppvol:  100
          k-points:  [4, 4, 4]
          

          Subsections of FAQ

          Can I change njob in the middle of the simulation?

          2024 May 7

          Can I change njob in the middle of the simulation?

          Yes, you can change whenever you want.

          Below is an example of how the behavior changes when you reduce njob.

          Warning

          In CriSPY version 1.2.3 and earlier, there is a bug, so it is recommended to avoid reducing njobs.

          Currently, with njob = 4, jobs for structures with IDs 0, 1, 2, and 3 are running.
          Let’s say we chage njob from 4 to 2.

          $ cryspy
          [2024-04-28 18:27:41,847][cryspy_restart][INFO] 
          
          
          Restart CrySPY 1.2.4
          
          
          [2024-04-28 18:27:41,848][read_input][INFO] Changed njob from 4 to 2
          [2024-04-28 18:27:42,335][ctrl_job][INFO] # ---------- job status
          [2024-04-28 18:27:42,335][ctrl_job][INFO] ID      0: still queueing or running
          [2024-04-28 18:27:42,335][ctrl_job][INFO] ID      1: still queueing or running
          

          We reduced njob to 2, so we checked IDs 0, 1 and ignored IDs 2, 3.

          $ cryspy
          [2024-04-28 18:29:25,250][cryspy_restart][INFO] 
          
          
          Restart CrySPY 1.2.4
          
          
          [2024-04-28 18:29:25,744][ctrl_job][INFO] # ---------- job status
          [2024-04-28 18:29:25,744][ctrl_job][INFO] ID      0: Stage 1 Done!
          [2024-04-28 18:29:25,757][ctrl_job][INFO]     submitted job, ID      0 Stage 2
          [2024-04-28 18:29:25,758][ctrl_job][INFO] ID      1: Stage 1 Done!
          [2024-04-28 18:29:25,767][ctrl_job][INFO]     submitted job, ID      1 Stage 2
          

          Once the jobs for IDs 1 and 2 are finished, we will then proceed to check the next two jobs (IDs 2 and 3).

          $ cryspy
          [2024-04-28 18:31:30,830][cryspy_restart][INFO]
          
          
          Restart CrySPY 1.2.4
          
          
          [2024-04-28 18:31:31,329][ctrl_job][INFO] # ---------- job status
          [2024-04-28 18:31:31,329][ctrl_job][INFO] ID      0: Stage 2 Done!
          [2024-04-28 18:31:31,329][collect_vasp][WARNING]     Structure ID 0, could not obtain energy from OSZICAR
          [2024-04-28 18:31:31,333][ctrl_job][INFO]     collect results: E = nan eV/atom
          [2024-04-28 18:31:31,341][ctrl_job][INFO] ID      1: Stage 2 Done!
          [2024-04-28 18:31:31,341][collect_vasp][WARNING]     Structure ID 1, could not obtain energy from OSZICAR
          [2024-04-28 18:31:31,342][ctrl_job][INFO]     collect results: E = nan eV/atom
          [2024-04-28 18:31:31,347][cryspy][INFO] 
          
          recheck 1
          
          [2024-04-28 18:31:31,347][ctrl_job][INFO] # ---------- job status
          [2024-04-28 18:31:31,347][ctrl_job][INFO] ID      2: Stage 1 Done!
          [2024-04-28 18:31:31,358][ctrl_job][INFO]     submitted job, ID      2 Stage 2
          [2024-04-28 18:31:31,358][ctrl_job][INFO] ID      3: Stage 1 Done!
          [2024-04-28 18:31:31,368][ctrl_job][INFO]     submitted job, ID      3 Stage 2
          

          PNG (transparent background)

          logo_png1 logo_png1

          logo_png2 logo_png2

          logo_png3 logo_png3

          logo_png4 logo_png4

          JPG

          logo_jpg1 logo_jpg1

          logo_jpg2 logo_jpg2

          logo_jpg3 logo_jpg3

          logo_jpg4 logo_jpg4