Document

CrySPY (pronounced as crispy) is a crystal structure prediction tool written in Python.
CrySPY automates the following:

Structure generation
Submitting jobs for structure optimization
Collecting data for structure optimization
Selecting candidates using machine learning

CrySPY can be install by pip install csp-cryspy.

Latest version

CrySPY 1.4.1 (2025 July 7)

News

[2025 July 7] CrySPY 1.4.1 released. Version information/version 1.4.1
- add_max, elim_max, and subs_max in EA-vc
- Charge neutral condition in EA-vc
- cryspy-Eplot subcommand
[2025 June 17] CrySPY 1.4.0 released. Version information/version 1.4.0
- Variable-composition evolutionary algorithm
- Interactive mode
[2024 May 31] CrySPY 1.3.0 released. Version information/version 1.3.0
- There are important changes. See version information.
[2024 May 10] CrySPY 1.2.5 released. Version information/version 1.2.5
- Bug fix
[2024 May 7] (Document) FAQ page
[2024 May 7] CrySPY 1.2.4 released. Version information/version 1.2.4
- Bug fix
[2024 April 24] (Document) Tutorial > Random Search (RS) > VASP
[2023 October 21] CrySPY 1.2.3 released. Version information/version 1.2.3
- Bug fix for MPI
[2023 October 18] CrySPY 1.2.2 released. Version information/version 1.2.2
- Enthalpy
[2023 September 27] CrySPY 1.2.1 released. Version information/version 1.2.1
- Bug fix for ASE interface
[2023 August 8] (Document, Utility) Upload the example of CHGNET
[2023 July 10] CrySPY 1.2.0 released. Version information/version 1.2.0
- Interface for ASE
- Adoption of logging
[2023 June 14] CrySPY 1.1.1 released. Version information/version 1.1.1
- bug fix
[2023 May 16] CrySPY 1.1.0 released. Version information/version 1.1.0
- MPI parallelization (optional)
- New score of LAQA
[2023 March 16] CrySPY 1.0.0 released. Version information/version 1.0.0
- CrySPY is available in PyPI, so you can install by pip.

Discussions

Discussions in GitHub (questions and comments)

License

CrySPY is distributed under the MIT License
Copyright (c) 2018 CrySPY Development Team

Code contributors

Tomoki Yamashita and Lab members (Nagaoka University of Technology)
Nobuya Sato (National Institute of Advanced Industrial Science and Technology)
Hiori Kino (National Institute for Materials Science)
Kei Terayama (Yokohama City University)
Hikaru Sawahata (Kanazawa University)
Shinichi Kanehira (Osaka University)

Reference

CrySPY（software）
- T. Yamashita, S. Kanehira, N. Sato, H. Kino, H. Sawahata, T. Sato, F. Utsuno, K. Tsuda, T. Miyake, and T. Oguchi,
  “CrySPY: a crystal structure prediction tool accelerated by machine learning”,
  Sci. Technol. Adv. Mater. Meth. 1, 87 (2021). Link
Bayesian optimization
- T. Yamashita, N. Sato, H. Kino, T. Miyake, K. Tsuda, and T. Oguchi,
  “Crystal structure prediction accelerated by Bayesian optimization”,
  Phys. Rev. Mater. 2, 013803 (2018). Link
- N. Sato, T. Yamashita, T. Oguchi, K. Hukushima, and T. Miyake,
  “Adjusting the descriptor for a crystal structure search using Bayesian optimization”,
  Phys. Rev. Mater. 4, 033801 (2020). Link
Bayesian optimization and evolutionary algorithm
- T. Yamashita, H. Kino, K. Tsuda, T. Miyake, and T. Oguchi,
  “Hybrid algorithm of Bayesian optimization and evolutionary algorithm in crystal structure prediction”,
  Sci. Technol. Adv. Mater. Meth. 2, 67 (2022). Link
LAQA
- K.Terayama, T. Yamashita, T. Oguchi, and K. Tsuda,
  “Fine-grained optimization method for crystal structure prediction”,
  npj Comput. Mater. 4, 32 (2018). Link
- T. Yamashita and H. Sekine,
  “Improvement of look ahead based on quadratic approximation for crystal structure prediction”,
  Sci. Technol. Adv. Mater. Meth. 2, 84 (2022). Link

Link

GitHub repo GitHub discussions CrySPY utility

Version information

Version 1.4.1

Important change

EA-vc

The number of atoms operated on in addition, elimination, and substitution can now be changed. The number is randomly selected up to 3 by default (in 1.4.0, it was always 1).

Added

Charge neutral condition

It is now possible to impose a charge neutrality condition during structure generation in EA-vc.
- Input file > [structure] section
- Feaatures > Charge neutrality condition

Subcommand

cryspy-Eplot: （CrySPY > Features > Re-plot convex hull ）

Version 1.4.0

Important change

New algorithm: EA-vc

Currently, only the ASE interface is supported. Support for VASP and others is planned. CrySPY > Interface
CrySPY > Tutorial > Variable-composition evolutionary algorithm (EA-vc)
CrySPY > Search algorithms > Variable-composition evolutionary algorithm (EA-vc)

Interactive mode

RS, EA, and EA-vc can be run from the Jupyter environment.
Only the ASE interface is supported.
CrySPY > Tutorial > Interactive mode (Jupyter Notebook)
CrySPY > Features > Interactive mode

EA

tot_struc is no longer used in EA. The number of structures in the first generation is now determined by n_pop.

Interatomic distance check after structrue optimization

Added check_mindist_opt to the [option] section in cryspy.in.
Default: check_mindist_opt = True.
After structure relaxation, a check is performed to ensure that the minimum interatomic distance constraint is satisfied.

Common

The natot parameter in cryspy.in has been removed.
Ctrl_ext has been removed.

Fixed

Fixed a bug related to using TS as the score in BO.
Several other minor fixes.

Version 1.3.0

Important change

Common

working directory name
work000000 –> work0
We used to pickle data by grouping several data into tuples, but we changed it to pickle each item individually.
For example, rs_id_data.pkl –> id_queueing.pkl and id_running.pkl

BO

Using cal_fingerprint program is obsolete. dscribe is required instead.
- fppath and fp_rmin in cryspy.in are obsolete.
Changed the Bayesian optimization library from COMBO to PHYSBO
See Installation > CrySPY > CrySPY 1.3.0 or later

Fixed

soiap

support for recent pymatgen

Added

Random structure generation and structure generation by EA are now available as libraries. see Features > As library

for developer

We stopped using global variables (rin), now uses dataclass for input data.
Many of the input variables were lists, but we changed them to tuples.

Version 1.2.5

Bug fix

simple bug fix

Version 1.2.4

Bug fix

ASE interface: e.args[0] –> str(e.args[0])
ext mode: noprint
njob ( see also FAQ > Can I change njob in the middle of the simulation )

EA

default value of cls_lat: equal –> random

EA-vc

test version of variable composition EA (EA-vc). only binary system for now.

Version 1.2.3

MPI

Bug fix
In using MPI, -p option is required:

mpiexec -n 4 cryspy -p

See also

Features > Structure generation with MPI parallelization

Version 1.2.2

Enthalpy

You can use enthalpy instead of energy for VASP and QE.

See also

Features > Enthalpy

Version 1.2.1

ASE interface

Bug fixed for multiple stages.

Version 1.2.0

ASE interface

ASE interface is now available.

See also

Tutorial > Random Search (RS) > ASE in your local PC

Adoption of logging

CrySPY logs are output to both the screen and files (log_cryspy and err_cryspy).

See also

Features > Logging

Version 1.1.1

Bug fix for spg_error

In random structure generation, when a structure cannot be generated for a certain space group, the space group number is recorded in the variable sgp_error, and the number is skipped thereafter, but a bug was found in which the number was registered incorrectly in rare cases. Therefore, this spg_error function has been removed.

Version 1.1.0

Parallelization with MPI

Random structure generation using MPI has been available.

See also

LAQA

Updated score formula to take into account the stress term (T. Yamashita and H. Sekine, Sci. Technol. Adv. Mater. Meth. 2, 84 (2022).).

See also

Backup

Files are copied to the directory named by the date and time in “backup” directory.
See features/backup in detail.

Version 1.0.0

Install and run

CrySPY is now available in PyPI. You can install by

pip install csp-cryspy

The executable script, cryspy is automatically installed in your PATH. To run CrySPY, just type cryspy:

cryspy &

CrySPY stops once before going to next selection (BO, LAQA) or next generation (EA). For example, EA case:

[old version]

cryspy run
- check jobs (finish current generation?)
- structure generation by EA automatically starts

[CrySPY 1.0.0]

cryspy run
- check jobs (finish current generation?)
- stop
cryspy run
- auto backup
- structure generation by EA automatically starts

Auto and manual backup

Automatically backup:

before going to next selection or next generation
structure generation

To manually back up:

cryspy -b

See features/backup in detail.

Clean

cryspy -c

See features/clean in detail.

Directory tree

Changed the directory tree.

genstruc/RS –> RS/
genstruc/EA –> EA/
genstruc/struc_util.py –> util/
utility.py –> util/

IO

Fixed standard output file and standard error file: log_cryspy and err_cryspy
cryspy.out is obsoleted

Moved to CrySPY Utility

With the change in installation method, examples and cal_fingerprint have been moved to the CrySPY Utility.

COMBO

The python library COMBO is now optional in CrySPY. If you do not use Bayesian optimizaion, you do not need to install it.

New calc_code

ext: Deprecated in version 1.4.0.

cryspy.in

fppath

New input variable for cal_fingerprint. See Instllation/cryspy/cryspy_1.0

fwpath

New input variable for find_wy. See Instllation/requirements/find_wy

mindist

mindist can be omitted in cryspy.in
mindist_ea is obsoleted
added mindist_mol_bs and mindist_mol_bs_factor in cryspy.in

Version 0.10.3 or earlier

[2022 May 17] version 0.10.3 released
- Bug fixed: LAMMPS IO.
[2022 January 24] version 0.10.2 released
- Added nrot: maximum number of times to rotate molecules in mol_bs
[2021 September 30] version 0.10.1 released
- Fixed the problem of numpy.random.seed in multiprocessing
[2021 July 25] version 0.10.0 released
- Support PyXtal 0.2.9 or later
- LAQA can be used with QE
- Upper and lower limits of energy for EA and BO
[2021 July 13] paper published
- Our paper on CrySPY software has been published in STAM:Methods
[2021 March 18] version 0.9.2 released
- Support pymatgen v2022.
[2021 February 7] version 0.9.0 released
- Interfaced with OpenMX
- Employ PyXtal library to generate initial structures
- If you use PyXtal (default), find_wy program is not required
- LAQA can be used with soiap
- Change the name: [lattice] section –> [structure] section
- Several input variables move to [structure] section
  - natot: [basic] –> [structure]
  - atype: [basic] –> [structure]
  - nat: [basic] –> [structure]
  - maxcnt: [option] –> [structure]
  - symprec: [option] –> [structure]
  - spgnum: [option] –> [structure]
- New features
  - Molecular crystal structure generation
  - Scale volume
[2020 March 19] paper published
- Our paper on adjusting the descriptor for CSP Bayesian optimization has been published in Physical Review Materials
[2020 February 16] version 0.8.0 released
- Migrate to Python 3
- CrySPY logo created
- Change several variable names and data formats
- Change style of output for energy: eV/cell –> eV/atom
- IDs of working directories corresponds to structure IDs
- New features
  - recalculation
  - manual select in BO
[2018 December 5] version 0.7.0 released
- New features
  - Evolutionary algorithm
[2018 August 20] version 0.6.4 released
[2018 July 2] version 0.6.3 released
[2018 June 26] Version 0.6.2 released
[2018 March 1] Version 0.6.1 released
[2018 January 9] paper published
- Our paper on CrySPY has been published in Physical Review Materials

Installation

Note

You need (CrySPY + Python environment + structure optimizer) in your workstation, super computer, etc.

System requirements

Note

You need (CrySPY + Python environment + structure optimizer) in your workstation, super computer, etc.

Python

2025 July 3, updated

Python

CrySPY 1.3.0 or later

Python >= 3.9
- PyXtal (>= 0.5.3)
- (optional) mpi4py
- (optional, required if algo is BO) PHYSBO (Not COMBO)
- (optional, required if algo is BO) dscribe
- (optional) nglview

If you install csp-cryspy with pip, necessary libraries such as PyXtal, pymatgen, and ASE will be installed automatically. Go to Installation > CrySPY for detail.

2025 June 17, Tested and confirmed to work in the following environments (as installed via pip install csp-cryspy)

python 3.13.5
CrySPY 1.4.0
numpy 1.26.4 (with physbo) and 2.3.0（without physbo. physbo requires numpy < 2.0）
pandas 2.3.0
pymatgen 2025.6.14
pyxtal 1.0.9
scipy 1.15.3

Quick install

pip install csp-cryspy

When using BO

pip install dscribe physbo

CrySPY 1.1.0 – 1.2.5

Python >= 3.8
- PyXtal (>= 0.5.3)
- (optional) mpi4py
- (optional, required if algo is BO) COMBO

If you install csp-cryspy with pip, necessary libraries such as PyXtal will be installed automatically. Go to Installation > CrySPY. Manual installation of COMBO is required when using Bayesian optimization.

CrySPY 1.0.0

Python >= 3.8
- PyXtal (>= 0.5.3)
- (optional, required if algo is BO) COMBO

Info

[2023 April 22] How to instlal PyXtal (pyshtools) on arm64 MacOS is figured out. See Arm64 on MacOS (without Rosseta 2)
[2023 March 15] On MacOS, it is difficult to install PyXtal in the arm64 environment, so it is recommended to use the x86_64 environment with Rosetta 2.

CrySPY 0.10.0 – 0.10.3

Tested with Homebrew Python 3.8.x and 3.9.x on Mac and Python 3.8.x on Linux.

Python 3.x.x
- COMBO
- PyXtal (>= 0.2.2)
- (PyXtal requires pymatgen) pymatgen (>= 2022.x.x)

CrySPY 0.9.2

Tested with Homebrew Python 3.8.x and 3.9.x on Mac and Python 3.8.x on Linux.

Python 3.x.x
- COMBO
- pymatgen (>= 2022.x.x)
- PyXtal (>= 0.2.2)

Info

[2021 July 15] If you use PyXtal >= 0.2.9, update CrySPY to the version 0.10.0 or later.

Info

[2021 March 18] There is a breaking change in pymatgen 2022.x.x. CrySPY 0.9.2 and PyXtal 0.2.2 support this change in pymatgen.

Info

[2021 Feb. 5] PyXtal depends on numba, but numba does not support Python 3.9. So you should use Python 3.8.x for a while.
[2021 March 18] Currently numba supports Python 3.9.x.

Info

[2021 Feb. 7] PyXtal requires SciPy, but the latest version of SciPy (v1.6.0) might include a bug for deepcopy. You should use SciPy v1.5.4 for a while.
[2021 March 18] This bug has been fixed in SciPy v1.6.1.

CrySPY 0.9.0 – 0.9.1

Python 3.8.x
- COMBO
- pymatgen (<= 2021.x.x)
- PyXtal 0.1.6 - 0.2.1

CrySPY 0.8.0 or earlier

See the old document which is included CrySPY itself.

Structure optimizer

At least one optimizer is required.

First-principles calculation
- VASP
- QUANTUM ESPRESSO
- OpenMX (CrySPY 0.9.0 or later)
Interatomic potential
- soiap
- LAMMPS
Other
- ASE (CrySPY 1.2.0 or later)

find_wy (optional)

CrySPY have utilized find_wy to generate a random structure for a given space group (symmetry). However, CrySPY employs PyXtal library for structure generation as default since version 0.9.0. You can skip to install find_wy in CrySPY 0.9.0 or later, but you may use find_wy. For CrySPY 0.8.x or earlier, find_wy is required to generate a random structure for a given space group.

Info

You can skip to install find_wy in CrySPY 0.9.0 or later.

Installation of find_wy

m_tspace

First you need compile m_tspace for find_wy. Check these sites to compile it.

Download the source code of m_tspace in an arbitrary directory. For example:

$ mkdir -p ~/local
$ cd ~/local
$ git clone https://github.com/nim-hrkn/m_tspace.git

Additional two files are required to compile m_tspace. Download the following files in ~/local/m_tspace from TSPASE:

$ cd m_tspace
$ wget http://phoenix.mp.es.osaka-u.ac.jp/~tspace/tspace_main/tsp07/tsp98.f
$ wget http://phoenix.mp.es.osaka-u.ac.jp/~tspace/tspace_main/tsp07/prmtsp.f

Edit the makefile and run the make command. If you use ifort, you had better delete -check all option and use -O2 option.

$ emacs makefile
$ head -n 4 makefile
#FC=gfortran
#FFLAGS=-g -cpp -DUSE_GEN -ffixed-line-length-255
FC=ifort
FFLAGS=-O2 -g -traceback -cpp -DUSE_GEN -132
$ make

If you used gfortran, you might face the following problem:

tsp98.f:9839:32:

       CALL SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
                                1
Error: Actual argument contains too few elements for dummy argument 'ntab' (12/48) at (1)
make: *** [tsp98.o] Error 1

Then change the source file of tsp98.f like this (line 9925):

Before:

9913: C SUBROUTINE SUBGRP ====*====3====*====4====*====5====*====6====*====7
9914: C
9915: C    IF (JG(I),I=1,MG) IS A SUBGROUP OF (JGT(J),J=1,MGT) THEN
9916: C          TABLE (NTAB(I),I=1,MG) IS MADE HERE AND IND=0
9917: C    ELSE
9918: C          IND=-1
9919: C
9920: C                 1993/12/25
9921: C                   BY  S.TANAKA AND A. YANASE
9922: C---*----1----*----2----*----3----*----4----*----5----*----6----*----7
9923: C
9924:       SUBROUTINE SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
9925:       DIMENSION NTAB(48),JG(48),JGT(48)

After:

9913: C SUBROUTINE SUBGRP ====*====3====*====4====*====5====*====6====*====7
9914: C
9915: C    IF (JG(I),I=1,MG) IS A SUBGROUP OF (JGT(J),J=1,MGT) THEN
9916: C          TABLE (NTAB(I),I=1,MG) IS MADE HERE AND IND=0
9917: C    ELSE
9918: C          IND=-1
9919: C
9920: C                 1993/12/25
9921: C                   BY  S.TANAKA AND A. YANASE
9922: C---*----1----*----2----*----3----*----4----*----5----*----6----*----7
9923: C
9924:       SUBROUTINE SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
9925:       DIMENSION NTAB(12),JG(48),JGT(48)

If you succeed in compiling, you get m_tsp.a.

find_wy

Check these sites to compile find_wy:

Download the source code of find_wy in an arbitrary directory. For example:

$ mkdir -p ~/local
$ cd ~/local
$ git clone https://github.com/nim-hrkn/find_wy.git

Edit make.inc and set the path to the m_tsp.a that you just prepared.

$ cd find_wy
$ emacs make.inc
$ head -n 4 make.inc
TSPPATH=~/local/m_tspace
#INCPATH = -I $(TSPPATH)
TSP=$(TSPPATH)/m_tsp.a

You can delete -check all option and use -O2 option. Then run the make command.

$ make

When you get the executable file of find_wy, run the following command for test:

$ ./find_wy input_sample/input_si4o8.txt

If there is no problem, POS_WY_SKEL_ALL.json file is generated.

Executable file of find_wy

CrySPY 1.0.0 or later

Put the executable file of find_wy in your PATH. Or, specify the path of the executable file in cryspy.in as follows:

[structure]
use_find_wy = True
fwpath = /xxx/xxx/xxx/find_wy

CrySPY 0.10.3 or earlier

When you use find_wy, put the executable file of find_wy in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/, so that the executable file path is ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/find_wy.

CrySPY

CrySPY (>= 1.0.0) is available in PyPI. You can install by pip.

CrySPY 1.3.0 or later

2025 July 3, updated

CrySPY

pip

pip install csp-cryspy

Please note that the name is csp-cryspy on PyPI, not cryspy. The main command, cryspy, is automatically installed in your PATH. You can check by

which cryspy

Starting from version 1.4.1, the following subcommand is also installed:

cryspy-Eplot: （CrySPY > Features > Re-plot convex hull ）

Editable mode

If you want to change the source code of CrySPY, you can use pip’s editable mode (-e option).

git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git
pip install -e ./CrySPY

Instead of git clone, you can download the compressed file from the release page

PHYSBO and DScribe (optional)

If you use Bayesian optimization, PHYSBO and DScribe are required.

pip install physbo dscribe

Info

cal_fingerprint program and COMBO are obsolete.

mpi4py (optional)

When performing random structure generation with MPI parallelization, mpi4py is required.

pip install mpi4py

Jupyter and nglview (optional)

For analysis on a local PC or in interactive mode, Jupyter is required. If you want to visualize crystal structures using nglview in interactive mode, install nglview by pip.

pip install nglview

CrySPY 1.0.0 -- 1.2.5

CrySPY

pip

CrySPY 1.0.0 or later can be installed by pip.

pip install csp-cryspy

The executable script, cryspy, is automatically installed in your PATH. You can check by

which cryspy

Editable mode

If you want to change the source code of CrySPY, you can use pip’s editable mode (-e option).

git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git
pip install -e ./CrySPY

Instead of git clone, you can download the compressed file from the release page

cal_fingerprint (optional)

cal_fingerprint is a program to calculate structure descriptors and is required if algo is BO. From CrySPY 1.0.0, the cal_fingerprint program is included in CrySPY utility. See Instllation/CrySPY_utility/Compile cal_fingerprint for compilation.

Put the executable file of cal_fingerprint in your PATH. Or, specify the path of the executable file in cryspy.in as follows:

[BO]
fppath = /xxx/xxx/xxx/cal_fingerprint

Arm64 on MacOS (without Rosseta 2)

Info

In PyXtal, starting from version 0.6.3, pyshtools is no longer mandatory. Therefore, you can ignore the information written below if you are using a recent version of PyXtal.

Install miniforge3 (We do not know how to install pyshtools with homebrew python.)
Install pymatgen, pyshtools by conda (recent versions of pyshtools are available in conda-forge)

conda install pymatgen
conda install pyshtools

Install CrySPY

pip3 install csp-cryspy

CrySPY 0.10.3 or earlier

Installation of CrySPY is very simple. Just download it!

Download

You can put the source code of CrySPY in an arbitrary directory. For example, let us put the source code in ~/CrySPY_root/CrySPY-x.x.x (x.x.x means the version). Use git or download the compressed file.

Git

mkdir ~/CrySPY_root
cd ~/CrySPY_root
git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git CrySPY-x.x.x

zip or tar.gz file

Download the source as a zip or tar.gz file from GitHub release .
Then put the source like ~/CrySPY_root/CrySPY-x.x.x

Directory tree

Directory tree in ~/CrySPY_root/CrySPY-x.x.x/:

CrySPY-x.x.x
├── CHANGELOG.md
├── CrySPY/
│   ├── BO/
│   ├── EA/
│   ├── IO/
│   ├── LAQA/
│   ├── RS/
│   ├── __init__.py
│   ├── calc_dscrpt/
│   ├── f-fingerprint/
│   ├── find_wy/
│   ├── gen_struc/
│   ├── interface/
│   ├── job/
│   └── start/
│   └── utility.py
├── LICENSE
├── README.md
├── cryspy.py
├── docs/
├── example/
└── utility/

Info

Main script is cryspy.py.

Setup (optional)

find_wy (optional)

When you use find_wy, put the executable file of find_wy in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/, so that the executable file path is ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/find_wy.

cd ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy
cp ~/local/find_wy/find_wy .

Compile cal_fingerprint (optional)

When you use Bayesian optimization, compile cal_fingerpirnt program which calculates structure descriptors.

cd ~/CrySPY_root/CrySPY-x.x.x/CrySPY/f-fingerprint
emacs Makefile
make

Make sure that the executable file of cal_fingerprint exists in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/f-fingerprint/.

CrySPY utility (optional)

Setting up Python environment in your local PC is useful to analyze CrySPY results. Utility tools (jupyter notebook and python scripts) are available for analysis and visualization. Input examples are also included in CrySPY utility.

Info

Download CrySPY utility

Git

$ git clone https://github.com/Tomoki-YAMASHITA/CrySPY_utility.git

zip

Go to CrySPY utility and click green Code button, then choose Download ZIP.

Tutorial

Info

Beginners are encouraged to start with a random search.
Additional example files not included in the tutorials are also available in CrySPY utility.

Random Search (RS)

Info

ASE is easy to start for beginners because when you install CrySPY, ASE is also automatically installed. Although not highly accurate, ASE provides very lightweight and fast interatomic potentials, making it suitable for testing on a laptop or other low-spec machines.

Preparation of input files

Follow one of the instructions below, then proceed to the section on running CrySPY.

Running CrySPY

ASE on your local PC

2025 June 16, updated

ASE provides interfaces to different codes. ASE also includes Pure Python EMT calculator, which is suitable for testing CrySPY because of its fast and easy structure optimization.

In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Cu 8 atoms.

Assumption

Here, we assume the following conditions:

CrySPY 1.2.0 or later in your local PC
CrySPY job filename: job_cryspy
ase input filename: ase_in.py

Input files

Move to your working directory, and copy the example files by one of the following methods.

Download from CrySPY_utility/examples/ase_Cu8_RS
Copy from CrySPY utility that you installed

cd ase_Cu8_RS
tree

.
├── calc_in
│   ├── ase_in.py_1
│   └── job_cryspy
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = ASE
tot_struc = 5
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu
nat = 8

[ASE]
ase_python = ase_in.py

[option]

In [basic] section, jobcmd = zsh can be changed to jobcmd = sh or jobcmd = bash in accordance with your environment. CrySPY runs zsh job_cryspy as a background job internally.

[ASE] section is required when you use ASE.

You can name the following files whatever you want:

jobfile: job_cryspy
ase_python: ase_in.py

The other input variables are discussed later.

calc_in directory

The job file and input files for ASE are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh

# ---------- ASE
python3 ase_in.py

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

You can specify the input (ase_in.py) file names, but it must match the values of ase_python in cryspy.in. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string CrySPY_ID is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for ASE

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 1 in this ASE tutorial, so we need only ase_in.py_1. ase_in.py_1 is listed below. Refer to the ASE documentation for details.

from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.calculators.emt import EMT
from ase.optimize import BFGS
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(cell_filter)
opt.run(fmax=0.01, steps=2000)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = cell_filter.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ write structure
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

Unlike VASP and QE, the ASE input (python script) is more flexible. CrySPY has two rules:

Energy is output in units of eV/cell to log.tote file. CrySPY reads the last line of it.
Optimized structure is output to CONTCAR file in the VASP format.

Running CrySPY

Go to Running CrySPY

soiap on your local PC

2025 March 6 update

soiap is Structure Optimization with InterAtomic Potential. It is suitable for testing CrySPY because of its fast structure optimization. See instructions to install soiap.

In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Si 8 atoms.

Assumption

Here, we assume the following conditions:

(only version 0.10.3 or earlier) CrySPY main script: ~/CrySPY_root/CrySPY-0.9.0/cryspy.py
CrySPY job filename: job_cryspy
soiap executable file: ~/local/soiap-0.3.0/src/soiap
soiap input filename: soiap.in
soiap output filename: soiap.out
soiap input structure filename: initial.cif

Input files

Move to your working directory, and copy input example files by one of the following methods.

Download from Cryspy_utility/examples/soiap_Si8_RS
Copy from CrySPY utility that you installed
(only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/soiap_RS_Si8 .

cd soiap_RS_Si8
tree

.
├── calc_in
│   ├── job_cryspy
│   └── soiap.in_1
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = soiap
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

In [basic] section, jobcmd = zsh can be changed to jobcmd = sh or jobcmd = bash in accordance with your environment. CrySPY runs zsh job_cryspy as a background job internally.

[soiap] section is required when you use soiap.

You can name the following files whatever you want:

jobfile
soiap_infile
soiap_outfile
soiap_cif

The other input variables are discussed later.

calc_in directory

The job file and input files for soiap are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh

# ---------- soiap
EXEPATH=/path/to/soiap
$EXEPATH/soiap soiap.in 2>&1 > soiap.out

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Change /path/to/soiap into right path suitable for your environment. You can specify the input (soiap.in) and output (soiap.out) file names, but they must match the values of soiap_infile and soiap_outfile in cryspy.in. The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for soiap

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 1, so we need only soiap.in_1.

soiap.in_1 is listed below.

crystal initial.cif ! CIF file for the initial structure
symmetry 1 ! 0: not symmetrize displacements of the atoms or 1: symmetrize

md_mode_cell 3 ! cell-relaxation method
               ! 0: FIRE, 2: quenched MD, or 3: RFC5
number_max_relax_cell 100 ! max. number of the cell relaxation
number_max_relax 1 ! max. number of the atom relaxation
max_displacement 0.1 ! max. displacement of atoms in Bohr

external_stress_v 0.0 0.0 0.0 ! external pressure in GPa

th_force 5d-5 ! convergence threshold for the force in Hartree a.u.
th_stress 5d-7 ! convergence threshold for the stress in Hartree a.u.

force_field 1 ! force field
              ! 1: Stillinger-Weber for Si, 2: Tsuneyuki potential for SiO2,
              ! 3: ZRL for Si-O-N-H, 4: ADP for Nd-Fe-B, 5: Jmatgen, or
              ! 6: Lennard-Jones

The input structure file is specified at the first line. Use the same name as the value of soiap_cif in cryspy.in.

Running CrySPY

Go to Running CrySPY

VASP

2025 March 6 update

In this tutorial, we try to use CrySPY in a PC cluster with a job scheduler system such as PBS. Here we employ VASP. The target system is Na8Cl8, 16 atoms.

Assumption

Here, we assume the following conditions:

CrySPY 1.2.0 or later in your PC cluster
CrySPY job command: qsub
CrySPY job filename: job_cryspy
executable file, vasp_std in your PC cluster

Input files

Move to your working directory, and copy the example files by one of the following methods.

Download from Cryspy_utility/examples/vasp_Na8Cl8_RS
Copy from CrySPY utility that you installed

cd vasp_Na8Cl8_RS
tree

.
├── calc_in
│   ├── INCAR_1
│   ├── INCAR_2
│   ├── POTCAR
│   ├── POTCAR_is_dummy
│   └── job_cryspy
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5

[VASP]
kppvol = 40 80

[option]

In [basic] section, jobcmd = qsub can be changed in accordance with your environment. CrySPY runs qsub job_cryspy as a background job internally in this setting. You can name the following file whatever you want:

jobfile

We adopt a stage-based system for structure optimization calculations. Here, we use nstage = 2. For example, users can configure the following settings. In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density. Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.

[VASP] section is required when you use VASP. You have to specify k-point grid density (Å^-3) for each stage in kppvol.

Info

See Input file > Kpoint for details of kppvol

The other input variables are discussed later.

calc_in directory

The job file and input files for VASP are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Na8Cl8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q
####$ -q ibis3.q
####$ -q ibis4.q

# ---------- vasp
VASPROOT=/usr/local/vasp/vasp.6.4.2/bin
mpirun -np $NSLOTS $VASPROOT/vasp_std

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Change VASPROOT to the appropriate path suitable for your environment. The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

Input for VASP

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 2, so we need INCAR_1 and INCAR_2. Here, INCAR_1 is set to fix the cell and relax only the ionic positions, while INCAR_2 is configured to fully relax both the cell and ionic positions.

INCAR_1

SYSTEM = NaCl
!!!LREAL = Auto
Algo = Fast
NSW = 40

LWAVE = .FALSE.
!LCHARG = .FALSE.

ISPIN =  1

ISMEAR = 0
SIGMA = 0.1

IBRION = 2
ISIF = 2

EDIFF = 1e-5
EDIFFG = -0.01

INCAR_2

SYSTEM = NaCl
!!LREAL = Auto
Algo = Fast
NSW = 200

ENCUT = 341

!!LWAVE = .FALSE.
!!LCHARG = .FALSE.


ISPIN =  1

ISMEAR = 0
SIGMA = 0.1

IBRION = 2
ISIF = 3

EDIFF = 1e-5
EDIFFG = -0.01

CrySPY automatically generates POSCAR and KPOINTS files. You have to prepare POTCAR file yourself. The POTCAR included in this example file is empty, so please be aware of that.

Warning

POTCAR in this example is empty. We cannot distribute it.

Running CrySPY

Go to Running CrySPY

QE

2025 March 6, updated

In this tutorial, we try to use CrySPY in a machine with a job scheduler system such as PBS. Here we employ QUANTUM ESPRESSO. (QE). The target system is Si 8 atoms.

Assumption

Here, we assume the following conditions:

CrySPY job command: qsub
CrySPY job filename: job_cryspy
QE executable file: /usr/local/qe-6.5/bin/pw.x
QE input filename: pwscf.in
QE output filename: pwscf.out

Input files

Move to your working directory, and copy input example files by one of the following methods.

Download from CrySPY_utility/examples/qe_Si8_RS
Copy from CrySPY utility that you installed
(only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/QE_Si8_RS .

cd QE_RS_Si8
tree

.
├── calc_in
│   ├── job_cryspy
│   ├── pwscf.in_1
│   └── pwscf.in_2
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = QE
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Si
nat = 8

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  40  80

[option]

In [basic] section, jobcmd = qsub can be changed in accordance with your environment. CrySPY runs qsub job_cryspy as a background job internally in this setting.

[QE] section is required when you use QE. You have to specify k-point grid density (Å^-3) for each stage in kppvol.

Info

See Input file > Kpoint for details of kppvol

You can name the following files whatever you want:

jobfile
qe_infile
qe_outfile

The other input variables are discussed later.

calc_in directory

The job file and input files for QE are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS /path/to/pw.x < pwscf.in > pwscf.out


if [ -e "CRASH" ]; then
    sed -i -e '3 s/^.*$/skip/' stat_job
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

Change /path/to/pw.x to the appropriate path suitable for your environment. You can specify the input (pwscf.in) and output (pwscf.out) file names, but they must match the values of qe_infile and qe_outfile in cryspy.in.

The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

Input for QE

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 2, so we need pwscf.in_1 and pwscf.in_2. Here, pwscf.in_1 is set to fix the cell and relax only the ionic positions, while pwscf.in_2 is configured to fully relax both the cell and ionic positions.

pwscf.in_1

 &control
    title = 'Si8'
    calculation = 'relax'
    nstep = 100
    restart_mode = 'from_scratch',
    pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
    outdir='./out.d/'
 /

 &system
    ibrav = 0
    nat = 8
    ntyp = 1
    ecutwfc = 44.0
    occupations = 'smearing'
    degauss = 0.01
 /

 &electrons
 /

 &ions
 /

 &cell
 /

ATOMIC_SPECIES
  Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF

pwscf.in_2

 &control
    title = 'Si8'
    calculation = 'vc-relax'
    nstep = 200
    restart_mode = 'from_scratch',
    pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
    outdir='./out.d/'
 /

 &system
    ibrav = 0
    nat = 8
    ntyp = 1
    ecutwfc = 44.0
    occupations = 'smearing'
    degauss = 0.01
 /

 &electrons
 /

 &ions
 /

 &cell
 /

ATOMIC_SPECIES
  Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF

Change pseudo_dir to your suitable directory. Inputs for structure data and k-point such as ATOMIC_POSITIONS and K_POINTS are automatically appended by CrySPY with pymatgen. Users do not have to prepare them in pwscf.in_x.

Running CrySPY

Go to Running CrySPY

OpenMX

Coming soon.

LAMMPS

Coming soon.

Check cryspy.in

2025 June 16, updated

See Input file in detail.

Let’s take a look at cryspy.in again. This may be slightly different depending on calc_code you chose.

[basic]
algo = RS
calc_code = ASE
tot_struc = 5
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu
nat = 8

[ASE]
ase_python = ase_in.py

[option]

[basic] section

algo: Algorithm. Set RS for Random Search.
calc_code: Structure optimizer. Choose from VASP, QE, OMX, soiap, LAMMPS, ASE
tot_struc: The total number of structures. In this case, 5 random structures are generated at 1st run.
nstage: The number of stages. It’s up to you.
njob: The number of jobs running at the same time. In this example, CrySPY sets 2 slots for structure optimization, in other words, optimizes every 2 structures.
jobcmd: Command for jobs. Use bash, zsh, qsub, and so on.
jobfile: File name of the job file.

[structure] section

atype: Atom type. e.g. for Na8Cl8: atype = Na Cl.
nat: The number of atoms corresponding to each atype. e.g. for Na8Cl8: nat = 8 8

Script to run

Note

For version 1.0.0 or later, skip this page. The executable script is automatically installed.

Assumption

Here, we assume the following condition:

CrySPY main script: ~/CrySPY_root/CrySPY-0.9.0/cryspy.py

Make script

Let’s make a convenient shell script to avoid typing long commands over and over again. Here, we create the script, cryspy (any file name will do).

$ emacs cryspy
$ chmod 744 cryspy
$ cat cryspy

#!/bin/sh

python3 -u ~/CrySPY_root/CrySPY-0.9.0/cryspy.py 1>> log 2>> err

-u option (unbuffered option) can be omitted.

You can put this script in your $PATH, or just use like bash ./cryspy.

Firsrt run

2025 March 6, updated

Make sure you have the following in your working directory.

calc_in/
(cryspy)
cryspy.in

$ ls
calc_in/  cryspy.in

Then, run CyrSPY!

cryspy

If you use old version (0.10.3 or earlier):

bash ./cryspy

At the first run, CrySPY goes into structure generation mode. CrySPY stops after 5 structure generation.

If it worked properly, the following output appears on the screen:

[2025-03-06 18:52:21,495][cryspy_init][INFO] 


Start CrySPY 1.4.0b10


[2025-03-06 18:52:21,495][cryspy_init][INFO] # ---------- Library version info
[2025-03-06 18:52:21,495][cryspy_init][INFO] pandas version: 2.2.2
[2025-03-06 18:52:21,495][cryspy_init][INFO] pymatgen version: 2025.1.24
[2025-03-06 18:52:21,495][cryspy_init][INFO] pyxtal version: 1.0.6
[2025-03-06 18:52:21,495][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2025-03-06 18:52:21,496][write_input][INFO] [basic]
[2025-03-06 18:52:21,496][write_input][INFO] algo = RS
[2025-03-06 18:52:21,496][write_input][INFO] calc_code = ASE
[2025-03-06 18:52:21,496][write_input][INFO] tot_struc = 5
[2025-03-06 18:52:21,496][write_input][INFO] nstage = 1
[2025-03-06 18:52:21,496][write_input][INFO] njob = 2
[2025-03-06 18:52:21,496][write_input][INFO] jobcmd = zsh
[2025-03-06 18:52:21,496][write_input][INFO] jobfile = job_cryspy
...
(omitted)
...
[2025-03-06 18:52:21,497][rs_gen][INFO] # ---------- Initial structure generation
[2025-03-06 18:52:21,497][rs_gen][INFO] # ------ mindist
[2025-03-06 18:52:21,498][struc_util][INFO] Cu - Cu: 1.32
[2025-03-06 18:52:21,498][rs_gen][INFO] # ------ generate structures
[2025-03-06 18:52:21,519][gen_pyxtal][INFO] Structure ID      0: (8,) Space group:  31 -->  31 Pmn2_1
[2025-03-06 18:52:21,525][gen_pyxtal][INFO] Structure ID      1: (8,) Space group: 198 --> 198 P2_13
[2025-03-06 18:52:21,554][gen_pyxtal][INFO] Structure ID      2: (8,) Space group:   4 -->   4 P2_1
[2025-03-06 18:52:21,580][gen_pyxtal][INFO] Structure ID      3: (8,) Space group: 193 --> 191 P6/mmm
[2025-03-06 18:52:21,581][gen_pyxtal][WARNING] Compoisition [8] not compatible with symmetry 172:     spg = 172 retry.
[2025-03-06 18:52:21,625][gen_pyxtal][INFO] Structure ID      4: (8,) Space group:  64 -->  64 Cmce
[2025-03-06 18:52:22,013][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.516183

Several output files are also generated.

(cryspy.out): Short log. only version 0.10.3 or earlier.
cryspy.stat: Status file.
data/init_POSCARS: Initial struture file in POSCAR format. You can open this file using VESTA
data/pkl_data: Directory to save pickled data.
log_cryspy: log.
err_cryspy: error and warning.

Let’s take a look at cryspy.stat file.

[status]
id_queueing = 0 1 2 3 4

Structure ID 0 – 4 are queueing because we just generated structures, and have not submitted yet.

Tip

Check the initial structures, if the distances between atoms are too close, you should set the mindist in cryspy.in.

Features > Restriction on interatomic distances

Submit job

2023 July 10, update

Continue

CrySPY continues the simulation if you have cryspy.stat file.

Tip

Continue if you have crypy.stat
Start from the beginning if you don’t have cryspy.stat

Submit job

Run CyrSPY again.

cryspy

Check the screen or log_cryspy file.

[2023-07-10 18:52:51,859][cryspy_restart][INFO] 


Restart CrySPY 1.2.0


[2023-07-10 18:52:51,869][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:52:51,904][ctrl_job][INFO] ID      0: submit job, Stage 1
[2023-07-10 18:52:51,931][ctrl_job][INFO] ID      1: submit job, Stage 1

And also cryspy.stat file.

...
(omit)
...
[status]
id_queueing = 2 3 4
id      0 = Stage 1
id      1 = Stage 1

CrySPY submitted two jobs for structure ID 0 and 1 as you set njob = 2 in cryspy.in. Calculations are performed in the work directory. These directory names correspond to their structure ID.

tree -d work

work
├── 000000
├── 000001
└── fin

When the two jobs are done, run CrySPY again.

cryspy

[2023-07-10 18:55:01,053][cryspy_restart][INFO] 


Restart CrySPY 1.2.0


[2023-07-10 18:55:01,058][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,058][ctrl_job][INFO] ID      0: Stage 1 Done!
[2023-07-10 18:55:01,093][ctrl_job][INFO]     collect results: E = -0.00696997755502915 eV/atom
[2023-07-10 18:55:01,132][ctrl_job][INFO] ID      1: Stage 1 Done!
[2023-07-10 18:55:01,133][ctrl_job][INFO]     collect results: E = 0.4934076667166454 eV/atom
[2023-07-10 18:55:01,144][cryspy][INFO] 

recheck 1

[2023-07-10 18:55:01,145][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,153][ctrl_job][INFO] ID      2: submit job, Stage 1
[2023-07-10 18:55:01,161][ctrl_job][INFO] ID      3: submit job, Stage 1

If you set nstage = 2 (more than 2), new jobs on stage 2 for ID 0 and 1 are submitted. If you set nstage = 1, CrySPY collects calculation data of ID 0 and 1, then submits next ID’s jobs. Directories of the finished structure are moved to the fin directory.

Repeat cryspy several times until all 5 structures are done. You can delete the work directory when the simulation is done if you do not need it.

The auto script (repeat_cryspy) may help you.

Check results

Move to data directory. There should be a few more files.

$ cd data
$ ls
cryspy_rslt  cryspy_rslt_energy_asc  init_POSCARS  opt_POSCARS  pkl_data/

cryspy_rslt: Result file.
cryspy_rslt_energy_asc: Result file sorted in energy ascending order.
init_POSCARS: Initial struture file in POSCAR format.
opt_POSCARS: Optimized structure file in POSCAR format.
pkl_data/: Directory to save pickled data.

The results are written to text files, cryspy_rslt and cryspy_rslt_energy_asc (and also saved in pickle data in pkl_data directory).

Each result appends to cryspy_rslt file in the order in which one finished earlier.

cat cryspy_rslt

   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet

Info

Not ID order in cryspy_rslt

In cryspy_rslt_energy_asc file, the results are sorted in energy ascending order.

cat cryspy_rslt_energy_asc

   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done

Spg_num and Spg_sym show space group information on initial structures. Spg_num_opt and Spg_sym_opt are those of optimized structures. The last column Opt indicates whether or not optimization reached required accuracy.

Append structures

Of course only 5 structures are not enough to find stable structures. You can append structures whenever you want. Here let’s append more 5 structures.

For Si-Si mindist, the default value of 1.11 Å is used in the first structure generation (see log_cryspy), which is a little too close. Let us try to set the mindist to 2.0 Å.

Info

For mindist, see also Features > Restriction on interatomic distances.

Edit cryspy.in and change the value of tot_struc into 10, and add mindist_1 = 2.0

emacs cryspy.in
cat cryspy.in

[basic]
algo = RS
calc_code = soiap
tot_struc = 10
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Si
nat = 8
mindist_1 = 2.0

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

Then run cryspy, and check log_cryspy file.

cryspy &
cat log_cryspy

...
(omit)
...

2023/03/19 00:01:47
CrySPY 1.0.0
Restart cryspy.py


Changed tot_struc from 5 to 10
Changed mindist from None to [[2.0]]

Backup data

# ---------- Append structures
# ------ mindist
Si - Si 2.0
Structure ID      5 was generated. Space group: 218 --> 221 Pm-3m
Structure ID      6 was generated. Space group:  86 --> 129 P4/nmm
Structure ID      7 was generated. Space group: 129 --> 129 P4/nmm
Structure ID      8 was generated. Space group: 191 --> 191 P6/mmm
Structure ID      9 was generated. Space group:  31 -->  31 Pmn2_1

Remember that CrySPY goes into structure generation mode whenever you change the value of tot_struc. In this mode, CrySPY does not do any other action such as collecting data, submitting jobs, and so on.

Note

Structure generation mode whenever you change the value of tot_struc.
From version 1.0.0, CrySPY automatically backs up when adding structures. See features/backup.

Repeat cryspy & several times until all appended structures are done. The auto script (repeat_cryspy) may help you.

Analysis and visualization

Download data

It is assumed here that you analyze and visualize CrySPY data on your local PC. If you use CrySPY on a supercomputer or workstation, download the data to your local machine. You can delete the work and backup directories if they are not needed, as their file size can be very large.

Jupyter notebook

Move to the data/ directory in the results you downloaded earlier. Then, if the CrySPY utility has already been downloaded locally, copy cryspy_analyzer_RS.ipynb. Alternatively, you can download it directly from GitHub (CrySPY_utility/notebook/). Launch Jupyter (e.g., VS Code, Jupyter Lab, or Jupyter Notebook), and simply run the cells in order to obtain a figure like the one shown below.

Evolutionary Algorithm (EA)

Info

For the basic usage of CrySPY, start by referring to Tutorial > Random Search (RS).
For details of the algorithm, refer to Search Algorithms > Evolutionary algorithm (EA).

Preparation of input files

Follow one of the instructions below, then proceed to the section on running CrySPY.

ASE on your local PC

Running CrySPY

ASE on your local PC

2025 April 5

The files used in this tutorial can be downloaded from CrySPY_utility/examples/ase_Cu8Au8_EA. This tutorial demonstrates a test run on a local machine using ASE’s lightweight Pure Python EMT calculator. The target system is Cu8Au8.

cryspy.in

Example of cryspy.in.

[basic]
algo = EA
calc_code = ASE
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu Au
nat = 8 8

[EA]
n_pop = 10
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 1
n_elite = 1
n_fittest = 5
slct_func = TNM
t_size = 2
maxgen_ea = 0

[ASE]
ase_python = ase_in.py

[option]

Use algo = EA.
When using a bash environment, set jobcmd to bash.
In EA, tot_struc is not used. The number of structures is determined by n_pop.
n_pop = n_crsov + n_perm + n_strain + n_rand
For parameters in the [EA] section, refer to Input file > [EA] section and Search algorithms > Evolutionary algorithm (EA).

calc_in/

The contents under calc_in/ are the same as in Tutorial > Random Search (RS) > ASE in your local PC.

calc_in/ase_in.py_1

from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.calculators.emt import EMT
from ase.optimize import BFGS
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(cell_filter)
opt.run(fmax=0.01, steps=2000)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = cell_filter.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ write structure
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

calc_in/job_cryspy

#!/bin/sh

# ---------- ASE
python3 ase_in.py > out.log

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Create next generation

2025 April 6

First run

When you run cryspy, the program enters structure generation mode. It generates the first generation of random structures and then exits.

cryspy

log

...
[2025-04-06 09:15:34,720][ea_init][INFO] # ---------- Initialize evolutionary algorithm
[2025-04-06 09:15:34,720][ea_init][INFO] # ------ Generation 1
[2025-04-06 09:15:34,720][ea_init][INFO] 10 structures by random

In EA, running cryspy appends the current generation’s information to cryspy.stat.

[status]
generation = 1
id_queueing = 0 1 2 3 4 5 6 7 8 9

Optimize structures

After running cryspy several times and completing the structure optimization for the first generation, the output will appear as shown below.

...
[2025-04-06 09:20:26,218][ctrl_job][INFO] Done generation 1
[2025-04-06 09:20:26,218][ctrl_job][INFO] 
EA is ready

Create next generation

Once all preparations are complete, running cryspy again automatically creates a backup and starts generating the next-generation structures.

cryspy

...
[2025-04-06 09:35:11,546][cryspy_restart][INFO] read input, cryspy.in
[2025-04-06 09:35:11,554][ctrl_job][INFO] # ---------- job status
[2025-04-06 09:35:11,554][ctrl_job][INFO] Done generation 1
[2025-04-06 09:35:11,554][utility][INFO] Backup data
[2025-04-06 09:35:11,611][ea_next_gen][INFO] # ---------- Evolutionary algorithm
[2025-04-06 09:35:11,611][ea_next_gen][INFO] Generation 2
[2025-04-06 09:35:11,613][ea_next_gen][INFO] # ------ natural selection
[2025-04-06 09:35:11,687][ea_next_gen][INFO] ranking without duplication (including elite):
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      1, fitness:   -0.00530
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      3, fitness:    0.01490
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      4, fitness:    0.04485
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      7, fitness:    0.11501
[2025-04-06 09:35:11,687][ea_next_gen][INFO] Structure ID      8, fitness:    0.15254
[2025-04-06 09:35:11,687][ea_next_gen][INFO] # ------ Generate children
[2025-04-06 09:35:11,687][ea_child][INFO] # -- mindist
[2025-04-06 09:35:11,689][struc_util][INFO] Cu - Cu: 1.32
[2025-04-06 09:35:11,689][struc_util][INFO] Cu - Au: 1.34
[2025-04-06 09:35:11,689][struc_util][INFO] Au - Au: 1.36
[2025-04-06 09:35:11,740][crossover][INFO] Structure ID     10 (8, 8) was generated from      3 and      1 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,764][crossover][WARNING] remove_within_mindist: some atoms within mindist. retry.
[2025-04-06 09:35:11,774][crossover][INFO] Structure ID     11 (8, 8) was generated from      3 and      1 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,789][crossover][INFO] Structure ID     12 (8, 8) was generated from      1 and      4 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,833][crossover][INFO] Structure ID     13 (8, 8) was generated from      1 and      3 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,852][crossover][WARNING] mindist in _add_border_line: Cu - Cu, 0.567032320824818. retry.
[2025-04-06 09:35:11,861][crossover][INFO] Structure ID     14 (8, 8) was generated from      7 and      1 by crossover. Space group:   1 P1
[2025-04-06 09:35:11,875][permutation][INFO] Structure ID     15 (8, 8) was generated from      1 by permutation. Space group: 146 R3
[2025-04-06 09:35:11,888][permutation][INFO] Structure ID     16 (8, 8) was generated from      3 by permutation. Space group:   1 P1
[2025-04-06 09:35:11,890][strain][WARNING] mindist in strain: Cu - Cu, 1.3050485787603692. retry.
[2025-04-06 09:35:11,903][strain][INFO] Structure ID     17 (8, 8) was generated from      3 by strain. Space group:   1 P1
[2025-04-06 09:35:11,917][strain][INFO] Structure ID     18 (8, 8) was generated from      1 by strain. Space group:   1 P1
[2025-04-06 09:35:12,513][ea_child][INFO] # ------ Random structure generation
[2025-04-06 09:35:12,513][rs_gen][INFO] # ------ mindist
[2025-04-06 09:35:12,515][struc_util][INFO] Cu - Cu: 1.32
[2025-04-06 09:35:12,515][struc_util][INFO] Cu - Au: 1.34
[2025-04-06 09:35:12,515][struc_util][INFO] Au - Au: 1.36
[2025-04-06 09:35:12,516][rs_gen][INFO] # ------ generate structures
[2025-04-06 09:35:12,530][gen_pyxtal][INFO] Structure ID     19: (8, 8) Space group:  86 -->  86 P4_2/n
[2025-04-06 09:35:12,533][ea_next_gen][INFO] # ------ Select elites
[2025-04-06 09:35:12,533][ea_next_gen][INFO] Structure ID      9 keeps as the elite

After that, simply running cryspy repeatedly will advance the structure search.

Check results

cryspy_rslt

The following is an example of cryspy_rslt after completing calculations up to the third generation. In EA, generation information (Gen) is also included.

    Gen  Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
0     1      214  I4_132          230       Ia-3d   1.168743     NaN  no_file
1     1      198   P2_13          198       P2_13  -0.005303     NaN  no_file
2     1       95  P4_322           95      P4_322   0.389566     NaN  no_file
3     1       27    Pcc2           27        Pcc2   0.014898     NaN  no_file
4     1       60    Pbcn           60        Pbcn   0.044852     NaN  no_file
5     1      116   P-4c2          116       P-4c2   0.403246     NaN  no_file
6     1      187   P-6m2          187       P-6m2   1.054706     NaN  no_file
7     1      161     R3c          160         R3m   0.115009     NaN  no_file
8     1      146      R3          146          R3   0.152535     NaN  no_file
9     1       60    Pbcn           47        Pmmm  -0.005676     NaN  no_file
10    2        1      P1            1          P1   0.026070     NaN  no_file
11    2        1      P1            7          Pc   0.005898     NaN  no_file
12    2        1      P1            1          P1   0.005208     NaN  no_file
13    2        1      P1            1          P1   0.005506     NaN  no_file
14    2        1      P1            1          P1   0.024364     NaN  no_file
15    2      146      R3          146          R3   0.011525     NaN  no_file
16    2        1      P1            1          P1   0.014590     NaN  no_file
17    2        1      P1            1          P1   0.015236     NaN  no_file
18    2        1      P1            2         P-1  -0.012335     NaN  no_file
19    2       86  P4_2/n          140      I4/mcm   0.274548     NaN  no_file
20    3        1      P1            1          P1   0.013611     NaN  no_file
21    3        1      P1           10        P2/m  -0.014166     NaN  no_file
22    3        1      P1            1          P1   0.019472     NaN  no_file
23    3        1      P1            1          P1   0.011641     NaN  no_file
24    3        1      P1            1          P1   0.000297     NaN  no_file
25    3        1      P1            1          P1   0.005596     NaN  no_file
26    3        1      P1            1          P1   0.013374     NaN  no_file
27    3        1      P1            2         P-1   0.005055     NaN  no_file
28    3        2     P-1           12        C2/m  -0.012396     NaN  no_file
29    3      182  P6_322          182      P6_322   0.711472     NaN  no_file

ea_info

The EA parameters used in each generation are recorded in ea_info.

Gen Population Crossover Permutation Strain Random Elite crs_lat slct_func
  1         10         0           0      0     10     0  random       TNM
  2         10         5           2      2      1     1  random       TNM
  3         10         5           2      2      1     1  random       TNM

ea_origin

Information about the structure generation method and parent individuals is output to ea_origin.

Gen Struc_ID   Operation   Parent
  1        0      random     None
  1        1      random     None
  1        2      random     None
  1        3      random     None
  1        4      random     None
  1        5      random     None
  1        6      random     None
  1        7      random     None
  1        8      random     None
  1        9      random     None
  2       10   crossover   (3, 1)
  2       11   crossover   (3, 1)
  2       12   crossover   (1, 4)
  2       13   crossover   (1, 3)
  2       14   crossover   (7, 1)
  2       15 permutation     (1,)
  2       16 permutation     (3,)
  2       17      strain     (3,)
  2       18      strain     (1,)
  2       19      random     None
  2        9       elite    elite
  3       20   crossover (18, 12)
  3       21   crossover  (12, 9)
  3       22   crossover (12, 18)
  3       23   crossover  (18, 9)
  3       24   crossover (13, 18)
  3       25 permutation    (18,)
  3       26 permutation     (9,)
  3       27      strain    (18,)
  3       28      strain    (18,)
  3       29      random     None
  3       18       elite    elite

Analysis and visualization

Download data

Jupyter notebook

Move to the data/ directory in the results you downloaded earlier. Then, if the CrySPY utility has already been downloaded locally, copy cryspy_analyzer_EA.ipynb. Alternatively, you can download it directly from GitHub (CrySPY_utility/notebook/). Launch Jupyter (e.g., VS Code, Jupyter Lab, or Jupyter Notebook), and simply run the cells in order to obtain a figure like the one shown below.

Variable-composition evolutionary algorithm (EA-vc)

Info

For the basic usage of CrySPY, start by referring to CrySPY > Tutorial > Random Search (RS).
For the basic usage of EA, start by referring to CrySPY > Tutorial > Evolutionary algorithm (EA).
For details of the algorithm, refer to CrySPY > Search algorithms > Variable-composition Evolutionary algorithm (EA-vc).

Info

System requirements

CrySPY 1.4.0 or later
As of June 2025, only the ASE interface is supported（ see CrySPY > Interface）

Preparation of input files

Follow one of the instructions below, then proceed to the section on running CrySPY.

Running CrySPY

ASE on your local PC (Cu-Ag-Au)

2025 June 16

The files used in this tutorial can be downloaded from CrySPY_utility/examples/ase_Cu-Ag-Au_EA-vc. This tutorial demonstrates a test run on a local machine using ASE’s lightweight Pure Python EMT calculator. The target system is the ternary Cu-Ag-Au system.

cryspy.in

Example of cryspy.in.

[basic]
algo = EA-vc
calc_code = ASE
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu Ag Au
ll_nat = 0 0 0
ul_nat = 8 8 8

[ASE]
ase_python = ase_in.py

[EA]
n_pop = 20
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 2
n_add = 3
n_elim = 3
n_subs = 3
target = random
n_elite = 2
n_fittest = 10
slct_func = TNM
t_size = 2
end_point = 0.0 0.0 0.0
maxgen_ea = 0

[option]

Use algo = EA-vc.
When using a bash environment, set jobcmd to bash.
In the [structure] section, use ll_nat to specify the minimum number of atoms for each element, and ul_nat for the maximum.
n_pop = n_crsov + n_perm + n_strain + n_rand + n_add + n_elim + n_subs
The end_point field should be set to the energy per atom (eV/atom) for each pure element—Cu, Ag, and Au. When using ASE’s Pure Python EMT calculator, the energies of the pure elements are zero, so enter 0.0 in each case. (For an example using CHGNet, refer to ASE-CHGNet（Cu-Au）.)
For the parameters in the [EA] section, refer to CrySPY > Input file > [EA] Section and CrySPY > Search algorithms > Variable-composition evolutionary algorithm (EA-vc).
Refer also to CrySPY > Tutorial > Variable-composition evolutionary algorithm (EA-VC) > Analysis and Visualization for the parameters of the automatically generated convex hull plot.

calc_in/

The contents under calc_in/ are the same as those in Tutorial > Random Search (RS) > ASE on your local PC.

calc_in/ase_in.py_1

from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.calculators.emt import EMT
from ase.optimize import BFGS
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(cell_filter)
opt.run(fmax=0.01, steps=2000)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = cell_filter.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ write structure
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

calc_in/job_cryspy

#!/bin/sh

# ---------- ASE
python3 ase_in.py > out.log

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

ASE-CHGNet（Cu-Au）

2025 June 16

Info

CHGNet needs to be installed.

The files used in this tutorial can be downloaded from CrySPY_utility/examples/ase_chgnet_Cu-Au_EA-vc. In this tutorial, we assume that a computing cluster with a job scheduler is used together with the machine learning potential CHGNet. The calculation can also be performed on a local PC, so if you prefer this, please modify the input settings accordingly. The target system is the binary Cu-Au system.

Pre-calculation

In EA-vc, the per-atom energies of each elemental phase must be used as the reference in the end_point setting of cryspy.in, so they need to be calculated beforehand. There should be two directories inside the example.

Au-fcc
├── POSCAR
├── chgnet_in.py
└── job_cryspy

Cu-fcc
├── POSCAR
├── chgnet_in.py
└── job_cryspy

Each directory contains a crystal structure file (POSCAR), a Python script (chgnet_in.py) to perform structure relaxation and calculate energy, and a job script (job_cryspy). Please modify these files according to your computing environment.

Submit the job (replace the job submission command as appropriate for your system).

cd Au-fcc
qsub job_cryspy
cd ../Cu-fcc
qsub job_cryspy
cd ..

When the calculations finish successfully, a file named end_point will be created in each directory, containing the energy per atom (eV/atom) after structure relaxation.

cat Au-fcc/end_point
-3.2357187271118164

cat Cu_fcc/end_point
-4.083529472351074

These values are then used as input for the cryspy.in file.

cryspy.in

[basic]
algo = EA-vc
calc_code = ASE
nstage = 1
njob = 20
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

[ASE]
ase_python = chgnet_in.py

[EA]
n_pop = 20
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 2
n_add = 3
n_elim = 3
n_subs = 3
target = random
n_elite = 2
n_fittest = 10
slct_func = TNM
t_size = 2
maxgen_ea = 0
end_point = -4.08352709  -3.23571777

[option]

Use algo = EA-vc.
Change jobcmd according to your computing environment.
In the [structure] section, use ll_nat to specify the minimum number of atoms for each element, and ul_nat for the maximum.
n_pop = n_crsov + n_perm + n_strain + n_rand + n_add + n_elim + n_subs
In the end_point, enter the per-atom energies (in eV/atom) of each pure element—Cu, Ag, and Au. Use the values obtained from the pre-calculations, and make sure the order follows that of atype.
For the parameters in the [EA] section, refer to CrySPY > Input file > [EA] Section and CrySPY > Search algorithms > Variable-composition evolutionary algorithm (EA-vc).
Refer also to CrySPY > Tutorial > Variable-composition evolutionary algorithm (EA-VC) > Analysis and Visualization for the parameters of the automatically generated convex hull plot.

calc_in/

The contents under calc_in/ are the same as those in Tutorial > Random Search (RS) > ASE on your local PC, with minor modifications for CHGNet. Be sure to adjust paths such as the Python executable in the job script to match your computing environment. Be sure to adjust the Python executable path in the job script.

calc_in/chgnet_in.py_1

# ---------- import
from ase.constraints import FixSymmetry
from ase.filters import FrechetCellFilter
from ase.io import read, write
from ase.optimize import FIRE, BFGS, LBFGS
from chgnet.model import CHGNetCalculator

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR')

# ---------- set up
atoms.calc = CHGNetCalculator()
atoms.set_constraint([FixSymmetry(atoms)])
cell_filter = FrechetCellFilter(atoms)
opt = BFGS(cell_filter, trajectory='opt.traj')

# ---------- run
opt.run(fmax=0.01, steps=2000)

# ---------- write structure
write('opt_struc.vasp', cell_filter.atoms, format='vasp', direct=True)

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
# ------ energy
e = cell_filter.atoms.get_total_energy()    # eV/cell
with open('log.tote', mode='w') as f:
    f.write(str(e))

# ------ struc
opt_atoms = cell_filter.atoms.copy()
opt_atoms.set_constraint(None)    # remove constraint for pymatgen
write('CONTCAR', opt_atoms, format='vasp', direct=True)

calc_in/job_cryspy

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N CuAu_CrySPY_ID
#$ -pe smp 2

# ---------- OpenMP
export OMP_NUM_THREADS=2

# ---------- ASE
/usr/local/Python-3.10.13/bin/python3 chgnet_in.py > out.log

# --------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Create next generation

2025 June 16

First run

When you run cryspy, the program enters structure generation mode. It generates the first generation of random structures and then exits.

cryspy

It can be confirmed from the output that structures are generated with the number of atoms within the range specified by ll_nat and ul_nat.

...
[2025-06-16 10:04:45,648][cryspy_init][INFO] # ---------- Initial structure generation
[2025-06-16 10:04:45,648][rs_gen][INFO] # ------ mindist
[2025-06-16 10:04:45,650][struc_util][INFO] Cu - Cu: 1.32
[2025-06-16 10:04:45,650][struc_util][INFO] Cu - Ag: 1.385
[2025-06-16 10:04:45,650][struc_util][INFO] Cu - Au: 1.34
[2025-06-16 10:04:45,650][struc_util][INFO] Ag - Ag: 1.45
[2025-06-16 10:04:45,650][struc_util][INFO] Ag - Au: 1.405
[2025-06-16 10:04:45,650][struc_util][INFO] Au - Au: 1.36
[2025-06-16 10:04:45,650][rs_gen][INFO] # ------ generate structures
[2025-06-16 10:04:45,659][gen_pyxtal][WARNING] Compoisition [1 4] not compatible with symmetry 34:     spg = 34 retry.
[2025-06-16 10:04:45,662][gen_pyxtal][WARNING] Compoisition [ 2  2 12] not compatible with symmetry 39:     spg = 39 retry.
[2025-06-16 10:04:45,691][gen_pyxtal][INFO] Structure ID      0: (3, 1, 2) Space group:  82 --> 119 I-4m2
[2025-06-16 10:04:45,694][gen_pyxtal][WARNING] Compoisition [6 6 2] not compatible with symmetry 57:     spg = 57 retry.
[2025-06-16 10:04:45,749][gen_pyxtal][INFO] Structure ID      1: (1, 8, 5) Space group:  71 -->  71 Immm
[2025-06-16 10:04:45,857][gen_pyxtal][INFO] Structure ID      2: (3, 7, 8) Space group: 174 --> 174 P-6
...

The file cryspy.stat shows that the current generation’s information is being added during the EA process.

[status]
generation = 1
id_queueing = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Optimize structures

After running cryspy several times and completing the structure optimization for the first generation, the output will appear as shown below.

...
[2025-06-16 10:25:56,962][ctrl_job][INFO] Done generation 1
[2025-06-16 10:25:56,962][ctrl_job][INFO] Calculate convex hull for generation 1
[2025-06-16 10:25:57,854][ctrl_job][INFO] 
EA is ready

Convex hull

At this point, the hull distance data and the convex hull plot have been output to ./data/convex_hull/.

hull_dist_all_gen_1

    ID    hull distance (eV/atom)    Num_atom
     7                   0.000000    (0, 2, 6)
    14                   0.036510    (1, 7, 6)
    17                   0.064702    (0, 1, 5)
    19                   0.113649    (0, 0, 8)
    16                   0.168530    (6, 4, 8)
     9                   0.186497    (8, 4, 6)
     1                   0.187379    (1, 8, 5)
    11                   0.233893    (4, 5, 4)
     3                   0.273365    (6, 5, 5)
    10                   0.326759    (1, 4, 4)
     2                   0.330749    (3, 7, 8)
     8                   0.359543    (6, 2, 7)
     4                   0.404169    (4, 4, 2)
    18                   0.422989    (0, 6, 8)
    13                   0.428456    (0, 6, 3)
     5                   0.444792    (7, 4, 7)
     6                   0.464305    (7, 7, 7)
    12                   0.556654    (3, 0, 0)
    15                   0.560062    (6, 7, 1)
     0                   0.644278    (3, 1, 2)

conv_hull_gen_1.svg

Create next generation

Once all preparations are complete, running cryspy again automatically creates a backup and starts generating the next-generation structures.

cryspy

...
[2025-06-16 10:37:19,860][ctrl_job][INFO] Done generation 1
[2025-06-16 10:37:20,136][utility][INFO] Backup data
[2025-06-16 10:37:20,173][ea_next_gen][INFO] # ---------- Evolutionary algorithm
[2025-06-16 10:37:20,174][ea_next_gen][INFO] Generation 2
[2025-06-16 10:37:20,174][ea_next_gen][INFO] # ------ natural selection
[2025-06-16 10:37:20,177][ea_next_gen][INFO] ranking without duplication (including elite):
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      7, fitness:    0.00000
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     14, fitness:    0.03651
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     17, fitness:    0.06470
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     19, fitness:    0.11365
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     16, fitness:    0.16853
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      9, fitness:    0.18650
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      1, fitness:    0.18738
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     11, fitness:    0.23389
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID      3, fitness:    0.27336
[2025-06-16 10:37:20,177][ea_next_gen][INFO] Structure ID     10, fitness:    0.32676
[2025-06-16 10:37:20,177][ea_next_gen][INFO] # ------ Generate children
[2025-06-16 10:37:20,177][ea_child][INFO] # -- mindist
[2025-06-16 10:37:20,179][struc_util][INFO] Cu - Cu: 1.32
[2025-06-16 10:37:20,179][struc_util][INFO] Cu - Ag: 1.385
[2025-06-16 10:37:20,179][struc_util][INFO] Cu - Au: 1.34
[2025-06-16 10:37:20,179][struc_util][INFO] Ag - Ag: 1.45
[2025-06-16 10:37:20,179][struc_util][INFO] Ag - Au: 1.405
[2025-06-16 10:37:20,179][struc_util][INFO] Au - Au: 1.36
[2025-06-16 10:37:20,217][crossover][INFO] Structure ID     20 (0, 4, 7) was generated from     19 and     14 by crossover. Space group:   1 P1
[2025-06-16 10:37:20,219][crossover][INFO] Structure ID     21 (0, 1, 7) was generated from      7 and     17 by crossover. Space group:   1 P1
[2025-06-16 10:37:20,221][crossover][INFO] Structure ID     22 (3, 0, 8) was generated from     16 and     19 by crossover. Space group:   1 P1
[2025-06-16 10:37:20,225][crossover][INFO] Structure ID     23 (0, 1, 7) was generated from      7 and     17 by crossover. Space group:   1 P1
...
[2025-06-16 10:37:20,809][ea_next_gen][INFO] # ------ Select elites
[2025-06-16 10:37:20,809][ea_next_gen][INFO] Structure ID      7 keeps as the elite
[2025-06-16 10:37:20,809][ea_next_gen][INFO] Structure ID     14 keeps as the elite

After that, simply running cryspy repeatedly will advance the structure search.

Check results

This section focuses on the differences from the EA method.

cryspy_rslt

Below is an example of a cryspy_rslt file after completing calculations up to the third generation. In EA-vc, formation energy (Ef_eV_atom) and number of atoms (Num_atom) are also included.

    Gen  Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Ef_eV_atom   Num_atom  Magmom      Opt
0     1      119   I-4m2          119       I-4m2   0.639865    0.639865  (3, 1, 2)     NaN  no_file
1     1       71    Immm           71        Immm   0.182650    0.182650  (1, 8, 5)     NaN  no_file
2     1      174     P-6          187       P-6m2   0.324864    0.324864  (3, 7, 8)     NaN  no_file
3     1       71    Immm           71        Immm   0.269227    0.269227  (6, 5, 5)     NaN  no_file
4     1       12    C2/m           65        Cmmm   0.401521    0.401521  (4, 4, 2)     NaN  no_file
7     1      123  P4/mmm          123      P4/mmm  -0.009930   -0.009930  (0, 2, 6)     NaN  no_file
10    1      107    I4mm          107        I4mm   0.320875    0.320875  (1, 4, 4)     NaN  no_file
5     1      121   I-42m          121       I-42m   0.439643    0.439643  (7, 4, 7)     NaN  no_file
6     1      115   P-4m2          115       P-4m2   0.459892    0.459892  (7, 7, 7)     NaN  no_file
8     1       81     P-4           81         P-4   0.354247    0.354247  (6, 2, 7)     NaN  no_file
9     1       11  P2_1/m           11      P2_1/m   0.182084    0.182084  (8, 4, 6)     NaN  no_file
11    1       10    P2/m           10        P2/m   0.229819    0.229819  (4, 5, 4)     NaN  no_file

nat_data

Information on the number of atoms is also included in nat_data.

    ID    ('Cu', 'Ag', 'Au')
     0    (3, 1, 2)
     1    (1, 8, 5)
     2    (3, 7, 8)
     3    (6, 5, 5)
     4    (4, 4, 2)
     5    (7, 4, 7)
     6    (7, 7, 7)
     7    (0, 2, 6)
     8    (6, 2, 7)
     9    (8, 4, 6)
    10    (1, 4, 4)
...

hull_dist_all_gen_x

For example, after the third generation is completed, the hull distance data is output to the file ./convex_hull/hull_dist_all_gen_3.

    ID    hull distance (eV/atom)    Num_atom
    43                   0.000000    (0, 2, 5)
    42                   0.000000    (0, 5, 5)
    48                   0.000000    (0, 1, 5)
    46                   0.000009    (0, 1, 5)
    28                   0.000011    (0, 1, 5)
    41                   0.000360    (0, 4, 6)
    47                   0.001838    (0, 1, 5)
    36                   0.001992    (1, 1, 6)
    21                   0.002544    (0, 1, 7)
    23                   0.002551    (0, 1, 7)
    24                   0.002795    (0, 4, 7)

conv_hull_gen_x.svg

The convex hull plot at the end of generation 3 is saved as ./convex_hull/conv_hull_gen_3.svg. Although svg is the default format, it can be changed to pdf or png by modifying the fig_format in the input file.

Analysis and visualization

Automatic convex hull plotting

In EA-vc simulations of binary and ternary systems, a convex hull plot is automatically generated at the end of each generation. For further customization, you can edit the plot yourself using a Jupyter notebook. For quaternary systems, visualization using Plotly with Jupyter is available (Plotly should already be installed automatically, as it is a dependency of pymatgen). Below are some usage examples.

Binary system

The above figure shows an example after search up to the third generation, with red labels added for explanation. The input file settings related to convex hull plotting are listed below (default values in parentheses).

show_max: Upper limit of the y-axis (0.2)
label_stable: Whether to display compositions of stable phases (True)
vmax: Maximum value of the colorbar on the right (0.2)
bottom_margin: Margin between the minimum value and the lower end of the y-axis (0.02)
fig_format: File format of the output figure. Supported formats: svg, png, pdf. (svg)

Each marker corresponding to the latest generation is marked with a cross.

Ternary system

show_max: Only entries with a hull distance less than or equal to show_max are plotted. (0.2)
label_stable: Whether to display compositions of stable phases (True)
vmax: Maximum value of the colorbar on the right (0.2)
bottom_margin: Not applicable to ternary systems
fig_format: File format of the output figure. Supported formats: svg, png, pdf. (svg)

Each marker corresponding to the latest generation is marked with a cross.

Download data

Jupyter notebook

Move to the data/ directory in the results you downloaded earlier. Then, if the CrySPY utility has already been downloaded locally, copy cryspy_analyzer_EA-vc.ipynb. Alternatively, you can download it directly from GitHub (CrySPY_utility/notebook/).

The Jupyter notebook file includes the same functions as the CrySPY code, allowing you to freely customize the convex hull plots. Execute the cells in order as appropriate, and choosing one of the following options will produce the same plot as the automatic output.

Binary system, matplotlib
Ternary system, matplotlib

In the section

Interactive plot using Plotly,

interactive plots using Plotly are available for binary, ternary, and quaternary systems. See CrySPY > Tutorial > Interactive Mode (Jupyter Notebook) #Interactive plot using Plotly for example plots.

Bayesian Optimization (BO)

LAQA

May 15th, 2023

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.
Here, we assume CrySPY 1.1.0 or later.

The example files used here can be downloaded from CrySPY_utility/examples/qe_Si16_LAQA. In this tutorial, only 50 initial structures are generated, but originally, LAQA is designed to select candidates from many more structures.

Input

cryspy.in

Here is an example of cryspy.in.

[basic]
algo = LAQA
calc_code = QE
tot_struc = 50
nstage = 1
njob = 10
jobcmd = qsub
jobfile = job_cryspy

[structure]
atype = Si
nat = 16
mindist_1 = 1.5

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  80

[LAQA]
nselect_laqa = 4

[option]

nstage must be 1 in LAQA
You have to write nselect_laqa in [LAQA] section. nselect_laqa is the number of candidates you select at one time.

If you want to change the value of the weight for LAQA score, edit wf and ws as below. If omitted, the default values are used (0.1 and 10.0, respectively). See, Search algorithms > LAQA for the score.

[LAQA]
nselect_laqa = 4
wf = 0.1
ws = 10.0

calc_in/pwscf.in_1

&control
    calculation = 'vc-relax'
    pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
    outdir='./outdir/'
    nstep = 10
/

&system
    ibrav = 0
    nat = 16
    ntyp = 1
    ecutwfc = 40
    ecutrho = 200
    occupations = 'smearing'
    degauss = 0.01
/

&electrons
/

&ions
/

&cell
/

ATOMIC_SPECIES
  Si -1.0 si_pbe_v1.uspp.F.UPF

nstep controls how many steps of structure optimization can proceed in one selection. (NSW for VASP)

calc_in/job_cryspy

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS pw.x -nk 4 < pwscf.in > pwscf.out

if [ -e "CRASH" ]; then
    sed -i -e '3 s/^.*$/skip/' stat_job
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

The job file is the same as the usual way.

Run

Tip

An automatic script is also available. See the bottom of this page.

Just type cryspy for the 1st run.

cryspy &

Check log_cryspy. 50 random structures are generated.

2023/05/13 13:02:07
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 1

Read input file, cryspy.in
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.5
Structure ID      0 was generated. Space group: 165 --> 165 P-3c1
Structure ID      1 was generated. Space group:  66 -->  66 Cccm
Structure ID      2 was generated. Space group: 146 --> 146 R3
Structure ID      3 was generated. Space group:  82 -->  82 I-4
Structure ID      4 was generated. Space group: 162 --> 162 P-31m
...
...
...
Structure ID     47 was generated. Space group:  90 -->  90 P42_12
Structure ID     48 was generated. Space group: 214 --> 214 I4_132
Structure ID     49 was generated. Space group:  23 -->  23 I222

Elapsed time for structure generation: 0:00:10.929030


# ---------- Initialize LAQA
# ---------- Selection 0
selected_id: 50 IDs

In LAQA, jobs of structure optimization for all structures are submitted once at the beginning. Note that only 10 steps are proceeded here since we set nstep = 10. Repeat cryspy command until all of these (10 steps) are completed. If necessary, you can also submit all jobs at once by increasing the value of njob.

After all the initial optimizations, LAQA is ready is displayed at the end of log_cryspy.

2023/05/13 13:23:31
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status
ID     41: Stage 1 Done!

LAQA is ready

Next cryspy run will make the first selection.

2023/05/13 13:23:33
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status

Backup data

# ---------- Selection 1
selected_id: 37 8 10 48

Here, only the number set in nselect_laqa will be selected. Type cryspy to submit the jobs (next 10 steps).

cryspy &

2023/05/13 13:23:36
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status
ID     37: submit job, Stage 1
ID      8: submit job, Stage 1
ID     10: submit job, Stage 1
ID     48: submit job, Stage 1

Then, by repeating this over and over again, the optimization of the structure selected according to the score advances by 10 steps each time. Proceed until several structures are completed, and finish (stop) when you like.

Status

If you want to check the LAQA score during the simulation, you can look at the status file:

./data/LAQA_status

Other files for LAQA will be output:

./data_LAQA_bias
./data_LAQA_energy
./data_LAQA_score
./data_LAQA_selected_id
./data_LAQA_step

Analysis and visualization

It is assumed here that you analyze and visualize CrySPY data in your local PC. If you use CrySPY in super computers or workstations, download the data in your local PC. You can delete the work and backup directory if you do not need it because the file size could be very large. You may gzip the pkl data to decrease the file size.

jupyter notebook

Move to the data/ directory in results you just downloaded. Then copy cryspy_analyzer_LAQA.ipynb from CrySPY utility.

You can obtain the graph and animation with the notebook. In the gif below, all of the optimizations were completed. This is just for animation. (When all of the optimizations are completed, the computational cost is the same as random search.)

This graph shows the energy as a function of optimization step. The red lines indicate three structures with the lowest energy. The most stable one reached diamond structure. The structures that eventually become stable were selected at an early stage.

Info

If algo = LAQA, the followings are automatically set in the [option] section.

force_step_flag = True
stress_step_flag = True

Force and stress data are collected step by step. Energy and structure data are NOT. They are collected for each selection. In other words, in this case, energy and structure data are saved once every 10 steps. If you want to collect energy and structure data step by step, manually set up as follows:

[option]
energy_step_flag = True
struc_step_flag = True

Auto script

You may find it tedious to run cryspy over and over again. The auto script could help you.

repeat_cryspy

Molecular crystal structure prediction

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.

In this section, we give a tutorial on the molecular structure generation part only. Since version 0.9.0, CrySPY has been able to generate random molecular crystal structures using PyXtal.

You need to use a pre-defined molecular by PyXtal’s database (see, https://pyxtal.readthedocs.io/en/latest/Usage.html?highlight=benzene#pyxtal-molecule-pyxtal-molecule)) or create molecule files that define molecular structures.

Pre-defined molecule

PyXtal currently supports C60, H2O, CH4, NH3, benzene, naphthalene, anthracene, tetracene, pentacene, coumarin, resorcinol, benzamide, aspirin, ddt, lindane, glycine, glucose, and ROY.

Let us generate molecular crystal structures that consist of 2 benzenes.

Move to your working directory, and copy input example files by one of the following methods.

Download from CrySPY utility > examples > qe_benzene_2_RS_mol
Copy from CrySPY utility that you installed
(only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/QE_benzene_2_RS_mol .

Take a look at cryspy.in.

$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 6
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
struc_mode = mol
atype = H C
nat = 12 12
mol_file = benzene
nmol = 2

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40  60

[option]

In generating molecular crystal structures, you have to set struc_mode = mol in the [structure] section. Molecule file(s) and the number of molecule(s) are specified as:

mol_file = benzene
nmol = 2

Run CrySPY and see the initial structures (./data/init_POSCARS).

User-defined molecule

Move to your working directory, and copy input example files for 2 formula units of Li3PS4.

version 1.0.0 or later
- Copy from CrySPY utility
version 0.10.3 or earlier
- cp -r ~/CrySPY_root/CrySPY-0.9.0/example/QE_Li3PS4_2fu_RS_mol .

$ cd QE_Li3PS4_2fu_RS_mol
$ ls
Li.xyz  PS4.xyz  calc_in/  cryspy.in

Molecule files of Li and PS4 are included. Supported formats in PyXtal are .xyz, .gjf, .g03, .g09, .com, .inp, .out, and pymatgen’s JSON serialized molecules.

$ cat Li.xyz
1
New structure
 Li  0.000  0.000  0.000

$ cat PS4.xyz
5
New structure
 P    0.000000    0.000000    0.000000
 S    1.200000    1.200000   -1.200000
 S    1.200000   -1.200000    1.200000
 S   -1.200000    1.200000    1.200000
 S   -1.200000   -1.200000   -1.200000

Check cryspy.in.

$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 4
nstage = 2
njob = 1
jobcmd = qsub
jobfile = job_cryspy

[structure]
struc_mode = mol
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40  60

[option]

A single atom (Li atom in this case) is treated as a molecule in the molecular crystal structure generation mode. In this example, a random molecular structure is composed of six Li molecules (atoms) and two PS4 molecules specified as:

mol_file = ./Li.xyz ./PS4.xyz
nmol = 6 2

In mol_file, set relative path of molecule files from cryspy.in. Here the molecule files are placed in the same directory.

Run CrySPY and see the initial structures (./data/init_POSCARS).

timeout_mol

Molecular crystal structure generation can be time consuming because PyXtal calculates the molecule directions according to a specified space group. Sometimes molecular crystal structure generation gets stuck. So we set a time limit on the single structure generation. The time limit (timeout_mol) is set to 120 seconds by default. If the limit is insufficient, you have to increase it as (see last line):

struc_mode = mol
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0

Volume of unit cell

You can control the volume of unit cells by changing the value(s) of scaling factor, vol_factor, in cryspy.in. By default, vol_factor is set to 1.0. It is also possible to specify a range of factors. Set minimum and maximum values as follows:

struc_mode = mol
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0
vol_factor = 0.8 1.5

Random structure generation with MPI

Oct. 21 2023, update

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.

Info

Requirements:

CrySPY ~~1.1.0~~ 1.2.3 or later
mpi4py
MPI library (Open MPI, Intel MPI, MPICH, etc.)

Warning

1.1.0 <= CrySPY <=1.2.2 has a bug. When you use bash (zsh) to run a job with MPI (e.g., jobcmd = zsh, jobfile = job_cryspy), the MPI job does not run. There is no problem when you use a job scheduler (qsub, sbatch). It has already fixed in version 1.2.3.

mpi4py

Install mpi4py if it is not already installed.

pip install mpi4py

Input

cryspy.in is the same as normal usage and does not need to be changed. Here we try structure generation with MPI using the following settings:

[basic]
algo = RS
calc_code = soiap
tot_struc = 100
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

All except tot_struc, atype, and nat are irrelevant for structure generation and can be ignored here.

Run

If you want to generate structures with 4 MPI processes, just use mpiexec -n (with `-p`` option):

mpiexec -n 4 cryspy -p

In 1.1.0 <= CrySPY <= 1.2.2, use (without `-p`` option)

mpiexec -n 4 cryspy

If you submit the job with a job scheduler system, make the job file. Here is an example:

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
#$ -N n_nproc
#$ -pe smp 4


mpirun -np $NSLOTS ~/.local/bin/cryspy

Please edit the location of the executable script cryspy.

Result

CrySPY simply divides the task (number of structures) by the number of processes:

Rank 0: IDs 0 – 24
Rank 1: IDs 25 – 49
Rank 2: IDs 50 – 74
Rank 3: IDs 75 – 99

CrySPY outputs the log in the order they are generated as follows:

2023/04/24 22:47:51
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 4

Read input file, cryspy.in
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.11
Structure ID     25 was generated. Space group: 138 --> 123 P4/mmm
Structure ID     75 was generated. Space group:  99 -->  99 P4mm
Structure ID      0 was generated. Space group: 127 --> 123 P4/mmm
Structure ID      1 was generated. Space group:  61 -->  61 Pbca
Structure ID     50 was generated. Space group:  38 -->  38 Amm2
Structure ID     51 was generated. Space group: 134 --> 123 P4/mmm
Structure ID     26 was generated. Space group: 111 --> 123 P4/mmm
Structure ID      2 was generated. Space group:   9 -->   9 Cc
Structure ID      3 was generated. Space group:  80 -->  80 I4_1
Structure ID      4 was generated. Space group: 107 --> 107 I4mm
Structure ID      5 was generated. Space group:  75 -->  75 P4
Structure ID     76 was generated. Space group: 108 --> 108 I4cm
Structure ID     77 was generated. Space group: 100 --> 100 P4bm
Structure ID     27 was generated. Space group: 207 --> 221 Pm-3m

However, the order in init_POSCARS is by structure ID since CrySPY outputs after all structures have been generated.

ID_0
1.0
   2.9636956737951818    0.0000000000000002    0.0000000000000002
   0.0000000000000000    2.9636956737951818    0.0000000000000002
   0.0000000000000000    0.0000000000000000    6.2634106638053080
Si
8
direct
  -0.1602734164607877   -0.1602734164607877   -0.0000000000000000 Si
   0.1602734164607877    0.1602734164607877    0.5000000000000000 Si
   0.6602734164607877    0.3397265835392123    0.7500000000000000 Si
   0.3397265835392122    0.6602734164607877    0.2500000000000000 Si
   0.4469739273741755    0.4469739273741755   -0.0000000000000000 Si
   0.5530260726258245    0.5530260726258244    0.5000000000000000 Si
   0.0530260726258245    0.9469739273741754    0.7500000000000000 Si
   0.9469739273741754    0.0530260726258245    0.2500000000000000 Si
ID_1
1.0
   7.2751506682509657    0.0000000000000004    0.0000000000000004
   0.0000000000000000    7.2751506682509657    0.0000000000000004
   0.0000000000000000    0.0000000000000000    5.1777634169924873
Si
8
direct
  -0.3845341807505553   -0.3845341807505553    0.4999999999999999 Si
   0.3845341807505553    0.3845341807505553    0.5000000000000000 Si
   0.3845341807505553   -0.3845341807505553    0.0000000000000000 Si
  -0.3845341807505553    0.3845341807505553   -0.0000000000000000 Si
   0.0000000000000000    0.5000000000000000    0.2500000000000000 Si
   0.5000000000000000    0.0000000000000000    0.7500000000000000 Si
   0.0000000000000000    0.5000000000000000    0.7500000000000000 Si
   0.5000000000000000    0.0000000000000000    0.2500000000000000 Si
ID_2
1.0
  -4.3660398676292269   -4.3660398676292269    0.0000000000000000
  -4.3660398676292269   -0.0000000000000003   -4.3660398676292269
   0.0000000000000000   -4.3660398676292269   -4.3660398676292269
Si
8
direct
   0.8700001548800920    0.8700001548800920    0.1299998451199080 Si
   0.1299998451199080    0.1299998451199080    0.8700001548800920 Si
   0.8700001548800920    0.1299998451199080    0.8700001548800920 Si
   0.1299998451199080    0.8700001548800920    0.1299998451199080 Si
   0.1299998451199080    0.8700001548800920    0.8700001548800920 Si
   0.8700001548800920    0.1299998451199080    0.1299998451199080 Si
   0.7500000000000000    0.7500000000000000    0.7500000000000000 Si
   0.2500000000000000    0.2500000000000000    0.2500000000000000 Si

Note

Except for the random structure generation part, there is no point in using MPI because it is not parallelized.

Info

Interactive mode (Jupyter Notebook)

2025 March 6

Info

Requirements:

CrySPY 1.4.0 or later
Jupyter
Structure optimization software compatible with ASE (e.g., machine learning potentials).
nglview (optional)

Preparation

When CrySPY is installed, ASE is automatically installed as well. Set up Jupyter to be usable on a workstation or local PC. In this tutorial, Pure Python EMT calculator is used for structure optimization. Note that the accuracy of the EMT potential is poor, as it is intended for demonstration purposes only.

The example notebook also includes code for using the machine learning potential CHGNet. If you want to try CHGNet, make sure to install it in advance using pip.

Input file

Move to your working directory, and copy the example files by one of the following methods.

Download from CrySPY_utility/examples/interactive
Copy from CrySPY utility that you installed

Even in interactive mode, cryspy.in is used as the input file. The calc_in directory is not used in interactive mode. You can refer to the examples of cryspy.in in the input_examples directory.

Here, the following cryspy.in using EA-vc will be used. For more details on EA-vc, refer to the EA-vc tutorial.

[basic]
algo = EA-vc
calc_code = ASE
nstage = 1
njob = 10
jobcmd = zsh
jobfile = job_cryspy

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

[ASE]
ase_python = ase_in.py

[EA]
n_pop = 20
n_crsov = 5
n_perm = 2
n_strain = 2
n_rand = 2
n_add = 3
n_elim = 3
n_subs = 3
target = random
n_elite = 2
n_fittest = 10
slct_func = TNM
t_size = 2
maxgen_ea = 5
end_point = 0.0 0.0

[option]

Notebook

Open cryspy_interactive.ipynb and execute the cells from the top.

Check current working directory

The first cell only checks the files and the contents of cryspy.in.

!pwd
print()
!ls
print()
!cat cryspy.in

Import

Ignore the commented-out sections this time and execute the cell that imports the core libraries for CrySPY’s interactive mode.

# ---------- import
from cryspy.interactive import action

Initialize CrySPY

This cell corresponds to a standard initial run. It reads cryspy.in and generates the initial structures.

# ---------- initial structure generation
action.initialize()

Set calculator

This cell sets the ASE calculator. Here, ASE’s EMT is used.

# ---------- EMT in ASE
from ase.calculators.emt import EMT
calculator = EMT()

# ---------- CHGNet
#from chgnet.model import CHGNetCalculator
#calculator = CHGNetCalculator()

Restart CrySPY

Executing this cell starts the optimization of the previously generated initial structures. In interactive mode, structure optimization calculations are performed sequentially, one by one. A progress bar is also displayed during the process.

# ---------- structure optimization
action.restart(
    njob=20,    # njob=0: njob in cryspy.in will be used
    calculator=calculator,
    optimizer='BFGS',    # 'FIRE', 'BFGS' or 'LBFGS'
    symmetry=True,       # default: True
    fmax=0.01,           # default: 0.01 eV/Å
    steps=2000,          # default: 2000
)

njob: The number of structures to be optimized in a single execution. If set to 0, the value specified in cryspy.in is used.
calculator: Assign the previously set calculator.
optimizer: Select from FIRE, BFGS, or LBFGS. Specify as a string.
symmetry: If True, structure optimization is performed while preserving symmetry.
fmax: The maximum atomic force (eV/Å) used for convergence criteria.
steps: Maximum optimization steps.

If the njob value is set to a small number, execute this cell multiple times to complete the optimization of all initial structures. When using EA-vc, the following message will be displayed upon completion.

EA is ready

Executing this cell again will trigger generational turnover. Once the next-generation structures are generated, continue executing this cell repeatedly in the same manner.

Show results

Running this cell allows you to display files such as cryspy_rslt_energy_asc.

# ---------- show results
#!cat ./data/cryspy_rslt    # Order of structure optimization completion
!cat ./data/cryspy_rslt_energy_asc    # show energy ascending order
#!sed -n 2,4p ./data/cryspy_rslt    # show i--jth lines
#!tail -n 5 ./data/cryspy_rslt    # show last 5 lines

Structure visualization

You can interactively visualize both the initial and optimized structures.

from ase.visualize import view
atoms = action.get_atoms('opt', cid=0)    # 'init' or 'opt'
view(atoms, viewer='ngl')    # viewer = 'ngl', 'ase', or 'x3d'

Changing opt to init in action.get_atoms('opt', cid=0) allows you to check the initial structure. The cid parameter specifies the structure ID. Since this utilizes ASE’s functionality, the viewer option supports ngl, ase, and x3d. To use ngl, you need to install nglview, so make sure to install it via pip in advance.

Energy plot for RS, EA

For random search (RS) and evolutionary algorithm (EA), an energy graph shown below can be displayed. In the case of EA-vc, direct energy comparison is not possible due to differences in the number of atoms, so the convex hull plot, discussed later, is used instead.

fig, ax = action.plot_E(
              title=None,
              ymax=2.0,
              ymin=-0.5,
              markersize=12,
              marker_edge_width=1.0,
              marker_edge_color='black',
              alpha=1.0,
          )

Convex hull plot for EA-vc

Interactive plot using Plotly

For EA-vc, an interactive convex hull plot using Plotly is available. When CrySPY is installed, Plotly is automatically installed as well. This convex hull plot utilizes pymatgen’s functionality.

action.interactive_plot_convex_hull(cgen=None, show_unstable=0.2, ternary_style='2d')

cgen: Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
show_unstable: The maximum hull distance value to display on the plot
ternary_style
- Binary system: ternary_style = ‘2d’
- Ternary system: ternary_style = ‘2d’, ‘3d’
- Quaternary system: ternary_style = ‘3d’

When performing calculations with ternary or quaternary systems instead of binary systems, you can obtain the following interactive plots.

From left to right:

Ternary system (ternary_style = ‘2d’)
Ternary system (ternary_style = ‘3d’)
Quaternary system (ternary_style = ‘3d’)

Binary system using matplotlib

Running this cell plots the binary convex hull using matplotlib.

fig, ax = action.plot_convex_hull_binary(
              cgen=None,
              show_max=0.2,
              label_stable=True,
              vmax=0.2,
              bottom_margin=0.02,
          )
fig    # to show plot in jupyter

cgen: Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
show_max: The maximum formation energy to display on the plot
label_stable: Whether to display the labels (compositions) of stable structures
vmax: The maximum hull distance in the color bar
bottom_margin: Bottom margin of y-axis

Ternary system using matplotlib

If exploring a ternary system, running this cell will generate a convex hull plot using matplotlib.

fig, ax = action.plot_convex_hull_ternary(
              cgen=None,
              show_max=0.2,
              label_stable=True,
              vmax=0.2,
          )
fig    # to show plot in jupyter

cgen: Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
show_max: The maximum formation energy to display on the plot
label_stable: Whether to display the labels (compositions) of stable structures
vmax: The maximum hull distance in the color bar

For example, the following plot can be obtained.

Interface

2025 June 16

CrySPY supports multiple structure optimizers:

First-principles calculation
- VASP
- Quantum Espresso
- OpenMX (CrySPY 0.9.0 or later)
Interatomic potential
- soiap
- LAMMPS
Other (Machine learning potentials and other models that support ASE)
- ASE (CrySPY 1.2.0 or later)
- Interactive mode（ASE）（CrySPY 1.4.0 or later）

At least one optimizer is required.

Algorithm compatibility

CrySPY 1.4.0
Developing additional interfaces for EA-vc.

	RS	EA	EA-vc	BO	LAQA
VASP	✓	✓	×	✓	✓
Quantum Espresso	✓	✓	×	✓	✓
OpenMX	✓	✓	×	✓	×
soiap	✓	✓	×	✓	✓
LAMMPS	✓	✓	×	✓	×
ASE	✓	✓	✓	✓	×
Interactive (ASE)	✓	✓	✓	×	×

Option compatibility

	energy_step_flag	struc_step_flag	force_step_flag	stress_step_flag
VASP	✓	✓	✓	✓
Quantum Espresso	✓	✓	✓	✓
OpenMX	×	×	×	×
soiap	✓	✓	✓	✓
LAMMPS	×	×	×	×
ASE	×	×	×	×

Search algorithms

Random search (RS)

under construction

Evolutionary algorithm (EA)

2025 April 2

Background

Evolutionary algorithms (EAs) are metaheuristic methods inspired by the theory of evolution. EA can effectively generate new structures (offspring) by inheriting the local environments of the stable structures (parents) explored so far. Oganov group’s USPEX is a well-known software, and there are others such as XtalOpt. For example, there are previous studies such as the following papers.

T. S. Bush, C. R. A. Catlow, and P. D. Battle, J. Mater. Chem. 5, 1269 (1995).
A. R. Oganov and C. W. Glass, J. Chem. Phys. 124, 244704 (2006).
A. R. Oganov, A. O. Lyakhov, and M. Valle, Acc. Chem. Res. 44, 227 (2011).
A. O. Lyakhov, A. R. Oganov, H. T. Stokes, and Q. Zhu, Comput. Phys. Commun. 184, 1172 (2013).

Procedure

Initialize population
Evaluate fitness
Natural selection
Select parents
Create next generation
Repeat from step 2: Evaluate fitness

Initialize population

In the first generation, a set of random structures is generated according to the number specified by n_pop. tot_struc is not used in EA or EA-vc.

Evaluate fitness

Currently, energy is the only property that can be used as fitness in CrySPY. By setting fit_reverse = False, the algorithm is configured to search for the minimum value. The fit_reverse setting is designed for future cases where fitness may be based on properties other than energy.

Natural selection

DFT calculations occasionally fail and produce extremely unreasonable energy values. emax_ea and emin_ea can be used to filter out structures with unreasonably low (or high) energy values: $$ \mathrm{emin\_ea} \le E \ (\mathrm{eV/atom}) \le \mathrm{emax\_ea} $$ For example, if emin_ea is set, any structure with an energy lower than that value will be ignored.

In natural selection, the current population and elite individuals preserved from previous generations are first ranked based on fitness. The number of elite individuals used here is specified by n_elite. Only the top n_fittest individuals among the current population and elite individuals are selected, while all others are eliminated. During the natural selection process, duplicates are removed using the StructureMatcher class provided by pymatgen, and then the top n_fittest individuals are selected. n_fittest is often set to about half of n_pop (the population size). Note that in the current implementation, if n_fittest = 0, all individuals are retained. The figure below shows an example of the natural selection process when n_fittest = 5.

Select parents

Two parent selection methods are implemented in CrySPY to select a single parent individual from the candidate parents. Both methods are designed so that individuals with higher fitness have a higher probability of being selected. Setting slct_func = TNM enables tournament selection, while slct_func = RLT enables roulette selection. Tournament selection requires fewer parameters and is easier to use.

Create next generation

The next generation consists of offspring produced by evolutionary operations on candidate parents, along with some randomly generated structures. Random structures are added in each generation to maintain diversity and to help escape from local minima.

Evolutionary operations

Here, we introduce the operations of the fixed-composition EA implemented in CrySPY.

Population size

The sum of structures from crossover, permutation, strain, and random generation must be equal to n_pop.

n_pop = n_crsov + n_perm + n_strain + n_rand

Crossover

Overview

Crossover is an evolutionary operation that creates a new structure (offspring) by exchanging sliced regions between two parent structures. This promotes structural diversity and enables the inheritance of locally stable features. It is one of the main operators used to explore low-energy configurations in structure search.

How it works

Select two distinct individuals from the candidate parents
Perform a random translation
Randomly select a lattice vector
Slice the parents near the center
Swap the sliced halves
Select the offspring with more atoms
Adjust the number of atoms near the border
Perform a minimum interatomic distance check

4. Slice the parents near the center

The slice point is placed near the center and slightly varied each time.

slice_point = np.clip(np.random.normal(loc=0.5, scale=0.1), 0.3, 0.7)

If any of the subsequent steps fail, the process may be restarted from step 4. However, the number of retries is limited to maxcnt_ea, and if this limit is exceeded, the parent selection step is repeated.

5. Swap the sliced halves

When crs_lat is set to random, the lattice vectors of one of the two parent structures are randomly selected. When crs_lat is set to equal, the average of the lattice vectors of the two parent structures is used. The default is random.

6. Select the offspring with more atoms

Swapping the sliced parts of the parent structures results in two structures with different numbers of atoms. Temporarily, we select the structure with more atoms.

However, if the composition differs too much from the target, the process restarts from step 4 (Slice the parents near the center). The tolerance for the difference in the number of atoms is set by nat_diff_tole. The default value of nat_diff_tole is 4, which allows a tolerance of ±4 atoms per element. In the figure above, the number of blue atoms is -1 and the number of green atoms is +2 relative to the original composition.

7. Adjust the number of atoms near the border

Deletion

When adjusting the number of atoms, the process starts with atom deletion. The number of green atoms is excessive and needs to be reduced. As illustrated in the figure below, atoms that do not satisfy the minimum interatomic distance defined by mindist are preferentially removed.

As shown below, if atoms that violate the minimum interatomic distance remain after the deletion process, the procedure is restarted from step 4 (Slice the parents near the center).

If there are no atoms violating the minimum interatomic distance but atoms still need to be deleted, atoms are removed in order of increasing distance from the border, as shown in the figure below. Note that, in addition to the central slicing point, positions with internal coordinates of 0 are also considered borders.

Addition

When atoms are lacking, the missing atoms are added near the border. The internal coordinate along the selected axis is determined as shown below. Here, mean refers to either the slice point or 0.

coords[axis] = np.random.normal(loc=mean, scale=0.08)

The remaining two components of the coordinate are determined randomly. Atoms are added until the target number is reached, while checking for violations of the minimum interatomic distance.

Permutation

Overview

Permutation is an evolutionary operation that generates new structures (offspring) by modifying the atomic arrangement within a single structure. It enables the exploration of alternative configurations without changing the lattice or the overall composition.

How it works

The positions of atoms of different elements are swapped. The number of swaps can be specified by ntimes, which is set to 1 by default. After the swap, a minimum interatomic distance check is performed.

Strain

Overview

Strain is an evolutionary operation that generates a new structure (offspring) by applying a small random distortion to the lattice of a parent structure. It helps to explore nearby regions of the configuration space while preserving atomic connectivity and composition. This operator is useful for fine-tuning structural candidates and escaping local minima.

How it works

The lattice vectors are $ \mathbf{a} $ transformed to $ \mathbf{a}' $ by applying a strain matrix, as follows:

$$ \mathbf{a}' = \begin{pmatrix} 1 + \eta_1 & \frac{1}{2} \eta_6 & \frac{1}{2} \eta_5 \\ \frac{1}{2} \eta_6 & 1 + \eta_2 & \frac{1}{2} \eta_4 \\ \frac{1}{2} \eta_5 & \frac{1}{2} \eta_4 & 1 + \eta_3 \end{pmatrix} \mathbf{a}. $$

Here, $ \eta_i $ are given by a Gaussinan distribution $ \mathcal{N}\left( 0, \ \sigma_{\mathrm{st}}^2 \right) $. $ \sigma_{\mathrm{st}} $ is specified by the input parameter sigma_st (by default, sigma_st = 0.5). As shown in the figure below, the lattice is deformed and then rescaled to restore the original volume. Finally, the minimum interatomic distance constraint is checked.

Tournament selection

Overview

Tournament selection is a method used to choose parent individuals from the candidate parents based on their fitness. It is designed to balance selection pressure and diversity in the population. The figure below shows an example with n_fittest = 10 and t_size = 3.

How it works

A fixed number of individuals (t_size) are randomly selected from the candidate parents.
Among them, the individual with the highest fitness (i.e., lowest energy) is chosen as the parent.
This process is repeated until the required number of parents is selected.

Advantages

Simple and efficient
Requires only one parameter (t_size)
Can control selection pressure by adjusting t_size

Notes

The default value of t_size is 3.
If t_size is small, diversity is promoted.
If t_size is large, selection pressure increases, favoring the fittest individuals.
Unlike roulette selection, tournament selection never chooses the bottom (t_size - 1 ) individuals from the candidate parents.

Roulette selection

Overview

Roulette selection is a probabilistic method used to select parent individuals from the candidate parents based on their fitness. In roulette selection, each individual’s chance of being selected is proportional to its fitness.

How it works

When fit_reverse is set to False (default), corresponding to minimization mode where energy is used as fitness, the fitness values of the candidate parents are multiplied by –1.
The fitness values $ f_i $ are linearly scaled into $ f'_i $ using the following equation, where $ a $ and $ b $ are parameters specified by a_rlt and b_rlt, respectively (with the condition that $ a > b $). $$ f_i' = \frac{a - b}{f_{\mathrm{max}} - f_{\mathrm{min}}} f_i + \frac{b f_{\mathrm{max}} - a f_{\mathrm{min}}}{f_{\mathrm{max}} - f_{\mathrm{min}}} $$
The scaled fitness values $ f_i’ $ are converted into selection probabilities using the following equation: $$ p_i = \frac{f_i’}{\sum_k f_k’} $$ Each probability $ p_i $ represents the likelihood of selecting the $ i $-th individual.
Parent individuals are then selected one by one according to the probabilities $ p_i $ using roulette wheel sampling, until the required number of parents is obtained.

Advantages

All individuals have a non-zero chance of being selected
Selection pressure can be adjusted by scaling the fitness values

Notes

By default, a_rlt = 10.0 and b_rlt = 1.0
Proper scaling of fitness values is important to ensure meaningful selection pressure.The figure below shows examples of $ p_i $ when $ a $ is relatively small (left) and relatively large (right). If $ a $ is too small, the selection pressure becomes weak, making it more difficult to favor individuals with higher fitness.

Variable-composition evolutionary algorithm (EA-vc)

2025 July 7, updated

Overview

Since CrySPY 1.4.0, a variable-composition EA (EA-vc) has been available as an extension of the fixed-composition EA. Refer to the following page for the supported interfaces (Interface). Although the overall flow is similar to the fixed-composition EA, EA-vc differs in how fitness is evaluated and how offspring are generated in order to handle varying compositions. Here, we describe the parts that have been modified from the original EA.

From version 1.4.1, it is possible to generate structures under the charge neutrality condition.

Procedure

Initialize population
Evaluate fitness
Natural selection
Select parents
Create next generation
Repeat from step 2: Evaluate fitness

Initialize population

In the first generation, a set of random structures is generated according to the number specified by n_pop. tot_struc is not used in EA or EA-vc. In EA-vc, the number of atoms for each atom type is randomly determined within a user-defined range. The minimum (ll_nat) and maximum (ul_nat) number of atoms per type can be specified in cryspy.in as shown below.

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

Evaluate fitness

The convex hull computed from formation energies is used to evaluate the phase stability of different compositions, since directly comparing total energies of structures with different numbers of atoms is not meaningful. Information on formation energy, the convex hull, and phase diagrams can be found online. For example, see Materials Project Documentation. In EA-vc, the fitness is defined as the energy above hull (also referred to as hull distance).

Formation energy

Formation energy is calculated based on the reference energies (in eV/atom) of stable pure elements, which are specified as end_point in cryspy.in. For example, in the case of the Cu–Au binary system, the end_point should contain the per-atom energies (in eV/atom) of fcc-Cu and fcc-Au, in that order. Note that even if a structure with the same composition as end_point is found during the structure search and has a total energy lower than the corresponding end_point value, the formation energy is still currently calculated based on the original end_point values defined in cryspy.in.

Convex hull

The energy difference between a given structure’s formation energy and the convex hull is called the energy above hull, also known as the hull distance. This value indicates how much higher the formation energy of a structure is compared to the most stable combination of phases at the same composition. Structures with a hull distance of zero are on the convex hull and are thus thermodynamically stable.

Unlike in the fixed-composition EA, EA-vc filters structures based on their per-atom energy when computing the convex hull, using the condition: $$ \mathrm{emin\_ea} \le E \le \mathrm{emax\_ea} $$ Note that this filtering is based only on the total energy per atom, not on the formation energy.

To compute the convex hull, CrySPY uses the PhaseDiagram class provided by the pymatgen library. Unlike in the case of formation energy, if a structure with the same composition as a pure element has a total energy lower than the corresponding end_point value, that structure is used as the reference for computing the convex hull and hull distance.

Natural selection

As shown in the figure below, EA-vc can produce multiple stable structures (i.e., with a hull distance of 0). In such cases, multiple individuals share the top rank in terms of hull distance. If the number of elite structures specified by n_elite is smaller than the number of equally ranked individuals, the selection becomes non-deterministic. Currently, CrySPY randomly selects n_elite individuals from those with a hull distance less than 0.001 eV/atom. If the number of individuals with a hull distance less than 0.001 eV/atom is smaller than n_elite, elite structures are selected in the usual way, based on fitness ranking. When selecting elite individuals as well, duplicate structures are removed using the StructureMatcher class provided by the pymatgen library.

Elite individuals are selected based on the best structures from previous generations. However, because hull distance can vary from one generation to the next, the values for elite individuals are recalculated using the current convex hull before natural selection is applied.

As described in the Convex hull section, emin_ea and emax_ea are not used for natural selection in EA-vc.

Select parents

The method for selecting parents is the same as in the fixed-composition EA.

Create next generation

Evolutionary operations

The crossover (vc) operation is slightly different from that in the fixed-composition EA, while permutation and strain are the same. EA-vc introduces several new operations to enable compositional variation.

Population size

The sum of structures from crossover, permutation, strain, addition, elimination, substitution, and random generation must be equal to n_pop.

n_pop = n_crsov + n_perm + n_strain + n_add + n_elim + n_subs+ n_rand

Crossover (vc)

The variable-composition crossover is almost the same as the fixed-composition version, but it differs in that the adjustment of the number of atoms is minimized.

In step 6 of the fixed-composition crossover, the difference in the number of atoms in each atom type is calculated directly. In contrast, in crossover (vc), the difference is calculated based on the allowed range defined by ll_nat and ul_nat. For example:

ll_nat = [4, 4, 4]
ul_nat = [8, 8, 8]
offspring_nat = [2, 6, 12]
nat_diff = [-2, 0, 4]

If this difference in the number of atoms (nat_diff in the example above) exceeds the allowed tolerance (nat_diff_tole), the operation is retried. Otherwise, the number of atoms is adjusted to fall within the range defined by ll_nat and ul_nat.

Addition

2025 July 7, updated

An atom type whose current count does not exceed the limit specified by ul_nat is randomly selected, and one atom of that type is added at a random position.

From version 1.4.1, the functionality to add multiple atoms has been implemented. The number of atoms to be added is randomly selected from natural numbers up to add_max. The default value is add_max = 3.

Add one atom and check whether it satisfies the minimum interatomic distance specified by mindist.
If the distance condition is not satisfied, the atom is placed again at a different random position. This process is repeated up to maxcnt_ea times.
(since version 1.4.1) Repeat until the randomly determined number of atoms (up to add_max) have been added.
If no valid offspring is obtained, the volume is expanded by 10%, and the same procedure is retried up to maxcnt_ea times.
If that also fails, the volume is expanded up to 20% and the structure generation is attempted again. If it still fails, the parent is replaced.

Elimination

2025 July 7, updated

An atom type whose current count is above the lower limit specified by ll_nat is randomly selected, and one atom of that type is removed.

From version 1.4.1, the functionality to remove multiple atoms has been implemented. The number of atoms to be removed is randomly selected from natural numbers up to elim_max. The default value is elim_max = 3.

Substitution

2025 July 7, updated

Substitution is an operation in which two different atom types are randomly selected and their positions are substituted.

From version 1.4.1, the functionality to substitute multiple atoms has been implemented. The number of atoms to be substituted is randomly selected from natural numbers up to subs_max. The default value is subs_max = 3.

The number of atoms after the substitution is restricted so that it does not fall below the minimum (ll_nat) and does not exceed the maximum (ul_nat) number of atoms.
Finally, the minimum interatomic distance specified by mindist is checked, and if there are no issues, the structure is accepted as an offspring.

Bayesian optimizaion (BO)

under construction

One of the selection-type algorithms.

Reference

LAQA

One of the selection-type algorithms.

Score $ L $

$$ L = -E + w_F \frac{F^2}{2\Delta F} + w_S S. $$

Symbol	Note
$$ E $$	Energy (eV/atom)
$$ w_F $$	Weight of the force term. Default: $ w_F = 0.1$
$$ F $$	Averaged norm of the atomic force (eV/Å)
$$ \Delta F $$	Absolute difference of $ F $ from the previous step. $ \Delta F = 1$ for the first step. $ \Delta F = 10^{-6}$ if $ \Delta F \le 10^{-6} $.
$$ w_S $$	Weight of the stress term. Default: $ w_S = 10.0$
$$ S $$	Average of the absolute values of the components of the stress tensor (eV/Å^3).

Reference

Structure generation

CrySPY currently has three random structure generation modes: crystal (default), mol, and mol_bs. PyXtal (or find_wy) is used for the structure generation.

struc_mode = crystal

under construction

struc_mode = mol

under construction

struc_mode = mol_bs

CrySPY uses pyxtal in normal molecular crystal structure generation mode (struc_mode = mol). The molecules are arranged to fit a point group at a selected Wykoff position in the space group to keep the symmetry. (Sometimes it takes a long time to generate.)

In mol_bs mode (bs means break symmetry), dummy atoms are placed in Wykoff positions as in ordinary crystals, and then the dummy atoms are replaced by molecules without considering symmetry and rotated randomly. The structure generation is relatively fast.

under construction

Features

Logging

2023 July 10

CrySPY 1.2.0 adopts logging library of Python. CrySPY logs are output to both the screen and files(log_cryspy and err_cryspy).

log –> screen and log_cryspy
error and warning –> screen and err_cryspy

Here is the example:

[2023-07-10 18:40:54,389][cryspy_init][INFO] 


Start CrySPY 1.2.0


[2023-07-10 18:40:54,389][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2023-07-10 18:40:54,390][read_input][INFO] Save input data in cryspy.stat
[2023-07-10 18:40:54,391][cryspy_init][INFO] # ---------- Initial structure generation
[2023-07-10 18:40:54,391][cryspy_init][INFO] Number of MPI processes: 1
[2023-07-10 18:40:54,391][gen_init_struc][INFO] # ------ mindist
[2023-07-10 18:40:54,395][struc_util][INFO] Cu - Cu: 1.32
[2023-07-10 18:40:54,395][gen_init_struc][INFO] # ------ generate structures
[2023-07-10 18:40:54,481][gen_pyxtal][INFO] Structure ID      0 was generated. Space group:   1 -->   1 P1
[2023-07-10 18:40:54,493][gen_pyxtal][INFO] Structure ID      1 was generated. Space group:  28 -->  28 Pma2
[2023-07-10 18:40:54,498][gen_pyxtal][INFO] Structure ID      2 was generated. Space group:  29 -->  29 Pca2_1
[2023-07-10 18:40:54,704][gen_pyxtal][INFO] Structure ID      3 was generated. Space group: 137 --> 137 P4_2/nmc
[2023-07-10 18:40:54,725][gen_pyxtal][INFO] Structure ID      4 was generated. Space group: 212 --> 214 I4_132
[2023-07-10 18:40:54,800][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.408367

If you do not want output in the console, run cryspy with the -n option as follows:

cryspy -n

Backup

2024 Dec. 22 updated

CrySPY has a simple backup function. The following files are backed up:

cryspy.in
cryspy.stat
log_cryspy
err_cryspy
debug_cryspy
cryspy_interactive.ipynb
calc_in/*
data/*

work/* are NOT included.

(v1.1.0 or later) above files are copied to a directory named by date and time in “backup” directory. Previous backups are NOT automatically deleted.
(v1.0.0) only one generation is backed up, and previous backups will be deleted.

Auto backup

The timing of the automatic backup is as follows:

before going to next selection (BO, LAQA) or next generation (EA)
append structures

Manual backup

To manually back up, run cryspy with -b or --backup option as:

cryspy -b

This command only performs backups, unlike the normal execution.

Clean

2024 Dec. 22 updated

CrySPY has a simple clean (just move files) function. It is useful when you want to start over from the beginning. The following files are cleaned up:

cryspy.stat
log_cryspy
err_cryspy
lock_cryspy
data/*
work/*
tmp_gen_struc/*

To clean up, run cryspy with -c or --clean option as:

$ ls
calc_in  cryspy.in  cryspy.stat  data  err_cryspy  log_cryspy

$ cryspy -c
Are you sure you want to clean the data? 'yes' or 'no' [y/n]: y

$ ls
calc_in  cryspy.in  trash

$ ls trash
20230318_100728

Files other than calc_in/* and cryspy.in are moved to trash and grouped into a directory named by date and time. If you do not need them, you can delete them manually.

Restriction on interatomic distances

2024 April 23, updated

You can restrict the interatomic distance in structure generation. Here is an example of [structure] section in the input file to limit minimum interatomic distance for a A-B binary system.

[structure]
natot = 8
atype = A B
nat = 4 4
mindist_1 = 2.0 1.8
mindist_2 = 1.8 1.5

This means that minimum interatomic distances of A-A, A-B, and B-B are limited to 2.0, 1.8, and 1.5 Å, respectively. Structures with interatomic distances shorter than these values are automatically eliminated.

For ternary systems, you will need mindist_1, mindist_2, and mindist_3. Mindist matrix must be a symmetric matrix.

Since CrySPY version 1.4.0, the minimum interatomic distance check is also performed after structure relaxation. This feature was introduced because, with machine learning potentials, structures with nearly overlapping atoms can sometimes be obtained. You can disable this feature by adding the following line to cryspy.in (it is enabled by default):

[option]
check_mindist_opt = False

Example: Na8Cl8

Without mindist

cryspy.in

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8

[VASP]
kppvol = 40 80

[option]

log_cryspy

[2024-04-23 13:46:28,598][cryspy_init][INFO] 


Start CrySPY 1.2.3


[2024-04-23 13:46:28,598][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2024-04-23 13:46:28,598][read_input][INFO] Save input data in cryspy.stat
[2024-04-23 13:46:28,599][gen_init_struc][INFO] # ------ mindist
[2024-04-23 13:46:28,601][struc_util][INFO] Na - Na: 1.66
[2024-04-23 13:46:28,602][struc_util][INFO] Na - Cl: 1.3399999999999999
[2024-04-23 13:46:28,602][struc_util][INFO] Cl - Cl: 1.02
...

In the default settings of PyXtal, atoms can sometimes be too close to each other, as shown in the figure above, so it is recommended to set the mindist parameter. That would help simplify DFT calculations.

With mindist

cryspy.in

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5

[VASP]
kppvol = 40 80

[option]

log_cryspy

[2024-04-23 14:06:21,955][cryspy_init][INFO] 


Start CrySPY 1.2.3


[2024-04-23 14:06:21,955][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2024-04-23 14:06:21,956][read_input][INFO] Save input data in cryspy.stat
[2024-04-23 14:06:21,956][gen_init_struc][INFO] # ------ mindist
[2024-04-23 14:06:21,956][struc_util][INFO] Na - Na: 2.5
[2024-04-23 14:06:21,956][struc_util][INFO] Na - Cl: 1.5
[2024-04-23 14:06:21,956][struc_util][INFO] Cl - Cl: 2.5

In cases like ionic crystals, it is advisable to set up the configuration in such a way that cations and anions are kept apart from each other.

CrySPY_ID in job files

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 8
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS pw.x -nk 4 -nb 2 < pwscf.in > pwscf.out


if [ -e "CRASH" ]; then
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

Structure generation with MPI parallelization

Oct. 21 2023, update

Random structure generation using MPI has been available since version 1.1.0 ( using CrySPY >= 1.2.3 is better). You need to install mpi4py in your Python environment for MPI parallelization. Of course, an MPI library such as Open MPI, Intel MPI, and MPICH is required for your workstation.

Info

Requirements:

CrySPY ~~1.1.0~~ 1.2.3 or later
mpi4py
MPI library (Open MPI, Intel MPI, MPICH, etc.)

Warning

The figure below shows the relationship between elapsed time and the number of processes for 1000 structures of Si8 with the following setting:

[basic]
algo = RS
calc_code = soiap
tot_struc = 1000
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8
mindist_1 = 2.2

The structure generation is taking a long time because of a slightly stricter setting like mindset_1 = 2.2. The structure generation was performed 10 times for each number of processes.

Run

mpiexec -n 4 cryspy -p

Info

Enthalpy

2023/10/18

Info

Requirements:

CrySPY 1.2.2 or later
VASP or QE

When performing CSP at high pressure, enthalpy results can be collected instead of total energy. Not yet compatible with softwares other than VASP and QE.

E_eV_atom in cryspy_rslt and cryspy_rslt_energy_asc turns into enthalpy (eV/atom). Here is the example of CSP results under 40 GPa pressure for Sr4O4. CsCl-type structure (ID 5) is more stable than NaCl-type (ID 6).

   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
5       26  Pmc2_1          221       Pm-3m  -2.276790     NaN     done
6      225   Fm-3m          225       Fm-3m  -2.244800     NaN     done
1      101  P4_2cm          107        I4mm  -2.181115     NaN     done
4      123  P4/mmm          123      P4/mmm  -2.034509     NaN  not_yet
3       20  C222_1           63        Cmcm  -0.686541     NaN     done
2       75      P4           75          P4  -0.008713     NaN  not_yet
9       51    Pmma           47        Pmmm   0.096430     NaN     done
8       65    Cmmm          123      P4/mmm   1.099657     NaN     done
0      187   P-6m2          187       P-6m2   1.292124     NaN     done
7       53    Pmna           53        Pmna   5.153504     NaN  not_yet

VASP

CrySPY reads energy (enthalpy) from a OSZICAR file. This automatically changes to enthalpy when PSTRESS is set in INCAR_x as follows:

PSTRESS = 400

You do not have to do anything in cryspy.in. energy_step_flag is also supported for enthalpy.

Example: CrySPY utility > examples > qe_Sr4O4_RS_pv_term

QE

Add pv_term = True in the QE section of cryspy.in to use enthalpy:

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  40  80
pv_term = True

Don’t forget to write press in the QE input:

 &cell
    press = 400
 /

Warning

In QE, energy_step_flag is not supported yet for enthalpy.

As library

2024 May 31

Info

Requirements:

CrySPY 1.3.0 or later

Cryspy can be used as a library to generate random structures or structures by evolutionary algoritym. The jupyter notebook is available in CrySPY utility > notebook > as_library.

Random structure generation

####
#### when you change set_logger(), you need to restart the kernel
####
from cryspy.util.utility import set_logger    # optional
set_logger()    # optional
#set_logger(noprint=True, logfile='log_cryspy', errfile='err_cryspy')    # write log and err messages to files

from cryspy.RS.gen_struc_RS import gen_pyxtal

nstruc = 10
atype = ('Na', 'Cl')
nat = (4, 4)
mindist = ((2.0, 1.5),
           (1.5, 2.0))
spgnum = 'all'

init_struc_data = gen_pyxtal.gen_struc(
    nstruc=nstruc,
    atype=atype,
    nat=nat,
    mindist=mindist,
    spgnum=spgnum,
)

You can get init_struc_data (dict: {ID: pymatgen Strcture, …})

Structure generation by evolutionary algorithm

Situation: parent A (, parent B) –> child

Prepare two (one) parent structures as pymatgen Structure object.
In this example, just use the results of RS for Cu4Au4 (see, CrySPY utility > notebook > as_library).

import pickle
with open('./Cu4Au4_sample/opt_struc_data.pkl', 'rb') as f:
    opt_struc_data = pickle.load(f)

Crossover

from cryspy.EA.gen_struc_EA import crossover

# you can change parent_A and parent_B
parent_A = opt_struc_data[0]
parent_B = opt_struc_data[1]

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))

child = crossover.gen_child(
    atype=atype,
    nat=nat,
    mindist=mindist,
    parent_A=parent_A,
    parent_B=parent_B,
)

# child: pymatgen Structure

Permutation

from cryspy.EA.gen_struc_EA import permutation

# you can change parent_A
parent_A = opt_struc_data[0]

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))
ntimes = 1    # number of times to perform permutatio

child = permutation.gen_child(
    atype=atype,
    mindist=mindist,
    parent_A=parent_A,
    ntimes=ntimes,
)

# child: pymatgen Structure

Strain

from cryspy.EA.gen_struc_EA import strain

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))
sigma_st = 0.05    # standard deviation of strain

child = strain.gen_child(
    atype=atype,
    mindist=mindist,
    parent_A=parent_A,
    sigma_st=sigma_st,
)

Situation: parent group, fitness –> children

Data set

Prepare structure and fitness (energy) data as dict. The key is structure ID. In this example, just use the results of RS for Cu4Au4 (see, CrySPY utility > notebook > as_library)..

e.g.
struc_data = {0: (pymatgen Structure), 1: (pymatgen Structure), …}
fitness = {0: 0.019632287242441926, 1: -0.005437509701440302, …}

import pickle
with open('./Cu4Au4_sample/opt_struc_data.pkl', 'rb') as f:
    opt_struc_data = pickle.load(f)
with open('./Cu4Au4_sample/rslt_data.pkl', 'rb') as f:
    rslt_data = pickle.load(f)

struc_data = opt_struc_data    # dict
fitness = rslt_data['E_eV_atom'].to_dict()    # you may include None or np.nan for values

Survival of the fittest

from cryspy.EA.survival import survival_fittest
from cryspy.EA.gen_struc_EA.select_parents import SelectParents
from cryspy.EA.gen_struc_EA import crossover, permutation, strain

n_fittest = 5    # number of survivors

ranking, _, _ = survival_fittest(
    fitness=fitness,
    struc_data=struc_data,
    elite_struc=None,
    elite_fitness=None,
    n_fittest=n_fittest,
    fit_reverse=False,
    emax_ea=None,
    emin_ea=None,
)

# ranking <-- e.g. [2, 1, 0, 7, 9] without structure duplicaiton

Select parents class

sp = SelectParents(ranking)    # after set_xxx, we can use sp.get_parents(n_parent)
sp.set_tournament(t_size=2)

Crossover

atype = ('Cu', 'Au')
nat = (4, 4)
mindist = ((1.5, 1.5),
           (1.5, 1.5))
n_crsov = 5    # number of structures to be generated by crossover
#id_start = len(init_struc_data)  # next Structure ID
id_start = 10

co_children, co_parents, co_operation = crossover.gen_crossover(
    atype=atype,
    nat=nat,
    mindist=mindist,
    struc_data=struc_data,
    sp=sp,
    n_crsov=n_crsov,
    id_start=id_start,
)

# co_children <-- dict {ID: pymatgen Structure, ID: pymatgen Structure, ...}
# co_parents  <-- e.g. {10: (2, 7), 11: (2, 1), 12: (2, 1), 13: (0, 2), 14: (2, 1)}
# co_operation <-- e.g. {10: 'crossover', 11: 'crossover', ...}

Permutation

n_perm = 5    # number of structures to be generated by permutation
#id_start = len(init_struc_data) + n_crsov   # next Structure ID
id_start = 15
ntimes = 1    # number of times to perform permutation

pm_children, pm_parents, pm_operation = permutation.gen_permutation(
    atype=atype,
    mindist=mindist,
    struc_data=struc_data,
    sp=sp,
    n_perm=n_perm,
    id_start=id_start,
    ntimes=ntimes,
)

# pm_children <-- dict {ID: pymatgen Structure, ID: pymatgen Structure, ...}
# pm_parents  <-- e.g. {15: (2,), 16: (1,), 17: (2,), 18: (1,), 19: (1,)}
# pm_operation <-- e.g. {15: 'permutaion', 16: 'permutation', ...}

Strain

n_strain = 5    # number of structures to be generated by strain
#id_start = len(init_struc_data) + n_crsov + n_perm   # next Structure ID
id_start = 20
sigma_st = 0.05    # standard deviation of strain

st_children, st_parents, st_operation = strain.gen_strain(
    atype=atype,
    mindist=mindist,
    struc_data=struc_data,
    sp=sp,
    n_strain=n_strain,
    id_start=id_start,
    sigma_st=sigma_st,
)

# st_children <-- dict {ID: pymatgen Structure, ID: pymatgen Structure, ...}
# st_parents  <-- e.g. {20: (1,), 21: (2,), 22: (0,), 23: (2,), 24: (2,)}
# st_operation <-- e.g. {20: 'strain', 21: 'strain', ...}

Interactive mode

2025 March 6

Info

Requirements:

CrySPY 1.4.0 or later
Jupyter
Structure optimization software compatible with ASE (e.g., machine learning potentials).
nglview (optional)

An interactive mode using Jupyter Notebook has been made available to ensure ease of use, even for those unfamiliar with PC clusters or supercomputers. Since the structure optimization calculations are designed for ASE, compatible machine learning potentials can be used.

For detailed usage, please refer to Tutorial > Interactice mode（Jupyter Notebook）.

Re-plot convex hull

2025 July 3

Starting from CrySPY version 1.4.1, the cryspy-Eplot command is installed. In EA-vc, a convex hull plot is automatically generated and saved at the end of each generation’s calculation. If you wish to adjust the plotting parameters afterward, you can re-plot the graph using the cryspy-Eplot command.

Usage

Modify the following parameters directly in the [EA] section of cryspy.in, and then run cryspy-Eplot.

cgen
show_max
label_stable
vmax
bottom_margin
fig_format

cgen specifies the maximum generation number included in the plot. The default is None, meaning up to the latest generation. In EA-vc, the convex hull is computed at the end of each generation, So specifying an unfinished generation with cgen may result in an error.

The output image file will be overwritten at ./data/convex_hull/conv_hull_gen_{cgen}.{fig_format}.

For details on other parameters, refer to CrySPY > Tutorial > EA-vc > Analysis and visualization.

Charge neutrality condition

July 5, 2025

From version 1.4.1, it is possible to impose a charge neutrality condition during structure generation in EA-vc. This can be applied to both random structure generation and structure generation by evolutionary operations.

In cryspy.in, the charge corresponding to each atype is specified as charge.

Example of cryspy.in:

...
[structure]
atype = Li Ca Cl
ll_nat = 0 0 0
ul_nat = 8 8 8
charge = 1 2 -1
...

For example, in this case, only structures that satisfy the charge neutrality condition, such as (Li, Ca, Cl) = (4, 0, 4) or (Li, Ca, Cl) = (4, 2, 8), will be generated.

Note that if add_max or elim_max is too small as shown below, there may be no combinations of atom numbers that satisfy the charge neutrality condition when adding or removing atoms, making structure generation impossible.

...
[structure]
atype = Li Ca Cl
ll_nat = 0 0 0
ul_nat = 8 8 8
charge = 1 2 -1
...
...
[EA]
add_max = 1
elim_max = 1

For example, in the above case, if add_max = 1, there are no combinations of atom numbers that satisfy the charge neutrality condition. If add_max = 2, (Li, Ca, Cl) = (1, 0, 1), that is, adding one Li and one Cl atom, satisfies the charge neutrality condition because (+1) + (-1) = 0. If add_max = 3, there are combinations such as (Li, Ca, Cl) = (1, 0, 1) and (0, 1, 2).

Input file

Description of the input file, cryspy.in.

File format

CrySPY uses the configparser module to read input file, cryspy.in . cryspy.in consists of sections, led by a [section] header and followed by name = value or name : value entries. Section names and values are case sensitive, but names are not. Lines beginning with # or ; are ignored and may be used to provide comments. Accepted bool values are 1, yes, true, and on, which cause this method to return True, and 0, no, false, and off, which cause it to return False. These string values for bool are checked in a case-insensitive manner. Some values are given in a space-separated manner.

Info

See configparser in detail.

Note

section name: case sensitive
name: case insensitive
value: case sensitive except for bool

[basic] section

2025 March 6 updated

Name	Value	Description
`algo`	`RS`, `EA`, `EA-vc`, `BO`, `LAQA`	Algorithm
`calc_code`	`VASP`, `QE`, `OMX`, `soiap`, `LAMMPS`, `ASE`	Caluculation code for structure optimization
`tot_struc`	int	The total number of structures. Not used in the case of EA or EA-vc.
`nstage`	int	The number of stages
`njob`	int	The number of jobs running at the same time.
`jobcmd`	str	Command to submit jobs such as qsub and sbatch.
`jobfile`	str	File name of the job file.

[structure] section

2025 July 4, updated

Name	Value	Default	Description
`struc_mode`	`crystal`, `mol`, `mol_bs`	`crystal`	Structure generation mode
`atype`	atomic symbol [atomic symbol …]		Atom type. e.g. `atype = Na Cl`.
`nat`	int [int …]		The number of atoms corresponding to each atype. e.g. `nat = 8 8`. Not used in `EA-vc`.
`mindist` (`mindist_?`)	float [float …]	`None`	Constraint on minimum interatomic distance [Å].
`mindist_factor`	float	1.0	Scaling factor for `mindist`.
`vol_factor`	float	1.0	Volume scaling factor.
`vol_mu`	float	`None`	Mean of volume if you want specify the volume of cells.
`vol_sigma`	float	`None`	Standard deviation of volume if you want specify the volume of cells.
`symprec`	float	0.01	Precision for symmetry finding.
`spgnum`	`all`, space group number, 0	`all`	Constraint on space group. If `all`, 1–230. If 0, random structure without space group information (no symmetry).
`use_find_wy`	bool	`False`	Structure generation with find_wy.

mindist

Features > Restriction on interatomic distances

if algo is EA-vc

Name	Value	Default	Description
`ll_nat`	int [int …]		Lower limit of `nat`. e.g. `ll_nat = 0 0`.
`ul_nat`	int [int …]		Upper limit of `nat`. e.g. `ul_nat = 8 8`.
`charge`	int [int …]	`None`	Used to impose the charge neutrality condition. Charge of each atom type. e.g. `charge = 1 -1` for NaCl.

if struc_mode is mol or mol_bs

Name	Value	Default	Description
`mol_file`	str [str …]		Path of molecule files or molecule names.
`nmol`	int [int …]		The number of molecules.
`timeout_mol`	float	`None`	Time out for molecular structure generation.
`rot_mol`	`random`, `random_mol`, `random_wyckoff`	`random_wyckoff`	Only used in `mol_bs`. Mode for rotation of molecules.
`nrot`	int	20	Only used in `mol_bs`. Maximum number of trials to rotate molecules.
`mindist_mol_bs` (`mindist_mol_bs_?`)	float [float …]	`None`	Only used in `mol_bs`. Constraint on minimum intermolecular distance [Å].
`mindist_mol_bs_factor`	float	1.0	Only used in `mol_bs`. Scaling factor for `mindist_mol_bs`.

if use_find_wy is True or spgnum = 0

Name	Value	Default	Description
`fwpath`	str	`None`	Only used with find_wy. Path of find_wy. If `None`, `fwpath` is automatically searched in your $PATH.
`minlen`	float		Only used with find_wy or `spgnum = 0`. Minimum length of lattice vector [Å].
`maxlen`	float		Only used with find_wy or `spgnum = 0`. Maximum length of lattice vector [Å].
`dangle`	float		Only used with find_wy or `spgnum = 0`. Delta angle for alpha, beta, and gamma in degree unit.
`maxcnt`	int	50	Only used with find_wy or `spgnum = 0`. Maximum number of trials to determine atom positions.

[VASP] section

2024 April 22

[VASP] section is required only if you use VASP (calc_code = VASP)

Name	Value	Default	Description
`kppvol`	int [int …]		Grid density per Å**(-3) of reciprocal cell in each stage.
`force_gamma`	bool	`False`	If true, force gamma-centered mesh.

kppvol and force gamma

Input file > Kpoint

[QE] section

[QE] section is required only if you use QE (calc_code = QE)

Name	Value	Default	Description
`kppvol`	int [int …]		Grid density per Å**(-3) of reciprocal cell in each stage
`qe_infile`	str		File name of QE input file.
`qe_outfile`	str		File name of QE output file.
`pv_term`	bool	`False`	If true, read enthalpy instead of total energy.

kppvol

Input file > Kpoint

pv_term

Features > Enthalpy > QE

[OMX] section

[OMX] section is required only if you use OpenMX (calc_code = OMX)

Name	Value	Description
`kppvol`	int [int …]	Grid density per Å**(-3) of reciprocal cell in each stage
`OMX_infile`	str	File name of OpenMX input file.
`OMX_outfile`	str	File name of OpenMX output file.
`ValenceElectrons`	str float float [str float float …]	The number of initial charges for up and down spin states.

kppvol

Input file > Kpoint

ValenceElectrons

e.g. in NaCl: ValenceElectrons = Na 4.5 4.5 Cl 3.5 3.5.

[soaip] section

[soiap] section is required only if you use soiap (calc_code = soiap)

Name	Value	Description
`soiap_infile`	str	File name of soiap input file.
`soiap_outfile`	str	File name of soiap output file.
`soiap_cif`	str	File name of soiap CIF-formatted initial structure.

[LAMMPS] section

[LAMMPS] section is required only if you use LAMMPS (calc_code = LAMMPS)

Name	Value	Default	Description
`lammps_infile`	str		File name of LAMMPS input file.
`lammps_outfile`	str		File name of LAMMPS output file.
`lammps_potential`	str [str …], `None`	`None`	Potential.
`lammps_data`	str		File name of LAMMPS data file.

[ASE] section

[ASE] section is required only if you use ASE (calc_code = ASE)

Name	Value	Default	Description
`ase_python`	str		File name of ASE input file.

[EA] section

2025 July 7, updated

[EA] section is required only if you use EA or EA-vc
See also CrySPY > Search algorithms > EA

Name	Value	Default	Description
`n_pop`	int		Population (see also Population size)
`n_crsov`	int		Number of offspring created by crossover
`n_perm`	int		Number of offspring created by permutation
`n_strain`	int		Number of offspring created by strain
`n_rand`	int		Number of structures created randomly
`n_elite`	int		Number of elite individuals (see also Natural selection)
`fit_reverse`	bool	`False`	If `False`, minimal search (see also Evaluate fitness)
`n_fittest`	int	0	Number of individuals that remain natural selection. If set to 0, all individuals are retained.
`slct_func`	`TNM`, `RLT`		Function to select parents
`t_size`	int	3	Tournament size. Used only used `slct_func = TNM`. (see also Tournament selection)
`a_rlt`	float	10.0	Parameter for linear scaling. Used only with `slct_func = RLT`. (see also Roulette selection)
`b_rlt`	float	1.0	Parameter for linear scaling. Used only with `slct_func = RLT`. (see also Roulette selection)
`crs_lat`	`equal`, `random`	`random`	How to mix lattice vectors (see also crossover > 5. Swap the sliced halves)
`nat_diff_tole`	int	4	Tolerance for difference in the number of atoms in crossover. (see also crossover > 6. Select the offspring with more atoms)
`ntimes`	int	1	Number of times in permutation.
`sigma_st`	float	0.5	Standard deviation for strain.
`maxcnt_ea`	int	50	Maximum number of trials in EA.
`maxgen_ea`	int	0	Maximum generation. If set to 0, no upper limit is applied.
`emax_ea`	float	`None`	Energy upper limit (eV/atom) for natural selection.
`emin_ea`	float	`None`	Energy lower limit (eV/atom) for natural selection.

EA-vc requires the following additional variables on top of the standard EA.
Note that in EA-vc, emax_ea and emin_ea are used not in natural selection but when computing the convex hull.
See also CrySPY > Search algorithms > EA-vc
See also CrySPY > Tutorial > EA-vc > Analysis and visualization for the convex hull plot.

Name	Value	Default	Description
`n_add`	int		Number of offspring created by addition.
`add_max`	int	3	(since version 1.4.1) Maximum number of atoms to add in addition
`n_elim`	int		Number of offspring created by elimination.
`elim_max`	int	3	(since version 1.4.1) Maximum number of atoms to eliminate in elimination
`n_subs`	int		Number of offspring created by substitution.
`subs_max`	int	3	(since version 1.4.1) Maximum number of atoms to substitute in substitution
`target`	str	`random`	Target. Only `random` for now.
`end_point`	(float, …, float)		Energy of end points for formation energy.
`emax_ea`	float	`None`	Energy upper limit (eV/atom) for computing the convex hull.
`emin_ea`	float	`None`	Energy lower limit (eV/atom) for computing the convex hull.
`cgen`	int	`None`	(since version 1.4.1) Which generation’s data to plot up to. If None, data will be plotted up to the latest generation.
`show_max`	float	0.2	When plotting the convex hull, the maximum value of the y-axis (for binary systems) or the maximum hull distance (for ternary systems) is set by show_max.
`lable_stable`	bool	`True`	Whether to show stable compositions when plotting the convex hull.
`vmax`	float	0.2	Maximum value of the colorbar representing hull distance.
`bottom_margin`	float	0.02	Bottom margin of the y-axis for binary convex hull plot.
`fig_format`	str	`svg`	Figure format for convex hull plot: `svg`, `png`, or `pdf`.

[BO] section

2024 May 27th, updated

[BO] section is required only if you use BO (algo = BO)

Name	Value	Default	Description
`nselect_bo`	int		The number of structures to be selected at once.
`score`	`TS`, `EI`, `PI`		Acquisition function.
`num_rand_basis`	int	0	If `0`, Gaussian process. The number of basis function.
`cdev`	float	0.001	Cutoff of deviation for standardization.
`dscrpt`	`FP`		Structure descriptor.
`max_select_bo`	int	0	Maximum number of selection.
`manual_select_bo`	int [int …]	`None`	Structure IDs to be selected manually.
`emax_bo`	float	`None`	Upper limit of energy in BO.
`emin_bo`	float	`None`	Lower limit of energy in BO.

if decrpt is FP

CrySPY 1.3.0 or later

fppath and fp_rmin are obsolete.

Name	Value	Default	Description
`fp_rmax`	float	8.0	Only used with `dscrpt = FP`. Maximum cutoff of r in fingerprint.
`fp_npoints`	int	20	Only used with `dscrpt = FP`. Number of discretized points for each pair in fingerprint.
`fp_sigma`	float	0.7	Only used with `dscrpt = FP`. Sigma parameter [Å] in Gaussian smearing function.

CrySPY 1.2.5 or earlyer

Name	Value	Default	Description
`fppath`	str	`None`	Only used with `dscrpt = FP`. Path of cal_fingerprint. If `None`, `fwpath` is automatically searched in your $PATH.
`fp_rmin`	float	0.5	Only used with `dscrpt = FP`. Minimum cutoff of r in fingerprint.
`fp_rmax`	float	5.0	Only used with `dscrpt = FP`. Maximum cutoff of r in fingerprint.
`fp_npoints`	int	20	Only used with `dscrpt = FP`. Number of discretized points for each pair in fingerprint.
`fp_sigma`	float	1.0	Only used with `dscrpt = FP`. Sigma parameter [Å] in Gaussian smearing function.

[LAQA] section

[LAQA] section is required only if you use LAQA (algo = LAQA)

Name	Value	Default	Description
`nselect_laqa`	int		The number of structures to be selected at once.
`wf`	float	0.1	Weight of the force term.
`ws`	float	10.0	Weight of the stress term.

Info

[option] section

Name	Value	Default	Description
`check_mindist_opt`	bool	`True`	If True, a mindist constraint is checked after structure relaxation.
`stop_chkpt`	int	0	CrySPY stops at a specified check point.
`load_struc_flag`	bool	`False`	If True, load initial structures from `./data/pkl_data/init_struc_data.pkl`.
`stop_next_struc`	bool	`False`	If True, CrySPY does not submit jobs for next structures, but jobs for next stage are submitted.
`recalc`	int [int …]	(empty list)	Specify structure IDs if you want to recalculate or continue optimization.
`append_struc_ea`	bool	`False`	If True, append structures by EA.
`energy_step_flag`	bool	`False`	If True, save energy_step_data in `./data/pkl_data/energy_step_data.pkl`.
`struc_step_flag`	bool	`False`	If True, save struc_step_data in `./data/pkl_data/struc_step_data.pkl`.
`force_step_flag`	bool	`False`	If True, save force_step_data in `./data/pkl_data/force_step_data.pkl`.
`stress_step_flag`	bool	`False`	If True, save stress_step_data in `./data/pkl_data/stress_step_data.pkl`.

Kpoint

2024 April 22

CrySPY automatically generates the k-point setting using the pymatgen.io.vasp.Kpoints.automatic_density_by_vol function from pymatgen. An example in cryspy.in with nstage = 2 is as follows:

[VASP]
kppvol = 40 120

stage 1: kppvol = 40
stage 2: kppvol = 120

kppvol means a grid density per Å ${}^{-3} $ of reciprocal cell.
VASP: gamma centered meshes are used for hexagonal cells and face-centered cells; otherwise, Monkhorst-Pack grids are employed.
QE and OMX: only a k-mesh is provided, no offset.

What is the appropriate value for kppvol?

Here are the guidelines. We use VESTA for visualizing crystal structures.

Primitive cell of diamond Si

a = b = c = 3.836 Å

kppvol	k-mesh
0	[1, 1, 1]
20	[4, 4, 4]
40	[6, 6, 6]
60	[7, 7, 7]
80	[7, 7, 7]
100	[8, 8, 8]
120	[9, 9, 9]
140	[9, 9, 9]
160	[9, 9, 9]
180	[10, 10, 10]
200	[10, 10, 10]
400	[13, 13, 13]
600	[15, 15, 15]
800	[17, 17, 17]

Conventional cell of diamond Si

a = b = c = 5.431 Å

kppvol	k-mesh
0	[1, 1, 1]
20	[3, 3, 3]
40	[3, 3, 3]
60	[4, 4, 4]
80	[4, 4 ,4]
100	[5, 5, 5]
120	[5, 5, 5]
140	[6, 6, 6]
160	[6, 6, 6]
180	[6, 6, 6]
200	[6, 6, 6]
400	[8, 8, 8]
600	[9, 9, 9]
800	[10, 10, 10]

Nd2Fe14B

a = b = 8.804 Å
c = 12.205 Å

kppvol	k-mesh
0	[1, 1, 1]
20	[1, 1, 1]
40	[2, 2, 1]
60	[2, 2, 2]
80	[3, 3 ,2]
100	[3, 3, 2]
120	[3, 3, 2]
140	[3, 3, 2]
160	[3, 3, 2]
180	[4, 4, 2]
200	[4, 4, 3]
400	[5, 5, 3]
600	[6, 6, 4]
800	[6, 6, 4]

Data format

All pickle data are stored in the ./data/pkl_data/ directory. A Jupyter notebook for analyzing the pkl data (pkl_data.ipynb) is also provided.

Common data

Initial and optimized structure data

Initial and optimized structure data are saved in init_struc_data.pkl and opt_struc_data.pkl, respectively. pymatgen library is required to analyze these data files.

Data format

type: dict
- key: structure ID
- value: structure data
string form
- {0: Structure Summary …,
  1: Structure Summary …,
  …}
structure data format
- pymatgen.core.structure.Structure

How to access

import pickle
with open('init_struc_data.pkl', 'rb') as f:
   init_struc_data = pickle.load(f)
with open('opt_struc_data.pkl', 'rb') as f:
   opt_struc_data = pickle.load(f)

# struc_step_data[ID]
#
#

# ---------- structure step data of ID 0
cid = 0      # ID
init_struc_data[cid]    # to show initial structure of ID 0

Structure Summary
Lattice
    abc : 5.727301 5.727301 4.405757
 angles : 90.0 90.0 90.0
 volume : 144.5175386563631
      A : 5.727301 0.0 0.0
      B : 0.0 5.727301 0.0
      C : 0.0 0.0 4.405757
PeriodicSite: Si (0.2506, 5.4767, 1.1014) [0.0438, 0.9562, 0.2500]
PeriodicSite: Si (2.6130, 3.1143, 1.1014) [0.4562, 0.5438, 0.2500]
PeriodicSite: Si (3.1143, 0.2506, 1.1014) [0.5438, 0.0438, 0.2500]
PeriodicSite: Si (5.4767, 2.6130, 1.1014) [0.9562, 0.4562, 0.2500]
PeriodicSite: Si (5.4767, 0.2506, 3.3043) [0.9562, 0.0438, 0.7500]
PeriodicSite: Si (3.1143, 2.6130, 3.3043) [0.5438, 0.4562, 0.7500]
PeriodicSite: Si (2.6130, 5.4767, 3.3043) [0.4562, 0.9562, 0.7500]
PeriodicSite: Si (0.2506, 3.1143, 3.3043) [0.0438, 0.5438, 0.7500]

Result data

Common result data such as space group, energies, etc. are saved in rslt_data.pkl. pandas library is required to analyze this data file.

Data format

type: pandas.core.frame.DataFrame
- row lable: structure ID
string form
- see blow

How to access

import pickle
with open('rslt_data.pkl', 'rb') as f:
   rslt_data = pickle.load(f)


# ---------- sort by Energy
# top 5
rslt_data.sort_values(by=['E_eV_atom']).head(5)

   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done

Random Search (RS)

Evolutionary algorithm (EA)

Bayesian Optimization (BO)

LAQA

Optional data

Energy step data

Energy step data is saved in energy_step_data.pkl if you set energy_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

Warning

energy_step_flag = True is currently available only with VASP, QE, and soiap.

Info

In soiap, energy_step_data is collected only if loopa == 1. This is because other data (struc, force, and stress) are output only when loopa == 1. See, https://github.com/nbsato/soiap/blob/master/doc/instructions.md

Data format

type: dict
- key: structure ID
- value: list of energy step data in each stage
string form
- {0: [array([-3.4439912 , -3.55040935, -3.66697038, ..]), array([-4.0613393 , -4.05445631, -4.06159641, …]), …],
  1: [array([-2.68209823, -2.69012487, -2.68364907, ..]), array([-2.79140967, -2.79183827, -2.79206508, …]), …],
  …}
unit of energy
- eV/atom

How to access

import pickle
with open('energy_step_data.pkl', 'rb') as f:
    energy_step_data = pickle.load(f)

# energy_step_data[ID][stage][step]
# energy_step_data[ID][0] <-- stage 1
# energy_step_data[ID][1] <-- stage 2
#
# in LAQA
# energy_step_data[ID][selection][step]
# energy_step_data[ID][0] <-- 1st selection
# energy_step_data[ID][1] <-- 2nd selection

# ---------- energy step data of ID 3, stage 1
cid = 3      # ID
stage = 1    # stage
energy_step_data[cid][stage-1][:10]    # show only 10 enegies in jupyter

array([-3.4439912 , -3.55040935, -3.66697038, -3.77192063, -3.84320717,
       -3.80679245, -3.84633935, -3.87374706, -3.89123193, -3.90422926])

Structure step data

Structure step data is saved in struc_step_data.pkl if you set struc_step_flag = True in [option] section of cryspy.in. pymatgen library is required to analyze this data file.

Warning

struc_step_flag = True is currently available only with VASP, QE, and soiap.

Info

struc_step_data includes initial structures. For example, struc_step_data[cid][0][0] is the initial structure of ID = cid.

Data format

type: dict
- key: structure ID
- value: list of structure step data in each stage
string form
- {0: [[Structure Summary …, Structure Summary, …], […], …],
  1: [[Structure Summary …, Structure Summary, …], […], …],
  …}
structure data format
- pymatgen.core.structure.Structure

How to access

import pickle
with open('struc_step_data.pkl', 'rb') as f:
    struc_step_data = pickle.load(f)

# struc_step_data[ID][stage][step]
# struc_step_data[ID][0] <-- stage 1
# struc_step_data[ID][1] <-- stage 2
#
#
# in LAQA
# struc_step_data[ID][selection][step]
# struc_step_data[ID][0] <-- 1st selection
# struc_step_data[ID][1] <-- 2nd selection

# ---------- structure step data of ID 3, stage 1, step 4
cid = 0      # ID
stage = 1    # stage
step = 0     # step index (start from 0)
struc_step_data[cid][stage-1][step]    # to show initial structure of ID 0 at stage 1 in jupyter

Structure Summary
Lattice
    abc : 5.727301 5.727301 4.405757
 angles : 90.0 90.0 90.0
 volume : 144.5175386563631
      A : 5.727301 0.0 0.0
      B : 0.0 5.727301 0.0
      C : 0.0 0.0 4.405757
PeriodicSite: Si (0.2506, 5.4767, 1.1014) [0.0438, 0.9562, 0.2500]
PeriodicSite: Si (2.6130, 3.1143, 1.1014) [0.4562, 0.5438, 0.2500]
PeriodicSite: Si (3.1143, 0.2506, 1.1014) [0.5438, 0.0438, 0.2500]
PeriodicSite: Si (5.4767, 2.6130, 1.1014) [0.9562, 0.4562, 0.2500]
PeriodicSite: Si (5.4767, 0.2506, 3.3043) [0.9562, 0.0438, 0.7500]
PeriodicSite: Si (3.1143, 2.6130, 3.3043) [0.5438, 0.4562, 0.7500]
PeriodicSite: Si (2.6130, 5.4767, 3.3043) [0.4562, 0.9562, 0.7500]
PeriodicSite: Si (0.2506, 3.1143, 3.3043) [0.0438, 0.5438, 0.7500]

Force step data

Force step data is saved in force_step_data.pkl if you set force_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

Warning

force_step_flag = True is currently available only with VASP, QE, and soiap.

Data format

type: dict
- key: structure ID
- value: list of force step data in each stage
string form
- {0: [array([[ 0.26314927, -0.26314927, -0. ], […], …[…]]), array([[…], …, […]]), …],
  1: [array([[ 0. , 0. , 0. ], […], …[…]]), array([[…], …, […]]), …],
  …}
unit of force
- eV/Å

How to access

import pickle
with open('force_step_data.pkl', 'rb') as f:
    force_step_data = pickle.load(f)

# force_step_data[ID][stage][step][atom]
# force_step_data[ID][0] <-- stage 1
# force_step_data[ID][1] <-- stage 2
#
# in LAQA
# force_step_data[ID][selection][step][atom]
# force_step_data[ID][0] <-- 1st selection
# force_step_data[ID][1] <-- 2nd selection

# ---------- force step data of ID 3, stage 1
cid = 0      # ID
stage = 1    # stage
force_step_data[cid][stage-1][:3]    # to show only 3 steps in jupyter

[array([[ 0.26314927, -0.26314927, -0.        ],
        [-0.26314927,  0.26314927, -0.        ],
        [ 0.26314927,  0.26314927,  0.        ],
        [-0.26314927, -0.26314927, -0.        ],
        [-0.26314927,  0.26314927, -0.        ],
        [ 0.26314927, -0.26314927,  0.        ],
        [-0.26314927, -0.26314927, -0.        ],
        [ 0.26314927,  0.26314927,  0.        ]]),
 array([[-0.12103692,  0.12103692,  0.        ],
        [ 0.12103692, -0.12103692, -0.        ],
        [-0.12103692, -0.12103692, -0.        ],
        [ 0.12103692,  0.12103692,  0.        ],
        [ 0.12103692, -0.12103692, -0.        ],
        [-0.12103692,  0.12103692,  0.        ],
        [ 0.12103692,  0.12103692,  0.        ],
        [-0.12103692, -0.12103692, -0.        ]]),
 array([[-0.29801618,  0.29801618,  0.        ],
        [ 0.29801618, -0.29801618, -0.        ],
        [-0.29801618, -0.29801618, -0.        ],
        [ 0.29801618,  0.29801618,  0.        ],
        [ 0.29801618, -0.29801618, -0.        ],
        [-0.29801618,  0.29801618,  0.        ],
        [ 0.29801618,  0.29801618,  0.        ],
        [-0.29801618, -0.29801618, -0.        ]])]

step = 0     # step index (start from 0)
atom = 2     # atom index (start from 0)
force_step_data[cid][stage-1][step][atom]

array([0.26314927, 0.26314927, 0.        ])

Stress step data

Stress step data is saved in stress_step_data.pkl if you set stress_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

Warning

stress_step_flag = True is currently available only with VASP, QE, and soiap.

Data format

type: dict
- key: structure ID
- value: list of stress step data in each stage
string form
- {0: [array([[-0.16770062, 0. , 0. ], […], […]]), array([[…], ]…], […]]), …],
  1: [array([[ 0.39260083, -0. , -0. ], […], […]]), array([[…], […], […]]), …],
  …}
unit of stress
- eV/(Å**3)

How to access

import pickle
with open('stress_step_data.pkl', 'rb') as f:
    stress_step_data = pickle.load(f)

# stress_step_data[ID][stage][step][atom]
# stress_step_data[ID][0] <-- stage 1
# stress_step_data[ID][1] <-- stage 2
#
# in LAQA
# stress_step_data[ID][selection][step][atom]
# stress_step_data[ID][0] <-- 1st selection
# stress_step_data[ID][1] <-- 2nd selection

# ---------- stress step data of ID 3, stage 1
cid = 0      # ID
stage = 1    # stage
stress_step_data[cid][stage-1][:3]    # to show only 3 steps in jupyter

[array([[-0.16770062,  0.        ,  0.        ],
        [ 0.        , -0.16770062, -0.        ],
        [ 0.        ,  0.        ,  0.21823358]]),
 array([[-0.16020785, -0.        , -0.        ],
        [-0.        , -0.16020785,  0.        ],
        [-0.        ,  0.        ,  0.18646321]]),
 array([[-0.13572003, -0.        ,  0.        ],
        [-0.        , -0.13572003,  0.        ],
        [-0.        ,  0.        ,  0.15953926]])]

CrySPY utility

See Installation/CrySPY utility to download.

Examples

Various examples are available in CrySPY_utility/examples (GitHub).

Scripts

You can find useful scripts in CrySPY_utility/script (GitHub).

repeat_cryspy

Link: CrySPY_utility/script/repeat_cryspy

You may find it tedious to run cryspy over and over again. The automated script could help you.
This automated script runs cryspy every 5 minutes by default. The time interval can be adjusted by editing the following part of the script.

    sleep 300    # seconds

Usage

copy repeat_cryspy to the working directory
(optional) edit the time interval in repeat_cryspy
run the script

You can use the nohup command to keep the job running even after logging out.

[bash]

nohup ./repeat_cryspy &

[zsh]

nohup ./repeat_cryspy &!

extract_struc.py

2023 April 16 update

Link: CrySPY_utility/script/extract_struc.py

Script to extract structures from init_struc_data.pkl or opt_struc_data.pkl. This script can print stucture information and output cif files.

One can specify structure ID(s) using -i option. Top k structures (the k most stable structures) can be extracted using -t option. -a option is for outputting all the structures. (note that many cif files will be output.) Symmetrized cif files can be generated with -s option. When outputting a symmetrized CIF file, you can also specify a tolerance with --tolerance. Structure information is printed with -p. If you use -p option, cif files are not output. You can also read a gzipped file (e.g., opt_struc_data.pkl.gz).

Update History

2024 April 16: –tolerance option, gzip
2023 July 21: –print option

Usage

python3 extract_struc.py -h

or if you put the script in your PATH, you can omit python3

extract_struc.py -h

usage: extract_struc.py [-h] [-p] [-a] [-i [INDEX ...]] [-t TOP] [-r] [-s] [--tolerance TOLERANCE] infile

positional arguments:
  infile                input file

options:
  -h, --help            show this help message and exit
  -p, --print           just print, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -ps
  -a, --all_id          all structures, e.g., extract_struc.py opt_struc_data.pkl -as
  -i [INDEX ...], --index [INDEX ...]
                        structure ID, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
  -t TOP, --top TOP     top k structures, e.g. (k = 3), extract_struc.py opt_struc_data.pkl -t 3 -s
  -r, --rank            add rank in file names, e.g., extract_struc.py opt_struc_data.pkl -t 3 -rs
  -s, --symmetrized     symmetrized structure, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
  --tolerance TOLERANCE
                        tolerance for symmetrization (default 0.01), e.g., extract_struc.py opt_struc_data.pkl -i 0 1 -s --tolerance 0.01

Examples

Print

The -p option can be used in combination with any option except for -s option.

extract_struc.py -p opt_struc_data.pkl -i 0 1

ID 0
Full Formula (Na8 Cl8)
Reduced Formula: NaCl
abc   :   6.823618   6.823618   7.566454
angles:  90.000000  90.000000  96.650518
pbc   :       True       True       True
Sites (16)
  #  SP           a         b         c
---  ----  --------  --------  --------
  0  Na    0         0         1
  1  Na    0         0         0.5
  2  Na    0.704707  0.295293  0.75
  3  Na    0.295293  0.704707  0.25
  4  Na    0.5       0         1
  5  Na    0.5       0         0.5
  6  Na    0         0.5       0.5
  7  Na    0         0.5       0
  8  Cl    0.5       0.5       0
  9  Cl    0.5       0.5       0.5
 10  Cl    0.484753  0.515247  0.75
 11  Cl    0.515247  0.484753  0.25
 12  Cl    0.828247  0.171753  0.851096
 13  Cl    0.171753  0.828247  0.351096
 14  Cl    0.828247  0.171753  0.648904
 15  Cl    0.171753  0.828247  0.148904

ID 1
Full Formula (Na8 Cl8)
Reduced Formula: NaCl
abc   :   8.145021   8.145021   4.324235
angles:  90.000000  90.000000 120.000000
pbc   :       True       True       True
Sites (16)
  #  SP            a          b         c
---  ----  ---------  ---------  --------
  0  Na     0.666667   0.333333  0.736206
  1  Na     0.666667   0.333333  0.263794
  2  Na     0.913147   0.086853  0.5
  3  Na     0.913147   0.826295  0.5
  4  Na     0.173705   0.086853  0.5
  5  Na     0.77711    0.22289   0
  6  Na     0.77711    0.55422   0
  7  Na     0.44578    0.22289   0
  8  Cl     0.027675   0.423376  0.5
  9  Cl    -0.423376  -0.395701  0.5
 10  Cl     0.395701  -0.027675  0.5
 11  Cl    -0.423376  -0.027675  0.5
 12  Cl     0.395701   0.423376  0.5
 13  Cl     0.027675  -0.395701  0.5
 14  Cl     0.333333   0.666667  0.5
 15  Cl     0          0         0

Structure ID

extract_struc.py opt_struc_data.pkl -i 7 10 12

7.cif, 10.cif, and 12.cif are output.

For symmetrized cif,

extract_struc.py opt_struc_data.pkl -i 7 10 12 -s

2024 April 16
With the tolerance parameter (default 0.01)

extract_struc.py opt_struc_data.pkl -i 7 10 12 -s --tolerance 0.01

Top k structures

Info

rslt_data.pkl is required in the same directory as the input.

Let us suppose

./data/pkl_data/opt_struc_data.pkl
./data/pkl_data/rslt_data.pkl

and cryspy_rslt_energy_asc file is as follows:

    Spg_num     Spg_sym  Spg_num_opt Spg_sym_opt    E_eV_atom  Magmom      Opt
9       110      I4_1cd          110      I4_1cd -1284.708037     NaN  not_yet
16        4        P2_1            4        P2_1 -1284.693651     NaN     done
97       92    P4_12_12           91      P4_122 -1284.692494     NaN     done
8        57        Pbcm           57        Pbcm -1284.668504     NaN     done
81       19  P2_12_12_1           19  P2_12_12_1 -1284.635684     NaN     done
...

Top k(=3) structures can be extracted with:

extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3

In this example, rlst_data.pkl must be in ./data/pkl_data/. 9.cif, 16.cif, and 97.cif are output.

The rank can be included in cif file names with -r option:

extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -r

1_9.cif, 2_16.cif, and 3_97.cif are output.

For symmetrized cif:

extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -rs

All the structures

You should make a directory.

mkdir init_cifs
cd init_cifs
extract_struc.py /path/to/opt_struc_data.pkl -a

For symmetrized cif,

extract_struc.py /path/to/init_struc_data.pkl -as

Gzipped files

2024 April 16
Gzipped files (end with .gz) can be read:

extract_struc.py opt_struc_data.pkl.gz -i 0 1 -s

print_pkl.py

2024 May 31

When you want to quickly check the pickled files under data/pkl_data/, using print_pkl.py is convenient.

Usage

python3 print_pkl.py xxxx.pkl

or if you put the script in your PATH, you can omit python3

print_pkl.py xxxx.pkl

Example

print_pkl.py init_struc_data.pkl

Number of structures: 10
dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

print_pkl.py input_data.pkl

[basic]
algo = RS
calc_code = ASE
tot_struc = 10
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
struc_mode = crystal
natot = 8
atype = ('Cu', 'Au')
nat = (4, 4)
mindist_factor = 1.0
vol_factor = 1.1
symprec = 0.01
spgnum = all
use_find_wy = False

[option]
stop_chkpt = 0
load_struc_flag = False
stop_next_struc = False
append_struc_ea = False
energy_step_flag = False
struc_step_flag = False
force_step_flag = False
stress_step_flag = False

[ASE]
kpt_flag = False
force_gamma = False
ase_python = ase_in.py

print_pkl.py elite_struc.pkl

Number of structures: 2
dict_keys([3, 6])

print_pkl.py elite_fitness.pkl

{3: -325.79973412221455, 6: -324.8381948581405}

pos2pkl.py

2023 July 23 update

Script to convert structre data into init_struc_data.pkl. The default input format is init_POSCARS. Single structure data such as POSCAR and cif files can be optionally converted. Output is init_struc_data.pkl. Structure data can be added to an already existing init_struc_data.pkl. The structure ID is not taken into account and is newly assigned. If the number of atoms is different, an error is generated.

init_struc_data.pkl can be loaded at the start of the simulation in CrySPY.

You can remove and sort species with -f option. Note that without this option, pymatgen will sort the species in electronegativity order!

Usage

usage: pos2pkl.py [-h] [-s [SINGLE ...]] [-f [FILTER ...]] [-p] [infile ...]

positional arguments:
  infile                input file: init_POSCARS

options:
  -h, --help            show this help message and exit
  -s [SINGLE ...], --single [SINGLE ...]
                        input file: single structure file (POSCAR, cif)
  -f [FILTER ...], --filter [FILTER ...]
                        filter (sort): remove species and sort
  -p, --permit_diff_comp
                        flag for permitting different composition

Examples

init_POSCARS –> init_struc_data.pkl

It can be used to convert init_POSCARS generated by CrySPY to init_struc_data.pkl in another machine such as a supercomputer. Multiple input files can be converted.

python3 pos2pkl.py init_POSCARS

If you put the pos2pkl.py in your PATH, you can omit python3.

pos2pkl.py init_POSCARS

Composition: Na8 Cl8

Converted. The number of structures: 4
Save init_struc_data.pkl

Multiple inputs:

python3 pos2pkl.py init_POSCARS init_POSCARS2 init_POSCARS3

Composition: Na8 Cl8

Converted. The number of structures: 12
Save init_struc_data.pkl

If init_struc_data.pkl already exists in the current directory and you want to append to it:

python3 pos2pkl.py init_POSCARS

init_struc_data.pkl already exists.
Append to init_struc_data.pkl? [y/n]: y

Load init_struc_data
Composition: Na8 Cl8
The number of structures: 12

Converted. The number of structures: 16
Save init_struc_data.pkl

POSCAR or cif –> init_struc_data.pkl

Single structure data such as POSCAR and cif files can also be converted. -s/--single option is required.

python3 pos2pkl.py -s POSCAR test.cif

Composition: Na8 Cl8

Converted. The number of structures: 2
Save init_struc_data.pkl

init_POSCARS, POSCAR –> init_struc_data.pkl

python3 pos2pkl.py init_POSCARS -s POSCAR

Composition: Na8 Cl8

Converted. The number of structures: 5
Save init_struc_data.pkl

Warning

The following is wrong. The init_POSCARS is also treated as a single structure.

python3 pos2pkl.py -s POSCAR init_POSCARS

Filter (remove and sort)

Here we consider a cif file with the composition of Sr8 Co8 O20 X4, including 4 dummy atoms (X4). -f/--filter option can be used to remove and sort species. Specify the same as atype in cryspy.in.

python3 pos2pkl.py -s Sr8Co8O20X4.cif -f Sr Co O

Removed species: {'X0+'}
Composition: Sr8 Co8 O20

Converted. The number of structures: 1
Save init_struc_data.pkl

With extract_struc.py you can see how it was registered in init_struc_data.pkl.

python3 extract_struc.py init_struc_data.pkl -pa

ID 0
Full Formula (Sr8 Co8 O20)
Reduced Formula: Sr2Co2O5
...

-f option can allow you to sort.

python3 pos2pkl.py -s Sr8Co8O20X4.cif -f O Co

Removed species: {'Sr', 'X0+'}
Composition: O20 Co8

Converted. The number of structures: 1
Save init_struc_data.pkl

kpt_check.py

kpt_check.py can check a k-point mesh with a given kppvol. This script supports POSCAR, CONTCAR, and init_struc_data.pkl. pymatgen library is required.

After generating initial structures, you can try to see how much the value of kppvol should be.

Usage

python3 kpt_check.py -h

or if you put the script in your PATH, you can omit python3

kpt_check.py -h

usage: kpt_check.py [-h] [-w] [-n NSTRUC] infile kppvol

positional arguments:
  infile                input file: POSCAR, CONTCAR, or init_struc_data.pkl
  kppvol                kppvol

options:
  -h, --help            show this help message and exit
  -w, --write           write KPOINTS
  -n NSTRUC, --nstruc NSTRUC
                        number of structure to check

Example

POSCAR with a given kppvol

kpt_check.py POSCAR 100

a = 10.689217
b = 10.689217
c = 10.730846
    Lattice vector
10.689217 0.000000 0.000000
0.000000 10.689217 0.000000
0.000000 0.000000 10.730846

kppvol:  100
k-points:  [2, 2, 2]

Write KPOINTS file

You can generate a KPOINTS file using -w option.

kpt_check.py -w POSCAR 100

$ cat KPOINTS
pymatgen 4.7.6+ generated KPOINTS with grid density = 607 / atom
0
Monkhorst
2 2 2

Check k-point meshes for init_struc_data.pkl

In checking k-point meshes for init_struc_data.pkl, first five structures are automatically checked in the default setting. You can change the number of structures using -n option.

kpt_check.py -n 3 init_struc_data.pkl 100

# ---------- 0th structure
a = 8.0343076893
b = 8.03430768936
c = 9.1723323373
    Lattice vector
8.034308 0.000000 0.000000
-4.017154 6.957915 0.000000
0.000000 0.000000 9.172332

kppvol:  100
k-points:  [3, 3, 3]


# ---------- 1th structure
a = 9.8451944096
b = 9.84519440959
c = 6.8764313585
    Lattice vector
9.845194 0.000000 0.000000
-4.922597 8.526188 0.000000
0.000000 0.000000 6.876431

kppvol:  100
k-points:  [3, 3, 4]


# ---------- 2th structure
a = 7.5760383679
b = 7.57603836797
c = 6.6507478296
    Lattice vector
7.576038 0.000000 0.000000
-3.788019 6.561042 0.000000
0.000000 0.000000 6.650748

kppvol:  100
k-points:  [4, 4, 4]

FAQ

Can I change njob in the middle of the simulation?

Can I change njob in the middle of the simulation?

2024 May 7

Can I change njob in the middle of the simulation?

Yes, you can change whenever you want.

Below is an example of how the behavior changes when you reduce njob.

Warning

In CriSPY version 1.2.3 and earlier, there is a bug, so it is recommended to avoid reducing njobs.

Currently, with njob = 4, jobs for structures with IDs 0, 1, 2, and 3 are running.
Let’s say we chage njob from 4 to 2.

$ cryspy
[2024-04-28 18:27:41,847][cryspy_restart][INFO] 


Restart CrySPY 1.2.4


[2024-04-28 18:27:41,848][read_input][INFO] Changed njob from 4 to 2
[2024-04-28 18:27:42,335][ctrl_job][INFO] # ---------- job status
[2024-04-28 18:27:42,335][ctrl_job][INFO] ID      0: still queueing or running
[2024-04-28 18:27:42,335][ctrl_job][INFO] ID      1: still queueing or running

We reduced njob to 2, so we checked IDs 0, 1 and ignored IDs 2, 3.

$ cryspy
[2024-04-28 18:29:25,250][cryspy_restart][INFO] 


Restart CrySPY 1.2.4


[2024-04-28 18:29:25,744][ctrl_job][INFO] # ---------- job status
[2024-04-28 18:29:25,744][ctrl_job][INFO] ID      0: Stage 1 Done!
[2024-04-28 18:29:25,757][ctrl_job][INFO]     submitted job, ID      0 Stage 2
[2024-04-28 18:29:25,758][ctrl_job][INFO] ID      1: Stage 1 Done!
[2024-04-28 18:29:25,767][ctrl_job][INFO]     submitted job, ID      1 Stage 2

Once the jobs for IDs 1 and 2 are finished, we will then proceed to check the next two jobs (IDs 2 and 3).

$ cryspy
[2024-04-28 18:31:30,830][cryspy_restart][INFO]


Restart CrySPY 1.2.4


[2024-04-28 18:31:31,329][ctrl_job][INFO] # ---------- job status
[2024-04-28 18:31:31,329][ctrl_job][INFO] ID      0: Stage 2 Done!
[2024-04-28 18:31:31,329][collect_vasp][WARNING]     Structure ID 0, could not obtain energy from OSZICAR
[2024-04-28 18:31:31,333][ctrl_job][INFO]     collect results: E = nan eV/atom
[2024-04-28 18:31:31,341][ctrl_job][INFO] ID      1: Stage 2 Done!
[2024-04-28 18:31:31,341][collect_vasp][WARNING]     Structure ID 1, could not obtain energy from OSZICAR
[2024-04-28 18:31:31,342][ctrl_job][INFO]     collect results: E = nan eV/atom
[2024-04-28 18:31:31,347][cryspy][INFO] 

recheck 1

[2024-04-28 18:31:31,347][ctrl_job][INFO] # ---------- job status
[2024-04-28 18:31:31,347][ctrl_job][INFO] ID      2: Stage 1 Done!
[2024-04-28 18:31:31,358][ctrl_job][INFO]     submitted job, ID      2 Stage 2
[2024-04-28 18:31:31,358][ctrl_job][INFO] ID      3: Stage 1 Done!
[2024-04-28 18:31:31,368][ctrl_job][INFO]     submitted job, ID      3 Stage 2

Document

Latest version

News

Discussions

License

Code contributors

Reference

Link

Subsections of

Version information

Table of contents

Subsections of Version information

Version 1.4.1

Important change

EA-vc

Added

Charge neutral condition

Subcommand

Version 1.4.0

Important change

New algorithm: EA-vc

Interactive mode

EA

Interatomic distance check after structrue optimization

Common

Fixed

Version 1.3.0

Important change

Common

BO

Fixed

soiap

Added

for developer

Version 1.2.5

Bug fix

Version 1.2.4

Bug fix

EA

EA-vc

Version 1.2.3

MPI

Version 1.2.2

Enthalpy

Version 1.2.1

ASE interface

Version 1.2.0

ASE interface

Adoption of logging

Version 1.1.1

Bug fix for spg_error

Version 1.1.0

Parallelization with MPI

LAQA

Backup

Version 1.0.0

Install and run

Auto and manual backup

Clean

Directory tree

IO

Moved to CrySPY Utility

COMBO

New calc_code

cryspy.in

fppath

fwpath

mindist

Version 0.10.3 or earlier

Installation

Table of contents

Subsections of Installation

System requirements

Table of contents

Subsections of System requirements

Python

Python

CrySPY 1.3.0 or later

Quick install

CrySPY 1.1.0 – 1.2.5