PyPI version PyPI version

Document

CrySPY (pronounced as crispy) is a crystal structure prediction tool written in Python.
CrySPY automates the following:

  • Structure generation
  • Submitting jobs for structure optimization
  • Collecting data for structure optimization
  • Selecting candidates using machine learning

CrySPY can be install by pip install csp-cryspy.

Latest version

CrySPY 1.2.3 (2023 October 21)

News

Discussions

Discussions in GitHub (questions and comments)

License

CrySPY is distributed under the MIT License
Copyright (c) 2018 CrySPY Development Team

Code contributors

  • Tomoki Yamashita and Lab members (Nagaoka University of Technology)
  • Nobuya Sato (Tokyo Institute of Technology)
  • Hiori Kino (National Institute for Materials Science)
  • Kei Terayama (Yokohama City University)
  • Hikaru Sawahata (Kanazawa University)
  • Shinichi Kanehira (Osaka University)

Reference

  • CrySPY(software)
    • T. Yamashita, S. Kanehira, N. Sato, H. Kino, H. Sawahata, T. Sato, F. Utsuno, K. Tsuda, T. Miyake, and T. Oguchi,
      “CrySPY: a crystal structure prediction tool accelerated by machine learning”,
      Sci. Technol. Adv. Mater. Meth. 1, 87 (2021). Link
  • Bayesian optimization
    • T. Yamashita, N. Sato, H. Kino, T. Miyake, K. Tsuda, and T. Oguchi,
      “Crystal structure prediction accelerated by Bayesian optimization”,
      Phys. Rev. Mater. 2, 013803 (2018). Link
    • N. Sato, T. Yamashita, T. Oguchi, K. Hukushima, and T. Miyake,
      “Adjusting the descriptor for a crystal structure search using Bayesian optimization”,
      Phys. Rev. Mater. 4, 033801 (2020). Link
  • Bayesian optimization and evolutionary algorithm
    • T. Yamashita, H. Kino, K. Tsuda, T. Miyake, and T. Oguchi,
      “Hybrid algorithm of Bayesian optimization and evolutionary algorithm in crystal structure prediction”,
      Sci. Technol. Adv. Mater. Meth. 2, 67 (2022). Link
  • LAQA
    • K.Terayama, T. Yamashita, T. Oguchi, and K. Tsuda,
      “Fine-grained optimization method for crystal structure prediction”,
      npj Comput. Mater. 4, 32 (2018). Link
    • T. Yamashita and H. Sekine,
      “Improvement of look ahead based on quadratic approximation for crystal structure prediction”,
      Sci. Technol. Adv. Mater. Meth. 2, 84 (2022). Link

GitHub repo GitHub discussions CrySPY utility

Subsections of

Subsections of About CrySPY

Crystal structure prediction

fig_csp fig_csp

Input

  • Elements
  • The number of atoms

Output

  • Stable structure (global minimum)

Searching algorithms

The following searching algorithms are available in CrySPY:

  • Random Search (RS)
  • Evolutionary Algorithm (EA)
  • Bayesian Optimization (BO)
  • Look Ahead based on Quadratic Approximation (LAQA)

In a nutshell

Random Search (RS)

Random.

Evolutionary Algorithm (EA)

EA for crystal structure prediction has been developed by Oganov's group.
We also employ EA in CrySPY, and support the following:

  • Selection methods
    • Tournament selection
    • Roulette selection
    • Elite selection
  • Evolutionary operations
    • Crossover
    • Permutation
    • Strain
  • etc.
    • Survival of the fittest
    • Dedupe structures in survival of the fittest

Bayesian Optimization (BO)

One of the selection-type algorithms.

fig_BO fig_BO

Look Ahead based on Quadratic Approximation (LAQA)

One of the selection-type algorithms.

fig_LAQA fig_LAQA

fig_LAQA fig_LAQA

Interface

CrySPY is interfaced with several structure optimizers:

At least one optimizer is required.

Logo

PNG (transparent background)

logo_png1 logo_png1

logo_png2 logo_png2

logo_png3 logo_png3

logo_png4 logo_png4

JPG

logo_jpg1 logo_jpg1

logo_jpg2 logo_jpg2

logo_jpg3 logo_jpg3

logo_jpg4 logo_jpg4

Subsections of Version information

Version 1.2.2

Enthalpy

You can use enthalpy instead of energy for VASP and QE.

See also

Version 1.2.1

ASE interface

Bug fixed for multiple stages.

Version 1.1.1

Bug fix for spg_error

In random structure generation, when a structure cannot be generated for a certain space group, the space group number is recorded in the variable sgp_error, and the number is skipped thereafter, but a bug was found in which the number was registered incorrectly in rare cases. Therefore, this spg_error function has been removed.

Version 1.1.0

Parallelization with MPI

Random structure generation using MPI has been available.

See also

LAQA

Updated score formula to take into account the stress term (T. Yamashita and H. Sekine, Sci. Technol. Adv. Mater. Meth. 2, 84 (2022).).

See also

Backup

Files are copied to the directory named by the date and time in “backup” directory.
See features/backup in detail.

Version 1.0.0

Install and run

CrySPY is now available in PyPI. You can install by

pip install csp-cryspy

The executable script, cryspy is automatically installed in your PATH. To run CrySPY, just type cryspy:

cryspy &

CrySPY stops once before going to next selection (BO, LAQA) or next generation (EA). For example, EA case:

[old version]

  • cryspy run
    • check jobs (finish current generation?)
    • structure generation by EA automatically starts

[CrySPY 1.0.0]

  • cryspy run
    • check jobs (finish current generation?)
    • stop
  • cryspy run
    • auto backup
    • structure generation by EA automatically starts

Auto and manual backup

Automatically backup:

  • before going to next selection or next generation
  • structure generation

To manually back up:

cryspy -b

See features/backup in detail.

Clean

cryspy -c

See features/clean in detail.

Directory tree

Changed the directory tree.

  • genstruc/RS –> RS/
  • genstruc/EA –> EA/
  • genstruc/struc_util.py –> util/
  • utility.py –> util/

IO

  • Fixed standard output file and standard error file: log_cryspy and err_cryspy
  • cryspy.out is obsoleted

Moved to CrySPY Utility

With the change in installation method, examples and cal_fingerprint have been moved to the CrySPY Utility.

COMBO

The python library COMBO is now optional in CrySPY. If you do not use Bayesian optimizaion, you do not need to install it.

New calc_code

cryspy.in

fppath

New input variable for cal_fingerprint. See Instllation/cryspy/cryspy_1.0

fwpath

New input variable for find_wy. See Instllation/requirements/find_wy

mindist

  • mindist can be omitted in cryspy.in
  • mindist_ea is obsoleted
  • added mindist_mol_bs and mindist_mol_bs_factor in cryspy.in

Version 0.10.3 or earlier

  • [2022 May 17] version 0.10.3 released
    • Bug fixed: LAMMPS IO.
  • [2022 January 24] version 0.10.2 released
    • Added nrot: maximum number of times to rotate molecules in mol_bs
  • [2021 September 30] version 0.10.1 released
    • Fixed the problem of numpy.random.seed in multiprocessing
  • [2021 July 25] version 0.10.0 released
    • Support PyXtal 0.2.9 or later
    • LAQA can be used with QE
    • Upper and lower limits of energy for EA and BO
  • [2021 July 13] paper published
    • Our paper on CrySPY software has been published in STAM:Methods
  • [2021 March 18] version 0.9.2 released
    • Support pymatgen v2022.
  • [2021 February 7] version 0.9.0 released
    • Interfaced with OpenMX
    • Employ PyXtal library to generate initial structures
    • If you use PyXtal (default), find_wy program is not required
    • LAQA can be used with soiap
    • Change the name: [lattice] section –> [structure] section
    • Several input variables move to [structure] section
      • natot: [basic] –> [structure]
      • atype: [basic] –> [structure]
      • nat: [basic] –> [structure]
      • maxcnt: [option] –> [structure]
      • symprec: [option] –> [structure]
      • spgnum: [option] –> [structure]
    • New features
      • Molecular crystal structure generation
      • Scale volume
  • [2020 March 19] paper published
  • [2020 February 16] version 0.8.0 released
    • Migrate to Python 3
    • CrySPY logo created
    • Change several variable names and data formats
    • Change style of output for energy: eV/cell –> eV/atom
    • IDs of working directories corresponds to structure IDs
    • New features
      • recalculation
      • manual select in BO
  • [2018 December 5] version 0.7.0 released
    • New features
      • Evolutionary algorithm
  • [2018 August 20] version 0.6.4 released
  • [2018 July 2] version 0.6.3 released
  • [2018 June 26] Version 0.6.2 released
  • [2018 March 1] Version 0.6.1 released
  • [2018 January 9] paper published

Subsections of Installation

Subsections of System requirements

Python

Python

CrySPY 1.1.0 or later

  • Python >= 3.8
    • PyXtal (>= 0.5.3)
    • (optional) mpi4py
    • (optional, required if algo is BO) COMBO

If you install cryspy with pip, necessary libraries such as PyXtal will be installed automatically. Go to Installation > CrySPY. Manual installation of COMBO is required when using Bayesian optimization.

CrySPY 1.0.0

  • Python >= 3.8
    • PyXtal (>= 0.5.3)
    • (optional, required if algo is BO) COMBO
Info

[2023 April 22] How to instlal PyXtal (pyshtools) on arm64 MacOS is figured out. See Arm64 on MacOS (without Rosseta 2)
[2023 March 15] On MacOS, it is difficult to install PyXtal in the arm64 environment, so it is recommended to use the x86_64 environment with Rosetta 2.

CrySPY 0.10.0 – 0.10.3

Tested with Homebrew Python 3.8.x and 3.9.x on Mac and Python 3.8.x on Linux.

CrySPY 0.9.2

Tested with Homebrew Python 3.8.x and 3.9.x on Mac and Python 3.8.x on Linux.

Info

[2021 July 15] If you use PyXtal >= 0.2.9, update CrySPY to the version 0.10.0 or later.

Info

[2021 March 18] There is a breaking change in pymatgen 2022.x.x. CrySPY 0.9.2 and PyXtal 0.2.2 support this change in pymatgen.

Info

[2021 Feb. 5] PyXtal depends on numba, but numba does not support Python 3.9. So you should use Python 3.8.x for a while.
[2021 March 18] Currently numba supports Python 3.9.x.

Info

[2021 Feb. 7] PyXtal requires SciPy, but the latest version of SciPy (v1.6.0) might include a bug for deepcopy. You should use SciPy v1.5.4 for a while.
[2021 March 18] This bug has been fixed in SciPy v1.6.1.

CrySPY 0.9.0 – 0.9.1

CrySPY 0.8.0 or earlier

See the old document which is included CrySPY itself.

Structure optimizer

Structure optimizer

At least one optimizer is required.

find_wy (optional)

CrySPY have utilized find_wy to generate a random structure for a given space group (symmetry). However, CrySPY employs PyXtal library for structure generation as default since version 0.9.0. You can skip to install find_wy in CrySPY 0.9.0 or later, but you may use find_wy. For CrySPY 0.8.x or earlier, find_wy is required to generate a random structure for a given space group.

Info

You can skip to install find_wy in CrySPY 0.9.0 or later.

Installation of find_wy

m_tspace

First you need compile m_tspace for find_wy. Check these sites to compile it.

Download the source code of m_tspace in an arbitrary directory. For example:

$ mkdir -p ~/local
$ cd ~/local
$ git clone https://github.com/nim-hrkn/m_tspace.git

Additional two files are required to compile m_tspace. Download the following files in ~/local/m_tspace from TSPASE:

$ cd m_tspace
$ wget http://phoenix.mp.es.osaka-u.ac.jp/~tspace/tspace_main/tsp07/tsp98.f
$ wget http://phoenix.mp.es.osaka-u.ac.jp/~tspace/tspace_main/tsp07/prmtsp.f

Edit the makefile and run the make command. If you use ifort, you had better delete -check all option and use -O2 option.

$ emacs makefile
$ head -n 4 makefile
#FC=gfortran
#FFLAGS=-g -cpp -DUSE_GEN -ffixed-line-length-255
FC=ifort
FFLAGS=-O2 -g -traceback -cpp -DUSE_GEN -132
$ make

If you used gfortran, you might face the following problem:

tsp98.f:9839:32:

       CALL SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
                                1
Error: Actual argument contains too few elements for dummy argument 'ntab' (12/48) at (1)
make: *** [tsp98.o] Error 1

Then change the source file of tsp98.f like this (line 9925):

Before:

9913: C SUBROUTINE SUBGRP ====*====3====*====4====*====5====*====6====*====7
9914: C
9915: C    IF (JG(I),I=1,MG) IS A SUBGROUP OF (JGT(J),J=1,MGT) THEN
9916: C          TABLE (NTAB(I),I=1,MG) IS MADE HERE AND IND=0
9917: C    ELSE
9918: C          IND=-1
9919: C
9920: C                 1993/12/25
9921: C                   BY  S.TANAKA AND A. YANASE
9922: C---*----1----*----2----*----3----*----4----*----5----*----6----*----7
9923: C
9924:       SUBROUTINE SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
9925:       DIMENSION NTAB(48),JG(48),JGT(48)

After:

9913: C SUBROUTINE SUBGRP ====*====3====*====4====*====5====*====6====*====7
9914: C
9915: C    IF (JG(I),I=1,MG) IS A SUBGROUP OF (JGT(J),J=1,MGT) THEN
9916: C          TABLE (NTAB(I),I=1,MG) IS MADE HERE AND IND=0
9917: C    ELSE
9918: C          IND=-1
9919: C
9920: C                 1993/12/25
9921: C                   BY  S.TANAKA AND A. YANASE
9922: C---*----1----*----2----*----3----*----4----*----5----*----6----*----7
9923: C
9924:       SUBROUTINE SUBGRP(MG,JG,MGT,JGT,NTAB,IND)
9925:       DIMENSION NTAB(12),JG(48),JGT(48)

If you succeed in compiling, you get m_tsp.a.

find_wy

Check these sites to compile find_wy:

Download the source code of find_wy in an arbitrary directory. For example:

$ mkdir -p ~/local
$ cd ~/local
$ git clone https://github.com/nim-hrkn/find_wy.git

Edit make.inc and set the path to the m_tsp.a that you just prepared.

$ cd find_wy
$ emacs make.inc
$ head -n 4 make.inc
TSPPATH=~/local/m_tspace
#INCPATH = -I $(TSPPATH)
TSP=$(TSPPATH)/m_tsp.a

You can delete -check all option and use -O2 option. Then run the make command.

$ make

When you get the executable file of find_wy, run the following command for test:

$ ./find_wy input_sample/input_si4o8.txt

If there is no problem, POS_WY_SKEL_ALL.json file is generated.

Executable file of find_wy

CrySPY 1.0.0 or later

Put the executable file of find_wy in your PATH. Or, specify the path of the executable file in cryspy.in as follows:

[structure]
use_find_wy = True
fwpath = /xxx/xxx/xxx/find_wy

CrySPY 0.10.3 or earlier

When you use find_wy, put the executable file of find_wy in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/, so that the executable file path is ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/find_wy.

Subsections of CrySPY

CrySPY 1.0.0 or later

CrySPY

pip

CrySPY 1.0.0 or later can be installed by pip.

pip install csp-cryspy

The executable script, cryspy, is automatically installed in your PATH. You can check by

which cryspy

Editable mode

If you want to change the source code of CrySPY, you can use pip’s editable mode (-e option).

git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git
pip install -e ./CrySPY

Instead of git clone, you can download the compressed file from the release page

cal_fingerprint (optional)

cal_fingerprint is a program to calculate structure descriptors and is required if algo is BO. From CrySPY 1.0.0, the cal_fingerprint program is included in CrySPY utility. See Instllation/CrySPY_utility/Compile cal_fingerprint for compilation.

Put the executable file of cal_fingerprint in your PATH. Or, specify the path of the executable file in cryspy.in as follows:

[BO]
fppath = /xxx/xxx/xxx/cal_fingerprint

Arm64 on MacOS (without Rosseta 2)

Info

In PyXtal, starting from version 0.6.3, pyshtools is no longer mandatory. Therefore, you can ignore the information written below if you are using a recent version of PyXtal.

  1. Install miniforge3 (We do not know how to install pyshtools with homebrew python.)
  2. Install pymatgen, pyshtools by conda (recent versions of pyshtools are available in conda-forge)
conda install pymatgen
conda install pyshtools
  1. Install CrySPY
pip3 install csp-cryspy

CrySPY 0.10.3 or earlier

Installation of CrySPY is very simple. Just download it!

Download

You can put the source code of CrySPY in an arbitrary directory. For example, let us put the source code in ~/CrySPY_root/CrySPY-x.x.x (x.x.x means the version). Use git or download the compressed file.

Git

mkdir ~/CrySPY_root
cd ~/CrySPY_root
git clone https://github.com/Tomoki-YAMASHITA/CrySPY.git CrySPY-x.x.x

zip or tar.gz file

Download the source as a zip or tar.gz file from GitHub release .
Then put the source like ~/CrySPY_root/CrySPY-x.x.x

Directory tree

Directory tree in ~/CrySPY_root/CrySPY-x.x.x/:

CrySPY-x.x.x
├── CHANGELOG.md
├── CrySPY/
│   ├── BO/
│   ├── EA/
│   ├── IO/
│   ├── LAQA/
│   ├── RS/
│   ├── __init__.py
│   ├── calc_dscrpt/
│   ├── f-fingerprint/
│   ├── find_wy/
│   ├── gen_struc/
│   ├── interface/
│   ├── job/
│   └── start/
│   └── utility.py
├── LICENSE
├── README.md
├── cryspy.py
├── docs/
├── example/
└── utility/
Info

Main script is cryspy.py.

Setup (optional)

find_wy (optional)

When you use find_wy, put the executable file of find_wy in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/, so that the executable file path is ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy/find_wy.

cd ~/CrySPY_root/CrySPY-x.x.x/CrySPY/find_wy
cp ~/local/find_wy/find_wy .

Compile cal_fingerprint (optional)

When you use Bayesian optimization, compile cal_fingerpirnt program which calculates structure descriptors.

cd ~/CrySPY_root/CrySPY-x.x.x/CrySPY/f-fingerprint
emacs Makefile
make

Make sure that the executable file of cal_fingerprint exists in ~/CrySPY_root/CrySPY-x.x.x/CrySPY/f-fingerprint/.

CrySPY utility (optional)

Setting up Python environment in your local PC is useful to analyze CrySPY results. Utility tools (jupyter notebook and python scripts) are available for analysis and visualization. Input examples are also included in CrySPY utility.

Info

You need several Python libraries such as

Download CrySPY utility

Git

$ git clone https://github.com/Tomoki-YAMASHITA/CrySPY_utility.git

zip

Go to CrySPY utility and click green Code button, then choose Download ZIP.

Compile cal_fingerprint

When you use Bayesian optimization, compile cal_fingerpirnt program which calculates structure descriptors. A Fortran compiler is needed. Install in the environment where CrySPY is used, such as a workstations and supercomputers.

cd CrySPY_utility/f-fingerprint
emacs Makefile
make

See also Instllation/CrySPY.

Subsections of Tutorial

Random Search (RS)

Info

ASE is easy to start for beginners because when you install CrySPY (csp-cryspy), ASE is also automatically installed.

Preparation of input files

Follow any one of the examples and then go to “Running CrySPY” section.

Running CrySPY

  1. Check cryspy.in
  2. (version 0.10.3 or earlier) Script to run
  3. First run
  4. Submit job
  5. Check results
  6. Append structures
  7. Analysis and visualization

Loading external data

Only if calc_code == ext.

Subsections of Random Search (RS)

ASE in your local PC

2023 July 10

ASE provides interfaces to different codes. ASE also includes Pure Python EMT calculator, which is suitable for testing CrySPY because of its fast and easy structure optimization.

In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Cu 8 atoms.

Assumption

Here, we assume the following conditions:

  • CrySPY 1.2.0 or later in your local PC
  • CrySPY job filename: job_cryspy
  • ase input filename: ase_in.py

Input files

Move to your working directory, and copy the example files by one of the following methods.

cd ase_Cu8_RS
tree
.
├── calc_in
│   ├── ase_in.py_1
│   └── job_cryspy
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = ASE
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Cu
nat = 8

[ASE]
ase_python = ase_in.py

[option]

In [basic] section, jobcmd = zsh can be changed to jobcmd = sh or jobcmd = bash in accordance with your environment. CrySPY runs zsh job_cryspy as a background job internally.

[ASE] section is required when you use ASE.

You can name the following files whatever you want:

  • jobfile: job_cryspy
  • ase_python: ase_in.py

The other input variables are discussed later.

calc_in directory

The job file and input files for ASE are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh

# ---------- ASE
python3 ase_in.py

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

You can specify the input (ase_in.py) file names, but it must match the values of ase_python in cryspy.in. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string CrySPY_ID is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for ASE

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 1 in this ASE tutorial, so we need only ase_in.py_1. ase_in.py_1 is listed below. Refer to the ASE documentation for details.

from ase.constraints import ExpCellFilter, StrainFilter
from ase.calculators.emt import EMT
from ase.calculators.lj import LennardJones
from ase.optimize.sciopt import SciPyFminCG
from ase.optimize import BFGS
from ase.spacegroup.symmetrize import FixSymmetry
import numpy as np
from ase.io import read, write

# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')

# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
atoms = ExpCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(atoms)
#opt=SciPyFminCG(atoms)
opt.run()

# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
#                         CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = atoms.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
    f.write(str(e))

write('CONTCAR', atoms.atoms, format='vasp')

Unlike VASP and QE, the ASE input (python script) is more flexible. CrySPY has two rules:

  1. Energy is output in units of eV/cell to log.tote file. CrySPY reads the last line of it.
  2. Optimized structure is output to `CONTCAR`` file in the VASP format.

Running CrySPY

Go to Running CrySPY

soiap in your local PC

soiap is Structure Optimization with InterAtomic Potential. It is suitable for testing CrySPY because of its fast structure optimization. See instructions to install soiap.

In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Si 8 atoms.

Assumption

Here, we assume the following conditions:

  • (only version 0.10.3 or earlier) CrySPY main script: ~/CrySPY_root/CrySPY-0.9.0/cryspy.py
  • CrySPY job filename: job_cryspy
  • soiap executable file: ~/local/soiap-0.3.0/src/soiap
  • soiap input filename: soiap.in
  • soiap output filename: soiap.out
  • soiap input structure filename: initial.cif

Input files

Move to your working directory, and copy input example files by one of the following methods.

  • Download from cryspy_utility/examples/soiap_Si8_RS
  • Copy from CrySPY utility that you installed
  • (only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/soiap_RS_Si8 .
cd soiap_RS_Si8
tree
.
├── calc_in
│   ├── job_cryspy
│   └── soiap.in_1
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = soiap
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

In [basic] section, jobcmd = zsh can be changed to jobcmd = sh or jobcmd = bash in accordance with your environment. CrySPY runs zsh job_cryspy as a background job internally.

[soiap] section is required when you use soiap.

You can name the following files whatever you want:

  • jobfile
  • soiap_infile
  • soiap_outfile
  • soiap_cif

The other input variables are discussed later.

calc_in directory

The job file and input files for soiap are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh

# ---------- soiap
EXEPATH=/path/to/soiap
$EXEPATH/soiap soiap.in 2>&1 > soiap.out

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Change /path/to/soiap into right path suitable for your environment. You can specify the input (soiap.in) and output (soiap.out) file names, but they must match the values of soiap_infile and soiap_outfile in cryspy.in. The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for soiap

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 1, so we need only soiap.in_1.

soiap.in_1 is listed below.

crystal initial.cif ! CIF file for the initial structure
symmetry 1 ! 0: not symmetrize displacements of the atoms or 1: symmetrize

md_mode_cell 3 ! cell-relaxation method
               ! 0: FIRE, 2: quenched MD, or 3: RFC5
number_max_relax_cell 100 ! max. number of the cell relaxation
number_max_relax 1 ! max. number of the atom relaxation
max_displacement 0.1 ! max. displacement of atoms in Bohr

external_stress_v 0.0 0.0 0.0 ! external pressure in GPa

th_force 5d-5 ! convergence threshold for the force in Hartree a.u.
th_stress 5d-7 ! convergence threshold for the stress in Hartree a.u.

force_field 1 ! force field
              ! 1: Stillinger-Weber for Si, 2: Tsuneyuki potential for SiO2,
              ! 3: ZRL for Si-O-N-H, 4: ADP for Nd-Fe-B, 5: Jmatgen, or
              ! 6: Lennard-Jones

The input structure file is specified at the first line. Use the same name as the value of soiap_cif in cryspy.in.

Running CrySPY

Go to Running CrySPY

VASP

2024 April 24

In this tutorial, we try to use CrySPY in a PC cluster with a job scheduler system such as PBS. Here we employ VASP. The target system is Na8Cl8, 16 atoms.

Assumption

Here, we assume the following conditions:

  • CrySPY 1.2.0 or later in your PC cluster
  • CrySPY job command: qsub
  • CrySPY job filename: job_cryspy
  • executable file, vasp_std in your PC cluster

Input files

Move to your working directory, and copy the example files by one of the following methods.

cd vasp_Na8Cl8_RS
tree
.
├── calc_in
│   ├── INCAR_1
│   ├── INCAR_2
│   ├── POTCAR
│   ├── POTCAR_is_dummy
│   └── job_cryspy
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5

[VASP]
kppvol = 40 80

[option]

In [basic] section, jobcmd = qsub can be changed in accordance with your environment. CrySPY runs qsub job_cryspy as a background job internally in this setting. You can name the following file whatever you want:

  • jobfile

We adopt a stage-based system for structure optimization calculations. Here, we use nstage = 2. For example, users can configure the following settings. In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density. Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.

[VASP] section is required when you use VASP. You have to specify k-point grid density (Å^-3) for each stage in kppvol.

Info

See Input file > Kpoint for details of kppvol

The other input variables are discussed later.

calc_in directory

The job file and input files for VASP are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Na8Cl8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q
####$ -q ibis3.q
####$ -q ibis4.q

# ---------- vasp
VASPROOT=/usr/local/vasp/vasp.6.4.2/bin
mpirun -np $NSLOTS $VASPROOT/vasp_std

# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job

Change VASPROOT to the appropriate path suitable for your environment. The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for VASP

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 2, so we need INCAR_1 and INCAR_2. Here, INCAR_1 is set to fix the cell and relax only the ionic positions, while INCAR_2 is configured to fully relax both the cell and ionic positions.

INCAR_1

SYSTEM = NaCl
!!!LREAL = Auto
Algo = Fast
NSW = 40

LWAVE = .FALSE.
!LCHARG = .FALSE.

ISPIN =  1

ISMEAR = 0
SIGMA = 0.1

IBRION = 2
ISIF = 2

EDIFF = 1e-5
EDIFFG = -0.01

INCAR_2

SYSTEM = NaCl
!!LREAL = Auto
Algo = Fast
NSW = 200

ENCUT = 341

!!LWAVE = .FALSE.
!!LCHARG = .FALSE.


ISPIN =  1

ISMEAR = 0
SIGMA = 0.1

IBRION = 2
ISIF = 3

EDIFF = 1e-5
EDIFFG = -0.01

CrySPY automatically generates POSCAR and KPOINTS files. You have to prepare POTCAR file yourself. The POTCAR included in this example file is empty, so please be aware of that.

Warning

POTCAR in this example is empty. We cannot distribute it.

Running CrySPY

Go to Running CrySPY

QE

2024 April 24, updated

In this tutorial, we try to use CrySPY in a machine with a job scheduler system such as PBS. Here we employ QUANTUM ESPRESSO. (QE). The target system is Si 8 atoms.

Assumption

Here, we assume the following conditions:

  • CrySPY job command: qsub
  • CrySPY job filename: job_cryspy
  • QE executable file: /usr/local/qe-6.5/bin/pw.x
  • QE input filename: pwscf.in
  • QE output filename: pwscf.out

Input files

Move to your working directory, and copy input example files by one of the following methods.

  • Download from cryspy_utility/examples/qe_Si8_RS
  • Copy from CrySPY utility that you installed
  • (only version 0.10.3 or earlier) cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/QE_Si8_RS .
cd QE_RS_Si8
tree
.
├── calc_in
│   ├── job_cryspy
│   ├── pwscf.in_1
│   └── pwscf.in_2
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = QE
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  40  80

[option]

In [basic] section, jobcmd = qsub can be changed in accordance with your environment. CrySPY runs qsub job_cryspy as a background job internally in this setting.

We adopt a stage-based system for structure optimization calculations. Here, we use nstage = 2. For example, users can configure the following settings. In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density. Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.

[QE] section is required when you use QE. You have to specify k-point grid density (Å^-3) for each stage in kppvol.

Info

See Input file > Kpoint for details of kppvol

You can name the following files whatever you want:

  • jobfile
  • qe_infile
  • qe_outfile

The other input variables are discussed later.

calc_in directory

The job file and input files for QE are prepared in this directory.

Job file

The name of the job file must match the value of jobfile in cryspy.in. The example of job file (here, job_cryspy) is shown below.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS /path/to/pw.x < pwscf.in > pwscf.out


if [ -e "CRASH" ]; then
    sed -i -e '3 s/^.*$/skip/' stat_job
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

Change /path/to/pw.x to the appropriate path suitable for your environment. You can specify the input (pwscf.in) and output (pwscf.out) file names, but they must match the values of qe_infile and qe_outfile in cryspy.in.

The job file is written in the same way as the one you usually use except for the last line. You must add sed -i -e '3 s/^.*$/done/' stat_job at the end of the file in CrySPY.

Note

sed -i -e '3 s/^.*$/done/' stat_job is required at the end of the job file.

Tip

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

Input for QE

Input files based on the number of stages (nstage in cryspy.in) are required. Name the input file(s) with a suffix _x. Here x means the stage number.

We are using nstage = 2, so we need pwscf.in_1 and pwscf.in_2. Here, pwscf.in_1 is set to fix the cell and relax only the ionic positions, while pwscf.in_2 is configured to fully relax both the cell and ionic positions.

pwscf.in_1

 &control
    title = 'Si8'
    calculation = 'relax'
    nstep = 100
    restart_mode = 'from_scratch',
    pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
    outdir='./out.d/'
 /

 &system
    ibrav = 0
    nat = 8
    ntyp = 1
    ecutwfc = 44.0
    occupations = 'smearing'
    degauss = 0.01
 /

 &electrons
 /

 &ions
 /

 &cell
 /

ATOMIC_SPECIES
  Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF

pwscf.in_2

 &control
    title = 'Si8'
    calculation = 'vc-relax'
    nstep = 200
    restart_mode = 'from_scratch',
    pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
    outdir='./out.d/'
 /

 &system
    ibrav = 0
    nat = 8
    ntyp = 1
    ecutwfc = 44.0
    occupations = 'smearing'
    degauss = 0.01
 /

 &electrons
 /

 &ions
 /

 &cell
 /

ATOMIC_SPECIES
  Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF

Change pseudo_dir to your suitable directory. Inputs for structure data and k-point such as ATOMIC_POSITIONS and K_POINTS are automatically appended by CrySPY with pymatgen. Users do not have to prepare them in pwscf.in_x.

Running CrySPY

Go to Running CrySPY

OpenMX

Coming soon.

LAMMPS

Coming soon.

External program

Available from CrySPY 0.11.0.

If you use an external program not supported by CrySPY, the optimized energy and structure data can be loaded semi-manually in CrySPY. You have to prepare two files, ext_opt_struc_data.pkl and ext_energy_data.pkl.

Assumption

Here, we assume the following conditions:

  • (version 0.10.3 or earlier) CrySPY main script: ~/CrySPY_root/CrySPY-0.11.0/cryspy.py

(calc_in directory is not required.)

Input files

Move to your working directory, and copy input example files.

  • version 1.0.0 or later
    • Copy from CrySPY utility
  • version 0.10.3 or earlier
    • cp -r ~/CrySPY_root/CrySPY-0.9.0/example/ext_Si8_RS .
cd ext_Si8_RS
tree
.
└── cryspy.in

cryspy.in

cryspy.in is the input file of CrySPY.

[basic]
algo = RS
calc_code = ext
tot_struc = 5

[structure]
natot = 8
atype = Si
nat = 8

[option]

If calc_code == ext, nstage, njob, jobcmd, and jobfile are ignored.

Running CrySPY

This mode is different from the normal use of CrySPY. Go to Load external data.

Check cryspy.in

See Input file in detail.

Let’s take a look at cryspy.in again. This may be slightly different depending on calc_code you chose.

[basic]
algo = RS
calc_code = soiap
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

[basic] section

  • algo: Algorithm. Set RS for Random Search.
  • calc_code: Structure optimizer. Choose from VASP, QE, OMX, soiap, LAMMPS
  • tot_struc: The total number of structures. In this case, 5 random structures are generated at 1st run.
  • nstage: The number of stages. It’s up to you.
  • njob: The number of jobs running at the same time. In this example, CrySPY sets 2 slots for structure optimization, in other words, optimizes every 2 structures.
  • jobcmd: Command for jobs. Use bash, zsh, qsub, and so on.
  • jobfile: File name of the job file.

[structure] section

  • natot: The total number of atoms. e.g. for Na8Cl8: natot = 16.
  • atype: Atom type. e.g. for Na8Cl8: atype = Na Cl.
  • nat: The number of each atom. e.g. for Na8Cl8: nat = 8 8

Script to run

Note

For version 1.0.0 or later, skip this page. The executable script is automatically installed.

Assumption

Here, we assume the following condition:

  • CrySPY main script: ~/CrySPY_root/CrySPY-0.9.0/cryspy.py

Make script

Let’s make a convenient shell script to avoid typing long commands over and over again. Here, we create the script, cryspy (any file name will do).

$ emacs cryspy
$ chmod 744 cryspy
$ cat cryspy
#!/bin/sh

python3 -u ~/CrySPY_root/CrySPY-0.9.0/cryspy.py 1>> log 2>> err

-u option (unbuffered option) can be omitted.

You can put this script in your $PATH, or just use like bash ./cryspy.

Firsrt run

2023 July 10, update

Make sure you have the following in your working directory.

  • calc_in/
  • (cryspy)
  • cryspy.in
$ ls
calc_in/  cryspy.in

Then, run CyrSPY!

cryspy

If you use old version (0.10.3 or earlier):

bash ./cryspy

At the first run, CrySPY goes into structure generation mode. CrySPY stops after 5 structure generation.

If it worked properly, the following output appears on the screen:

[2023-07-10 18:40:54,389][cryspy_init][INFO] 


Start CrySPY 1.2.0


[2023-07-10 18:40:54,389][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2023-07-10 18:40:54,390][read_input][INFO] Save input data in cryspy.stat
[2023-07-10 18:40:54,391][cryspy_init][INFO] # ---------- Initial structure generation
[2023-07-10 18:40:54,391][cryspy_init][INFO] Number of MPI processes: 1
[2023-07-10 18:40:54,391][gen_init_struc][INFO] # ------ mindist
[2023-07-10 18:40:54,395][struc_util][INFO] Cu - Cu: 1.32
[2023-07-10 18:40:54,395][gen_init_struc][INFO] # ------ generate structures
[2023-07-10 18:40:54,481][gen_pyxtal][INFO] Structure ID      0 was generated. Space group:   1 -->   1 P1
[2023-07-10 18:40:54,493][gen_pyxtal][INFO] Structure ID      1 was generated. Space group:  28 -->  28 Pma2
[2023-07-10 18:40:54,498][gen_pyxtal][INFO] Structure ID      2 was generated. Space group:  29 -->  29 Pca2_1
[2023-07-10 18:40:54,704][gen_pyxtal][INFO] Structure ID      3 was generated. Space group: 137 --> 137 P4_2/nmc
[2023-07-10 18:40:54,725][gen_pyxtal][INFO] Structure ID      4 was generated. Space group: 212 --> 214 I4_132
[2023-07-10 18:40:54,800][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.408367
cryspy  4.35s user 1.04s system 145% cpu 3.697 total

Several output files are also generated.

  • (cryspy.out): Short log. only version 0.10.3 or earlier.
  • cryspy.stat: Status file.
  • data/init_POSCARS: Initial struture file in POSCAR format. You can open this file using VESTA
  • data/pkl_data: Directory to save pickled data.
  • log_cryspy: log.
  • err_cryspy: error and warning.

Let’s take a look at cryspy.stat file.

...
(omit)
...
[status]
id_queueing = 0 1 2 3 4

Structure ID 0 – 4 are queueing because we just generated structures, and have not submitted yet.

Tip

Check the initial structures, if the distances between atoms are too close, you should set the mindist in cryspy.in.

Submit job

2023 July 10, update

Continue

CrySPY continues the simulation if you have cryspy.stat file.

Tip

Continue if you have crypy.stat
Start from the beginning if you don’t have cryspy.stat

Submit job

Run CyrSPY again.

cryspy

Check the screen or log_cryspy file.

[2023-07-10 18:52:51,859][cryspy_restart][INFO] 


Restart CrySPY 1.2.0


[2023-07-10 18:52:51,869][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:52:51,904][ctrl_job][INFO] ID      0: submit job, Stage 1
[2023-07-10 18:52:51,931][ctrl_job][INFO] ID      1: submit job, Stage 1

And also cryspy.stat file.

...
(omit)
...
[status]
id_queueing = 2 3 4
id      0 = Stage 1
id      1 = Stage 1

CrySPY submitted two jobs for structure ID 0 and 1 as you set njob = 2 in cryspy.in. Calculations are performed in the work directory. These directory names correspond to their structure ID.

tree -d work
work
├── 000000
├── 000001
└── fin

When the two jobs are done, run CrySPY again.

cryspy
[2023-07-10 18:55:01,053][cryspy_restart][INFO] 


Restart CrySPY 1.2.0


[2023-07-10 18:55:01,058][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,058][ctrl_job][INFO] ID      0: Stage 1 Done!
[2023-07-10 18:55:01,093][ctrl_job][INFO]     collect results: E = -0.00696997755502915 eV/atom
[2023-07-10 18:55:01,132][ctrl_job][INFO] ID      1: Stage 1 Done!
[2023-07-10 18:55:01,133][ctrl_job][INFO]     collect results: E = 0.4934076667166454 eV/atom
[2023-07-10 18:55:01,144][cryspy][INFO] 

recheck 1

[2023-07-10 18:55:01,145][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,153][ctrl_job][INFO] ID      2: submit job, Stage 1
[2023-07-10 18:55:01,161][ctrl_job][INFO] ID      3: submit job, Stage 1

If you set nstage = 2 (more than 2), new jobs on stage 2 for ID 0 and 1 are submitted. If you set nstage = 1, CrySPY collects calculation data of ID 0 and 1, then submits next ID’s jobs. Directories of the finished structure are moved to the fin directory.

Repeat cryspy several times until all 5 structures are done. You can delete the work directory when the simulation is done if you do not need it.

The auto script (repeat_cryspy) may help you.

Check results

Move to data directory. There should be a few more files.

$ cd data
$ ls
cryspy_rslt  cryspy_rslt_energy_asc  init_POSCARS  opt_POSCARS  pkl_data/
  • cryspy_rslt: Result file.
  • cryspy_rslt_energy_asc: Result file sorted in energy ascending order.
  • init_POSCARS: Initial struture file in POSCAR format.
  • opt_POSCARS: Optimized structure file in POSCAR format.
  • pkl_data/: Directory to save pickled data.

The results are written to text files, cryspy_rslt and cryspy_rslt_energy_asc (and also saved in pickle data in pkl_data directory).

Each result appends to cryspy_rslt file in the order in which one finished earlier.

cat cryspy_rslt
   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
Info

Not ID order in cryspy_rslt

In cryspy_rslt_energy_asc file, the results are sorted in energy ascending order.

cat cryspy_rslt_energy_asc
   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done

Spg_num and Spg_sym show space group information on initial structures. Spg_num_opt and Spg_sym_opt are those of optimized structures. The last column Opt indicates whether or not optimization reached required accuracy.

Append structures

Of course only 5 structures are not enough to find stable structures. You can append structures whenever you want. Here let’s append more 5 structures.

For Si-Si mindist, the default value of 1.11 Å is used in the first structure generation (see log_cryspy), which is a little too close. Let us try to set the mindist to 2.0 Å.

Edit cryspy.in and change the value of tot_struc into 10, and add mindist_1 = 2.0

emacs cryspy.in
cat cryspy.in
[basic]
algo = RS
calc_code = soiap
tot_struc = 10
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8
mindist_1 = 2.0

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

Then run cryspy, and check log_cryspy file.

cryspy &
cat log_cryspy
...
(omit)
...

2023/03/19 00:01:47
CrySPY 1.0.0
Restart cryspy.py


Changed tot_struc from 5 to 10
Changed mindist from None to [[2.0]]

Backup data

# ---------- Append structures
# ------ mindist
Si - Si 2.0
Structure ID      5 was generated. Space group: 218 --> 221 Pm-3m
Structure ID      6 was generated. Space group:  86 --> 129 P4/nmm
Structure ID      7 was generated. Space group: 129 --> 129 P4/nmm
Structure ID      8 was generated. Space group: 191 --> 191 P6/mmm
Structure ID      9 was generated. Space group:  31 -->  31 Pmn2_1

Remember that CrySPY goes into structure generation mode whenever you change the value of tot_struc. In this mode, CrySPY does not do any other action such as collecting data, submitting jobs, and so on.

Note

Structure generation mode whenever you change the value of tot_struc.
From version 1.0.0, CrySPY automatically backs up when adding structures. See features/backup.

Repeat cryspy & several times until all appended structures are done. The auto script (repeat_cryspy) may help you.

Analysis and visualization

Download the data

It is assumed here that you analyze and visualize CrySPY data in your local PC. If you use CrySPY in super computers or workstations, download the data in your local PC. You can delete the work and backup directory if you do not need it because the file size could be very large.

jupyter notebook

Move to the data/ directory in results you just download. Then copy cryspy_analyzer_RS.ipynb from CrySPY utility.

$ ls
calc_in/ cryspy.in cryspy.stat  data/  err_cryspy  log_cryspy
$ cd data
$ ls
cryspy_rslt  cryspy_rslt_energy_asc  init_POSCARS  opt_CIFS.cif  opt_POSCARS  pkl_data/
cp /path/to/CrySPY_utility/cryspy_analyzer_RS.ipynb .

Run jupyter. (VScode, jupyter lab, jupyter notebook, and so on.) You can get the following figure by simply running the steps in order.

RS for Si8 RS for Si8

Load external data

You need only cryspy.in.

$ ls
cryspy.in

Then, run CyrSPY.

cryspy &

At the first run, CrySPY goes into structure generation mode as usual. CrySPY stops after 5 structure generation.

If it worked properly, log_cryspy would look like this.

2022/07/14 19:41:41
CrySPY 1.0.0
Start cryspy.py

Read input file, cryspy.in
Write input data in cryspy.out
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.11
Structure ID      0 was generated. Space group:  88 --> 141 I4_1/amd
Structure ID      1 was generated. Space group: 101 --> 101 P4_2cm
Structure ID      2 was generated. Space group: 204 --> 229 Im-3m
Structure ID      3 was generated. Space group: 199 --> 199 I2_13
Structure ID      4 was generated. Space group:  12 -->  12 C2/m

Unlike normal use, a directory named ext was created. Only the stat_job file exists in ext/.

$ cat ext/stat_job
out

If you run cryspy when “out” is written in the stat_job file, queueing structure files (cif format) are exported in ext/queue.

cryspy &
$ ls ext/queue
0.cif  1.cif  2.cif  3.cif  4.cif

The number in the file name is structure ID. The fist line of stat_job was automatically changed.

$ cat ext/stat_job
submitted

Perform structure optimization and energy evaluation in an external program using the output cif files. Once that calculation is done, prepare the optimized structure and energy data in the pickle data format, ext_opt_struc_data.pkl and ext_energy_data.pkl.

The data format of ext_opt_struc_data.pkl is the same as init_struc_data.pkl and opt_struc_data.pkl, see Data format/Initial and optimized structure data.

The data format of ext_energy_data.pkl is similar to ext_opt_struc_data.pkl. Just change the value from the structure data into the energy. An example of the energy data (dict type) is shown below.

  • key: structure ID
  • value: energy
{0: -0.7139331910805997,
 1: -0.5643404689832622,
 2: -0.5832404287259171,
 3: -0.535037327286169,
 4: -0.6316663459586607}

The ext/calc_data directory should be automatically generated, so put the two pickle files here.

$ ls ext/calc_data
ext_energy_data.pkl  ext_opt_struc_data.pkl

When ready, replace the first line of the stat_job file with “done” and run CrySPY.

$ emacs /ext/stat_job
$ cat /ext/stat_job
done
cryspy &

CrySPY collects the result data.

Evolutionary Algorithm (EA)

EA

Bayesian Optimization (BO)

BO

LAQA

May 15th, 2023

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.
Here, we assume CrySPY 1.1.0 or later.

The example files used here can be downloaded from CrySPY Utility > Examples > qe_Si16_LAQA. In this tutorial, only 50 initial structures are generated, but originally, LAQA is designed to select candidates from many more structures.

Input

cryspy.in

Here is an example of cryspy.in.

[basic]
algo = LAQA
calc_code = QE
tot_struc = 50
nstage = 1
njob = 10
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Si
nat = 16
mindist_1 = 1.5

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  80

[LAQA]
nselect_laqa = 4

[option]
  • nstage must be 1 in LAQA
  • You have to write nselect_laqa in [LAQA] section. nselect_laqa is the number of candidates you select at one time.

If you want to change the value of the weight for LAQA score, edit wf and ws as below. If omitted, the default values are used (0.1 and 10.0, respectively). See, Searching algorithms > LAQA for the score.

[LAQA]
nselect_laqa = 4
wf = 0.1
ws = 10.0

calc_in/pwscf.in_1

&control
    calculation = 'vc-relax'
    pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
    outdir='./outdir/'
    nstep = 10
/

&system
    ibrav = 0
    nat = 16
    ntyp = 1
    ecutwfc = 40
    ecutrho = 200
    occupations = 'smearing'
    degauss = 0.01
/

&electrons
/

&ions
/

&cell
/

ATOMIC_SPECIES
  Si -1.0 si_pbe_v1.uspp.F.UPF
  • nstep controls how many steps of structure optimization can proceed in one selection. (NSW for VASP)

calc_in/job_cryspy

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS pw.x -nk 4 < pwscf.in > pwscf.out

if [ -e "CRASH" ]; then
    sed -i -e '3 s/^.*$/skip/' stat_job
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job
  • The job file is the same as the usual way.

Run

Tip

An automatic script is also available. See the bottom of this page.

Just type cryspy for the 1st run.

cryspy &

Check log_cryspy. 50 random structures are generated.

2023/05/13 13:02:07
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 1

Read input file, cryspy.in
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.5
Structure ID      0 was generated. Space group: 165 --> 165 P-3c1
Structure ID      1 was generated. Space group:  66 -->  66 Cccm
Structure ID      2 was generated. Space group: 146 --> 146 R3
Structure ID      3 was generated. Space group:  82 -->  82 I-4
Structure ID      4 was generated. Space group: 162 --> 162 P-31m
...
...
...
Structure ID     47 was generated. Space group:  90 -->  90 P42_12
Structure ID     48 was generated. Space group: 214 --> 214 I4_132
Structure ID     49 was generated. Space group:  23 -->  23 I222

Elapsed time for structure generation: 0:00:10.929030


# ---------- Initialize LAQA
# ---------- Selection 0
selected_id: 50 IDs

In LAQA, jobs of structure optimization for all structures are submitted once at the beginning. Note that only 10 steps are proceeded here since we set nstep = 10. Repeat cryspy command until all of these (10 steps) are completed. If necessary, you can also submit all jobs at once by increasing the value of njob.

After all the initial optimizations, LAQA is ready is displayed at the end of log_cryspy.

2023/05/13 13:23:31
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status
ID     41: Stage 1 Done!

LAQA is ready

Next cryspy run will make the first selection.

2023/05/13 13:23:33
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status

Backup data

# ---------- Selection 1
selected_id: 37 8 10 48

Here, only the number set in nselect_laqa will be selected. Type cryspy to submit the jobs (next 10 steps).

cryspy &
2023/05/13 13:23:36
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1



# ---------- job status
ID     37: submit job, Stage 1
ID      8: submit job, Stage 1
ID     10: submit job, Stage 1
ID     48: submit job, Stage 1

Then, by repeating this over and over again, the optimization of the structure selected according to the score advances by 10 steps each time. Proceed until several structures are completed, and finish (stop) when you like.

Status

If you want to check the LAQA score during the simulation, you can look at the status file:

  • ./data/LAQA_status

Other files for LAQA will be output:

  • ./data_LAQA_bias
  • ./data_LAQA_energy
  • ./data_LAQA_score
  • ./data_LAQA_selected_id
  • ./data_LAQA_step

Analysis and visualization

It is assumed here that you analyze and visualize CrySPY data in your local PC. If you use CrySPY in super computers or workstations, download the data in your local PC. You can delete the work and backup directory if you do not need it because the file size could be very large. You may gzip the pkl data to decrease the file size.

jupyter notebook

Move to the data/ directory in results you just downloaded. Then copy cryspy_analyzer_LAQA.ipynb from CrySPY utility.

You can obtain the graph and animation with the notebook. In the gif below, all of the optimizations were completed. This is just for animation. (When all of the optimizations are completed, the computational cost is the same as random search.)

fig_LAQA fig_LAQA

This graph shows the energy as a function of optimization step. The red lines indicate three structures with the lowest energy. The most stable one reached diamond structure. The structures that eventually become stable were selected at an early stage.

Info

If algo = LAQA, the followings are automatically set in the [option] section.

  • force_step_flag = True
  • stress_step_flag = True

Force and stress data are collected step by step. Energy and structure data are NOT. They are collected for each selection. In other words, in this case, energy and structure data are saved once every 10 steps. If you want to collect energy and structure data step by step, manually set up as follows:

[option]
energy_step_flag = True
struc_step_flag = True

Auto script

You may find it tedious to run cryspy over and over again. The auto script could help you.

repeat_cryspy

Molecular crystal structure prediction

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.

In this section, we give a tutorial on the molecular structure generation part only. Since version 0.9.0, CrySPY has been able to generate random molecular crystal structures using PyXtal.

You need to use a pre-defined molecular by PyXtal’s database (see, https://pyxtal.readthedocs.io/en/latest/Usage.html?highlight=benzene#pyxtal-molecule-pyxtal-molecule)) or create molecule files that define molecular structures.

Pre-defined molecule

PyXtal currently supports C60, H2O, CH4, NH3, benzene, naphthalene, anthracene, tetracene, pentacene, coumarin, resorcinol, benzamide, aspirin, ddt, lindane, glycine, glucose, and ROY.

Let us generate molecular crystal structures that consist of 2 benzenes.

Move to your working directory, and copy input example files by one of the following methods.

Take a look at cryspy.in.

$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 6
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
struc_mode = mol
natot = 24
atype = H C
nat = 12 12
mol_file = benzene
nmol = 2

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40  60

[option]

In generating molecular crystal structures, you have to set struc_mode = mol in the [structure] section. Molecule file(s) and the number of molecule(s) are specified as:

  • mol_file = benzene
  • nmol = 2

Run CrySPY and see the initial structures (./data/init_POSCARS).

User-defined molecule

Move to your working directory, and copy input example files for 2 formula units of Li3PS4.

  • version 1.0.0 or later
    • Copy from CrySPY utility
  • version 0.10.3 or earlier
    • cp -r ~/CrySPY_root/CrySPY-0.9.0/example/QE_Li3PS4_2fu_RS_mol .
$ cd QE_Li3PS4_2fu_RS_mol
$ ls
Li.xyz  PS4.xyz  calc_in/  cryspy.in

Molecule files of Li and PS4 are included. Supported formats in PyXtal are .xyz, .gjf, .g03, .g09, .com, .inp, .out, and pymatgen’s JSON serialized molecules.

$ cat Li.xyz
1
New structure
 Li  0.000  0.000  0.000
$ cat PS4.xyz
5
New structure
 P    0.000000    0.000000    0.000000
 S    1.200000    1.200000   -1.200000
 S    1.200000   -1.200000    1.200000
 S   -1.200000    1.200000    1.200000
 S   -1.200000   -1.200000   -1.200000

Check cryspy.in.

$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 4
nstage = 2
njob = 1
jobcmd = qsub
jobfile = job_cryspy

[structure]
struc_mode = mol
natot = 16
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40  60

[option]

A single atom (Li atom in this case) is treated as a molecule in the molecular crystal structure generation mode. In this example, a random molecular structure is composed of six Li molecules (atoms) and two PS4 molecules specified as:

  • mol_file = ./Li.xyz ./PS4.xyz
  • nmol = 6 2

In mol_file, set relative path of molecule files from cryspy.in. Here the molecule files are placed in the same directory.

Run CrySPY and see the initial structures (./data/init_POSCARS).

timeout_mol

Molecular crystal structure generation can be time consuming because PyXtal calculates the molecule directions according to a specified space group. Sometimes molecular crystal structure generation gets stuck. So we set a time limit on the single structure generation. The time limit (timeout_mol) is set to 120 seconds by default. If the limit is insufficient, you have to increase it as (see last line):

struc_mode = mol
natot = 16
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0

Volume of unit cell

You can control the volume of unit cells by changing the value(s) of scaling factor, vol_factor, in cryspy.in. By default, vol_factor is set to 1.0. It is also possible to specify a range of factors. Set minimum and maximum values as follows:

struc_mode = mol
natot = 16
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz  ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0
vol_factor = 0.8 1.5

Random structure generation with MPI

Oct. 21 2023, update

Info

First, see Tutorial > Random Search (RS) for basic usage of CrySPY.

Info

Requirements:

  • CrySPY 1.1.0 1.2.3 or later
  • mpi4py
  • MPI library (Open MPI, Intel MPI, MPICH, etc.)
Warning

1.1.0 <= CrySPY <=1.2.2 has a bug. When you use bash (zsh) to run a job with MPI (e.g., jobcmd = zsh, jobfile = job_cryspy), the MPI job does not run. There is no problem when you use a job scheduler (qsub, sbatch). It has already fixed in version 1.2.3.

mpi4py

Install mpi4py if it is not already installed.

pip install mpi4py

Input

cryspy.in is the same as normal usage and does not need to be changed. Here we try structure generation with MPI using the following settings:

[basic]
algo = RS
calc_code = soiap
tot_struc = 100
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8

[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif

[option]

All except tot_struc, natot, atype, and nat are irrelevant for structure generation and can be ignored here.

Run

If you want to generate structures with 4 MPI processes, just use mpiexec -n (with `-p`` option):

mpiexec -n 4 cryspy -p

In 1.1.0 <= CrySPY <= 1.2.2, use (without `-p`` option)

mpiexec -n 4 cryspy

If you submit the job with a job scheduler system, make the job file. Here is an example:

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
#$ -N n_nproc
#$ -pe smp 4


mpirun -np $NSLOTS ~/.local/bin/cryspy

Please edit the location of the executable script cryspy.

Result

CrySPY simply divides the task (number of structures) by the number of processes:

  • Rank 0: IDs 0 – 24
  • Rank 1: IDs 25 – 49
  • Rank 2: IDs 50 – 74
  • Rank 3: IDs 75 – 99

CrySPY outputs the log in the order they are generated as follows:

2023/04/24 22:47:51
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 4

Read input file, cryspy.in
Save input data in cryspy.stat

# --------- Generate initial structures
# ------ mindist
Si - Si 1.11
Structure ID     25 was generated. Space group: 138 --> 123 P4/mmm
Structure ID     75 was generated. Space group:  99 -->  99 P4mm
Structure ID      0 was generated. Space group: 127 --> 123 P4/mmm
Structure ID      1 was generated. Space group:  61 -->  61 Pbca
Structure ID     50 was generated. Space group:  38 -->  38 Amm2
Structure ID     51 was generated. Space group: 134 --> 123 P4/mmm
Structure ID     26 was generated. Space group: 111 --> 123 P4/mmm
Structure ID      2 was generated. Space group:   9 -->   9 Cc
Structure ID      3 was generated. Space group:  80 -->  80 I4_1
Structure ID      4 was generated. Space group: 107 --> 107 I4mm
Structure ID      5 was generated. Space group:  75 -->  75 P4
Structure ID     76 was generated. Space group: 108 --> 108 I4cm
Structure ID     77 was generated. Space group: 100 --> 100 P4bm
Structure ID     27 was generated. Space group: 207 --> 221 Pm-3m

However, the order in init_POSCARS is by structure ID since CrySPY outputs after all structures have been generated.

ID_0
1.0
   2.9636956737951818    0.0000000000000002    0.0000000000000002
   0.0000000000000000    2.9636956737951818    0.0000000000000002
   0.0000000000000000    0.0000000000000000    6.2634106638053080
Si
8
direct
  -0.1602734164607877   -0.1602734164607877   -0.0000000000000000 Si
   0.1602734164607877    0.1602734164607877    0.5000000000000000 Si
   0.6602734164607877    0.3397265835392123    0.7500000000000000 Si
   0.3397265835392122    0.6602734164607877    0.2500000000000000 Si
   0.4469739273741755    0.4469739273741755   -0.0000000000000000 Si
   0.5530260726258245    0.5530260726258244    0.5000000000000000 Si
   0.0530260726258245    0.9469739273741754    0.7500000000000000 Si
   0.9469739273741754    0.0530260726258245    0.2500000000000000 Si
ID_1
1.0
   7.2751506682509657    0.0000000000000004    0.0000000000000004
   0.0000000000000000    7.2751506682509657    0.0000000000000004
   0.0000000000000000    0.0000000000000000    5.1777634169924873
Si
8
direct
  -0.3845341807505553   -0.3845341807505553    0.4999999999999999 Si
   0.3845341807505553    0.3845341807505553    0.5000000000000000 Si
   0.3845341807505553   -0.3845341807505553    0.0000000000000000 Si
  -0.3845341807505553    0.3845341807505553   -0.0000000000000000 Si
   0.0000000000000000    0.5000000000000000    0.2500000000000000 Si
   0.5000000000000000    0.0000000000000000    0.7500000000000000 Si
   0.0000000000000000    0.5000000000000000    0.7500000000000000 Si
   0.5000000000000000    0.0000000000000000    0.2500000000000000 Si
ID_2
1.0
  -4.3660398676292269   -4.3660398676292269    0.0000000000000000
  -4.3660398676292269   -0.0000000000000003   -4.3660398676292269
   0.0000000000000000   -4.3660398676292269   -4.3660398676292269
Si
8
direct
   0.8700001548800920    0.8700001548800920    0.1299998451199080 Si
   0.1299998451199080    0.1299998451199080    0.8700001548800920 Si
   0.8700001548800920    0.1299998451199080    0.8700001548800920 Si
   0.1299998451199080    0.8700001548800920    0.1299998451199080 Si
   0.1299998451199080    0.8700001548800920    0.8700001548800920 Si
   0.8700001548800920    0.1299998451199080    0.1299998451199080 Si
   0.7500000000000000    0.7500000000000000    0.7500000000000000 Si
   0.2500000000000000    0.2500000000000000    0.2500000000000000 Si
Note

Except for the random structure generation part, there is no point in using MPI because it is not parallelized.

Subsections of Seaching algorithms

Random search (RS)

under construction

Evolutionary algorithm (EA)

under construction

Bayesian optimizaion (BO)

under construction

LAQA

Score $ L $

$$ L = -E + w_F \frac{F^2}{2\Delta F} + w_S S. $$
SymbolNote
$$ E $$Energy (eV/atom)
$$ w_F $$Weight of the force term. Default: $ w_F = 0.1$
$$ F $$Averaged norm of the atomic force (eV/Å)
$$ \Delta F $$Absolute difference of $ F $ from the previous step. $ \Delta F = 1$ for the first step. $ \Delta F = 10^{-6}$ if $ \Delta F \le 10^{-6} $.
$$ w_S $$Weight of the stress term. Default: $ w_S = 10.0$
$$ S $$Average of the absolute values of the components of the stress tensor (eV/Å^3).

Reference

Subsections of Structure generation

struc_mode = crystal

under construction

struc_mode = mol

under construction

struc_mode = mol_bs

CrySPY uses pyxtal in normal molecular crystal structure generation mode (struc_mode = mol). The molecules are arranged to fit a point group at a selected Wykoff position in the space group to keep the symmetry. (Sometimes it takes a long time to generate.)

In mol_bs mode (bs means break symmetry), dummy atoms are placed in Wykoff positions as in ordinary crystals, and then the dummy atoms are replaced by molecules without considering symmetry and rotated randomly. The structure generation is relatively fast.

under construction

Subsections of Features

Logging

2023 July 10

CrySPY 1.2.0 adopts logging library of Python. CrySPY logs are output to both the screen and files(log_cryspy and err_cryspy).

  • log –> screen and log_cryspy
  • error and warning –> screen and err_cryspy

Here is the example:

[2023-07-10 18:40:54,389][cryspy_init][INFO] 


Start CrySPY 1.2.0


[2023-07-10 18:40:54,389][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2023-07-10 18:40:54,390][read_input][INFO] Save input data in cryspy.stat
[2023-07-10 18:40:54,391][cryspy_init][INFO] # ---------- Initial structure generation
[2023-07-10 18:40:54,391][cryspy_init][INFO] Number of MPI processes: 1
[2023-07-10 18:40:54,391][gen_init_struc][INFO] # ------ mindist
[2023-07-10 18:40:54,395][struc_util][INFO] Cu - Cu: 1.32
[2023-07-10 18:40:54,395][gen_init_struc][INFO] # ------ generate structures
[2023-07-10 18:40:54,481][gen_pyxtal][INFO] Structure ID      0 was generated. Space group:   1 -->   1 P1
[2023-07-10 18:40:54,493][gen_pyxtal][INFO] Structure ID      1 was generated. Space group:  28 -->  28 Pma2
[2023-07-10 18:40:54,498][gen_pyxtal][INFO] Structure ID      2 was generated. Space group:  29 -->  29 Pca2_1
[2023-07-10 18:40:54,704][gen_pyxtal][INFO] Structure ID      3 was generated. Space group: 137 --> 137 P4_2/nmc
[2023-07-10 18:40:54,725][gen_pyxtal][INFO] Structure ID      4 was generated. Space group: 212 --> 214 I4_132
[2023-07-10 18:40:54,800][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.408367

If you want to run cryspy as a background job, or if you use the auto script (repeat_cryspy), and do NOT want it to output to the screen, run cryspy with the -n option as follow:

cryspy -n

Backup

CrySPY has a simple backup function. The following files are backed up:

  • cryspy.in
  • cryspy.stat
  • log_cryspy
  • err_cryspy
  • calc_in/*
  • data/*
  • ext/*

work/* are NOT included.

  • (v1.1.0 or later) above files are copied to a directory named by date and time in “backup” directory. Previous backups are NOT automatically deleted.
  • (v1.0.0) only one generation is backed up, and previous backups will be deleted.

Auto backup

The timing of the automatic backup is as follows:

  • before going to next selection (BO, LAQA) or next generation (EA)
  • append structures

Manual backup

To manually back up, run cryspy with -b or --backup option as:

cryspy -b

This command only performs backups, unlike the normal execution.

Clean

CrySPY has a simple clean (just move files) function. It is useful when you want to start over from the beginning. The following files are cleaned up:

  • cryspy.stat
  • log_cryspy
  • err_cryspy
  • lock_cryspy
  • data/*
  • work/*
  • ext/*
  • tmp_calc_FP/*
  • tmp_gen_struc/*

To clean up, run cryspy with -c or --clean option as:

$ ls
calc_in  cryspy.in  cryspy.stat  data  err_cryspy  log_cryspy
$ cryspy -c
Are you sure you want to clean the data? 'yes' or 'no' [y/n]: y
$ ls
calc_in  cryspy.in  trash
$ ls trash
20230318_100728

Files other than calc_in/* and cryspy.in are moved to trash and grouped into a directory named by date and time. If you do not need them, you can delete them manually.

Restriction on interatomic distances

2024 April 23, updated

You can restrict the interatomic distance in structure generation. Here is an example of [structure] section in the input file to limit minimum interatomic distance for a A-B binary system.

[structure]
natot = 8
atype = A B
nat = 4 4
mindist_1 = 2.0 1.8
mindist_2 = 1.8 1.5

This means that minimum interatomic distances of A-A, A-B, and B-B are limited to 2.0, 1.8, and 1.5 Å, respectively. Structures with interatomic distances shorter than these values are automatically eliminated.

For ternary systems, you will need mindist_1, mindist_2, and mindist_3. Mindist matrix must be a symmetric matrix.

Example: Na8Cl8

Without mindist

cryspy.in

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8

[VASP]
kppvol = 40 80

[option]

log_cryspy

[2024-04-23 13:46:28,598][cryspy_init][INFO] 


Start CrySPY 1.2.3


[2024-04-23 13:46:28,598][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2024-04-23 13:46:28,598][read_input][INFO] Save input data in cryspy.stat
[2024-04-23 13:46:28,599][gen_init_struc][INFO] # ------ mindist
[2024-04-23 13:46:28,601][struc_util][INFO] Na - Na: 1.66
[2024-04-23 13:46:28,602][struc_util][INFO] Na - Cl: 1.3399999999999999
[2024-04-23 13:46:28,602][struc_util][INFO] Cl - Cl: 1.02
...

fig_mindist fig_mindist

In the default settings of PyXtal, atoms can sometimes be too close to each other, as shown in the figure above, so it is recommended to set the mindist parameter. That would help simplify DFT calculations.

With mindist

cryspy.in

[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy

[structure]
natot = 16
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5

[VASP]
kppvol = 40 80

[option]

log_cryspy

[2024-04-23 14:06:21,955][cryspy_init][INFO] 


Start CrySPY 1.2.3


[2024-04-23 14:06:21,955][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2024-04-23 14:06:21,956][read_input][INFO] Save input data in cryspy.stat
[2024-04-23 14:06:21,956][gen_init_struc][INFO] # ------ mindist
[2024-04-23 14:06:21,956][struc_util][INFO] Na - Na: 2.5
[2024-04-23 14:06:21,956][struc_util][INFO] Na - Cl: 1.5
[2024-04-23 14:06:21,956][struc_util][INFO] Cl - Cl: 2.5

In cases like ionic crystals, it is advisable to set up the configuration in such a way that cations and anions are kept apart from each other.

CrySPY_ID in job files

In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID. When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name. For example, in the PBS system, #PBS -N Si_CrySPY_ID in ID 10 is replaced with #PBS -N Si_10. Note that starting with a number will result in an error. You should add a prefix like Si_.

#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 8
####$ -q ibis1.q
####$ -q ibis2.q

mpirun -np $NSLOTS pw.x -nk 4 -nb 2 < pwscf.in > pwscf.out


if [ -e "CRASH" ]; then
    exit 1
fi

sed -i -e '3 s/^.*$/done/' stat_job

Structure generation with MPI parallelization

Oct. 21 2023, update

Random structure generation using MPI has been available since version 1.1.0 ( using CrySPY >= 1.2.3 is better). You need to install mpi4py in your Python environment for MPI parallelization. Of course, an MPI library such as Open MPI, Intel MPI, and MPICH is required for your workstation.

Info

Requirements:

  • CrySPY 1.1.0 1.2.3 or later
  • mpi4py
  • MPI library (Open MPI, Intel MPI, MPICH, etc.)
Warning

1.1.0 <= CrySPY <=1.2.2 has a bug. When you use bash (zsh) to run a job with MPI (e.g., jobcmd = zsh, jobfile = job_cryspy), the MPI job does not run. There is no problem when you use a job scheduler (qsub, sbatch). It has already fixed in version 1.2.3.

The figure below shows the relationship between elapsed time and the number of processes for 1000 structures of Si8 with the following setting:

[basic]
algo = RS
calc_code = soiap
tot_struc = 1000
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy

[structure]
natot = 8
atype = Si
nat = 8
mindist_1 = 2.2

The structure generation is taking a long time because of a slightly stricter setting like mindset_1 = 2.2. The structure generation was performed 10 times for each number of processes.

fig_MPI fig_MPI

Run

mpiexec -n 4 cryspy -p

Enthalpy

2023/10/18

Info

Requirements:

  • CrySPY 1.2.2 or later
  • VASP or QE

When performing CSP at high pressure, enthalpy results can be collected instead of total energy. Not yet compatible with softwares other than VASP and QE.

E_eV_atom in cryspy_rslt and cryspy_rslt_energy_asc turns into enthalpy (eV/atom). Here is the example of CSP results under 40 GPa pressure for Sr4O4. CsCl-type structure (ID 5) is more stable than NaCl-type (ID 6).

   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
5       26  Pmc2_1          221       Pm-3m  -2.276790     NaN     done
6      225   Fm-3m          225       Fm-3m  -2.244800     NaN     done
1      101  P4_2cm          107        I4mm  -2.181115     NaN     done
4      123  P4/mmm          123      P4/mmm  -2.034509     NaN  not_yet
3       20  C222_1           63        Cmcm  -0.686541     NaN     done
2       75      P4           75          P4  -0.008713     NaN  not_yet
9       51    Pmma           47        Pmmm   0.096430     NaN     done
8       65    Cmmm          123      P4/mmm   1.099657     NaN     done
0      187   P-6m2          187       P-6m2   1.292124     NaN     done
7       53    Pmna           53        Pmna   5.153504     NaN  not_yet

VASP

CrySPY reads energy (enthalpy) from a OSZICAR file. This automatically changes to enthalpy when PSTRESS is set in INCAR_x as follows:

PSTRESS = 400

You do not have to do anything in cryspy.in. energy_step_flag is also supported for enthalpy.

Example:

QE

Add pv_term = True in the QE section of cryspy.in to use enthalpy:

[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol =  40  80
pv_term = True

Don’t forget to write press in the QE input:

 &cell
    press = 400
 /
Warning

In QE, energy_step_flag is not supported yet for enthalpy.

Example:

Subsections of Input file

File format

CrySPY uses the configparser module to read input file, cryspy.in . cryspy.in consists of sections, led by a [section] header and followed by name = value or name : value entries. Section names and values are case sensitive, but names are not. Lines beginning with # or ; are ignored and may be used to provide comments. Accepted bool values are 1, yes, true, and on, which cause this method to return True, and 0, no, false, and off, which cause it to return False. These string values for bool are checked in a case-insensitive manner. Some values are given in a space-separated manner.

Info

See configparser in detail.

Note

section name: case sensitive
name: case insensitive
value: case sensitive except for bool

[basic] section

NameValueDefaultDescription
algoRS, EA, BO, LAQAAlgorithm
calc_codeVASP, QE, OMX, soiap, LAMMPSCaluculation code for structure optimization
tot_strucintThe total number of structures
nstageintThe number of stages
njobintThe number of jobs running at the same time.
jobcmdstrCommand to submit jobs such as qsub and sbatch.
jobfilestrFile name of the job file.

[structure] section

NameValueDefaultDescription
struc_modecrystal, mol, mol_bscrystalStructure generation mode
natotintThe total number of atoms.
atypeatomic symbol [atomic symbol …]Atom type. e.g. atype = Na Cl.
natint [int …]The number of atoms in each atom type. e.g. nat = 8 8.
mol_filestr [str …], NoneNonePath of molecule files or molecule names.
nmolint [int …], NoneNoneThe number of molecules.
timeout_molfloat120.0Time out for molecular structure generation.
rot_molrandom, random_mol, random_wyckoffrandom_wyckoffMode for rotation of molecules.
nrotint20Maximum number of trials to rotate molecules in mol_bs
vol_factorfloat float1.0 1.0Minimum and maximum values of volume factor.
vol_mufloat, NoneNoneMean of volume if you want specify the volume of cells.
vol_sigmafloat, NoneNoneStandard deviation of volume if you want specify the volume of cells.
mindist (mindist_?)float [float …], NoneNoneConstraint on minimum interatomic distance [Å].
mindist_factorfloat1.0Scaling factor for mindist.
mindist_mol_bs (mindist_mol_bs_?)float [float …], NoneNoneConstraint on minimum intermolecular distance [Å].
mindist_mol_bs_factorfloat1.0Scaling factor for mindist_mol_bs.
symprecfloat,0.01Precision for symmetry finding.
spgnumall, space group number, 0allConstraint on space group. If all, 1–230. If 0, random structure without space group information (no symmetry).
use_find_wyboolFalseStructure generation with find_wy.
fwpathstr, NoneNonePath of find_wy
minlenfloat, NoneNoneOnly used with find_wy or spgnum = 0. Minimum length of lattice vector [Å].
maxlenfloat, NoneNoneOnly used with find_wy or spgnum = 0. Maximum length of lattice vector [Å].
danglefloat, NoneNoneOnly used with find_wy or spgnum = 0. Delta angle for alpha, beta, and gamma in degree unit.
maxcntint50Only used with find_wy or spgnum = 0. Maximum number of trials to determine atom positions.

mindist

[VASP] section

2024 April 22

[VASP] section is required only if you use VASP (calc_code = VASP)

NameValueDefaultDescription
kppvolint [int …]Grid density per Å**(-3) of reciprocal cell in each stage.
force_gammaboolFalseIf true, force gamma-centered mesh.

kppvol and force gamma

[QE] section

[QE] section is required only if you use QE (calc_code = QE)

NameValueDefaultDescription
kppvolint [int …]Grid density per Å**(-3) of reciprocal cell in each stage
qe_infilestrFile name of QE input file.
qe_outfilestrFile name of QE output file.
pv_termboolFalseIf true, read enthalpy instead of total energy.

kppvol

pv_term

[OMX] section

[OMX] section is required only if you use OpenMX (calc_code = OMX)

NameValueDefaultDescription
kppvolint [int …]Grid density per Å**(-3) of reciprocal cell in each stage
OMX_infilestrFile name of OpenMX input file.
OMX_outfilestrFile name of OpenMX output file.
ValenceElectronsstr float float [str float float …]The number of initial charges for up and down spin states.

kppvol

ValenceElectrons

e.g. in NaCl: ValenceElectrons = Na 4.5 4.5 Cl 3.5 3.5.

[soaip] section

[soiap] section is required only if you use soiap (calc_code = soiap)

NameValueDefaultDescription
soiap_infilestrFile name of soiap input file.
soiap_outfilestrFile name of soiap output file.
soiap_cifstrFile name of soiap CIF-formatted initial structure.

[LAMMPS] section

[LAMMPS] section is required only if you use LAMMPS (calc_code = LAMMPS)

NameValueDefaultDescription
lammps_infilestrFile name of LAMMPS input file.
lammps_outfilestrFile name of LAMMPS output file.
lammps_potentialstr [str …], NoneNonePotential.
lammps_datastrFile name of LAMMPS data file.

[ASE] section

[ASE] section is required only if you use ASE (calc_code = ASE)

NameValueDefaultDescription
ase_pythonstrFile name of ASE input file.

[EA] section

[EA] section is required only if you use EA (algo = EA)

NameValueDefaultDescription
n_popintPopulation from second generation.
n_crsovintThe number of structures generated by crossover.
n_permintThe number of structures generated by permutation.
n_strainintThe number of structures generated by strain.
n_randintThe number of structures generated randomly.
n_eliteintThe number of elite structures.
fit_reverseboolFalseIf False, minimal search.
n_fittestintNoneThe number of structures which can survive.
slct_funcTNM, RLTSelect function.
t_sizeint3Only used with slct_func = TNM. Tournament size.
a_rltfloat10.0Only used with slct_func = RLT. Parameter for linear scaling.
b_rltfloat1.0Only used with slct_func = RLT. Parameter for linear scaling.
crs_latequal, randomequalHow to mix lattice vectors.
nat_diff_toleint4Tolerance for difference in the number of atoms in crossover.
ntimesint1The number of times in permutation.
sigma_stfloat0.5Standard deviation for strain.
maxcnt_eaint50Maximum number of trials in EA.
maxgen_eaint0Maximum generation.
emax_eafloatNoneUpper limit of energy in selecting parents.
emin_eafloatNoneLower limit of energy in selecting parents.

[BO] section

[BO] section is required only if you use BO (algo = BO)

NameValueDefaultDescription
nselect_bointThe number of structures to be selected at once.
scoreTS, EI, PIAcquisition function.
num_rand_basisintIf 0, Gaussian process. The number of basis function.
cdevfloat0.001Cutoff of deviation for standardization.
dscrptFPDescriptor for structures.
fppathstr, NoneNoneOnly used with dscrpt = FP. Path of cal_fingerprint.
fp_rminfloat0.5Only used with dscrpt = FP. Minimum cutoff of r in fingerprint.
fp_rmaxfloat5.0Only used with dscrpt = FP. Maximum cutoff of r in fingerprint.
fp_npointsint20Only used with dscrpt = FP. Number of discretized points for each pair in fingerprint.
fp_sigmafloat1.0Only used with dscrpt = FP. Sigma parameter [Å] in Gaussian smearing function.
max_select_boint0Maximum number of selection.
manual_select_boint [int …][]Structure IDs to be selected manually.
emax_bofloatNoneUpper limit of energy in BO.
emin_bofloatNoneLower limit of energy in BO.

[LAQA] section

[LAQA] section is required only if you use LAQA (algo = LAQA)

NameValueDefaultDescription
nselect_laqaintThe number of structures to be selected at once.
wffloat0.1Weight of the force term.
wsfloat10.0Weight of the stress term.
Info

If algo = LAQA, the followings are automatically set in the [option] section.

  • force_step_flag = True
  • stress_step_flag = True

Force and stress data are collected step by step. Energy and structure data are NOT. They are collected for each selection. In other words, in this case, energy and structure data are saved once every 10 steps. If you want to collect energy and structure data step by step, manually set up as follows:

[option]
energy_step_flag = True
struc_step_flag = True

[option] section

NameValueDefaultDescription
stop_chkptint0CrySPY stops at a specified check point.
load_struc_flagboolFalseIf True, load initial structures from ./data/pkl_data/init_struc_data.pkl.
stop_next_strucboolFalseIf True, CrySPY does not submit jobs for next structures, but jobs for next stage are submitted.
recalcint [int …](empty list)Specify structure IDs if you want to recalculate or continue optimization.
append_struc_eaboolFalseIf True, append structures by EA.
energy_step_flagboolFalseIf True, save energy_step_data in ./data/pkl_data/energy_step_data.pkl.
struc_step_flagboolFalseIf True, save struc_step_data in ./data/pkl_data/struc_step_data.pkl.
force_step_flagboolFalseIf True, save force_step_data in ./data/pkl_data/force_step_data.pkl.
stress_step_flagboolFalseIf True, save stress_step_data in ./data/pkl_data/stress_step_data.pkl.

Kpoint

2024 April 22

CrySPY automatically generates the k-point setting using the pymatgen.io.vasp.Kpoints.automatic_density_by_vol function from pymatgen. An example in cryspy.in with nstage = 2 is as follows:

[VASP]
kppvol = 40 120
  • stage 1: kppvol = 40
  • stage 2: kppvol = 120

kppvol means a grid density per Å ${}^{-3} $ of reciprocal cell.
VASP: gamma centered meshes are used for hexagonal cells and face-centered cells; otherwise, Monkhorst-Pack grids are employed.
QE and OMX: only a k-mesh is provided, no offset.

What is the appropriate value for kppvol?

Here are the guidelines. We use VESTA for visualizing crystal structures.

Primitive cell of diamond Si

fig_prim_diamond fig_prim_diamond

a = b = c = 3.836 Å

kppvolk-mesh
0[1, 1, 1]
20[4, 4, 4]
40[6, 6, 6]
60[7, 7, 7]
80[7, 7, 7]
100[8, 8, 8]
120[9, 9, 9]
140[9, 9, 9]
160[9, 9, 9]
180[10, 10, 10]
200[10, 10, 10]
400[13, 13, 13]
600[15, 15, 15]
800[17, 17, 17]

Conventional cell of diamond Si

fig_conv_diamond fig_conv_diamond

a = b = c = 5.431 Å

kppvolk-mesh
0[1, 1, 1]
20[3, 3, 3]
40[3, 3, 3]
60[4, 4, 4]
80[4, 4 ,4]
100[5, 5, 5]
120[5, 5, 5]
140[6, 6, 6]
160[6, 6, 6]
180[6, 6, 6]
200[6, 6, 6]
400[8, 8, 8]
600[9, 9, 9]
800[10, 10, 10]

Nd2Fe14B

fig_Nd2Fe12B fig_Nd2Fe12B

a = b = 8.804 Å
c = 12.205 Å

kppvolk-mesh
0[1, 1, 1]
20[1, 1, 1]
40[2, 2, 1]
60[2, 2, 2]
80[3, 3 ,2]
100[3, 3, 2]
120[3, 3, 2]
140[3, 3, 2]
160[3, 3, 2]
180[4, 4, 2]
200[4, 4, 3]
400[5, 5, 3]
600[6, 6, 4]
800[6, 6, 4]

Subsections of Data format

Subsections of Common data

Initial and optimized structure data

Initial and optimized structure data are saved in init_struc_data.pkl and opt_struc_data.pkl, respectively. pymatgen library is required to analyze these data files.

Data format

  • type: dict
    • key: structure ID
    • value: structure data
  • string form
    • {0: Structure Summary …,
      1: Structure Summary …,
      …}
  • structure data format

How to access

import pickle
with open('init_struc_data.pkl', 'rb') as f:
   init_struc_data = pickle.load(f)
with open('opt_struc_data.pkl', 'rb') as f:
   opt_struc_data = pickle.load(f)

# struc_step_data[ID]
#
#

# ---------- structure step data of ID 0
cid = 0      # ID
init_struc_data[cid]    # to show initial structure of ID 0
Structure Summary
Lattice
    abc : 5.727301 5.727301 4.405757
 angles : 90.0 90.0 90.0
 volume : 144.5175386563631
      A : 5.727301 0.0 0.0
      B : 0.0 5.727301 0.0
      C : 0.0 0.0 4.405757
PeriodicSite: Si (0.2506, 5.4767, 1.1014) [0.0438, 0.9562, 0.2500]
PeriodicSite: Si (2.6130, 3.1143, 1.1014) [0.4562, 0.5438, 0.2500]
PeriodicSite: Si (3.1143, 0.2506, 1.1014) [0.5438, 0.0438, 0.2500]
PeriodicSite: Si (5.4767, 2.6130, 1.1014) [0.9562, 0.4562, 0.2500]
PeriodicSite: Si (5.4767, 0.2506, 3.3043) [0.9562, 0.0438, 0.7500]
PeriodicSite: Si (3.1143, 2.6130, 3.3043) [0.5438, 0.4562, 0.7500]
PeriodicSite: Si (2.6130, 5.4767, 3.3043) [0.4562, 0.9562, 0.7500]
PeriodicSite: Si (0.2506, 3.1143, 3.3043) [0.0438, 0.5438, 0.7500]

Result data

Common result data such as space group, energies, etc. are saved in rslt_data.pkl. pandas library is required to analyze this data file.

Data format

  • type: pandas.core.frame.DataFrame
    • row lable: structure ID
  • string form
    • see blow

How to access

import pickle
with open('rslt_data.pkl', 'rb') as f:
   rslt_data = pickle.load(f)


# ---------- sort by Energy
# top 5
rslt_data.sort_values(by=['E_eV_atom']).head(5)
   Spg_num Spg_sym  Spg_num_opt Spg_sym_opt  E_eV_atom  Magmom      Opt
1       98  I4_122           12        C2/m  -3.978441     NaN  not_yet
3       36  Cmc2_1           36      Cmc2_1  -3.520306     NaN  not_yet
2       16    P222           16        P222  -3.348616     NaN  not_yet
4       36  Cmc2_1            4        P2_1  -3.304168     NaN  not_yet
0      139  I4/mmm          139      I4/mmm  -3.000850     NaN     done

Random Search (RS)

Table of contents

    Evolutionary algorithm (EA)

    Table of contents

      Bayesian Optimization (BO)

      Table of contents

        LAQA

        Table of contents

          Subsections of Optional data

          Energy step data

          Energy step data is saved in energy_step_data.pkl if you set energy_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

          Warning

          energy_step_flag = True is currently available only with VASP, QE, and soiap.

          Info

          In soiap, energy_step_data is collected only if loopa == 1. This is because other data (struc, force, and stress) are output only when loopa == 1. See, https://github.com/nbsato/soiap/blob/master/doc/instructions.md

          Data format

          • type: dict
            • key: structure ID
            • value: list of energy step data in each stage
          • string form
            • {0: [array([-3.4439912 , -3.55040935, -3.66697038, ..]), array([-4.0613393 , -4.05445631, -4.06159641, …]), …],
              1: [array([-2.68209823, -2.69012487, -2.68364907, ..]), array([-2.79140967, -2.79183827, -2.79206508, …]), …],
              …}
          • unit of energy
            • eV/atom

          How to access

          import pickle
          with open('energy_step_data.pkl', 'rb') as f:
              energy_step_data = pickle.load(f)
          
          # energy_step_data[ID][stage][step]
          # energy_step_data[ID][0] <-- stage 1
          # energy_step_data[ID][1] <-- stage 2
          #
          # in LAQA
          # energy_step_data[ID][selection][step]
          # energy_step_data[ID][0] <-- 1st selection
          # energy_step_data[ID][1] <-- 2nd selection
          
          # ---------- energy step data of ID 3, stage 1
          cid = 3      # ID
          stage = 1    # stage
          energy_step_data[cid][stage-1][:10]    # show only 10 enegies in jupyter
          
          array([-3.4439912 , -3.55040935, -3.66697038, -3.77192063, -3.84320717,
                 -3.80679245, -3.84633935, -3.87374706, -3.89123193, -3.90422926])
          

          Structure step data

          Structure step data is saved in struc_step_data.pkl if you set struc_step_flag = True in [option] section of cryspy.in. pymatgen library is required to analyze this data file.

          Warning

          struc_step_flag = True is currently available only with VASP, QE, and soiap.

          Info

          struc_step_data includes initial structures. For example, struc_step_data[cid][0][0] is the initial structure of ID = cid.

          Data format

          • type: dict
            • key: structure ID
            • value: list of structure step data in each stage
          • string form
            • {0: [[Structure Summary …, Structure Summary, …], […], …],
              1: [[Structure Summary …, Structure Summary, …], […], …],
              …}
          • structure data format

          How to access

          import pickle
          with open('struc_step_data.pkl', 'rb') as f:
              struc_step_data = pickle.load(f)
          
          # struc_step_data[ID][stage][step]
          # struc_step_data[ID][0] <-- stage 1
          # struc_step_data[ID][1] <-- stage 2
          #
          #
          # in LAQA
          # struc_step_data[ID][selection][step]
          # struc_step_data[ID][0] <-- 1st selection
          # struc_step_data[ID][1] <-- 2nd selection
          
          # ---------- structure step data of ID 3, stage 1, step 4
          cid = 0      # ID
          stage = 1    # stage
          step = 0     # step index (start from 0)
          struc_step_data[cid][stage-1][step]    # to show initial structure of ID 0 at stage 1 in jupyter
          
          Structure Summary
          Lattice
              abc : 5.727301 5.727301 4.405757
           angles : 90.0 90.0 90.0
           volume : 144.5175386563631
                A : 5.727301 0.0 0.0
                B : 0.0 5.727301 0.0
                C : 0.0 0.0 4.405757
          PeriodicSite: Si (0.2506, 5.4767, 1.1014) [0.0438, 0.9562, 0.2500]
          PeriodicSite: Si (2.6130, 3.1143, 1.1014) [0.4562, 0.5438, 0.2500]
          PeriodicSite: Si (3.1143, 0.2506, 1.1014) [0.5438, 0.0438, 0.2500]
          PeriodicSite: Si (5.4767, 2.6130, 1.1014) [0.9562, 0.4562, 0.2500]
          PeriodicSite: Si (5.4767, 0.2506, 3.3043) [0.9562, 0.0438, 0.7500]
          PeriodicSite: Si (3.1143, 2.6130, 3.3043) [0.5438, 0.4562, 0.7500]
          PeriodicSite: Si (2.6130, 5.4767, 3.3043) [0.4562, 0.9562, 0.7500]
          PeriodicSite: Si (0.2506, 3.1143, 3.3043) [0.0438, 0.5438, 0.7500]
          

          Force step data

          Force step data is saved in force_step_data.pkl if you set force_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

          Warning

          force_step_flag = True is currently available only with VASP, QE, and soiap.

          Data format

          • type: dict
            • key: structure ID
            • value: list of force step data in each stage
          • string form
            • {0: [array([[ 0.26314927, -0.26314927, -0. ], […], …[…]]), array([[…], …, […]]), …],
              1: [array([[ 0. , 0. , 0. ], […], …[…]]), array([[…], …, […]]), …],
              …}
          • unit of force
            • eV/Å

          How to access

          import pickle
          with open('force_step_data.pkl', 'rb') as f:
              force_step_data = pickle.load(f)
          
          # force_step_data[ID][stage][step][atom]
          # force_step_data[ID][0] <-- stage 1
          # force_step_data[ID][1] <-- stage 2
          #
          # in LAQA
          # force_step_data[ID][selection][step][atom]
          # force_step_data[ID][0] <-- 1st selection
          # force_step_data[ID][1] <-- 2nd selection
          
          # ---------- force step data of ID 3, stage 1
          cid = 0      # ID
          stage = 1    # stage
          force_step_data[cid][stage-1][:3]    # to show only 3 steps in jupyter 
          
          [array([[ 0.26314927, -0.26314927, -0.        ],
                  [-0.26314927,  0.26314927, -0.        ],
                  [ 0.26314927,  0.26314927,  0.        ],
                  [-0.26314927, -0.26314927, -0.        ],
                  [-0.26314927,  0.26314927, -0.        ],
                  [ 0.26314927, -0.26314927,  0.        ],
                  [-0.26314927, -0.26314927, -0.        ],
                  [ 0.26314927,  0.26314927,  0.        ]]),
           array([[-0.12103692,  0.12103692,  0.        ],
                  [ 0.12103692, -0.12103692, -0.        ],
                  [-0.12103692, -0.12103692, -0.        ],
                  [ 0.12103692,  0.12103692,  0.        ],
                  [ 0.12103692, -0.12103692, -0.        ],
                  [-0.12103692,  0.12103692,  0.        ],
                  [ 0.12103692,  0.12103692,  0.        ],
                  [-0.12103692, -0.12103692, -0.        ]]),
           array([[-0.29801618,  0.29801618,  0.        ],
                  [ 0.29801618, -0.29801618, -0.        ],
                  [-0.29801618, -0.29801618, -0.        ],
                  [ 0.29801618,  0.29801618,  0.        ],
                  [ 0.29801618, -0.29801618, -0.        ],
                  [-0.29801618,  0.29801618,  0.        ],
                  [ 0.29801618,  0.29801618,  0.        ],
                  [-0.29801618, -0.29801618, -0.        ]])]
          
          step = 0     # step index (start from 0)
          atom = 2     # atom index (start from 0)
          force_step_data[cid][stage-1][step][atom]
          
          array([0.26314927, 0.26314927, 0.        ])
          

          Stress step data

          Stress step data is saved in stress_step_data.pkl if you set stress_step_flag = True in [option] section of cryspy.in. NumPy library is required to analyze this data file.

          Warning

          stress_step_flag = True is currently available only with VASP, QE, and soiap.

          Data format

          • type: dict
            • key: structure ID
            • value: list of stress step data in each stage
          • string form
            • {0: [array([[-0.16770062, 0. , 0. ], […], […]]), array([[…], ]…], […]]), …],
              1: [array([[ 0.39260083, -0. , -0. ], […], […]]), array([[…], […], […]]), …],
              …}
          • unit of stress
            • eV/(Å**3)

          How to access

          import pickle
          with open('stress_step_data.pkl', 'rb') as f:
              stress_step_data = pickle.load(f)
          
          # stress_step_data[ID][stage][step][atom]
          # stress_step_data[ID][0] <-- stage 1
          # stress_step_data[ID][1] <-- stage 2
          #
          # in LAQA
          # stress_step_data[ID][selection][step][atom]
          # stress_step_data[ID][0] <-- 1st selection
          # stress_step_data[ID][1] <-- 2nd selection
          
          # ---------- stress step data of ID 3, stage 1
          cid = 0      # ID
          stage = 1    # stage
          stress_step_data[cid][stage-1][:3]    # to show only 3 steps in jupyter 
          
          [array([[-0.16770062,  0.        ,  0.        ],
                  [ 0.        , -0.16770062, -0.        ],
                  [ 0.        ,  0.        ,  0.21823358]]),
           array([[-0.16020785, -0.        , -0.        ],
                  [-0.        , -0.16020785,  0.        ],
                  [-0.        ,  0.        ,  0.18646321]]),
           array([[-0.13572003, -0.        ,  0.        ],
                  [-0.        , -0.13572003,  0.        ],
                  [-0.        ,  0.        ,  0.15953926]])]
          

          Subsections of CrySPY Utility

          Subsections of Examples

          ase_chgnet_Sr4Co4O12

          Download

          ase_chgnet_RS_Sr4Co4O12.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = ASE
          tot_struc = 10
          nstage = 1
          njob = 2
          jobcmd = bash
          jobfile = job_cryspy
          
          [structure]
          natot = 20
          atype = Sr Co O
          nat = 4  4  12
          mindist_1 = 2.2  2.0  1.8
          mindist_2 = 2.0  2.2  1.5
          mindist_3 = 1.8  1.5  2.0
          
          [ASE]
          ase_python = chgnet_in.py
          
          [option]
          

          calc_in/

          chgnet_in.py_1

          from chgnet.model import StructOptimizer
          from pymatgen.core import Structure
          
          
          # ---------- input structure
          # CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
          structure = Structure.from_file('POSCAR')
          
          # ---------- relax
          relaxer = StructOptimizer()
          result = relaxer.relax(atoms=structure, fmax=0.01, steps=2000)
          
          # ---------- opt. structure and energy
          # [rule in ASE interface]
          # output file for energy: 'log.tote' in eV/cell
          #                         CrySPY reads the last line of 'log.tote'
          # output file for structure: 'CONTCAR' in vasp format
          # ------ energy
          traj = result['trajectory']
          e = traj.compute_energy()   # eV/cell
          with open('log.tote', mode='w') as f:
              f.write(str(e))
          # ------ struc
          opt_struc = result["final_structure"]
          opt_struc.to(fmt='poscar', filename='CONTCAR')
          

          job_cryspy

          #!/bin/sh
          
          # ---------- ASE
          python3 chgnet_in.py
          
          # ---------- for error
          if [ ! -f "log.tote" ]; then
              sed -i -e '3 s/^.*$/skip/' stat_job
              exit 1
          fi
          
          # ---------- CrySPY
          sed -i -e '3 s/^.*$/done/' stat_job
          

          ase_Cu8_RS

          Download

          ase_Cu8_RS.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = ASE
          tot_struc = 5
          nstage = 1
          njob = 2
          jobcmd = zsh
          jobfile = job_cryspy
          
          [structure]
          natot = 8
          atype = Cu
          nat = 8
          
          [ASE]
          ase_python = ase_in.py
          
          [option]
          

          calc_in/

          ase_in.py_1

          from ase.constraints import ExpCellFilter, StrainFilter
          from ase.calculators.emt import EMT
          from ase.calculators.lj import LennardJones
          from ase.optimize.sciopt import SciPyFminCG
          from ase.optimize import BFGS
          from ase.spacegroup.symmetrize import FixSymmetry
          import numpy as np
          from ase.io import read, write
          
          # ---------- input structure
          # CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
          atoms = read('POSCAR', format='vasp')
          
          # ---------- setting and run
          atoms.calc = EMT()
          atoms.set_constraint([FixSymmetry(atoms)])
          atoms = ExpCellFilter(atoms, hydrostatic_strain=False)
          opt = BFGS(atoms)
          #opt=SciPyFminCG(atoms)
          opt.run()
          
          # ---------- opt. structure and energy
          # [rule in ASE interface]
          # output file for energy: 'log.tote' in eV/cell
          #                         CrySPY reads the last line of 'log.tote'
          # output file for structure: 'CONTCAR' in vasp format
          e = atoms.atoms.get_total_energy()
          with open('log.tote', mode='w') as f:
              f.write(str(e))
          
          write('CONTCAR', atoms.atoms, format='vasp')
          

          job_cryspy

          #!/bin/sh
          
          # ---------- ASE
          python3 ase_in.py
          
          # ---------- CrySPY
          sed -i -e '3 s/^.*$/done/' stat_job
          

          soiap_Si8_RS

          Download

          soiap_Si8_RS.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = soiap
          tot_struc = 5
          nstage = 1
          njob = 2
          jobcmd = zsh
          jobfile = job_cryspy
          
          [structure]
          natot = 8
          atype = Si
          nat = 8
          
          [soiap]
          soiap_infile = soiap.in
          soiap_outfile = soiap.out
          soiap_cif = initial.cif
          
          [option]
          

          calc_in/

          soiap.in_1

          crystal initial.cif ! CIF file for the initial structure
          symmetry 1 ! 0: not symmetrize displacements of the atoms or 1: symmetrize
          
          md_mode_cell 3 ! cell-relaxation method
                         ! 0: FIRE, 2: quenched MD, or 3: RFC5
          number_max_relax_cell 1000 ! max. number of the cell relaxation
          number_max_relax 1 ! max. number of the atom relaxation
          max_displacement 0.1 ! max. displacement of atoms in Bohr
          
          external_stress_v 0.0 0.0 0.0 ! external pressure in GPa
          
          th_force 5d-5 ! convergence threshold for the force in Hartree a.u.
          th_stress 5d-7 ! convergence threshold for the stress in Hartree a.u.
          
          force_field 1 ! force field
                        ! 1: Stillinger-Weber for Si, 2: Tsuneyuki potential for SiO2,
                        ! 3: ZRL for Si-O-N-H, 4: ADP for Nd-Fe-B, 5: Jmatgen, or
                        ! 6: Lennard-Jones
          

          job_cryspy

          #!/bin/sh
          
          # ---------- soiap
          EXEPATH=/path/to/soiap
          $EXEPATH/soiap soiap.in > soiap.out 2>&1
          
          # ---------- CrySPY
          sed -i -e '3 s/^.*$/done/' stat_job
          

          soiap_Si8_RS_mindist

          Download

          soiap_Si8_RS_mindist.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = soiap
          tot_struc = 5
          nstage = 1
          njob = 2
          jobcmd = zsh
          jobfile = job_cryspy
          
          [structure]
          natot = 8
          atype = Si
          nat = 8
          mindist_1 = 2.0
          
          [soiap]
          soiap_infile = soiap.in
          soiap_outfile = soiap.out
          soiap_cif = initial.cif
          
          [option]
          

          calc_in/

          soiap.in_1

          crystal initial.cif ! CIF file for the initial structure
          symmetry 1 ! 0: not symmetrize displacements of the atoms or 1: symmetrize
          
          md_mode_cell 3 ! cell-relaxation method
                         ! 0: FIRE, 2: quenched MD, or 3: RFC5
          number_max_relax_cell 1000 ! max. number of the cell relaxation
          number_max_relax 1 ! max. number of the atom relaxation
          max_displacement 0.1 ! max. displacement of atoms in Bohr
          
          external_stress_v 0.0 0.0 0.0 ! external pressure in GPa
          
          th_force 5d-5 ! convergence threshold for the force in Hartree a.u.
          th_stress 5d-7 ! convergence threshold for the stress in Hartree a.u.
          
          force_field 1 ! force field
                        ! 1: Stillinger-Weber for Si, 2: Tsuneyuki potential for SiO2,
                        ! 3: ZRL for Si-O-N-H, 4: ADP for Nd-Fe-B, 5: Jmatgen, or
                        ! 6: Lennard-Jones
          

          job_cryspy

          #!/bin/sh
          
          # ---------- soiap
          EXEPATH=/path/to/soiap
          $EXEPATH/soiap soiap.in > soiap.out 2>&1
          
          # ---------- CrySPY
          sed -i -e '3 s/^.*$/done/' stat_job
          

          vasp_Na8Cl8_RS

          Download

          vasp_Na8Cl8_RS.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = VASP
          tot_struc = 5
          nstage = 2
          njob = 2
          jobcmd = qsub
          jobfile = job_cryspy
          
          [structure]
          natot = 16
          atype = Na Cl
          nat = 8 8
          mindist_1 = 2.5 1.5
          mindist_2 = 1.5 2.5
          
          [VASP]
          kppvol = 40 80
          
          [option]
          

          calc_in/

          INCAR_1

          SYSTEM = NaCl
          !!!LREAL = Auto
          Algo = Fast
          NSW = 40
          
          LWAVE = .FALSE.
          !LCHARG = .FALSE.
          
          ISPIN =  1
          
          ISMEAR = 0
          SIGMA = 0.1
          
          IBRION = 2
          ISIF = 2
          
          EDIFF = 1e-5
          EDIFFG = -0.01
          

          INCAR_2

          SYSTEM = NaCl
          !!LREAL = Auto
          Algo = Fast
          NSW = 200
          
          ENCUT = 341
          
          !!LWAVE = .FALSE.
          !!LCHARG = .FALSE.
          
          
          ISPIN =  1
          
          ISMEAR = 0
          SIGMA = 0.1
          
          IBRION = 2
          ISIF = 3
          
          EDIFF = 1e-5
          EDIFFG = -0.01
          

          job_cryspy

          #!/bin/sh
          #$ -cwd
          #$ -V -S /bin/bash
          ####$ -V -S /bin/zsh
          #$ -N Na8Cl8_CrySPY_ID
          #$ -pe smp 20
          ####$ -q ibis1.q
          ####$ -q ibis2.q
          ####$ -q ibis3.q
          ####$ -q ibis4.q
          
          # ---------- vasp
          VASPROOT=/usr/local/vasp/vasp.6.4.2/bin
          mpirun -np $NSLOTS $VASPROOT/vasp_std
          
          # ---------- CrySPY
          sed -i -e '3 s/^.*$/done/' stat_job
          

          vasp_Sr4O4_RS_pv_term

          Download

          vasp_Sr4O4_RS_pv_term.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = VASP
          tot_struc = 10
          nstage = 2
          njob = 4
          jobcmd = qsub
          jobfile = job_cryspy
          
          [structure]
          natot = 8
          atype = Sr O
          nat = 4 4
          
          [VASP]
          kppvol = 40 80
          
          [option]
          

          calc_in/

          INCAR_1

          SYSTEM = SrO
          Algo = Fast
          NSW = 60
          
          LWAVE = .FALSE.
          !LCHARG = .FALSE.
          
          ISPIN =  1
          
          ISMEAR = 0
          SIGMA = 0.1
          
          IBRION = 2
          ISIF = 2
          
          EDIFF = 1e-5
          EDIFFG = -0.01
          
          PSTRESS = 400
          

          INCAR_2

          SYSTEM = NaCl
          LREAL = Auto
          Algo = Fast
          NSW = 200
          
          ENCUT = 520
          
          !!LWAVE = .FALSE.
          !!LCHARG = .FALSE.
          
          
          ISPIN =  1
          
          ISMEAR = 0
          SIGMA = 0.1
          
          IBRION = 2
          ISIF = 3
          
          EDIFF = 1e-5
          EDIFFG = -0.01
          
          PSTRESS = 400
          

          job_cryspy

          #!/bin/sh
          #$ -cwd
          #$ -V -S /bin/bash
          ####$ -V -S /bin/zsh
          #$ -N SrO_CrySPY_ID
          #$ -pe smp 20
          ####$ -q ibis1.q
          ####$ -q ibis2.q
          ####$ -q ibis3.q
          ####$ -q ibis4.q
          
          # ---------- vasp
          VASPROOT=/usr/local/vasp/vasp.6.4.2/bin
          mpirun -np $NSLOTS $VASPROOT/vasp_std
          
          # ---------- CrySPY
          sed -i -e '3 s/^.*$/done/' stat_job
          

          qe_Si8_RS

          Download

          qe_Si8_RS.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = QE
          tot_struc = 5
          nstage = 2
          njob = 2
          jobcmd = qsub
          jobfile = job_cryspy
          
          [structure]
          natot = 8
          atype = Si
          nat = 8
          
          [QE]
          qe_infile = pwscf.in
          qe_outfile = pwscf.out
          kppvol =  40  80
          
          [option]
          

          calc_in/

          pwscf.in_1

           &control
              title = 'Si8'
              calculation = 'relax'
              nstep = 100
              restart_mode = 'from_scratch',
              pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
              outdir='./out.d/'
           /
          
           &system
              ibrav = 0
              nat = 8
              ntyp = 1
              ecutwfc = 44.0
              occupations = 'smearing'
              degauss = 0.01
           /
          
           &electrons
           /
          
           &ions
           /
          
           &cell
           /
          
          ATOMIC_SPECIES
            Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF
          

          pwscf.in_2

           &control
              title = 'Si8'
              calculation = 'vc-relax'
              nstep = 200
              restart_mode = 'from_scratch',
              pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
              outdir='./out.d/'
           /
          
           &system
              ibrav = 0
              nat = 8
              ntyp = 1
              ecutwfc = 44.0
              occupations = 'smearing'
              degauss = 0.01
           /
          
           &electrons
           /
          
           &ions
           /
          
           &cell
           /
          
          ATOMIC_SPECIES
            Si  28.086  Si.pbe-n-kjpaw_psl.1.0.0.UPF
          

          job_cryspy

          #!/bin/sh
          #$ -cwd
          #$ -V -S /bin/bash
          ####$ -V -S /bin/zsh
          #$ -N Si8_CrySPY_ID
          #$ -pe smp 20
          ####$ -q ibis1.q
          ####$ -q ibis2.q
          
          mpirun -np $NSLOTS /path/to/pw.x < pwscf.in > pwscf.out
          
          
          if [ -e "CRASH" ]; then
              sed -i -e '3 s/^.*$/skip/' stat_job
              exit 1
          fi
          
          sed -i -e '3 s/^.*$/done/' stat_job
          

          qe_benzene_2_RS_mol

          Download

          qe_benzene_2_RS_mol.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = QE
          tot_struc = 6
          nstage = 2
          njob = 2
          jobcmd = qsub
          jobfile = job_cryspy
          
          [structure]
          struc_mode = mol
          natot = 24
          atype = H C
          nat = 12 12
          mol_file = benzene
          nmol = 2
          
          [QE]
          qe_infile = pwscf.in
          qe_outfile = pwscf.out
          kppvol = 40  60
          
          [option]
          

          calc_in/

          pwscf.in_1

           &control
              title = '2 benzene'
              calculation = 'relax'
              nstep = 30
              restart_mode = 'from_scratch',
              pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
              outdir='./outdir/'
           /
          
           &system
              ibrav = 0
              nat = 24
              ntyp = 2
              ecutwfc = 35.00
              ecutrho = 300.00
              occupations = 'smearing'
              degauss = 0.01
           /
          
           &electrons
           /
          
           &ions
           /
          
           &cell
           /
          
          ATOMIC_SPECIES
             H  1.008  H.pbe-kjpaw_psl.1.0.0.UPF
             C  12.01  C.pbe-n-kjpaw_psl.1.0.0.UPF
          

          pwscf.in_2

           &control
              title = '2 benzene'
              calculation = 'vc-relax'
              nstep = 200
              restart_mode = 'from_scratch',
              pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
              outdir='./outdir/'
           /
          
           &system
              ibrav = 0
              nat = 24
              ntyp = 2
              ecutwfc = 46.00
              ecutrho = 326.00
              occupations = 'smearing'
              degauss = 0.01
           /
          
           &electrons
           /
          
           &ions
           /
          
           &cell
           /
          
          ATOMIC_SPECIES
             H  1.008  H.pbe-kjpaw_psl.1.0.0.UPF
             C  12.01  C.pbe-n-kjpaw_psl.1.0.0.UPF
          

          job_cryspy

          #!/bin/sh
          #$ -cwd
          #$ -V -S /bin/bash
          ####$ -V -S /bin/zsh
          #$ -N bz_CrySPY_ID
          #$ -pe smp 20
          
          # ---------- qe run
          mpirun -np $NSLOTS /path/to/pw.x  < pwscf.in > pwscf.out
          
          # ---------- qe if crash
          if [ -e "CRASH" ]; then
              sed -i -e '3 s/^.*$/skip/' stat_job
              exit 1
          fi
          
          # ---------- cryspy
          sed -i -e '3 s/^.*$/done/' stat_job
          

          qe_Sr4O4_RS_pv_term

          Download

          qe_Sr4O4_RS_pv_term.tar.gz

          cryspy.in

          [basic]
          algo = RS
          calc_code = QE
          tot_struc = 10
          nstage = 2
          njob = 4
          jobcmd = qsub
          jobfile = job_cryspy
          
          [structure]
          natot = 8
          atype = Sr O
          nat = 4 4
          
          [QE]
          qe_infile = pwscf.in
          qe_outfile = pwscf.out
          kppvol =  40  80
          pv_term = True
          
          [option]
          

          calc_in/

          pwscf.in_1

           &control
              title = 'SrO'
              calculation = 'relax'
              nstep = 100
              restart_mode = 'from_scratch',
              pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
              outdir='./out.d/'
           /
          
           &system
              ibrav = 0
              nat = 8
              ntyp = 2
              ecutwfc = 40
              ecutrho = 200
              occupations = 'smearing'
              degauss = 0.01
           /
          
           &electrons
           /
          
           &ions
           /
          
           &cell
              press = 400
           /
          
          ATOMIC_SPECIES
            Sr  -1.0  sr_pbe_v1.uspp.F.UPF
             O  -1.0  o_pbe_v1.2.uspp.F.UPF
          

          pwscf.in_2

           &control
              title = 'SrO'
              calculation = 'vc-relax'
              nstep = 200
              restart_mode = 'from_scratch',
              pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
              outdir='./out.d/'
           /
          
           &system
              ibrav = 0
              nat = 8
              ntyp = 2
              ecutwfc = 40
              ecutrho = 200
              occupations = 'smearing'
              degauss = 0.01
           /
          
           &electrons
           /
          
           &ions
           /
          
           &cell
              press = 400
           /
          
          ATOMIC_SPECIES
            Sr  -1.0  sr_pbe_v1.uspp.F.UPF
             O  -1.0  o_pbe_v1.2.uspp.F.UPF
          

          job_cryspy

          #!/bin/sh
          #$ -cwd
          #$ -V -S /bin/bash
          ####$ -V -S /bin/zsh
          #$ -N SrO_CrySPY_ID
          #$ -pe smp 20
          ####$ -q ibis1.q
          ####$ -q ibis2.q
          
          # ---------- qe run
          mpirun -np $NSLOTS /path/to/pw.x  < pwscf.in > pwscf.out
          
          # ---------- qe if crash
          if [ -e "CRASH" ]; then
              sed -i -e '3 s/^.*$/skip/' stat_job
              exit 1
          fi
          
          # ---------- cryspy
          sed -i -e '3 s/^.*$/done/' stat_job
          

          qe_Si16_LAQA

          Download

          qe_Si16_LAQA.tar.gz

          cryspy.in

          [basic]
          algo = LAQA
          calc_code = QE
          tot_struc = 50
          nstage = 1
          njob = 10
          jobcmd = qsub
          jobfile = job_cryspy
          
          [structure]
          natot = 16
          atype = Si
          nat = 16
          mindist_1 = 1.5
          
          [QE]
          qe_infile = pwscf.in
          qe_outfile = pwscf.out
          kppvol =  80
          
          [LAQA]
          nselect_laqa = 4
          
          [option]
          

          calc_in/

          pwscf.in_1

          &control
              calculation = 'vc-relax'
              pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
              outdir='./outdir/'
              nstep = 10
          /
          
          &system
              ibrav = 0
              nat = 16
              ntyp = 1
              ecutwfc = 40
              ecutrho = 200
              occupations = 'smearing'
              degauss = 0.01
          /
          
          &electrons
          /
          
          &ions
          /
          
          &cell
          /
          
          ATOMIC_SPECIES
            Si -1.0 si_pbe_v1.uspp.F.UPF
          

          job_cryspy

          #!/bin/sh
          #$ -cwd
          #$ -V -S /bin/bash
          ####$ -V -S /bin/zsh
          #$ -N Si_CrySPY_ID
          #$ -pe smp 20
          ####$ -q ibis1.q
          ####$ -q ibis2.q
          
          # ---------- qe run
          mpirun -np $NSLOTS /path/to/pw.x  < pwscf.in > pwscf.out
          
          # ---------- qe if crash
          if [ -e "CRASH" ]; then
              sed -i -e '3 s/^.*$/skip/' stat_job
              exit 1
          fi
          
          # ---------- cryspy
          sed -i -e '3 s/^.*$/done/' stat_job
          

          Subsections of Scripts

          extract_struc.py

          2023 April 16 update

          Script to extract structures from init_struc_data.pkl or opt_struc_data.pkl. This script can print stucture information and output cif files.

          One can specify structure ID(s) using -i option. Top k structures (the k most stable structures) can be extracted using -t option. -a option is for outputting all the structures. (note that many cif files will be output.) Symmetrized cif files can be generated with -s option. When outputting a symmetrized CIF file, you can also specify a tolerance with --tolerance. Structure information is printed with -p. If you use -p option, cif files are not output. You can also read a gzipped file (e.g., opt_struc_data.pkl.gz).

          Update History

          • 2024 April 16: –tolerance option, gzip
          • 2023 July 21: –print option

          Usage

          python3 extract_struc.py -h
          

          or if you put the script in your PATH, you can omit python3

          extract_struc.py -h
          
          usage: extract_struc.py [-h] [-p] [-a] [-i [INDEX ...]] [-t TOP] [-r] [-s] [--tolerance TOLERANCE] infile
          
          positional arguments:
            infile                input file
          
          options:
            -h, --help            show this help message and exit
            -p, --print           just print, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -ps
            -a, --all_id          all structures, e.g., extract_struc.py opt_struc_data.pkl -as
            -i [INDEX ...], --index [INDEX ...]
                                  structure ID, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
            -t TOP, --top TOP     top k structures, e.g. (k = 3), extract_struc.py opt_struc_data.pkl -t 3 -s
            -r, --rank            add rank in file names, e.g., extract_struc.py opt_struc_data.pkl -t 3 -rs
            -s, --symmetrized     symmetrized structure, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
            --tolerance TOLERANCE
                                  tolerance for symmetrization (default 0.01), e.g., extract_struc.py opt_struc_data.pkl -i 0 1 -s --tolerance 0.01
          

          Examples

          Print

          The -p option can be used in combination with any option except for -s option.

          extract_struc.py -p opt_struc_data.pkl -i 0 1
          
          ID 0
          Full Formula (Na8 Cl8)
          Reduced Formula: NaCl
          abc   :   6.823618   6.823618   7.566454
          angles:  90.000000  90.000000  96.650518
          pbc   :       True       True       True
          Sites (16)
            #  SP           a         b         c
          ---  ----  --------  --------  --------
            0  Na    0         0         1
            1  Na    0         0         0.5
            2  Na    0.704707  0.295293  0.75
            3  Na    0.295293  0.704707  0.25
            4  Na    0.5       0         1
            5  Na    0.5       0         0.5
            6  Na    0         0.5       0.5
            7  Na    0         0.5       0
            8  Cl    0.5       0.5       0
            9  Cl    0.5       0.5       0.5
           10  Cl    0.484753  0.515247  0.75
           11  Cl    0.515247  0.484753  0.25
           12  Cl    0.828247  0.171753  0.851096
           13  Cl    0.171753  0.828247  0.351096
           14  Cl    0.828247  0.171753  0.648904
           15  Cl    0.171753  0.828247  0.148904
          
          ID 1
          Full Formula (Na8 Cl8)
          Reduced Formula: NaCl
          abc   :   8.145021   8.145021   4.324235
          angles:  90.000000  90.000000 120.000000
          pbc   :       True       True       True
          Sites (16)
            #  SP            a          b         c
          ---  ----  ---------  ---------  --------
            0  Na     0.666667   0.333333  0.736206
            1  Na     0.666667   0.333333  0.263794
            2  Na     0.913147   0.086853  0.5
            3  Na     0.913147   0.826295  0.5
            4  Na     0.173705   0.086853  0.5
            5  Na     0.77711    0.22289   0
            6  Na     0.77711    0.55422   0
            7  Na     0.44578    0.22289   0
            8  Cl     0.027675   0.423376  0.5
            9  Cl    -0.423376  -0.395701  0.5
           10  Cl     0.395701  -0.027675  0.5
           11  Cl    -0.423376  -0.027675  0.5
           12  Cl     0.395701   0.423376  0.5
           13  Cl     0.027675  -0.395701  0.5
           14  Cl     0.333333   0.666667  0.5
           15  Cl     0          0         0
          

          Structure ID

          extract_struc.py opt_struc_data.pkl -i 7 10 12
          

          7.cif, 10.cif, and 12.cif are output.

          For symmetrized cif,

          extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
          

          2024 April 16
          With the tolerance parameter (default 0.01)

          extract_struc.py opt_struc_data.pkl -i 7 10 12 -s --tolerance 0.01
          

          Top k structures

          Info

          rslt_data.pkl is required in the same directory as the input.

          Let us suppose

          • ./data/pkl_data/opt_struc_data.pkl
          • ./data/pkl_data/rslt_data.pkl

          and cryspy_rslt_energy_asc file is as follows:

              Spg_num     Spg_sym  Spg_num_opt Spg_sym_opt    E_eV_atom  Magmom      Opt
          9       110      I4_1cd          110      I4_1cd -1284.708037     NaN  not_yet
          16        4        P2_1            4        P2_1 -1284.693651     NaN     done
          97       92    P4_12_12           91      P4_122 -1284.692494     NaN     done
          8        57        Pbcm           57        Pbcm -1284.668504     NaN     done
          81       19  P2_12_12_1           19  P2_12_12_1 -1284.635684     NaN     done
          ...
          

          Top k(=3) structures can be extracted with:

          extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3
          

          In this example, rlst_data.pkl must be in ./data/pkl_data/. 9.cif, 16.cif, and 97.cif are output.

          The rank can be included in cif file names with -r option:

          extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -r
          

          1_9.cif, 2_16.cif, and 3_97.cif are output.

          For symmetrized cif:

          extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -rs
          

          All the structures

          You should make a directory.

          mkdir init_cifs
          cd init_cifs
          extract_struc.py /path/to/opt_struc_data.pkl -a
          

          For symmetrized cif,

          extract_struc.py /path/to/init_struc_data.pkl -as
          

          Gzipped files

          2024 April 16
          Gzipped files (end with .gz) can be read:

          extract_struc.py opt_struc_data.pkl.gz -i 0 1 -s
          

          pos2pkl.py

          2023 July 23 update

          Script to convert structre data into init_struc_data.pkl. The default input format is init_POSCARS. Single structure data such as POSCAR and cif files can be optionally converted. Output is init_struc_data.pkl. Structure data can be added to an already existing init_struc_data.pkl. The structure ID is not taken into account and is newly assigned. If the number of atoms is different, an error is generated.

          init_struc_data.pkl can be loaded at the start of the simulation in CrySPY.

          You can remove and sort species with -f option. Note that without this option, pymatgen will sort the species in electronegativity order!

          Usage

          usage: pos2pkl.py [-h] [-s [SINGLE ...]] [-f [FILTER ...]] [-p] [infile ...]
          
          positional arguments:
            infile                input file: init_POSCARS
          
          options:
            -h, --help            show this help message and exit
            -s [SINGLE ...], --single [SINGLE ...]
                                  input file: single structure file (POSCAR, cif)
            -f [FILTER ...], --filter [FILTER ...]
                                  filter (sort): remove species and sort
            -p, --permit_diff_comp
                                  flag for permitting different composition
          

          Examples

          init_POSCARS –> init_struc_data.pkl

          It can be used to convert init_POSCARS generated by CrySPY to init_struc_data.pkl in another machine such as a supercomputer. Multiple input files can be converted.

          python3 pos2pkl.py init_POSCARS
          

          If you put the pos2pkl.py in your PATH, you can omit python3.

          pos2pkl.py init_POSCARS
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 4
          Save init_struc_data.pkl
          

          Multiple inputs:

          python3 pos2pkl.py init_POSCARS init_POSCARS2 init_POSCARS3
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 12
          Save init_struc_data.pkl
          

          If init_struc_data.pkl already exists in the current directory and you want to append to it:

          python3 pos2pkl.py init_POSCARS
          
          init_struc_data.pkl already exists.
          Append to init_struc_data.pkl? [y/n]: y
          
          Load init_struc_data
          Composition: Na8 Cl8
          The number of structures: 12
          
          Converted. The number of structures: 16
          Save init_struc_data.pkl
          

          POSCAR or cif –> init_struc_data.pkl

          Single structure data such as POSCAR and cif files can also be converted. -s/--single option is required.

          python3 pos2pkl.py -s POSCAR test.cif
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 2
          Save init_struc_data.pkl
          

          init_POSCARS, POSCAR –> init_struc_data.pkl

          python3 pos2pkl.py init_POSCARS -s POSCAR
          
          Composition: Na8 Cl8
          
          Converted. The number of structures: 5
          Save init_struc_data.pkl
          
          Warning

          The following is wrong. The init_POSCARS is also treated as a single structure.

          python3 pos2pkl.py -s POSCAR init_POSCARS
          

          Filter (remove and sort)

          Here we consider a cif file with the composition of Sr8 Co8 O20 X4, including 4 dummy atoms (X4). -f/--filter option can be used to remove and sort species. Specify the same as atype in cryspy.in.

          python3 pos2pkl.py -s Sr8Co8O20X4.cif -f Sr Co O
          
          Removed species: {'X0+'}
          Composition: Sr8 Co8 O20
          
          Converted. The number of structures: 1
          Save init_struc_data.pkl
          

          With extract_struc.py you can see how it was registered in init_struc_data.pkl.

          python3 extract_struc.py init_struc_data.pkl -pa
          
          ID 0
          Full Formula (Sr8 Co8 O20)
          Reduced Formula: Sr2Co2O5
          ...
          

          -f option can allow you to sort.

          python3 pos2pkl.py -s Sr8Co8O20X4.cif -f O Co 
          
          Removed species: {'Sr', 'X0+'}
          Composition: O20 Co8
          
          Converted. The number of structures: 1
          Save init_struc_data.pkl
          

          kpt_check.py

          kpt_check.py can check a k-point mesh with a given kppvol. This script supports POSCAR, CONTCAR, and init_struc_data.pkl. pymatgen library is required.

          After generating initial structures, you can try to see how much the value of kppvol should be.

          Usage

          python3 kpt_check.py -h
          

          or if you put the script in your PATH, you can omit python3

          kpt_check.py -h
          
          usage: kpt_check.py [-h] [-w] [-n NSTRUC] infile kppvol
          
          positional arguments:
            infile                input file: POSCAR, CONTCAR, or init_struc_data.pkl
            kppvol                kppvol
          
          options:
            -h, --help            show this help message and exit
            -w, --write           write KPOINTS
            -n NSTRUC, --nstruc NSTRUC
                                  number of structure to check
          

          Example

          POSCAR with a given kppvol

          kpt_check.py POSCAR 100
          
          a = 10.689217
          b = 10.689217
          c = 10.730846
              Lattice vector
          10.689217 0.000000 0.000000
          0.000000 10.689217 0.000000
          0.000000 0.000000 10.730846
          
          kppvol:  100
          k-points:  [2, 2, 2]
          

          Write KPOINTS file

          You can generate a KPOINTS file using -w option.

          kpt_check.py -w POSCAR 100
          
          $ cat KPOINTS
          pymatgen 4.7.6+ generated KPOINTS with grid density = 607 / atom
          0
          Monkhorst
          2 2 2
          

          Check k-point meshes for init_struc_data.pkl

          In checking k-point meshes for init_struc_data.pkl, first five structures are automatically checked in the default setting. You can change the number of structures using -n option.

          kpt_check.py -n 3 init_struc_data.pkl 100
          
          # ---------- 0th structure
          a = 8.0343076893
          b = 8.03430768936
          c = 9.1723323373
              Lattice vector
          8.034308 0.000000 0.000000
          -4.017154 6.957915 0.000000
          0.000000 0.000000 9.172332
          
          kppvol:  100
          k-points:  [3, 3, 3]
          
          
          # ---------- 1th structure
          a = 9.8451944096
          b = 9.84519440959
          c = 6.8764313585
              Lattice vector
          9.845194 0.000000 0.000000
          -4.922597 8.526188 0.000000
          0.000000 0.000000 6.876431
          
          kppvol:  100
          k-points:  [3, 3, 4]
          
          
          # ---------- 2th structure
          a = 7.5760383679
          b = 7.57603836797
          c = 6.6507478296
              Lattice vector
          7.576038 0.000000 0.000000
          -3.788019 6.561042 0.000000
          0.000000 0.000000 6.650748
          
          kppvol:  100
          k-points:  [4, 4, 4]
          

          repeat_cryspy

          You may find it tedious to run cryspy over and over again. This auto script could help you. This runs cryspy once every 5 minutes.

          #!/bin/bash
          
          set -e
          
          while :
          do
              cryspy -n
              LOG_LASTLINE=`tail -n 1 log_cryspy`
              if  [ "$LOG_LASTLINE" = "Done all structures!" ]
              then
                  exit 0
              # ---------- for EA
              elif [ "${LOG_LASTLINE:0:17}" = "Reached maxgen_ea" ]
              then
                  exit 0
              elif [ "$LOG_LASTLINE" = "EA is ready" ]
              then
                  cryspy -n    # EA
                  LOG_LASTLINE=`tail -n 1 log_cryspy`
                  if [ "${LOG_LASTLINE:0:17}" = "Reached maxgen_ea" ]
                  then
                      exit 0
                  fi
                  cryspy -n    # submit jobs
              # ---------- for BO
              elif [ "${LOG_LASTLINE:0:21}" = "Reached max_select_bo" ]
              then
                  exit 0
              elif [ "$LOG_LASTLINE" = "BO is ready" ]
              then
                  cryspy -n    # selection
                  LOG_LASTLINE=`tail -n 1 log_cryspy`
                  if [ "${LOG_LASTLINE:0:21}" = "Reached max_select_bo" ]
                  then
                      exit 0
                  fi
                  cryspy -n    # submit jobs
              # ---------- for LAQA
              elif [ "$LOG_LASTLINE" = "LAQA is ready" ]
              then
                  cryspy -n    # selection
                  cryspy -n    # submit jobs
              fi
              sleep 5m
          done