Tutorial
Beginners are encouraged to start with a random search. You can find examaple files in cryspy_utility.
Beginners are encouraged to start with a random search. You can find examaple files in cryspy_utility.
ASE is easy to start for beginners because when you install CrySPY (csp-cryspy), ASE is also automatically installed.
Follow any one of the examples and then go to “Running CrySPY” section.
Only if calc_code == ext
.
2023 July 10
ASE provides interfaces to different codes. ASE also includes Pure Python EMT calculator, which is suitable for testing CrySPY because of its fast and easy structure optimization.
In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Cu 8 atoms.
Here, we assume the following conditions:
job_cryspy
ase_in.py
Move to your working directory, and copy the example files by one of the following methods.
cd ase_Cu8_RS
tree
.
├── calc_in
│ ├── ase_in.py_1
│ └── job_cryspy
└── cryspy.in
cryspy.in
is the input file of CrySPY.
[basic]
algo = RS
calc_code = ASE
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy
[structure]
natot = 8
atype = Cu
nat = 8
[ASE]
ase_python = ase_in.py
[option]
In [basic]
section, jobcmd = zsh
can be changed to jobcmd = sh
or jobcmd = bash
in accordance with your environment.
CrySPY runs zsh job_cryspy
as a background job internally.
[ASE]
section is required when you use ASE.
You can name the following files whatever you want:
jobfile
: job_cryspy
ase_python
: ase_in.py
The other input variables are discussed later.
The job file and input files for ASE are prepared in this directory.
The name of the job file must match the value of jobfile
in cryspy.in
.
The example of job file (here, job_cryspy
) is shown below.
#!/bin/sh
# ---------- ASE
python3 ase_in.py
# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job
You can specify the input (ase_in.py
) file names,
but it must match the values of ase_python
in cryspy.in
.
You must add sed -i -e '3 s/^.*$/done/' stat_job
at the end of the file in CrySPY.
sed -i -e '3 s/^.*$/done/' stat_job
is required at the end of the job file.
In the job file of CrySPY, the string CrySPY_ID
is automatically replaced with the structure ID.
When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name.
For example, in the PBS system, #PBS -N Si_CrySPY_ID
in ID 10 is replaced with #PBS -N Si_10
.
Note that starting with a number will result in an error.
You should add a prefix like Si_
.
Input files based on the number of stages (nstage
in cryspy.in
) are required.
Name the input file(s) with a suffix _x
.
Here x
means the stage number.
We are using nstage = 1
in this ASE tutorial, so we need only ase_in.py_1
.
ase_in.py_1
is listed below.
Refer to the ASE documentation for details.
from ase.constraints import ExpCellFilter, StrainFilter
from ase.calculators.emt import EMT
from ase.calculators.lj import LennardJones
from ase.optimize.sciopt import SciPyFminCG
from ase.optimize import BFGS
from ase.spacegroup.symmetrize import FixSymmetry
import numpy as np
from ase.io import read, write
# ---------- input structure
# CrySPY outputs 'POSCAR' as an input file in work/xxxxxx directory
atoms = read('POSCAR', format='vasp')
# ---------- setting and run
atoms.calc = EMT()
atoms.set_constraint([FixSymmetry(atoms)])
atoms = ExpCellFilter(atoms, hydrostatic_strain=False)
opt = BFGS(atoms)
#opt=SciPyFminCG(atoms)
opt.run()
# ---------- opt. structure and energy
# [rule in ASE interface]
# output file for energy: 'log.tote' in eV/cell
# CrySPY reads the last line of 'log.tote'
# output file for structure: 'CONTCAR' in vasp format
e = atoms.atoms.get_total_energy()
with open('log.tote', mode='w') as f:
f.write(str(e))
write('CONTCAR', atoms.atoms, format='vasp')
Unlike VASP and QE, the ASE input (python script) is more flexible. CrySPY has two rules:
log.tote
file. CrySPY reads the last line of it.Go to Running CrySPY
soiap is Structure Optimization with InterAtomic Potential. It is suitable for testing CrySPY because of its fast structure optimization. See instructions to install soiap.
In this tutorial, we try to use CrySPY in your local PC (Mac or Linux). The target system is Si 8 atoms.
Here, we assume the following conditions:
~/CrySPY_root/CrySPY-0.9.0/cryspy.py
job_cryspy
~/local/soiap-0.3.0/src/soiap
soiap.in
soiap.out
initial.cif
Move to your working directory, and copy input example files by one of the following methods.
cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/soiap_RS_Si8 .
cd soiap_RS_Si8
tree
.
├── calc_in
│ ├── job_cryspy
│ └── soiap.in_1
└── cryspy.in
cryspy.in
is the input file of CrySPY.
[basic]
algo = RS
calc_code = soiap
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy
[structure]
natot = 8
atype = Si
nat = 8
[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif
[option]
In [basic]
section, jobcmd = zsh
can be changed to jobcmd = sh
or jobcmd = bash
in accordance with your environment.
CrySPY runs zsh job_cryspy
as a background job internally.
[soiap]
section is required when you use soiap.
You can name the following files whatever you want:
jobfile
soiap_infile
soiap_outfile
soiap_cif
The other input variables are discussed later.
The job file and input files for soiap are prepared in this directory.
The name of the job file must match the value of jobfile
in cryspy.in
.
The example of job file (here, job_cryspy
) is shown below.
#!/bin/sh
# ---------- soiap
EXEPATH=/path/to/soiap
$EXEPATH/soiap soiap.in 2>&1 > soiap.out
# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job
Change /path/to/soiap
into right path suitable for your environment.
You can specify the input (soiap.in
) and output (soiap.out
) file names,
but they must match the values of soiap_infile
and soiap_outfile
in cryspy.in
.
The job file is written in the same way as the one you usually use except for the last line.
You must add sed -i -e '3 s/^.*$/done/' stat_job
at the end of the file in CrySPY.
sed -i -e '3 s/^.*$/done/' stat_job
is required at the end of the job file.
In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID.
When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name.
For example, in the PBS system, #PBS -N Si_CrySPY_ID
in ID 10 is replaced with #PBS -N Si_10
.
Note that starting with a number will result in an error.
You should add a prefix like Si_
.
Input files based on the number of stages (nstage
in cryspy.in
) are required.
Name the input file(s) with a suffix _x
.
Here x
means the stage number.
We are using nstage = 1
, so we need only soiap.in_1
.
soiap.in_1
is listed below.
crystal initial.cif ! CIF file for the initial structure
symmetry 1 ! 0: not symmetrize displacements of the atoms or 1: symmetrize
md_mode_cell 3 ! cell-relaxation method
! 0: FIRE, 2: quenched MD, or 3: RFC5
number_max_relax_cell 100 ! max. number of the cell relaxation
number_max_relax 1 ! max. number of the atom relaxation
max_displacement 0.1 ! max. displacement of atoms in Bohr
external_stress_v 0.0 0.0 0.0 ! external pressure in GPa
th_force 5d-5 ! convergence threshold for the force in Hartree a.u.
th_stress 5d-7 ! convergence threshold for the stress in Hartree a.u.
force_field 1 ! force field
! 1: Stillinger-Weber for Si, 2: Tsuneyuki potential for SiO2,
! 3: ZRL for Si-O-N-H, 4: ADP for Nd-Fe-B, 5: Jmatgen, or
! 6: Lennard-Jones
The input structure file is specified at the first line.
Use the same name as the value of soiap_cif
in cryspy.in
.
Go to Running CrySPY
2024 April 24
In this tutorial, we try to use CrySPY in a PC cluster with a job scheduler system such as PBS. Here we employ VASP. The target system is Na8Cl8, 16 atoms.
Here, we assume the following conditions:
qsub
job_cryspy
Move to your working directory, and copy the example files by one of the following methods.
cd vasp_Na8Cl8_RS
tree
.
├── calc_in
│ ├── INCAR_1
│ ├── INCAR_2
│ ├── POTCAR
│ ├── POTCAR_is_dummy
│ └── job_cryspy
└── cryspy.in
cryspy.in
is the input file of CrySPY.
[basic]
algo = RS
calc_code = VASP
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy
[structure]
natot = 16
atype = Na Cl
nat = 8 8
mindist_1 = 2.5 1.5
mindist_2 = 1.5 2.5
[VASP]
kppvol = 40 80
[option]
In [basic]
section, jobcmd = qsub
can be changed in accordance with your environment.
CrySPY runs qsub job_cryspy
as a background job internally in this setting.
You can name the following file whatever you want:
jobfile
We adopt a stage-based system for structure optimization calculations.
Here, we use nstage = 2
.
For example, users can configure the following settings.
In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density.
Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.
[VASP]
section is required when you use VASP.
You have to specify k-point grid density (Å^-3) for each stage in kppvol
.
See Input file > Kpoint for details of kppvol
The other input variables are discussed later.
The job file and input files for VASP are prepared in this directory.
The name of the job file must match the value of jobfile
in cryspy.in
.
The example of job file (here, job_cryspy
) is shown below.
#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Na8Cl8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q
####$ -q ibis3.q
####$ -q ibis4.q
# ---------- vasp
VASPROOT=/usr/local/vasp/vasp.6.4.2/bin
mpirun -np $NSLOTS $VASPROOT/vasp_std
# ---------- CrySPY
sed -i -e '3 s/^.*$/done/' stat_job
Change VASPROOT
to the appropriate path suitable for your environment.
The job file is written in the same way as the one you usually use except for the last line.
You must add sed -i -e '3 s/^.*$/done/' stat_job
at the end of the file in CrySPY.
sed -i -e '3 s/^.*$/done/' stat_job
is required at the end of the job file.
In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID.
When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name.
For example, in the PBS system, #PBS -N Si_CrySPY_ID
in ID 10 is replaced with #PBS -N Si_10
.
Note that starting with a number will result in an error.
You should add a prefix like Si_
.
Input files based on the number of stages (nstage
in cryspy.in
) are required.
Name the input file(s) with a suffix _x
.
Here x
means the stage number.
We are using nstage = 2
, so we need INCAR_1
and INCAR_2
.
Here, INCAR_1
is set to fix the cell and relax only the ionic positions, while INCAR_2
is configured to fully relax both the cell and ionic positions.
INCAR_1
SYSTEM = NaCl
!!!LREAL = Auto
Algo = Fast
NSW = 40
LWAVE = .FALSE.
!LCHARG = .FALSE.
ISPIN = 1
ISMEAR = 0
SIGMA = 0.1
IBRION = 2
ISIF = 2
EDIFF = 1e-5
EDIFFG = -0.01
INCAR_2
SYSTEM = NaCl
!!LREAL = Auto
Algo = Fast
NSW = 200
ENCUT = 341
!!LWAVE = .FALSE.
!!LCHARG = .FALSE.
ISPIN = 1
ISMEAR = 0
SIGMA = 0.1
IBRION = 2
ISIF = 3
EDIFF = 1e-5
EDIFFG = -0.01
CrySPY automatically generates POSCAR
and KPOINTS
files.
You have to prepare POTCAR
file yourself.
The POTCAR
included in this example file is empty, so please be aware of that.
POTCAR
in this example is empty. We cannot distribute it.
Go to Running CrySPY
2024 April 24, updated
In this tutorial, we try to use CrySPY in a machine with a job scheduler system such as PBS. Here we employ QUANTUM ESPRESSO. (QE). The target system is Si 8 atoms.
Here, we assume the following conditions:
qsub
job_cryspy
/usr/local/qe-6.5/bin/pw.x
pwscf.in
pwscf.out
Move to your working directory, and copy input example files by one of the following methods.
cp -r ~/CrySPY_root/CrySPY-0.9.0/example/v0.9.0/QE_Si8_RS .
cd QE_RS_Si8
tree
.
├── calc_in
│ ├── job_cryspy
│ ├── pwscf.in_1
│ └── pwscf.in_2
└── cryspy.in
cryspy.in
is the input file of CrySPY.
[basic]
algo = RS
calc_code = QE
tot_struc = 5
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy
[structure]
natot = 8
atype = Si
nat = 8
[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40 80
[option]
In [basic]
section, jobcmd = qsub
can be changed in accordance with your environment.
CrySPY runs qsub job_cryspy
as a background job internally in this setting.
We adopt a stage-based system for structure optimization calculations.
Here, we use nstage = 2
.
For example, users can configure the following settings.
In the first stage, only the ionic positions are relaxed, fixing the cell shape, with low k-point grid density.
Next, the ionic positions and cell shape are fully relaxed with high accuracy in the second stage.
[QE]
section is required when you use QE.
You have to specify k-point grid density (Å^-3) for each stage in kppvol
.
See Input file > Kpoint for details of kppvol
You can name the following files whatever you want:
jobfile
qe_infile
qe_outfile
The other input variables are discussed later.
The job file and input files for QE are prepared in this directory.
The name of the job file must match the value of jobfile
in cryspy.in
.
The example of job file (here, job_cryspy
) is shown below.
#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si8_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q
mpirun -np $NSLOTS /path/to/pw.x < pwscf.in > pwscf.out
if [ -e "CRASH" ]; then
sed -i -e '3 s/^.*$/skip/' stat_job
exit 1
fi
sed -i -e '3 s/^.*$/done/' stat_job
Change /path/to/pw.x
to the appropriate path suitable for your environment.
You can specify the input (pwscf.in
) and output (pwscf.out
) file names,
but they must match the values of qe_infile
and qe_outfile
in cryspy.in
.
The job file is written in the same way as the one you usually use except for the last line.
You must add sed -i -e '3 s/^.*$/done/' stat_job
at the end of the file in CrySPY.
sed -i -e '3 s/^.*$/done/' stat_job
is required at the end of the job file.
In the job file of CrySPY, the string “CrySPY_ID” is automatically replaced with the structure ID.
When you use a job scheduler such as PBS and SLURM, it is useful to set the structure ID to the job name.
For example, in the PBS system, #PBS -N Si_CrySPY_ID
in ID 10 is replaced with #PBS -N Si_10
.
Note that starting with a number will result in an error.
You should add a prefix like Si_
.
Input files based on the number of stages (nstage
in cryspy.in
) are required.
Name the input file(s) with a suffix _x
.
Here x
means the stage number.
We are using nstage = 2
, so we need pwscf.in_1
and pwscf.in_2
.
Here, pwscf.in_1
is set to fix the cell and relax only the ionic positions, while pwscf.in_2
is configured to fully relax both the cell and ionic positions.
pwscf.in_1
&control
title = 'Si8'
calculation = 'relax'
nstep = 100
restart_mode = 'from_scratch',
pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
outdir='./out.d/'
/
&system
ibrav = 0
nat = 8
ntyp = 1
ecutwfc = 44.0
occupations = 'smearing'
degauss = 0.01
/
&electrons
/
&ions
/
&cell
/
ATOMIC_SPECIES
Si 28.086 Si.pbe-n-kjpaw_psl.1.0.0.UPF
pwscf.in_2
&control
title = 'Si8'
calculation = 'vc-relax'
nstep = 200
restart_mode = 'from_scratch',
pseudo_dir = '/usr/local/pslibrary.1.0.0/pbe/PSEUDOPOTENTIALS/'
outdir='./out.d/'
/
&system
ibrav = 0
nat = 8
ntyp = 1
ecutwfc = 44.0
occupations = 'smearing'
degauss = 0.01
/
&electrons
/
&ions
/
&cell
/
ATOMIC_SPECIES
Si 28.086 Si.pbe-n-kjpaw_psl.1.0.0.UPF
Change pseudo_dir
to your suitable directory.
Inputs for structure data and k-point such as ATOMIC_POSITIONS
and K_POINTS
are automatically appended by CrySPY with pymatgen.
Users do not have to prepare them in pwscf.in_x
.
Go to Running CrySPY
Coming soon.
Coming soon.
Available from CrySPY 0.11.0.
If you use an external program not supported by CrySPY, the optimized energy and structure data can be loaded semi-manually in CrySPY.
You have to prepare two files, ext_opt_struc_data.pkl
and ext_energy_data.pkl
.
Here, we assume the following conditions:
~/CrySPY_root/CrySPY-0.11.0/cryspy.py
(calc_in
directory is not required.)
Move to your working directory, and copy input example files.
cp -r ~/CrySPY_root/CrySPY-0.9.0/example/ext_Si8_RS .
cd ext_Si8_RS
tree
.
└── cryspy.in
cryspy.in
is the input file of CrySPY.
[basic]
algo = RS
calc_code = ext
tot_struc = 5
[structure]
natot = 8
atype = Si
nat = 8
[option]
If calc_code == ext
, nstage
, njob
, jobcmd
, and jobfile
are ignored.
This mode is different from the normal use of CrySPY. Go to Load external data.
See Input file in detail.
Let’s take a look at cryspy.in
again.
This may be slightly different depending on calc_code
you chose.
[basic]
algo = RS
calc_code = soiap
tot_struc = 5
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy
[structure]
natot = 8
atype = Si
nat = 8
[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif
[option]
algo
: Algorithm. Set RS
for Random Search.calc_code
: Structure optimizer. Choose from VASP
, QE
, OMX
, soiap
, LAMMPS
tot_struc
: The total number of structures. In this case, 5 random structures are generated at 1st run.nstage
: The number of stages. It’s up to you.njob
: The number of jobs running at the same time. In this example, CrySPY sets 2 slots for structure optimization, in other words, optimizes every 2 structures.jobcmd
: Command for jobs. Use bash
, zsh
, qsub
, and so on.jobfile
: File name of the job file.natot
: The total number of atoms. e.g. for Na8Cl8: natot = 16
.atype
: Atom type. e.g. for Na8Cl8: atype = Na Cl
.nat
: The number of each atom. e.g. for Na8Cl8: nat = 8 8
For version 1.0.0 or later, skip this page. The executable script is automatically installed.
Here, we assume the following condition:
~/CrySPY_root/CrySPY-0.9.0/cryspy.py
Let’s make a convenient shell script to avoid typing long commands over and over again.
Here, we create the script, cryspy
(any file name will do).
$ emacs cryspy
$ chmod 744 cryspy
$ cat cryspy
#!/bin/sh
python3 -u ~/CrySPY_root/CrySPY-0.9.0/cryspy.py 1>> log 2>> err
-u
option (unbuffered option) can be omitted.
You can put this script in your $PATH, or just use like bash ./cryspy
.
2023 July 10, update
Make sure you have the following in your working directory.
$ ls
calc_in/ cryspy.in
Then, run CyrSPY!
cryspy
If you use old version (0.10.3 or earlier):
bash ./cryspy
At the first run, CrySPY goes into structure generation mode. CrySPY stops after 5 structure generation.
If it worked properly, the following output appears on the screen:
[2023-07-10 18:40:54,389][cryspy_init][INFO]
Start CrySPY 1.2.0
[2023-07-10 18:40:54,389][cryspy_init][INFO] # ---------- Read input file, cryspy.in
[2023-07-10 18:40:54,390][read_input][INFO] Save input data in cryspy.stat
[2023-07-10 18:40:54,391][cryspy_init][INFO] # ---------- Initial structure generation
[2023-07-10 18:40:54,391][cryspy_init][INFO] Number of MPI processes: 1
[2023-07-10 18:40:54,391][gen_init_struc][INFO] # ------ mindist
[2023-07-10 18:40:54,395][struc_util][INFO] Cu - Cu: 1.32
[2023-07-10 18:40:54,395][gen_init_struc][INFO] # ------ generate structures
[2023-07-10 18:40:54,481][gen_pyxtal][INFO] Structure ID 0 was generated. Space group: 1 --> 1 P1
[2023-07-10 18:40:54,493][gen_pyxtal][INFO] Structure ID 1 was generated. Space group: 28 --> 28 Pma2
[2023-07-10 18:40:54,498][gen_pyxtal][INFO] Structure ID 2 was generated. Space group: 29 --> 29 Pca2_1
[2023-07-10 18:40:54,704][gen_pyxtal][INFO] Structure ID 3 was generated. Space group: 137 --> 137 P4_2/nmc
[2023-07-10 18:40:54,725][gen_pyxtal][INFO] Structure ID 4 was generated. Space group: 212 --> 214 I4_132
[2023-07-10 18:40:54,800][cryspy_init][INFO] Elapsed time for structure generation: 0:00:00.408367
cryspy 4.35s user 1.04s system 145% cpu 3.697 total
Several output files are also generated.
cryspy.out
): Short log. only version 0.10.3 or earlier.cryspy.stat
: Status file.data/init_POSCARS
: Initial struture file in POSCAR format.
You can open this file using VESTAdata/pkl_data
: Directory to save pickled data.log_cryspy
: log.err_cryspy
: error and warning.Let’s take a look at cryspy.stat
file.
...
(omit)
...
[status]
id_queueing = 0 1 2 3 4
Structure ID 0 – 4 are queueing because we just generated structures, and have not submitted yet.
Check the initial structures, if the distances between atoms are too close, you should set the mindist
in cryspy.in
.
2023 July 10, update
CrySPY continues the simulation if you have cryspy.stat
file.
Continue if you have crypy.stat
Start from the beginning if you don’t have cryspy.stat
Run CyrSPY again.
cryspy
Check the screen or log_cryspy
file.
[2023-07-10 18:52:51,859][cryspy_restart][INFO]
Restart CrySPY 1.2.0
[2023-07-10 18:52:51,869][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:52:51,904][ctrl_job][INFO] ID 0: submit job, Stage 1
[2023-07-10 18:52:51,931][ctrl_job][INFO] ID 1: submit job, Stage 1
And also cryspy.stat
file.
...
(omit)
...
[status]
id_queueing = 2 3 4
id 0 = Stage 1
id 1 = Stage 1
CrySPY submitted two jobs for structure ID 0 and 1 as you set njob = 2
in cryspy.in
.
Calculations are performed in the work
directory.
These directory names correspond to their structure ID.
tree -d work
work
├── 000000
├── 000001
└── fin
When the two jobs are done, run CrySPY again.
cryspy
[2023-07-10 18:55:01,053][cryspy_restart][INFO]
Restart CrySPY 1.2.0
[2023-07-10 18:55:01,058][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,058][ctrl_job][INFO] ID 0: Stage 1 Done!
[2023-07-10 18:55:01,093][ctrl_job][INFO] collect results: E = -0.00696997755502915 eV/atom
[2023-07-10 18:55:01,132][ctrl_job][INFO] ID 1: Stage 1 Done!
[2023-07-10 18:55:01,133][ctrl_job][INFO] collect results: E = 0.4934076667166454 eV/atom
[2023-07-10 18:55:01,144][cryspy][INFO]
recheck 1
[2023-07-10 18:55:01,145][ctrl_job][INFO] # ---------- job status
[2023-07-10 18:55:01,153][ctrl_job][INFO] ID 2: submit job, Stage 1
[2023-07-10 18:55:01,161][ctrl_job][INFO] ID 3: submit job, Stage 1
If you set nstage = 2
(more than 2), new jobs on stage 2 for ID 0 and 1 are submitted.
If you set nstage = 1
, CrySPY collects calculation data of ID 0 and 1, then submits next ID’s jobs.
Directories of the finished structure are moved to the fin
directory.
Repeat cryspy
several times until all 5 structures are done.
You can delete the work
directory when the simulation is done if you do not need it.
The auto script (repeat_cryspy) may help you.
Move to data
directory. There should be a few more files.
$ cd data
$ ls
cryspy_rslt cryspy_rslt_energy_asc init_POSCARS opt_POSCARS pkl_data/
cryspy_rslt
: Result file.cryspy_rslt_energy_asc
: Result file sorted in energy ascending order.init_POSCARS
: Initial struture file in POSCAR format.opt_POSCARS
: Optimized structure file in POSCAR format.pkl_data/
: Directory to save pickled data.The results are written to text files, cryspy_rslt
and cryspy_rslt_energy_asc
(and also saved in pickle data in pkl_data
directory).
Each result appends to cryspy_rslt
file in the order in which one finished earlier.
cat cryspy_rslt
Spg_num Spg_sym Spg_num_opt Spg_sym_opt E_eV_atom Magmom Opt
0 139 I4/mmm 139 I4/mmm -3.000850 NaN done
1 98 I4_122 12 C2/m -3.978441 NaN not_yet
2 16 P222 16 P222 -3.348616 NaN not_yet
3 36 Cmc2_1 36 Cmc2_1 -3.520306 NaN not_yet
4 36 Cmc2_1 4 P2_1 -3.304168 NaN not_yet
Not ID order in cryspy_rslt
In cryspy_rslt_energy_asc
file, the results are sorted in energy ascending order.
cat cryspy_rslt_energy_asc
Spg_num Spg_sym Spg_num_opt Spg_sym_opt E_eV_atom Magmom Opt
1 98 I4_122 12 C2/m -3.978441 NaN not_yet
3 36 Cmc2_1 36 Cmc2_1 -3.520306 NaN not_yet
2 16 P222 16 P222 -3.348616 NaN not_yet
4 36 Cmc2_1 4 P2_1 -3.304168 NaN not_yet
0 139 I4/mmm 139 I4/mmm -3.000850 NaN done
Spg_num
and Spg_sym
show space group information on initial structures.
Spg_num_opt
and Spg_sym_opt
are those of optimized structures.
The last column Opt
indicates whether or not optimization reached required accuracy.
Of course only 5 structures are not enough to find stable structures. You can append structures whenever you want. Here let’s append more 5 structures.
For Si-Si mindist
, the default value of 1.11 Å is used in the first structure generation (see log_cryspy
), which is a little too close.
Let us try to set the mindist to 2.0 Å.
For mindist
, see also Features > Restriction on interatomic distances.
Edit cryspy.in
and change the value of tot_struc
into 10
, and add mindist_1 = 2.0
emacs cryspy.in
cat cryspy.in
[basic]
algo = RS
calc_code = soiap
tot_struc = 10
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy
[structure]
natot = 8
atype = Si
nat = 8
mindist_1 = 2.0
[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif
[option]
Then run cryspy, and check log_cryspy
file.
cryspy &
cat log_cryspy
...
(omit)
...
2023/03/19 00:01:47
CrySPY 1.0.0
Restart cryspy.py
Changed tot_struc from 5 to 10
Changed mindist from None to [[2.0]]
Backup data
# ---------- Append structures
# ------ mindist
Si - Si 2.0
Structure ID 5 was generated. Space group: 218 --> 221 Pm-3m
Structure ID 6 was generated. Space group: 86 --> 129 P4/nmm
Structure ID 7 was generated. Space group: 129 --> 129 P4/nmm
Structure ID 8 was generated. Space group: 191 --> 191 P6/mmm
Structure ID 9 was generated. Space group: 31 --> 31 Pmn2_1
Remember that CrySPY goes into structure generation mode whenever you change the value of tot_struc
.
In this mode, CrySPY does not do any other action such as collecting data, submitting jobs, and so on.
Structure generation mode whenever you change the value of tot_struc
.
From version 1.0.0, CrySPY automatically backs up when adding structures.
See features/backup.
Repeat cryspy &
several times until all appended structures are done.
The auto script (repeat_cryspy) may help you.
It is assumed here that you analyze and visualize CrySPY data in your local PC.
If you use CrySPY in super computers or workstations, download the data in your local PC.
You can delete the work
and backup
directory if you do not need it because the file size could be very large.
Move to the data/
directory in results you just download.
Then copy cryspy_analyzer_RS.ipynb
from CrySPY utility.
$ ls
calc_in/ cryspy.in cryspy.stat data/ err_cryspy log_cryspy
$ cd data
$ ls
cryspy_rslt cryspy_rslt_energy_asc init_POSCARS opt_CIFS.cif opt_POSCARS pkl_data/
cp /path/to/CrySPY_utility/cryspy_analyzer_RS.ipynb .
Run jupyter. (VScode, jupyter lab, jupyter notebook, and so on.) You can get the following figure by simply running the steps in order.
You need only cryspy.in
.
$ ls
cryspy.in
Then, run CyrSPY.
cryspy &
At the first run, CrySPY goes into structure generation mode as usual. CrySPY stops after 5 structure generation.
If it worked properly, log_cryspy
would look like this.
2022/07/14 19:41:41
CrySPY 1.0.0
Start cryspy.py
Read input file, cryspy.in
Write input data in cryspy.out
Save input data in cryspy.stat
# --------- Generate initial structures
# ------ mindist
Si - Si 1.11
Structure ID 0 was generated. Space group: 88 --> 141 I4_1/amd
Structure ID 1 was generated. Space group: 101 --> 101 P4_2cm
Structure ID 2 was generated. Space group: 204 --> 229 Im-3m
Structure ID 3 was generated. Space group: 199 --> 199 I2_13
Structure ID 4 was generated. Space group: 12 --> 12 C2/m
Unlike normal use, a directory named ext
was created.
Only the stat_job
file exists in ext/
.
$ cat ext/stat_job
out
If you run cryspy when “out” is written in the stat_job
file, queueing structure files (cif format) are exported in ext/queue
.
cryspy &
$ ls ext/queue
0.cif 1.cif 2.cif 3.cif 4.cif
The number in the file name is structure ID.
The fist line of stat_job
was automatically changed.
$ cat ext/stat_job
submitted
Perform structure optimization and energy evaluation in an external program using the output cif files.
Once that calculation is done, prepare the optimized structure and energy data in the pickle data format, ext_opt_struc_data.pkl
and ext_energy_data.pkl
.
The data format of ext_opt_struc_data.pkl
is the same as init_struc_data.pkl
and opt_struc_data.pkl
, see Data format/Initial and optimized structure data.
The data format of ext_energy_data.pkl
is similar to ext_opt_struc_data.pkl
. Just change the value from the structure data into the energy.
An example of the energy data (dict type) is shown below.
{0: -0.7139331910805997,
1: -0.5643404689832622,
2: -0.5832404287259171,
3: -0.535037327286169,
4: -0.6316663459586607}
The ext/calc_data
directory should be automatically generated, so put the two pickle files here.
$ ls ext/calc_data
ext_energy_data.pkl ext_opt_struc_data.pkl
When ready, replace the first line of the stat_job
file with “done” and run CrySPY.
$ emacs /ext/stat_job
$ cat /ext/stat_job
done
cryspy &
CrySPY collects the result data.
EA
BO
May 15th, 2023
First, see Tutorial > Random Search (RS) for basic usage of CrySPY.
Here, we assume CrySPY 1.1.0 or later.
The example files used here can be downloaded from CrySPY Utility > Examples > qe_Si16_LAQA. In this tutorial, only 50 initial structures are generated, but originally, LAQA is designed to select candidates from many more structures.
Here is an example of cryspy.in
.
[basic]
algo = LAQA
calc_code = QE
tot_struc = 50
nstage = 1
njob = 10
jobcmd = qsub
jobfile = job_cryspy
[structure]
natot = 16
atype = Si
nat = 16
mindist_1 = 1.5
[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 80
[LAQA]
nselect_laqa = 4
[option]
nstage
must be 1 in LAQAnselect_laqa
in [LAQA] section. nselect_laqa
is the number of candidates you select at one time.If you want to change the value of the weight for LAQA score, edit wf
and ws
as below.
If omitted, the default values are used (0.1 and 10.0, respectively).
See, Searching algorithms > LAQA for the score.
[LAQA]
nselect_laqa = 4
wf = 0.1
ws = 10.0
&control
calculation = 'vc-relax'
pseudo_dir = '/usr/local/gbrv/all_pbe_UPF_v1.5/'
outdir='./outdir/'
nstep = 10
/
&system
ibrav = 0
nat = 16
ntyp = 1
ecutwfc = 40
ecutrho = 200
occupations = 'smearing'
degauss = 0.01
/
&electrons
/
&ions
/
&cell
/
ATOMIC_SPECIES
Si -1.0 si_pbe_v1.uspp.F.UPF
nstep
controls how many steps of structure optimization can proceed in one selection. (NSW
for VASP)#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
####$ -V -S /bin/zsh
#$ -N Si_CrySPY_ID
#$ -pe smp 20
####$ -q ibis1.q
####$ -q ibis2.q
mpirun -np $NSLOTS pw.x -nk 4 < pwscf.in > pwscf.out
if [ -e "CRASH" ]; then
sed -i -e '3 s/^.*$/skip/' stat_job
exit 1
fi
sed -i -e '3 s/^.*$/done/' stat_job
An automatic script is also available. See the bottom of this page.
Just type cryspy
for the 1st run.
cryspy &
Check log_cryspy
.
50 random structures are generated.
2023/05/13 13:02:07
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 1
Read input file, cryspy.in
Save input data in cryspy.stat
# --------- Generate initial structures
# ------ mindist
Si - Si 1.5
Structure ID 0 was generated. Space group: 165 --> 165 P-3c1
Structure ID 1 was generated. Space group: 66 --> 66 Cccm
Structure ID 2 was generated. Space group: 146 --> 146 R3
Structure ID 3 was generated. Space group: 82 --> 82 I-4
Structure ID 4 was generated. Space group: 162 --> 162 P-31m
...
...
...
Structure ID 47 was generated. Space group: 90 --> 90 P42_12
Structure ID 48 was generated. Space group: 214 --> 214 I4_132
Structure ID 49 was generated. Space group: 23 --> 23 I222
Elapsed time for structure generation: 0:00:10.929030
# ---------- Initialize LAQA
# ---------- Selection 0
selected_id: 50 IDs
In LAQA, jobs of structure optimization for all structures are submitted once at the beginning.
Note that only 10 steps are proceeded here since we set nstep = 10
.
Repeat cryspy
command until all of these (10 steps) are completed.
If necessary, you can also submit all jobs at once by increasing the value of njob
.
After all the initial optimizations, LAQA is ready
is displayed at the end of log_cryspy
.
2023/05/13 13:23:31
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1
# ---------- job status
ID 41: Stage 1 Done!
LAQA is ready
Next cryspy run will make the first selection.
2023/05/13 13:23:33
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1
# ---------- job status
Backup data
# ---------- Selection 1
selected_id: 37 8 10 48
Here, only the number set in nselect_laqa
will be selected.
Type cryspy
to submit the jobs (next 10 steps).
cryspy &
2023/05/13 13:23:36
CrySPY 1.1.0
Restart cryspy.py
Number of MPI processes: 1
# ---------- job status
ID 37: submit job, Stage 1
ID 8: submit job, Stage 1
ID 10: submit job, Stage 1
ID 48: submit job, Stage 1
Then, by repeating this over and over again, the optimization of the structure selected according to the score advances by 10 steps each time. Proceed until several structures are completed, and finish (stop) when you like.
If you want to check the LAQA score during the simulation, you can look at the status file:
Other files for LAQA will be output:
It is assumed here that you analyze and visualize CrySPY data in your local PC.
If you use CrySPY in super computers or workstations, download the data in your local PC.
You can delete the work
and backup
directory if you do not need it because the file size could be very large.
You may gzip the pkl data to decrease the file size.
Move to the data/ directory in results you just downloaded.
Then copy cryspy_analyzer_LAQA.ipynb
from CrySPY utility.
You can obtain the graph and animation with the notebook. In the gif below, all of the optimizations were completed. This is just for animation. (When all of the optimizations are completed, the computational cost is the same as random search.)
This graph shows the energy as a function of optimization step. The red lines indicate three structures with the lowest energy. The most stable one reached diamond structure. The structures that eventually become stable were selected at an early stage.
If algo = LAQA, the followings are automatically set in the [option] section.
Force and stress data are collected step by step. Energy and structure data are NOT. They are collected for each selection. In other words, in this case, energy and structure data are saved once every 10 steps. If you want to collect energy and structure data step by step, manually set up as follows:
[option]
energy_step_flag = True
struc_step_flag = True
You may find it tedious to run cryspy over and over again. The auto script could help you.
First, see Tutorial > Random Search (RS) for basic usage of CrySPY.
In this section, we give a tutorial on the molecular structure generation part only. Since version 0.9.0, CrySPY has been able to generate random molecular crystal structures using PyXtal.
You need to use a pre-defined molecular by PyXtal’s database (see, https://pyxtal.readthedocs.io/en/latest/Usage.html?highlight=benzene#pyxtal-molecule-pyxtal-molecule)) or create molecule files that define molecular structures.
PyXtal currently supports C60
, H2O
, CH4
, NH3
, benzene
, naphthalene
, anthracene
, tetracene
, pentacene
, coumarin
, resorcinol
, benzamide
, aspirin
, ddt
, lindane
, glycine
, glucose
, and ROY
.
Let us generate molecular crystal structures that consist of 2 benzenes.
Move to your working directory, and copy input example files by one of the following methods.
cp -r ~/CrySPY_root/CrySPY-0.9.0/example/QE_benzene_2_RS_mol .
Take a look at cryspy.in
.
$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 6
nstage = 2
njob = 2
jobcmd = qsub
jobfile = job_cryspy
[structure]
struc_mode = mol
natot = 24
atype = H C
nat = 12 12
mol_file = benzene
nmol = 2
[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40 60
[option]
In generating molecular crystal structures, you have to set struc_mode = mol
in the [structure]
section.
Molecule file(s) and the number of molecule(s) are specified as:
Run CrySPY and see the initial structures (./data/init_POSCARS
).
Move to your working directory, and copy input example files for 2 formula units of Li3PS4.
cp -r ~/CrySPY_root/CrySPY-0.9.0/example/QE_Li3PS4_2fu_RS_mol .
$ cd QE_Li3PS4_2fu_RS_mol
$ ls
Li.xyz PS4.xyz calc_in/ cryspy.in
Molecule files of Li and PS4 are included. Supported formats in PyXtal are .xyz
, .gjf
, .g03
, .g09
, .com
, .inp
, .out
, and pymatgen’s JSON
serialized molecules.
$ cat Li.xyz
1
New structure
Li 0.000 0.000 0.000
$ cat PS4.xyz
5
New structure
P 0.000000 0.000000 0.000000
S 1.200000 1.200000 -1.200000
S 1.200000 -1.200000 1.200000
S -1.200000 1.200000 1.200000
S -1.200000 -1.200000 -1.200000
Check cryspy.in
.
$ cat cryspy.in
[basic]
algo = RS
calc_code = QE
tot_struc = 4
nstage = 2
njob = 1
jobcmd = qsub
jobfile = job_cryspy
[structure]
struc_mode = mol
natot = 16
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz ./PS4.xyz
nmol = 6 2
[QE]
qe_infile = pwscf.in
qe_outfile = pwscf.out
kppvol = 40 60
[option]
A single atom (Li atom in this case) is treated as a molecule in the molecular crystal structure generation mode. In this example, a random molecular structure is composed of six Li molecules (atoms) and two PS4 molecules specified as:
In mol_file
, set relative path of molecule files from cryspy.in
.
Here the molecule files are placed in the same directory.
Run CrySPY and see the initial structures (./data/init_POSCARS
).
Molecular crystal structure generation can be time consuming because PyXtal calculates the molecule directions according to a specified space group.
Sometimes molecular crystal structure generation gets stuck.
So we set a time limit on the single structure generation.
The time limit (timeout_mol
) is set to 120 seconds by default.
If the limit is insufficient, you have to increase it as (see last line):
struc_mode = mol
natot = 16
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0
You can control the volume of unit cells by changing the value(s) of scaling factor, vol_factor
, in cryspy.in
.
By default, vol_factor
is set to 1.0
.
It is also possible to specify a range of factors.
Set minimum and maximum values as follows:
struc_mode = mol
natot = 16
atype = Li P S
nat = 6 2 8
mol_file = ./Li.xyz ./PS4.xyz
nmol = 6 2
timeout_mol = 300.0
vol_factor = 0.8 1.5
Oct. 21 2023, update
First, see Tutorial > Random Search (RS) for basic usage of CrySPY.
Requirements:
1.1.0 <= CrySPY <=1.2.2 has a bug.
When you use bash (zsh) to run a job with MPI (e.g., jobcmd = zsh
, jobfile = job_cryspy
),
the MPI job does not run. There is no problem when you use a job scheduler (qsub, sbatch).
It has already fixed in version 1.2.3.
Install mpi4py if it is not already installed.
pip install mpi4py
cryspy.in
is the same as normal usage and does not need to be changed.
Here we try structure generation with MPI using the following settings:
[basic]
algo = RS
calc_code = soiap
tot_struc = 100
nstage = 1
njob = 2
jobcmd = zsh
jobfile = job_cryspy
[structure]
natot = 8
atype = Si
nat = 8
[soiap]
soiap_infile = soiap.in
soiap_outfile = soiap.out
soiap_cif = initial.cif
[option]
All except tot_struc
, natot
, atype
, and nat
are irrelevant for structure generation and can be ignored here.
If you want to generate structures with 4 MPI processes, just use mpiexec -n
(with `-p`` option):
mpiexec -n 4 cryspy -p
In 1.1.0 <= CrySPY <= 1.2.2, use (without `-p`` option)
mpiexec -n 4 cryspy
If you submit the job with a job scheduler system, make the job file. Here is an example:
#!/bin/sh
#$ -cwd
#$ -V -S /bin/bash
#$ -N n_nproc
#$ -pe smp 4
mpirun -np $NSLOTS ~/.local/bin/cryspy
Please edit the location of the executable script cryspy
.
CrySPY simply divides the task (number of structures) by the number of processes:
CrySPY outputs the log in the order they are generated as follows:
2023/04/24 22:47:51
CrySPY 1.1.0
Start cryspy.py
Number of MPI processes: 4
Read input file, cryspy.in
Save input data in cryspy.stat
# --------- Generate initial structures
# ------ mindist
Si - Si 1.11
Structure ID 25 was generated. Space group: 138 --> 123 P4/mmm
Structure ID 75 was generated. Space group: 99 --> 99 P4mm
Structure ID 0 was generated. Space group: 127 --> 123 P4/mmm
Structure ID 1 was generated. Space group: 61 --> 61 Pbca
Structure ID 50 was generated. Space group: 38 --> 38 Amm2
Structure ID 51 was generated. Space group: 134 --> 123 P4/mmm
Structure ID 26 was generated. Space group: 111 --> 123 P4/mmm
Structure ID 2 was generated. Space group: 9 --> 9 Cc
Structure ID 3 was generated. Space group: 80 --> 80 I4_1
Structure ID 4 was generated. Space group: 107 --> 107 I4mm
Structure ID 5 was generated. Space group: 75 --> 75 P4
Structure ID 76 was generated. Space group: 108 --> 108 I4cm
Structure ID 77 was generated. Space group: 100 --> 100 P4bm
Structure ID 27 was generated. Space group: 207 --> 221 Pm-3m
However, the order in init_POSCARS
is by structure ID since CrySPY outputs after all structures have been generated.
ID_0
1.0
2.9636956737951818 0.0000000000000002 0.0000000000000002
0.0000000000000000 2.9636956737951818 0.0000000000000002
0.0000000000000000 0.0000000000000000 6.2634106638053080
Si
8
direct
-0.1602734164607877 -0.1602734164607877 -0.0000000000000000 Si
0.1602734164607877 0.1602734164607877 0.5000000000000000 Si
0.6602734164607877 0.3397265835392123 0.7500000000000000 Si
0.3397265835392122 0.6602734164607877 0.2500000000000000 Si
0.4469739273741755 0.4469739273741755 -0.0000000000000000 Si
0.5530260726258245 0.5530260726258244 0.5000000000000000 Si
0.0530260726258245 0.9469739273741754 0.7500000000000000 Si
0.9469739273741754 0.0530260726258245 0.2500000000000000 Si
ID_1
1.0
7.2751506682509657 0.0000000000000004 0.0000000000000004
0.0000000000000000 7.2751506682509657 0.0000000000000004
0.0000000000000000 0.0000000000000000 5.1777634169924873
Si
8
direct
-0.3845341807505553 -0.3845341807505553 0.4999999999999999 Si
0.3845341807505553 0.3845341807505553 0.5000000000000000 Si
0.3845341807505553 -0.3845341807505553 0.0000000000000000 Si
-0.3845341807505553 0.3845341807505553 -0.0000000000000000 Si
0.0000000000000000 0.5000000000000000 0.2500000000000000 Si
0.5000000000000000 0.0000000000000000 0.7500000000000000 Si
0.0000000000000000 0.5000000000000000 0.7500000000000000 Si
0.5000000000000000 0.0000000000000000 0.2500000000000000 Si
ID_2
1.0
-4.3660398676292269 -4.3660398676292269 0.0000000000000000
-4.3660398676292269 -0.0000000000000003 -4.3660398676292269
0.0000000000000000 -4.3660398676292269 -4.3660398676292269
Si
8
direct
0.8700001548800920 0.8700001548800920 0.1299998451199080 Si
0.1299998451199080 0.1299998451199080 0.8700001548800920 Si
0.8700001548800920 0.1299998451199080 0.8700001548800920 Si
0.1299998451199080 0.8700001548800920 0.1299998451199080 Si
0.1299998451199080 0.8700001548800920 0.8700001548800920 Si
0.8700001548800920 0.1299998451199080 0.1299998451199080 Si
0.7500000000000000 0.7500000000000000 0.7500000000000000 Si
0.2500000000000000 0.2500000000000000 0.2500000000000000 Si
Except for the random structure generation part, there is no point in using MPI because it is not parallelized.