Subsections of Scripts
extract_struc.py
2023 April 16 update
Script to extract structures from init_struc_data.pkl
or opt_struc_data.pkl
.
This script can print stucture information and output cif files.
One can specify structure ID(s) using -i
option.
Top k structures (the k most stable structures) can be extracted using -t
option.
-a
option is for outputting all the structures.
(note that many cif files will be output.)
Symmetrized cif files can be generated with -s
option.
When outputting a symmetrized CIF file, you can also specify a tolerance with --tolerance
.
Structure information is printed with -p
.
If you use -p
option, cif files are not output.
You can also read a gzipped file (e.g., opt_struc_data.pkl.gz
).
Update History
- 2024 April 16: –tolerance option, gzip
- 2023 July 21: –print option
Usage
python3 extract_struc.py -h
or if you put the script in your PATH, you can omit python3
extract_struc.py -h
usage: extract_struc.py [-h] [-p] [-a] [-i [INDEX ...]] [-t TOP] [-r] [-s] [--tolerance TOLERANCE] infile
positional arguments:
infile input file
options:
-h, --help show this help message and exit
-p, --print just print, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -ps
-a, --all_id all structures, e.g., extract_struc.py opt_struc_data.pkl -as
-i [INDEX ...], --index [INDEX ...]
structure ID, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
-t TOP, --top TOP top k structures, e.g. (k = 3), extract_struc.py opt_struc_data.pkl -t 3 -s
-r, --rank add rank in file names, e.g., extract_struc.py opt_struc_data.pkl -t 3 -rs
-s, --symmetrized symmetrized structure, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
--tolerance TOLERANCE
tolerance for symmetrization (default 0.01), e.g., extract_struc.py opt_struc_data.pkl -i 0 1 -s --tolerance 0.01
Examples
The -p
option can be used in combination with any option except for -s
option.
extract_struc.py -p opt_struc_data.pkl -i 0 1
ID 0
Full Formula (Na8 Cl8)
Reduced Formula: NaCl
abc : 6.823618 6.823618 7.566454
angles: 90.000000 90.000000 96.650518
pbc : True True True
Sites (16)
# SP a b c
--- ---- -------- -------- --------
0 Na 0 0 1
1 Na 0 0 0.5
2 Na 0.704707 0.295293 0.75
3 Na 0.295293 0.704707 0.25
4 Na 0.5 0 1
5 Na 0.5 0 0.5
6 Na 0 0.5 0.5
7 Na 0 0.5 0
8 Cl 0.5 0.5 0
9 Cl 0.5 0.5 0.5
10 Cl 0.484753 0.515247 0.75
11 Cl 0.515247 0.484753 0.25
12 Cl 0.828247 0.171753 0.851096
13 Cl 0.171753 0.828247 0.351096
14 Cl 0.828247 0.171753 0.648904
15 Cl 0.171753 0.828247 0.148904
ID 1
Full Formula (Na8 Cl8)
Reduced Formula: NaCl
abc : 8.145021 8.145021 4.324235
angles: 90.000000 90.000000 120.000000
pbc : True True True
Sites (16)
# SP a b c
--- ---- --------- --------- --------
0 Na 0.666667 0.333333 0.736206
1 Na 0.666667 0.333333 0.263794
2 Na 0.913147 0.086853 0.5
3 Na 0.913147 0.826295 0.5
4 Na 0.173705 0.086853 0.5
5 Na 0.77711 0.22289 0
6 Na 0.77711 0.55422 0
7 Na 0.44578 0.22289 0
8 Cl 0.027675 0.423376 0.5
9 Cl -0.423376 -0.395701 0.5
10 Cl 0.395701 -0.027675 0.5
11 Cl -0.423376 -0.027675 0.5
12 Cl 0.395701 0.423376 0.5
13 Cl 0.027675 -0.395701 0.5
14 Cl 0.333333 0.666667 0.5
15 Cl 0 0 0
Structure ID
extract_struc.py opt_struc_data.pkl -i 7 10 12
7.cif
, 10.cif
, and 12.cif
are output.
For symmetrized cif,
extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
2024 April 16
With the tolerance parameter (default 0.01)
extract_struc.py opt_struc_data.pkl -i 7 10 12 -s --tolerance 0.01
Top k structures
rslt_data.pkl
is required in the same directory as the input.
Let us suppose
- ./data/pkl_data/opt_struc_data.pkl
- ./data/pkl_data/rslt_data.pkl
and cryspy_rslt_energy_asc
file is as follows:
Spg_num Spg_sym Spg_num_opt Spg_sym_opt E_eV_atom Magmom Opt
9 110 I4_1cd 110 I4_1cd -1284.708037 NaN not_yet
16 4 P2_1 4 P2_1 -1284.693651 NaN done
97 92 P4_12_12 91 P4_122 -1284.692494 NaN done
8 57 Pbcm 57 Pbcm -1284.668504 NaN done
81 19 P2_12_12_1 19 P2_12_12_1 -1284.635684 NaN done
...
Top k(=3) structures can be extracted with:
extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3
In this example, rlst_data.pkl
must be in ./data/pkl_data/
.
9.cif
, 16.cif
, and 97.cif
are output.
The rank can be included in cif file names with -r
option:
extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -r
1_9.cif
, 2_16.cif
, and 3_97.cif
are output.
For symmetrized cif:
extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -rs
All the structures
You should make a directory.
mkdir init_cifs
cd init_cifs
extract_struc.py /path/to/opt_struc_data.pkl -a
For symmetrized cif,
extract_struc.py /path/to/init_struc_data.pkl -as
Gzipped files
2024 April 16
Gzipped files (end with .gz
) can be read:
extract_struc.py opt_struc_data.pkl.gz -i 0 1 -s
pos2pkl.py
2023 July 23 update
Script to convert structre data into init_struc_data.pkl
. The default input format is init_POSCARS
. Single structure data such as POSCAR and cif files can be optionally converted. Output is init_struc_data.pkl
. Structure data can be added to an already existing init_struc_data.pkl
. The structure ID is not taken into account and is newly assigned. If the number of atoms is different, an error is generated.
init_struc_data.pkl
can be loaded at the start of the simulation in CrySPY.
You can remove and sort species with -f
option. Note that without this option, pymatgen will sort the species in electronegativity order!
Usage
usage: pos2pkl.py [-h] [-s [SINGLE ...]] [-f [FILTER ...]] [-p] [infile ...]
positional arguments:
infile input file: init_POSCARS
options:
-h, --help show this help message and exit
-s [SINGLE ...], --single [SINGLE ...]
input file: single structure file (POSCAR, cif)
-f [FILTER ...], --filter [FILTER ...]
filter (sort): remove species and sort
-p, --permit_diff_comp
flag for permitting different composition
Examples
init_POSCARS –> init_struc_data.pkl
It can be used to convert init_POSCARS
generated by CrySPY to init_struc_data.pkl
in another machine such as a supercomputer. Multiple input files can be converted.
python3 pos2pkl.py init_POSCARS
If you put the pos2pkl.py
in your PATH, you can omit python3
.
pos2pkl.py init_POSCARS
Composition: Na8 Cl8
Converted. The number of structures: 4
Save init_struc_data.pkl
Multiple inputs:
python3 pos2pkl.py init_POSCARS init_POSCARS2 init_POSCARS3
Composition: Na8 Cl8
Converted. The number of structures: 12
Save init_struc_data.pkl
If init_struc_data.pkl
already exists in the current directory and you want to append to it:
python3 pos2pkl.py init_POSCARS
init_struc_data.pkl already exists.
Append to init_struc_data.pkl? [y/n]: y
Load init_struc_data
Composition: Na8 Cl8
The number of structures: 12
Converted. The number of structures: 16
Save init_struc_data.pkl
POSCAR or cif –> init_struc_data.pkl
Single structure data such as POSCAR and cif files can also be converted. -s/--single
option is required.
python3 pos2pkl.py -s POSCAR test.cif
Composition: Na8 Cl8
Converted. The number of structures: 2
Save init_struc_data.pkl
init_POSCARS, POSCAR –> init_struc_data.pkl
python3 pos2pkl.py init_POSCARS -s POSCAR
Composition: Na8 Cl8
Converted. The number of structures: 5
Save init_struc_data.pkl
The following is wrong. The init_POSCARS
is also treated as a single structure.
python3 pos2pkl.py -s POSCAR init_POSCARS
Filter (remove and sort)
Here we consider a cif file with the composition of Sr8 Co8 O20 X4,
including 4 dummy atoms (X4).
-f/--filter
option can be used to remove and sort species.
Specify the same as atype
in cryspy.in
.
python3 pos2pkl.py -s Sr8Co8O20X4.cif -f Sr Co O
Removed species: {'X0+'}
Composition: Sr8 Co8 O20
Converted. The number of structures: 1
Save init_struc_data.pkl
With extract_struc.py you can see how it was registered in init_struc_data.pkl
.
python3 extract_struc.py init_struc_data.pkl -pa
ID 0
Full Formula (Sr8 Co8 O20)
Reduced Formula: Sr2Co2O5
...
-f
option can allow you to sort.
python3 pos2pkl.py -s Sr8Co8O20X4.cif -f O Co
Removed species: {'Sr', 'X0+'}
Composition: O20 Co8
Converted. The number of structures: 1
Save init_struc_data.pkl
kpt_check.py
kpt_check.py
can check a k-point mesh with a given kppvol
.
This script supports POSCAR
, CONTCAR
, and init_struc_data.pkl
.
pymatgen library is required.
After generating initial structures, you can try to see how much the value of kppvol
should be.
Usage
python3 kpt_check.py -h
or if you put the script in your PATH, you can omit python3
kpt_check.py -h
usage: kpt_check.py [-h] [-w] [-n NSTRUC] infile kppvol
positional arguments:
infile input file: POSCAR, CONTCAR, or init_struc_data.pkl
kppvol kppvol
options:
-h, --help show this help message and exit
-w, --write write KPOINTS
-n NSTRUC, --nstruc NSTRUC
number of structure to check
Example
POSCAR with a given kppvol
kpt_check.py POSCAR 100
a = 10.689217
b = 10.689217
c = 10.730846
Lattice vector
10.689217 0.000000 0.000000
0.000000 10.689217 0.000000
0.000000 0.000000 10.730846
kppvol: 100
k-points: [2, 2, 2]
Write KPOINTS file
You can generate a KPOINTS
file using -w
option.
kpt_check.py -w POSCAR 100
$ cat KPOINTS
pymatgen 4.7.6+ generated KPOINTS with grid density = 607 / atom
0
Monkhorst
2 2 2
Check k-point meshes for init_struc_data.pkl
In checking k-point meshes for init_struc_data.pkl
, first five structures are automatically checked in the default setting.
You can change the number of structures using -n
option.
kpt_check.py -n 3 init_struc_data.pkl 100
# ---------- 0th structure
a = 8.0343076893
b = 8.03430768936
c = 9.1723323373
Lattice vector
8.034308 0.000000 0.000000
-4.017154 6.957915 0.000000
0.000000 0.000000 9.172332
kppvol: 100
k-points: [3, 3, 3]
# ---------- 1th structure
a = 9.8451944096
b = 9.84519440959
c = 6.8764313585
Lattice vector
9.845194 0.000000 0.000000
-4.922597 8.526188 0.000000
0.000000 0.000000 6.876431
kppvol: 100
k-points: [3, 3, 4]
# ---------- 2th structure
a = 7.5760383679
b = 7.57603836797
c = 6.6507478296
Lattice vector
7.576038 0.000000 0.000000
-3.788019 6.561042 0.000000
0.000000 0.000000 6.650748
kppvol: 100
k-points: [4, 4, 4]
repeat_cryspy
You may find it tedious to run cryspy
over and over again. This auto script could help you.
This runs cryspy
once every 5 minutes.
#!/bin/bash
set -e
while :
do
cryspy -n
LOG_LASTLINE=`tail -n 1 log_cryspy`
if [ "$LOG_LASTLINE" = "Done all structures!" ]
then
exit 0
# ---------- for EA
elif [ "${LOG_LASTLINE:0:17}" = "Reached maxgen_ea" ]
then
exit 0
elif [ "$LOG_LASTLINE" = "EA is ready" ]
then
cryspy -n # EA
LOG_LASTLINE=`tail -n 1 log_cryspy`
if [ "${LOG_LASTLINE:0:17}" = "Reached maxgen_ea" ]
then
exit 0
fi
cryspy -n # submit jobs
# ---------- for BO
elif [ "${LOG_LASTLINE:0:21}" = "Reached max_select_bo" ]
then
exit 0
elif [ "$LOG_LASTLINE" = "BO is ready" ]
then
cryspy -n # selection
LOG_LASTLINE=`tail -n 1 log_cryspy`
if [ "${LOG_LASTLINE:0:21}" = "Reached max_select_bo" ]
then
exit 0
fi
cryspy -n # submit jobs
# ---------- for LAQA
elif [ "$LOG_LASTLINE" = "LAQA is ready" ]
then
cryspy -n # selection
cryspy -n # submit jobs
fi
sleep 300 # seconds
done
print_pkl.py
2024 May 31
When you want to quickly check the pickled files under data/pkl_data/
, using print_pkl.py is convenient.
Usage
python3 print_pkl.py xxxx.pkl
or if you put the script in your PATH, you can omit python3
print_pkl.py xxxx.pkl
Example
print_pkl.py init_struc_data.pkl
Number of structures: 10
dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print_pkl.py input_data.pkl
[basic]
algo = RS
calc_code = ASE
tot_struc = 10
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy
[structure]
struc_mode = crystal
natot = 8
atype = ('Cu', 'Au')
nat = (4, 4)
mindist_factor = 1.0
vol_factor = 1.1
symprec = 0.01
spgnum = all
use_find_wy = False
[option]
stop_chkpt = 0
load_struc_flag = False
stop_next_struc = False
append_struc_ea = False
energy_step_flag = False
struc_step_flag = False
force_step_flag = False
stress_step_flag = False
[ASE]
kpt_flag = False
force_gamma = False
ase_python = ase_in.py
print_pkl.py elite_struc.pkl
Number of structures: 2
dict_keys([3, 6])
print_pkl.py elite_fitness.pkl
{3: -325.79973412221455, 6: -324.8381948581405}