Subsections of Scripts

extract_struc.py

2023 April 16 update

Script to extract structures from init_struc_data.pkl or opt_struc_data.pkl. This script can print stucture information and output cif files.

One can specify structure ID(s) using -i option. Top k structures (the k most stable structures) can be extracted using -t option. -a option is for outputting all the structures. (note that many cif files will be output.) Symmetrized cif files can be generated with -s option. When outputting a symmetrized CIF file, you can also specify a tolerance with --tolerance. Structure information is printed with -p. If you use -p option, cif files are not output. You can also read a gzipped file (e.g., opt_struc_data.pkl.gz).

Update History

  • 2024 April 16: –tolerance option, gzip
  • 2023 July 21: –print option

Usage

python3 extract_struc.py -h

or if you put the script in your PATH, you can omit python3

extract_struc.py -h
usage: extract_struc.py [-h] [-p] [-a] [-i [INDEX ...]] [-t TOP] [-r] [-s] [--tolerance TOLERANCE] infile

positional arguments:
  infile                input file

options:
  -h, --help            show this help message and exit
  -p, --print           just print, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -ps
  -a, --all_id          all structures, e.g., extract_struc.py opt_struc_data.pkl -as
  -i [INDEX ...], --index [INDEX ...]
                        structure ID, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
  -t TOP, --top TOP     top k structures, e.g. (k = 3), extract_struc.py opt_struc_data.pkl -t 3 -s
  -r, --rank            add rank in file names, e.g., extract_struc.py opt_struc_data.pkl -t 3 -rs
  -s, --symmetrized     symmetrized structure, e.g., extract_struc.py opt_struc_data.pkl -i 7 10 12 -s
  --tolerance TOLERANCE
                        tolerance for symmetrization (default 0.01), e.g., extract_struc.py opt_struc_data.pkl -i 0 1 -s --tolerance 0.01

Examples

Print

The -p option can be used in combination with any option except for -s option.

extract_struc.py -p opt_struc_data.pkl -i 0 1
ID 0
Full Formula (Na8 Cl8)
Reduced Formula: NaCl
abc   :   6.823618   6.823618   7.566454
angles:  90.000000  90.000000  96.650518
pbc   :       True       True       True
Sites (16)
  #  SP           a         b         c
---  ----  --------  --------  --------
  0  Na    0         0         1
  1  Na    0         0         0.5
  2  Na    0.704707  0.295293  0.75
  3  Na    0.295293  0.704707  0.25
  4  Na    0.5       0         1
  5  Na    0.5       0         0.5
  6  Na    0         0.5       0.5
  7  Na    0         0.5       0
  8  Cl    0.5       0.5       0
  9  Cl    0.5       0.5       0.5
 10  Cl    0.484753  0.515247  0.75
 11  Cl    0.515247  0.484753  0.25
 12  Cl    0.828247  0.171753  0.851096
 13  Cl    0.171753  0.828247  0.351096
 14  Cl    0.828247  0.171753  0.648904
 15  Cl    0.171753  0.828247  0.148904

ID 1
Full Formula (Na8 Cl8)
Reduced Formula: NaCl
abc   :   8.145021   8.145021   4.324235
angles:  90.000000  90.000000 120.000000
pbc   :       True       True       True
Sites (16)
  #  SP            a          b         c
---  ----  ---------  ---------  --------
  0  Na     0.666667   0.333333  0.736206
  1  Na     0.666667   0.333333  0.263794
  2  Na     0.913147   0.086853  0.5
  3  Na     0.913147   0.826295  0.5
  4  Na     0.173705   0.086853  0.5
  5  Na     0.77711    0.22289   0
  6  Na     0.77711    0.55422   0
  7  Na     0.44578    0.22289   0
  8  Cl     0.027675   0.423376  0.5
  9  Cl    -0.423376  -0.395701  0.5
 10  Cl     0.395701  -0.027675  0.5
 11  Cl    -0.423376  -0.027675  0.5
 12  Cl     0.395701   0.423376  0.5
 13  Cl     0.027675  -0.395701  0.5
 14  Cl     0.333333   0.666667  0.5
 15  Cl     0          0         0

Structure ID

extract_struc.py opt_struc_data.pkl -i 7 10 12

7.cif, 10.cif, and 12.cif are output.

For symmetrized cif,

extract_struc.py opt_struc_data.pkl -i 7 10 12 -s

2024 April 16
With the tolerance parameter (default 0.01)

extract_struc.py opt_struc_data.pkl -i 7 10 12 -s --tolerance 0.01

Top k structures

Info

rslt_data.pkl is required in the same directory as the input.

Let us suppose

  • ./data/pkl_data/opt_struc_data.pkl
  • ./data/pkl_data/rslt_data.pkl

and cryspy_rslt_energy_asc file is as follows:

    Spg_num     Spg_sym  Spg_num_opt Spg_sym_opt    E_eV_atom  Magmom      Opt
9       110      I4_1cd          110      I4_1cd -1284.708037     NaN  not_yet
16        4        P2_1            4        P2_1 -1284.693651     NaN     done
97       92    P4_12_12           91      P4_122 -1284.692494     NaN     done
8        57        Pbcm           57        Pbcm -1284.668504     NaN     done
81       19  P2_12_12_1           19  P2_12_12_1 -1284.635684     NaN     done
...

Top k(=3) structures can be extracted with:

extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3

In this example, rlst_data.pkl must be in ./data/pkl_data/. 9.cif, 16.cif, and 97.cif are output.

The rank can be included in cif file names with -r option:

extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -r

1_9.cif, 2_16.cif, and 3_97.cif are output.

For symmetrized cif:

extract_struc.py ./data/pkl_data/opt_struc_data.pkl -t 3 -rs

All the structures

You should make a directory.

mkdir init_cifs
cd init_cifs
extract_struc.py /path/to/opt_struc_data.pkl -a

For symmetrized cif,

extract_struc.py /path/to/init_struc_data.pkl -as

Gzipped files

2024 April 16
Gzipped files (end with .gz) can be read:

extract_struc.py opt_struc_data.pkl.gz -i 0 1 -s

pos2pkl.py

2023 July 23 update

Script to convert structre data into init_struc_data.pkl. The default input format is init_POSCARS. Single structure data such as POSCAR and cif files can be optionally converted. Output is init_struc_data.pkl. Structure data can be added to an already existing init_struc_data.pkl. The structure ID is not taken into account and is newly assigned. If the number of atoms is different, an error is generated.

init_struc_data.pkl can be loaded at the start of the simulation in CrySPY.

You can remove and sort species with -f option. Note that without this option, pymatgen will sort the species in electronegativity order!

Usage

usage: pos2pkl.py [-h] [-s [SINGLE ...]] [-f [FILTER ...]] [-p] [infile ...]

positional arguments:
  infile                input file: init_POSCARS

options:
  -h, --help            show this help message and exit
  -s [SINGLE ...], --single [SINGLE ...]
                        input file: single structure file (POSCAR, cif)
  -f [FILTER ...], --filter [FILTER ...]
                        filter (sort): remove species and sort
  -p, --permit_diff_comp
                        flag for permitting different composition

Examples

init_POSCARS –> init_struc_data.pkl

It can be used to convert init_POSCARS generated by CrySPY to init_struc_data.pkl in another machine such as a supercomputer. Multiple input files can be converted.

python3 pos2pkl.py init_POSCARS

If you put the pos2pkl.py in your PATH, you can omit python3.

pos2pkl.py init_POSCARS
Composition: Na8 Cl8

Converted. The number of structures: 4
Save init_struc_data.pkl

Multiple inputs:

python3 pos2pkl.py init_POSCARS init_POSCARS2 init_POSCARS3
Composition: Na8 Cl8

Converted. The number of structures: 12
Save init_struc_data.pkl

If init_struc_data.pkl already exists in the current directory and you want to append to it:

python3 pos2pkl.py init_POSCARS
init_struc_data.pkl already exists.
Append to init_struc_data.pkl? [y/n]: y

Load init_struc_data
Composition: Na8 Cl8
The number of structures: 12

Converted. The number of structures: 16
Save init_struc_data.pkl

POSCAR or cif –> init_struc_data.pkl

Single structure data such as POSCAR and cif files can also be converted. -s/--single option is required.

python3 pos2pkl.py -s POSCAR test.cif
Composition: Na8 Cl8

Converted. The number of structures: 2
Save init_struc_data.pkl

init_POSCARS, POSCAR –> init_struc_data.pkl

python3 pos2pkl.py init_POSCARS -s POSCAR
Composition: Na8 Cl8

Converted. The number of structures: 5
Save init_struc_data.pkl
Warning

The following is wrong. The init_POSCARS is also treated as a single structure.

python3 pos2pkl.py -s POSCAR init_POSCARS

Filter (remove and sort)

Here we consider a cif file with the composition of Sr8 Co8 O20 X4, including 4 dummy atoms (X4). -f/--filter option can be used to remove and sort species. Specify the same as atype in cryspy.in.

python3 pos2pkl.py -s Sr8Co8O20X4.cif -f Sr Co O
Removed species: {'X0+'}
Composition: Sr8 Co8 O20

Converted. The number of structures: 1
Save init_struc_data.pkl

With extract_struc.py you can see how it was registered in init_struc_data.pkl.

python3 extract_struc.py init_struc_data.pkl -pa
ID 0
Full Formula (Sr8 Co8 O20)
Reduced Formula: Sr2Co2O5
...

-f option can allow you to sort.

python3 pos2pkl.py -s Sr8Co8O20X4.cif -f O Co 
Removed species: {'Sr', 'X0+'}
Composition: O20 Co8

Converted. The number of structures: 1
Save init_struc_data.pkl

kpt_check.py

kpt_check.py can check a k-point mesh with a given kppvol. This script supports POSCAR, CONTCAR, and init_struc_data.pkl. pymatgen library is required.

After generating initial structures, you can try to see how much the value of kppvol should be.

Usage

python3 kpt_check.py -h

or if you put the script in your PATH, you can omit python3

kpt_check.py -h
usage: kpt_check.py [-h] [-w] [-n NSTRUC] infile kppvol

positional arguments:
  infile                input file: POSCAR, CONTCAR, or init_struc_data.pkl
  kppvol                kppvol

options:
  -h, --help            show this help message and exit
  -w, --write           write KPOINTS
  -n NSTRUC, --nstruc NSTRUC
                        number of structure to check

Example

POSCAR with a given kppvol

kpt_check.py POSCAR 100
a = 10.689217
b = 10.689217
c = 10.730846
    Lattice vector
10.689217 0.000000 0.000000
0.000000 10.689217 0.000000
0.000000 0.000000 10.730846

kppvol:  100
k-points:  [2, 2, 2]

Write KPOINTS file

You can generate a KPOINTS file using -w option.

kpt_check.py -w POSCAR 100
$ cat KPOINTS
pymatgen 4.7.6+ generated KPOINTS with grid density = 607 / atom
0
Monkhorst
2 2 2

Check k-point meshes for init_struc_data.pkl

In checking k-point meshes for init_struc_data.pkl, first five structures are automatically checked in the default setting. You can change the number of structures using -n option.

kpt_check.py -n 3 init_struc_data.pkl 100
# ---------- 0th structure
a = 8.0343076893
b = 8.03430768936
c = 9.1723323373
    Lattice vector
8.034308 0.000000 0.000000
-4.017154 6.957915 0.000000
0.000000 0.000000 9.172332

kppvol:  100
k-points:  [3, 3, 3]


# ---------- 1th structure
a = 9.8451944096
b = 9.84519440959
c = 6.8764313585
    Lattice vector
9.845194 0.000000 0.000000
-4.922597 8.526188 0.000000
0.000000 0.000000 6.876431

kppvol:  100
k-points:  [3, 3, 4]


# ---------- 2th structure
a = 7.5760383679
b = 7.57603836797
c = 6.6507478296
    Lattice vector
7.576038 0.000000 0.000000
-3.788019 6.561042 0.000000
0.000000 0.000000 6.650748

kppvol:  100
k-points:  [4, 4, 4]

repeat_cryspy

You may find it tedious to run cryspy over and over again. This auto script could help you. This runs cryspy once every 5 minutes.

#!/bin/bash

set -e

while :
do
    cryspy -n
    LOG_LASTLINE=`tail -n 1 log_cryspy`
    if  [ "$LOG_LASTLINE" = "Done all structures!" ]
    then
        exit 0
    # ---------- for EA
    elif [ "${LOG_LASTLINE:0:17}" = "Reached maxgen_ea" ]
    then
        exit 0
    elif [ "$LOG_LASTLINE" = "EA is ready" ]
    then
        cryspy -n    # EA
        LOG_LASTLINE=`tail -n 1 log_cryspy`
        if [ "${LOG_LASTLINE:0:17}" = "Reached maxgen_ea" ]
        then
            exit 0
        fi
        cryspy -n    # submit jobs
    # ---------- for BO
    elif [ "${LOG_LASTLINE:0:21}" = "Reached max_select_bo" ]
    then
        exit 0
    elif [ "$LOG_LASTLINE" = "BO is ready" ]
    then
        cryspy -n    # selection
        LOG_LASTLINE=`tail -n 1 log_cryspy`
        if [ "${LOG_LASTLINE:0:21}" = "Reached max_select_bo" ]
        then
            exit 0
        fi
        cryspy -n    # submit jobs
    # ---------- for LAQA
    elif [ "$LOG_LASTLINE" = "LAQA is ready" ]
    then
        cryspy -n    # selection
        cryspy -n    # submit jobs
    fi
    sleep 300    # seconds
done

print_pkl.py

2024 May 31

When you want to quickly check the pickled files under data/pkl_data/, using print_pkl.py is convenient.

Usage

python3 print_pkl.py xxxx.pkl

or if you put the script in your PATH, you can omit python3

print_pkl.py xxxx.pkl

Example

print_pkl.py init_struc_data.pkl
Number of structures: 10
dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print_pkl.py input_data.pkl 
[basic]
algo = RS
calc_code = ASE
tot_struc = 10
nstage = 1
njob = 5
jobcmd = zsh
jobfile = job_cryspy

[structure]
struc_mode = crystal
natot = 8
atype = ('Cu', 'Au')
nat = (4, 4)
mindist_factor = 1.0
vol_factor = 1.1
symprec = 0.01
spgnum = all
use_find_wy = False

[option]
stop_chkpt = 0
load_struc_flag = False
stop_next_struc = False
append_struc_ea = False
energy_step_flag = False
struc_step_flag = False
force_step_flag = False
stress_step_flag = False

[ASE]
kpt_flag = False
force_gamma = False
ase_python = ase_in.py
print_pkl.py elite_struc.pkl
Number of structures: 2
dict_keys([3, 6])
print_pkl.py elite_fitness.pkl
{3: -325.79973412221455, 6: -324.8381948581405}