Variable-composition evolutionary algorithm (EA-vc)

2025 July 7, updated

Overview

Since CrySPY 1.4.0, a variable-composition EA (EA-vc) has been available as an extension of the fixed-composition EA. Refer to the following page for the supported interfaces (Interface). Although the overall flow is similar to the fixed-composition EA, EA-vc differs in how fitness is evaluated and how offspring are generated in order to handle varying compositions. Here, we describe the parts that have been modified from the original EA.

From version 1.4.1, it is possible to generate structures under the charge neutrality condition.

Procedure

  1. Initialize population
  2. Evaluate fitness
  3. Natural selection
  4. Select parents
  5. Create next generation
  6. Repeat from step 2: Evaluate fitness

Initialize population

In the first generation, a set of random structures is generated according to the number specified by n_pop. tot_struc is not used in EA or EA-vc. In EA-vc, the number of atoms for each atom type is randomly determined within a user-defined range. The minimum (ll_nat) and maximum (ul_nat) number of atoms per type can be specified in cryspy.in as shown below.

[structure]
atype = Cu Au
ll_nat = 0 0
ul_nat = 8 8

Evaluate fitness

The convex hull computed from formation energies is used to evaluate the phase stability of different compositions, since directly comparing total energies of structures with different numbers of atoms is not meaningful. Information on formation energy, the convex hull, and phase diagrams can be found online. For example, see Materials Project Documentation. In EA-vc, the fitness is defined as the energy above hull (also referred to as hull distance).

fig_EA-vc_phase_diagram_binary.svg fig_EA-vc_phase_diagram_binary.svg

Formation energy

Formation energy is calculated based on the reference energies (in eV/atom) of stable pure elements, which are specified as end_point in cryspy.in. For example, in the case of the Cu–Au binary system, the end_point should contain the per-atom energies (in eV/atom) of fcc-Cu and fcc-Au, in that order. Note that even if a structure with the same composition as end_point is found during the structure search and has a total energy lower than the corresponding end_point value, the formation energy is still currently calculated based on the original end_point values defined in cryspy.in.

Convex hull

The energy difference between a given structure’s formation energy and the convex hull is called the energy above hull, also known as the hull distance. This value indicates how much higher the formation energy of a structure is compared to the most stable combination of phases at the same composition. Structures with a hull distance of zero are on the convex hull and are thus thermodynamically stable.

Unlike in the fixed-composition EA, EA-vc filters structures based on their per-atom energy when computing the convex hull, using the condition:    $$ \mathrm{emin\_ea} \le E \le \mathrm{emax\_ea} $$ Note that this filtering is based only on the total energy per atom, not on the formation energy.

To compute the convex hull, CrySPY uses the PhaseDiagram class provided by the pymatgen library. Unlike in the case of formation energy, if a structure with the same composition as a pure element has a total energy lower than the corresponding end_point value, that structure is used as the reference for computing the convex hull and hull distance.

Natural selection

As shown in the figure below, EA-vc can produce multiple stable structures (i.e., with a hull distance of 0). In such cases, multiple individuals share the top rank in terms of hull distance. If the number of elite structures specified by n_elite is smaller than the number of equally ranked individuals, the selection becomes non-deterministic. Currently, CrySPY randomly selects n_elite individuals from those with a hull distance less than 0.001 eV/atom. If the number of individuals with a hull distance less than 0.001 eV/atom is smaller than n_elite, elite structures are selected in the usual way, based on fitness ranking. When selecting elite individuals as well, duplicate structures are removed using the StructureMatcher class provided by the pymatgen library. fig_EA-vc_elite.svg fig_EA-vc_elite.svg

Elite individuals are selected based on the best structures from previous generations. However, because hull distance can vary from one generation to the next, the values for elite individuals are recalculated using the current convex hull before natural selection is applied.

As described in the Convex hull section, emin_ea and emax_ea are not used for natural selection in EA-vc.

Select parents

The method for selecting parents is the same as in the fixed-composition EA.

Create next generation

Evolutionary operations

The crossover (vc) operation is slightly different from that in the fixed-composition EA, while permutation and strain are the same. EA-vc introduces several new operations to enable compositional variation.

Population size

The sum of structures from crossover, permutation, strain, addition, elimination, substitution, and random generation must be equal to n_pop.

  • n_pop = n_crsov + n_perm + n_strain + n_add + n_elim + n_subs+ n_rand

Subsections of Variable-composition evolutionary algorithm (EA-vc)

Crossover (vc)

The variable-composition crossover is almost the same as the fixed-composition version, but it differs in that the adjustment of the number of atoms is minimized.

In step 6 of the fixed-composition crossover, the difference in the number of atoms in each atom type is calculated directly. In contrast, in crossover (vc), the difference is calculated based on the allowed range defined by ll_nat and ul_nat. For example:

ll_nat = [4, 4, 4]
ul_nat = [8, 8, 8]
offspring_nat = [2, 6, 12]
nat_diff = [-2, 0, 4]

If this difference in the number of atoms (nat_diff in the example above) exceeds the allowed tolerance (nat_diff_tole), the operation is retried. Otherwise, the number of atoms is adjusted to fall within the range defined by ll_nat and ul_nat.

Addition

2025 July 7, updated

An atom type whose current count does not exceed the limit specified by ul_nat is randomly selected, and one atom of that type is added at a random position.

From version 1.4.1, the functionality to add multiple atoms has been implemented. The number of atoms to be added is randomly selected from natural numbers up to add_max. The default value is add_max = 3.

  • Add one atom and check whether it satisfies the minimum interatomic distance specified by mindist.
  • If the distance condition is not satisfied, the atom is placed again at a different random position. This process is repeated up to maxcnt_ea times.
  • (since version 1.4.1) Repeat until the randomly determined number of atoms (up to add_max) have been added.
  • If no valid offspring is obtained, the volume is expanded by 10%, and the same procedure is retried up to maxcnt_ea times.
  • If that also fails, the volume is expanded up to 20% and the structure generation is attempted again. If it still fails, the parent is replaced.

fig_EA-vc_addition.svg fig_EA-vc_addition.svg

Elimination

2025 July 7, updated

An atom type whose current count is above the lower limit specified by ll_nat is randomly selected, and one atom of that type is removed.

From version 1.4.1, the functionality to remove multiple atoms has been implemented. The number of atoms to be removed is randomly selected from natural numbers up to elim_max. The default value is elim_max = 3.

fig_EA-vc_elimination.svg fig_EA-vc_elimination.svg

Substitution

2025 July 7, updated

Substitution is an operation in which two different atom types are randomly selected and their positions are substituted.

From version 1.4.1, the functionality to substitute multiple atoms has been implemented. The number of atoms to be substituted is randomly selected from natural numbers up to subs_max. The default value is subs_max = 3.

  • The number of atoms after the substitution is restricted so that it does not fall below the minimum (ll_nat) and does not exceed the maximum (ul_nat) number of atoms.
  • Finally, the minimum interatomic distance specified by mindist is checked, and if there are no issues, the structure is accepted as an offspring.

fig_EA-vc_substitution.svg fig_EA-vc_substitution.svg