Getting Started

This guide is for developers who want to understand the metalquicha codebase, even without prior quantum chemistry or Fortran experience.

(this file was partially generated with an LLM but carefully checked by me, Jorge)

What is Metalquicha?

Metalquicha (Huastec word for “sunflower”) is a quantum chemistry code that specializes in fragmented calculations. Think of it as a calculator for molecular energies, but smart enough to break large molecules into smaller pieces to make calculations faster.

What Does It Do?

  • Input: Molecular structure (atoms and their positions) + calculation settings

  • Process: Breaks molecules into fragments, calculates energies for each piece

  • Output: Total energy (or other properties) of the system

Why Fragment?

Calculating the energy of a 1000-atom molecule is computationally expensive. But if you break it into 10-atom fragments, calculate each separately, and combine the results cleverly (using Many-Body Expansion), you can get accurate results much faster.

Quick Start

Build the Code

# Clone and build
git clone git@github.com:JorgeG94/metalquicha.git
cd metalquicha
mkdir build && cd build
cmake ..
make -j

# The executable will be at: build/mqc

Run Your First Calculation

First, you need to use the python script mqc_prep.py to generate the mqc format, from the json files. The reason behind not using the json directly is because the json libraries in Fortran can be delicate with different Fortran compilers, therefore we use a very simple format so that we can use plain old Fortran.

To generate an input file, simply run:

python mqc_prep.py validation/inputs/h3o.json
python mqc_prep.py validation/inputs/prism.json

Then run the calculations:

# Unfragmented calculation (simple hydronium ion)
./build/mqc validation/inputs/h3o.mqc

# Fragmented calculation (water prism)
./build/mqc validation/inputs/prism.mqc

The output shows energies and computation details. To run a parallel calculation, simply use mpirun:

mpirun -np nproc ./build/mqc validation/inputs/prism.mqc

Directory Structure

metalquicha/
├── app/                          # Entry point
│   └── main.f90                  # Program starts here
├── src/                          # Source code
│   ├── core/                     # Core data structures
│   │   ├── mqc_geometry.f90      # Molecular structures
│   │   ├── mqc_elements.f90      # Periodic table data
│   │   └── mqc_result_types.f90  # Energy/gradient results
│   ├── io/                       # Input/output
│   │   ├── mqc_config_parser.f90 # Parse .mqc files
│   │   ├── mqc_xyz_reader.f90    # Read .xyz geometries
│   │   └── mqc_json.f90          # JSON output
│   ├── fragmentation/            # Fragment generation
│   │   ├── mqc_combinatorics.f90      # Math for combinations
│   │   ├── mqc_fragment_lookup.f90    # Fast fragment indexing
│   │   ├── mqc_gmbe_utils.f90         # GMBE overlapping fragments
│   │   ├── mqc_physical_fragment.f90  # Fragment geometry builder
│   │   ├── mqc_mbe.f90                # Many-Body Expansion logic
│   │   └── mqc_mbe_io.f90             # Fragment I/O
│   ├── methods/                  # Quantum chemistry engines
│   │   ├── mqc_method_xtb.f90    # XTB method (GFN1-xTB)
│   │   └── mqc_method_base.f90   # Abstract method interface
│   ├── parallel/                 # MPI parallelization
│   │   └── mqc_mpi_tags.f90      # MPI message tags
│   ├── cli/                      # Command line parsing
│   └── mqc_driver.f90            # Main calculation orchestrator
├── test/                         # Unit tests
├── validation/                   # Physics validation tests
├── mqc_prep.py                   # Convert JSON → .mqc input
└── run_validation.py             # Run all validation tests

What Lives Where?

  • Core types (src/core/) - Define what a molecule, atom, or energy is

  • I/O (src/io/) - Reading input files, writing results

  • Fragmentation (src/fragmentation/) - Breaking molecules into pieces

  • Methods (src/methods/) - The actual quantum chemistry calculations

  • Driver (src/mqc_driver.f90) - Ties everything together

Key Concepts

Fragments

A fragment is a subset of atoms from the full molecule. For example, a protein with 1000 atoms might be split into:

  • Fragment 1: atoms 1-10

  • Fragment 2: atoms 11-20

  • … and so on

Many-Body Expansion (MBE)

Instead of calculating the whole molecule, we use:

E_total ≈ E(monomer1) + E(monomer2) + ...
          + E(dimer12) - E(monomer1) - E(monomer2)
          + ...

Key idea: Higher-order interactions (trimers, tetramers) contribute less, so we can stop at dimers or trimers and get good accuracy.

Generalized MBE (GMBE)

Regular MBE assumes fragments don’t overlap. GMBE handles overlapping fragments using the Principle of Inclusion-Exclusion (PIE):

E_total = Σ E(fragments) - Σ E(intersections) + Σ E(triple_intersections) - ...

This is implemented in mqc_gmbe_utils.f90 using a depth-first search (DFS) algorithm.

Hydrogen Capping

When you break a chemical bond (e.g., C-C), the fragment has a “dangling bond.” We add a hydrogen atom at the break point to satisfy valency. This is called hydrogen capping.

See mqc_physical_fragment.f90:add_hydrogen_caps()

System Geometry

The system_geometry_t type stores:

  • All atom coordinates

  • Fragment definitions (which atoms belong to which fragment)

  • Fragment sizes

  • Bonds (and which are broken)

See src/fragmentation/mqc_physical_fragment.f90

Code Flow: How a Calculation Works

Here’s what happens when you run ./mqc input.mqc:

Step 1: Entry Point (app/main.f90)

program main
  ! 1. Initialize MPI for parallel computation
  call pic_mpi_init()

  ! 2. Parse command line argument (input file)
  call get_command_argument(1, input_file)

  ! 3. Read .mqc input file
  call read_mqc_file(input_file, mqc_config, error)

  ! 4. Convert config to internal representation
  call config_to_system_geometry(mqc_config, sys_geom, error)

  ! 5. Run the calculation
  call run_calculation(world_comm, node_comm, config, sys_geom, bonds)
end program

Step 2: Driver (src/mqc_driver.f90)

The driver decides: fragmented or unfragmented?

subroutine run_calculation(...)
  if (sys_geom%fragmented) then
    ! Run MBE or GMBE
    call run_fragmented_calculation(...)
  else
    ! Run single calculation on whole molecule
    call run_unfragmented_calculation(...)
  end if
end subroutine

Step 3: Fragment Generation

For fragmented calculations:

  1. Generate combinations (mqc_combinatorics.f90)

    • Use binomial coefficients to determine number of fragments

    • Generate all k-mers (monomers, dimers, trimers, etc.)

  2. Build fragment geometries (mqc_physical_fragment.f90)

    • Extract atom coordinates for each fragment

    • Add hydrogen caps if bonds are broken

    • Store in system_geometry_t

  3. For GMBE: Find intersections (mqc_gmbe_utils.f90)

    • Use DFS to enumerate all overlapping cliques

    • Compute PIE coefficients (+1 for odd cliques, -1 for even)

    • Deduplicate intersection atom sets

Step 4: Energy Evaluation

For each fragment:

  1. Select method (mqc_method_xtb.f90 currently)

    • Initialize XTB calculator

    • Convert geometry to XTB format

  2. Run calculation

    call method%calculate_energy(fragment_geometry, result)
    
  3. Store result in calculation_result_t type

Step 5: Combine Results

MBE (mqc_mbe.f90:compute_mbe_energy):

total_energy = sum(monomers)
             + sum(dimers) - 2*sum(monomers)
             + sum(trimers) - 3*sum(dimers) + 3*sum(monomers)
             ...

GMBE uses PIE coefficients:

total_energy = sum(coefficient(i) * energy(i))

Step 6: Output

Results are written to:

  • Console: Human-readable summary

  • JSON file (output_*.json): Structured data for validation

Understanding the Modules

Core Modules (Start Here!)

mqc_geometry.f90

Defines molecular_geometry_t - stores atoms, positions, charges.

Key functions:

  • geometry_from_arrays() - Create geometry from atom data

  • get_atom_symbol() - Get element symbol for an atom

mqc_physical_fragment.f90

Defines system_geometry_t - the master structure for everything.

Key data:

  • total_atoms - Total number of atoms

  • xyz(:,:) - Atom coordinates (3, total_atoms)

  • atomic_numbers(:) - Element for each atom

  • fragment_atoms(:,:) - Which atoms are in which fragment

  • fragment_sizes(:) - Size of each fragment

  • fragmented - Boolean: is this a fragmented calculation?

Key functions:

  • build_fragment_from_indices() - Extract fragment geometry

  • add_hydrogen_caps() - Add H atoms at broken bonds

Fragmentation Modules (The Core Algorithm!)

mqc_combinatorics.f90

Pure math - no chemistry here!

Key functions:

  • binomial(n, r) - C(n,r) = n choose r

  • get_nfrags(n_monomers, max_level) - How many fragments total?

  • generate_fragment_list() - Generate all k-mer combinations

  • next_combination() - Iterator for combinations

Example:

! For 4 monomers, level 2 (dimers):
! Generates: [1,2], [1,3], [1,4], [2,3], [2,4], [3,4]

mqc_fragment_lookup.f90

Hash table for O(1) fragment lookup.

Why needed? With 100 monomers and level 3, you have C(100,3) = 161,700 trimers. Finding “which index is trimer [5,23,87]?” needs to be fast!

Key type: fragment_lookup_t

  • Uses FNV-1a hash function

  • Chaining for collision resolution

  • Prime-sized table for good distribution

mqc_gmbe_utils.f90

The most complex module - implements overlapping fragment logic.

Key functions:

  • find_fragment_intersection() - Find shared atoms between 2 fragments

  • generate_intersections() - Generate all k-way intersections

  • gmbe_enumerate_pie_terms() - DFS to find all cliques and compute PIE coefficients

Algorithm (simplified):

For each primary fragment i:
  Start DFS with clique = {i}
  For each candidate j > i that overlaps with current intersection:
    Add j to clique
    Compute new intersection
    If intersection non-empty:
      Coefficient = (+1 if |clique| odd, -1 if even)
      Recurse with expanded clique

mqc_mbe.f90

Energy combination logic for standard MBE.

Key function: compute_mbe_energy()

  • Loops through levels (monomers, dimers, trimers…)

  • Applies inclusion-exclusion formula

  • Returns total energy

I/O Modules

mqc_config_parser.f90

Parses .mqc input files (section-based format).

Structure:

type :: mqc_config_t
  character(len=:), allocatable :: method
  integer :: nlevel                    ! Max MBE level
  logical :: allow_overlapping_fragments
  type(geometry_t) :: geometry
  integer, allocatable :: fragment_indices(:,:)
  ! ... many more fields
end type

mqc_xyz_reader.f90

Reads .xyz geometry files (standard format):

3
Water molecule
O   0.0   0.0   0.0
H   0.0   0.0   0.96
H   0.93  0.0  -0.24

Method Modules

mqc_method_base.f90

Abstract interface - defines what a “quantum chemistry method” must provide.

type, abstract :: method_base_t
contains
  procedure(calculate_energy_interface), deferred :: calculate_energy
  procedure(calculate_gradient_interface), deferred :: calculate_gradient
end type

mqc_method_xtb.f90

Concrete implementation using the tblite library (GFN1-xTB method).

Key function: calculate_energy() converts geometry → calls tblite → returns energy

Where to Start Reading

For Understanding the Big Picture

  1. app/main.f90 (100 lines) - See the entry point

  2. src/mqc_driver.f90 (500 lines) - See how calculations are orchestrated

  3. src/fragmentation/mqc_physical_fragment.f90 (400 lines) - Understand fragments

For Understanding Fragmentation

  1. mqc_combinatorics.f90 - Pure math, easiest to understand

  2. mqc_fragment_lookup.f90 - Data structure for indexing

  3. mqc_mbe.f90 - See how energies are combined

  4. mqc_gmbe_utils.f90 - Most complex, save for last

For Understanding I/O

  1. mqc_config_parser.f90 - How we parse .mqc files

  2. mqc_xyz_reader.f90 - How we read geometries

For Understanding Quantum Chemistry Methods

  1. mqc_method_base.f90 - Interface definition

  2. mqc_method_xtb.f90 - Concrete implementation

Running Examples

Example 1: Unfragmented Calculation

# Simple hydronium ion
./build/mqc validation/inputs/h3o.mqc

What happens:

  • Reads 4 atoms (H₃O⁺)

  • No fragmentation

  • Calls XTB on the whole molecule

  • Prints energy: -5.773131 Ha

Example 2: Fragmented Water Prism

./build/mqc validation/inputs/prism.mqc

What happens:

  • Reads 18 atoms (6 water molecules)

  • Fragments into 6 monomers

  • Generates 6 monomers + 15 dimers = 21 fragments

  • Calculates energy for each fragment in parallel

  • Combines using MBE(2) formula

  • Prints total energy: -34.673668 Ha

Example 3: GMBE with Overlapping Fragments

./build/mqc validation/inputs/overlapping_gly3.mqc

What happens:

  • Reads glycine tripeptide with overlapping fragments

  • Uses GMBE(1) with PIE enumeration

  • Finds intersections between overlapping fragments

  • Applies inclusion-exclusion principle

  • Prints total energy

Prepare Your Own Input

# Create a JSON input
cat > my_input.json << EOF
{
  "schema": {"name": "mqc-frag", "version": "1.0"},
  "molecules": [{
    "xyz": "water.xyz",
    "molecular_charge": 0,
    "molecular_multiplicity": 1,
    "fragments": [[0,1,2]]
  }],
  "model": {"method": "XTB-GFN1"},
  "driver": "Energy"
}
EOF

# Convert to .mqc format
python3 mqc_prep.py my_input.json

# Run calculation
./build/mqc my_input.mqc

Adding New Features

Example: Add a New Quantum Chemistry Method

  1. Create new method file: src/methods/mqc_method_mynew.f90

module mqc_method_mynew
  use mqc_method_base, only: method_base_t
  implicit none

  type, extends(method_base_t) :: mynew_method_t
  contains
    procedure :: calculate_energy => mynew_calculate_energy
    procedure :: calculate_gradient => mynew_calculate_gradient
  end type

contains

  subroutine mynew_calculate_energy(this, geom, result)
    class(mynew_method_t), intent(inout) :: this
    type(molecular_geometry_t), intent(in) :: geom
    type(calculation_result_t), intent(out) :: result

    ! Your implementation here
    result%energy = 0.0_dp  ! Replace with actual calculation
  end subroutine

end module
  1. Register in driver: Edit src/mqc_driver.f90

use mqc_method_mynew, only: mynew_method_t

! In select_method():
case ('MYNEW')
  allocate(mynew_method_t :: method)
  1. Add to CMakeLists.txt: src/methods/CMakeLists.txt

target_sources(
  ${main_lib}
  PRIVATE mqc_method_mynew.f90
          ...
)
  1. Test: Create validation test in validation/inputs/test_mynew.mqc

Example: Add New Fragment Selection Algorithm

Want to select fragments based on distance cutoffs instead of combinatorial generation?

  1. Add to mqc_physical_fragment.f90:

subroutine generate_distance_based_fragments(sys_geom, cutoff, fragments)
  type(system_geometry_t), intent(in) :: sys_geom
  real(dp), intent(in) :: cutoff
  integer, allocatable, intent(out) :: fragments(:,:)

  ! Your algorithm:
  ! - Loop through all monomer pairs
  ! - Compute distance between centers of mass
  ! - If distance < cutoff, include dimer
end subroutine
  1. Call from driver in the fragmentation setup section

  2. Add config option in mqc_config_parser.f90 to enable it

Testing

Unit Tests

Located in test/ directory. Each module has a corresponding test.

# Run all tests
cd build
ctest

# Run specific test
./test/test_mqc_frag_utils

Validation Tests

Located in validation/. These test physics correctness.

# Run all validation tests
python3 run_validation.py

# Run specific test
./build/mqc validation/inputs/prism.mqc

What validation tests check:

  • Energy matches expected value (within tolerance)

  • JSON output is well-formed

  • All fragments were evaluated

  • PIE coefficients are correct (for GMBE)

Adding a New Test

  1. Create input file: validation/inputs/my_test.mqc

  2. Add to validation_tests.json:

{
  "name": "My test case",
  "input": "validation/inputs/my_test.mqc",
  "expected_energy": -123.456789,
  "type": "fragmented"
}
  1. Run: python3 run_validation.py

The script will:

  • Run the calculation

  • Parse JSON output

  • Compare energy to expected value

  • Report PASS/FAIL

Common Patterns in the Code

Error Handling

type(error_t) :: error

call some_function(..., error)
if (error%has_error()) then
  call logger%error(error%get_message())
  return
end if

Logging

use pic_logger, only: logger => global_logger

call logger%info("Starting calculation...")
call logger%debug("Fragment "//to_char(i)//" has "//to_char(n_atoms)//" atoms")
call logger%error("Invalid input: "//error_msg)

Memory Management

Fortran 2003+ has automatic deallocation, but we explicitly clean up:

type(system_geometry_t) :: sys_geom

! ... use sys_geom ...

call sys_geom%destroy()  ! Explicitly deallocate

MPI Parallelization

! Each MPI rank gets a subset of fragments
do i = world_comm%rank() + 1, n_fragments, world_comm%size()
  call calculate_fragment_energy(fragments(i), results(i))
end do

! Gather results from all ranks
call world_comm%allgatherv(local_results, all_results)

Glossary

  • MBE: Many-Body Expansion - method to approximate total energy using fragments

  • GMBE: Generalized MBE - handles overlapping fragments

  • PIE: Principle of Inclusion-Exclusion - combinatorial method for counting overlaps

  • Monomer: 1-body fragment (single unit)

  • Dimer: 2-body fragment (pair of units)

  • Trimer: 3-body fragment (triple of units)

  • k-mer: k-body fragment (k units)

  • XTB: Extended Tight Binding - fast semi-empirical quantum chemistry method

  • GFN1: Geometry, Frequency, Non-covalency version 1 (XTB parameter set)

  • Hartree (Ha): Unit of energy in atomic units (1 Ha ≈ 27.2 eV)

Additional Resources

Getting Help

  • Check existing validation tests in validation/inputs/ for examples

  • Read module documentation - each module has detailed comments

  • Run with verbose logging: Set log_level = Verbose in your .mqc file

  • Check JSON output: Contains detailed fragment energies and coefficients

Next Steps

  1. Build and run: Get the code compiling and run validation tests

  2. Read app/main.f90: Understand the entry point

  3. Trace a simple calculation: Follow the code path for h3o.mqc (unfragmented)

  4. Trace a fragmented calculation: Follow the code path for prism.mqc (MBE)

  5. Understand GMBE: Read mqc_gmbe_utils.f90 (most complex part)

  6. Add a feature: Pick something small to modify and experiment

Happy coding! 🌻