Trace: distributed

Ramble Meter

Ramble Meter
This post is very close to being completely finished. Not that rambly at all.

Massive investments are being made in the European landscape of quantum computing. The question is what frameworks that enable orchestration of calculations to only deploy the most optimal problem formulation on the most suitable piece of hardware.

Problem statement

As a researcher and innovator in the quantum life-science area, I want to be able to develop or test an algorithm locally on my laptop and iteratively expand on it in terms of parameters, noise models used, systems analysed etc. I want to define a grid of parameters, something like:

hyperparam_grid = [

  {
      'optimizer_type': 'COBYLA',
      'optimizer_steps': 50,
      'tol': 1e-3
  },
  {
      'optimizer_type': 'SPSA',
      'optimizer_steps': 50,
      'learning_rate': 0.1
  },
  ...

]

over which i want to find the optimal combination with respect to evaluation criteria. From this, a test deployment on a actual quantum computer backend would be made. I would then like to collect details on the calculation and build a “profile” of calculations over different European compute backends.

Moreover, ideally I would like to be able to create workflows where i can use whatever code i like for calculating integrals and different pieces of data required for my calculations and that parts of these workflows can be executed in parallell over processes that are interdependent. In pseudo-pseudo code:

Define:

  ansatz_types = ['TwoLocal', 'EfficientSU2']
  optimizers = ['COBYLA', 'SPSA']
  hyperparams_list = [{'optimizer_steps': 10, 'optimizer_params': {'tol': 1e-3}}, ... ]
  noise_models = ['depolarizing', 'bit_flip']
  

For each ansatz_type:

  Create and append ansatz_spec (function: prepare_ansatz, dependencies: ["assemble_hamiltonian"])

For each optimizer:

  vqe_dependencies = ["prepare_ansatz_<ansatz_type>"]
  Create nodename_prefix = "run_vqe_<ansatz_type>_<optimizer>"

For each hyperparams:

  Create and append vqe_spec (function: run_vqe_simulation, dependencies: vqe_dependencies, optimizer: optimizer, hyperparams: hyperparams)
  noise_dependencies = [vqe_spec.node_name]
  Create noise_nodename_prefix = "apply_noise_<vqe_spec.node_name>"

For each noise_model:

  Create and append noise_spec (function: apply_noise_model, dependencies: noise_dependencies, noise_model: noise_model, ansatz_type: ansatz_type)

Which yields a workflow of dependencies for each wavefunction ansatz:

  "assemble_hamiltonian" -> "prepare_ansatz" -> "run_vqe_simulation" -> "apply_noise_model"

Where the hamiltonian is assembled for the particular system and calculation in mind, preferably in a manner where the one and two electron integrals can be calculated using a code of choice.

Solution

ColonyOS is a meta-operating system that simplifies the execution of workloads across diverse and distributed computing environments, including cloud, edge, HPC, and IoT. This makes it ideal for managing the complex, resource-intensive tasks associated with distributed quantum computing. It is Open Source Software (MIT License) and is available on github for inspection and download. There are some great tutorial notebooks to get started. Below is a brief overview of the core conceptual features of the system.

Distributed Microservice Architectures

ColonyOS is built on a microservices model, where small, independent executors handle specific tasks. This approach supports the distributed nature of quantum computing, where quantum tasks (e.g., operations on quantum processors) may need to be executed across geographically separated quantum and classical computing resources. Executors are deployed independently and scaled horizontally, ensuring efficient parallel processing and fault tolerance. Workflow Orchestration

The platform allows users to define complex, multi-step workflows across distributed executors. This is crucial for quantum computing workflows, which often involve iterative processes like quantum circuit execution, optimization steps (e.g., in VQE algorithms), and hybrid quantum-classical computations. ColonyOS manages the dependencies and execution order, ensuring smooth operation across different systems.

Scalability and Fault Tolerance

Quantum computing systems require robust fault tolerance due to the probabilistic nature of quantum states and the potential for node failures in distributed setups. ColonyOS supports automatic re-assignment of tasks to healthy executors if one fails, making it highly resilient to issues that might disrupt quantum computation across distributed nodes.

Platform-Agnostic Integration

ColonyOS’s ability to operate across various platforms, including cloud and HPC environments, aligns with the hybrid quantum-classical infrastructure often required for quantum computing. This allows ColonyOS to orchestrate tasks that need to run on both classical supercomputers and quantum processors.

In summary, ColonyOS's distributed architecture, task orchestration capabilities, scalability, and fault tolerance make it a strong choice for managing the complexities of distributed quantum computing, where tasks need to be efficiently coordinated across a range of quantum and classical computing environments.

Implementation

The attempt made in this post is to describe the use of ColonyOS as an orchestrator for quantum computation tasks. The focus is on solving a simpler problem (a water ground state energy calculation) to illustrate the potential of the orchestrator and what it could mean for the distributed aspects of quantum computing.

What Is Measured to Evaluate the Calculations

Besides runtimes and ground state energy values generated from the calculations,, estimating how noise affects quantum circuits is important. To evaluate the performance of quantum circuits under noisy conditions, there are numerous metrics that can be used. In the present work the implementation covers Shannon Entropy and Jensen-Shannon Divergence (JSD) as a starting point. A few sections below will be used to elaborate on this using qiskit as a reference.

Noise Modeling in Qiskit

Qiskit provides various tools for simulating quantum circuits under various noise models. In this context, two common noise models are employed and described below.

1. Depolarizing Noise Model

The depolarizing noise model represents a scenario where the quantum state loses its coherence and becomes a completely mixed state with a certain probability. Mathematically, for a single-qubit state \(\rho\), the depolarizing channel \(\mathcal{E}_{\text{dep}}\) is defined as:

$$ \mathcal{E}_{\text{dep}}(\rho) = (1 - p) \rho + p \frac{I}{2} $$

where \(p\) is the depolarizing probability (error rate) and \(I\) is the identity matrix representing the maximally mixed state.

For multi-qubit systems, the depolarizing channel generalizes by applying the noise independently to each qubit or collectively to the entire system, depending on the model specifics.

In Qiskit, the depolarizing noise is added to quantum gates like single-qubit rotations (u3) and two-qubit gates (cx) using the depolarizing_error function:

depolarizing_error_1q = noise.depolarizing_error(p, 1)
depolarizing_error_2q = noise.depolarizing_error(p, 2)
>

In the depolarizing noise model, every gate operation is followed by the application of the depolarizing channel \(\mathcal{E}_{\text{dep}}\) with a specified probability. This simulates the randomization of the qubit state due to interactions with the environment.

For example, after applying a gate \(U\), the state \(\rho'\) becomes:

$$ \rho' = \mathcal{E}_{\text{dep}}(U \rho U^\dagger) = (1 - p) U \rho U^\dagger + p \frac{I}{2} $$

Bit-Flip Noise Model

The bit-flip noise model simulates the error where a qubit flips its state from \(|0\rangle\) to \(|1\rangle\) or vice versa, akin to a classical bit flip. The bit-flip channel \(\mathcal{E}_{\text{bf}}\) for a single qubit is defined as:

$$ \mathcal{E}_{\text{bf}}(\rho) = (1 - p) \rho + p X \rho X^\dagger , $$

where \(X\) is the Pauli-X operator, and \(p\) is the probability of a bit-flip error.

In Qiskit, bit-flip errors are introduced using the pauli_error function:

bit_flip_error_1q = noise.pauli_error([('X', p), ('I', 1 - p)])
bit_flip_error_2q = noise.pauli_error([('XX', p), ('II', 1 - p)])

For this model, every gate operation is followed by the application of the depolarizing channel \(\mathcal{E}_{\text{dep}}\) with a specified probability. This simulates the randomization of the qubit state due to interactions with the environment.

For example, after applying a gate \(U\), the state \(\rho'\) becomes:

$$ \rho' = \mathcal{E}_{\text{dep}}(U \rho U^\dagger) = (1 - p) U \rho U^\dagger + p \frac{I}{2} $$

Measuring the Impact of Noise

To quantify how noise affects the quantum circuit, we analyze the output probability distributions of the circuit under noiseless and noisy conditions using the following metrics:

Shannon Entropy

Shannon Entropy measures the uncertainty or randomness in a probability distribution. For a discrete random variable with possible outcomes \(\{x_i\}\) and corresponding probabilities \(\{p_i\}\), the Shannon entropy \(H\) is defined as:

\[ H(X) = -\sum_{i} p_i \log_2 p_i \]

where:

\(H(X)\): The Shannon entropy of the random variable \(X\).
\(p_i\): The probability of the \(i\)-th outcome.
\(\log_2 p_i\): The base-2 logarithm of \(p_i\), reflecting the amount of information in each outcome.\\

In the context of quantum circuits, the entropy of the output distribution indicates how spread out the measurement outcomes are:

Low Entropy: Concentrated on specific outcomes, implying less uncertainty.
High Entropy: More uniform, indicating higher uncertainty and randomness, often due to noise.

By calculating the entropy of both the noiseless (\(H_{\text{noiseless}}\)) and noisy (\(H_{\text{noisy}}\)) output distributions, we can assess the increase in uncertainty introduced by noise.

Jensen-Shannon Divergence (JSD)

The Jensen-Shannon Divergence is a method of measuring the similarity between two probability distributions. It is a symmetrized and smoothed version of the Kullback-Leibler divergence and is always bounded between 0 and 1 when using log base 2.

For two probability distributions \(P = \{p_i\}\) and \(Q = \{q_i\}\), the JSD is defined as:

$$ \text{JSD}(P \parallel Q) = \frac{1}{2} D_{\text{KL}}(P \parallel M) + \frac{1}{2} D_{\text{KL}}(Q \parallel M) $$

where:

\(M = \frac{1}{2}(P + Q)\) is the average distribution,
\(D_{\text{KL}}(P \parallel M)\) is the Kullback-Leibler divergence from \(P\) to \(M\):

$$ D_{\text{KL}}(P \parallel M) = \sum_{i} p_i \log_2 \left( \frac{p_i}{m_i} \right) $$

The JSD effectively measures how much the noisy distribution deviates from the noiseless distribution:

JSD = 0: The distributions are identical.
Higher JSD Values: Indicate greater divergence between the distributions due to noise.

Applying the Metrics to Evaluate Noise Impact

By computing the Shannon entropy and JSD for the output distributions, we gain quantitative insights into the noise's effect:

Entropy Difference (\(\Delta H\)):

$$ \Delta H = H_{\text{noisy}} - H_{\text{noiseless}} $$

Interpretation: A positive \(\Delta H\) suggests that noise has increased the uncertainty in the output distribution.

Entropy Ratio:

$$ \text{Entropy Ratio} = \frac{H_{\text{noisy}}}{H_{\text{noiseless}}} $$

Practical Implications of the Metrics

Algorithm Performance: Lower entropy difference and JSD values imply that the quantum circuit's performance is closer to the ideal case, with noise having a minimal effect.
Noise Model Assessment: By comparing metrics across different noise models (depolarizing vs. bit-flip), we can evaluate which types of noise have more detrimental effects on the circuit.
Optimization Strategies: Understanding how specific noise types impact the circuit guides the development of error mitigation techniques and circuit optimization.

Ranking and the Calculation Graph

The ColonyOS workflow calculation serializes qiskit objects as well as metrics and metadata from each part of the workflow into a sqlite database. This database is then exposed to localhost by a simple Flask API. That flask API is connected to a React frontend that exposes to key views of the results data, detailed below. Both views show the same data and allow ranking across a set of metrics, but do so in different ways.

The Metrics table

The metrics table is a simple (in development) table that simply displays each noise simulation computation and data from its related VQE simulation.

The Workflow Graph

The workflow graph shows how each step in the workflow is connected and which steps are dependent on its information.

Here, the legend tells what part of the calculation workflow the nodes correspond to, and a node information panel displays metrics of the selected node. It allows one to compute a ranking across nodes (similar to the metrics table) and rescales as well as labels the nodes as a function of rank, as can be seen below.

While this is interesting and useful for smaller calculation like this, one can imagine what such a database could present if the graph could provide easily searchable sets of data. Graph structures readily fit into graph learning algorithms as well, opening up the possibility of predicting calculation graph result estimates based on input to get a first order approximation of what may be good parameters to explore for the given system.

Code Examples

If you have read this far but still feel confused about ColonyOS, my slide material (really a day light crime - great material written by my colleague Johan Kristiansson and minor changes might help:

qas-cos.pdf

As mentioned earlier, we use ColonyOS features to create and schedule workflows of processes. This can be done in one go through using the Python interface pycolonies :


def build_workflow():
    step_name = "build_workflow"
    start_time = time.time()
    log_step(step_name, "started", start_time=start_time)

    try:
        vqe_simulation_nodes = []

        # Step 1: Calculate One-Electron Integrals
        one_electron_spec = generate_one_electron_integrals_spec("calculate_one_electron_integrals")
        vqe_simulation_nodes.append(one_electron_spec)

        # Step 2: Calculate Two-Electron Integrals
        two_electron_spec = generate_two_electron_integrals_spec("calculate_two_electron_integrals")
        vqe_simulation_nodes.append(two_electron_spec)

        # Step 3: Assemble Hamiltonian
        assemble_spec = generate_assemble_hamiltonian_spec(
            "assemble_hamiltonian",
            dependencies=["calculate_one_electron_integrals", "calculate_two_electron_integrals"],
            dependency_ids = [
                one_electron_spec.kwargs.get("one_electron_uuid"),
                two_electron_spec.kwargs.get("two_electron_uuid")
            ]
        )

        vqe_simulation_nodes.append(assemble_spec)

        # Prepare Ansatz and VQE Simulation
        ansatz_types = ['TwoLocal', 'EfficientSU2']
        optimizers = ['COBYLA', 'SPSA']
        hyperparams_list = [
            {'optimizer_steps': 10, 'optimizer_params': {'tol': 1e-3}},
            {'optimizer_steps': 20, 'optimizer_params': {'tol': 1e-4}},
            {'optimizer_steps': 30, 'optimizer_params': {'tol': 1e-5}},
        ]

        #Define noise models
        noise_models = ['depolarizing', 'bit_flip']

        for ansatz_type in ansatz_types:

            ansatz_spec = generate_ansatz_spec(
                ansatz_type,
                f"prepare_ansatz_{ansatz_type}",
                dependencies=["assemble_hamiltonian"],
                dependency_ids = [
                    assemble_spec.kwargs.get("hamiltonian_uuid"),
                ]
            )
            vqe_simulation_nodes.append(ansatz_spec)

            for optimizer in optimizers:
                # Each optimizer for a given ansatz has its own dependency on the ansatz
                vqe_dependencies = [f"prepare_ansatz_{ansatz_type}"]
                nodename_prefix = f"run_vqe_{ansatz_type}_{optimizer}"

                # Generate VQE execution specs for different hyperparameters
                vqe_hyperparam_specs = generate_vqe_hyperparam_specs(
                    nodename_prefix,
                    dependencies=vqe_dependencies,
                    dependency_ids = [
                        assemble_spec.kwargs.get("hamiltonian_uuid"),
                        ansatz_spec.kwargs.get("ansatz_uuid"),
                    ],
                    optimizer_type=optimizer,
                    hyperparams_list=hyperparams_list
                )

                vqe_simulation_nodes.extend(vqe_hyperparam_specs)

                # For each VQE result, generate noise model specs
                for vqe_spec in vqe_hyperparam_specs:
                    vqe_nodename = vqe_spec.nodename
                    noise_dependencies = [vqe_nodename]
                    noise_nodename_prefix = f"apply_noise_{vqe_nodename}"
                    noise_specs = generate_noise_model_specs(
                        noise_nodename_prefix,
                        dependencies=noise_dependencies,
                        noise_models_list=noise_models,
                        ansatz_type=ansatz_type,
                        dependency_ids = [
                            vqe_spec.kwargs.get("vqe_result_uuid"),
                            ansatz_spec.kwargs.get("ansatz_uuid"),
                        ],
                    )
                    vqe_simulation_nodes.extend(noise_specs)

        # Submit the workflow
        workflow = Workflow(colonyname=colonyname)
        workflow.functionspecs.extend(vqe_simulation_nodes)
        workflow_graph = colonies.submit_workflow(workflow, prvkey)
        print(f"Workflow {workflow_graph.processgraphid} submitted")

This workflow generates a graph much like the one described earlier. Each function spec takes a node identified and generates data for instantiating a FuncSpec object, like for instance the one electron integral:


def generate_one_electron_integrals_spec(nodename):
    one_electron_uuid = str(uuid.uuid4())
    return FuncSpec(
        funcname="calculate_one_electron_integrals",
        nodename=nodename,
        kwargs={"one_electron_uuid": one_electron_uuid},
        conditions=Conditions(
            colonyname=colonyname,
            executortype="quantum-executor",
        )
    )

The executor used in this example, in turn, imports the required functions and calls them with appropriate arguments:


from quantum_workflow.hamiltonian import (
    calculate_one_electron_integrals,
    calculate_two_electron_integrals,
    assemble_hamiltonian
)
from quantum_workflow.ansatz import prepare_ansatz
from quantum_workflow.vqe_simulation import run_vqe_simulation
from quantum_workflow.noise_model import apply_noise_model

class QuantumExecutor:
    def __init__(self):
        colonies, colonyname, colony_prvkey, _, _ = colonies_client()
        self.colonies = colonies
        self.colonyname = colonyname
        self.colony_prvkey = colony_prvkey
        self.executorname = f"quantum-executor-{socket.gethostname()}-{uuid.uuid4()}"
        self.executortype = "quantum-executor"
        self.mem = "1Gi"
        self.cpu = "1000m"
        # self.gpu = {"count": 1}

        # Generate private key for the executor
        crypto = Crypto()
        self.executor_prvkey = crypto.prvkey()
        self.executorid = crypto.id(self.executor_prvkey)

        self.register()

    def register(self):
        executor = {
            "executorname": self.executorname,
            "executorid": self.executorid,
            "colonyname": self.colonyname,
            "executortype": self.executortype
        }

        try:
            # Register and approve the executor
            executor = self.colonies.add_executor(executor, self.colony_prvkey)
            self.colonies.approve_executor(self.colonyname, self.executorname, self.colony_prvkey)

            # Register the functions with the executor
            functions = [
                "calculate_one_electron_integrals",
                "calculate_two_electron_integrals",
                "assemble_hamiltonian",
                "prepare_ansatz",
                "run_vqe_simulation",
                "apply_noise_model"
            ]

            for func in functions:
                self.colonies.add_function(self.colonyname, self.executorname, func, self.executor_prvkey)

        except Exception as err:
            print(err)
            os._exit(0)

        print("Executor", self.executorname, "registered")

    def start(self):
        while True:
            try:
                # Assign the process to the executor
                process = self.colonies.assign(self.colonyname, 10, self.executor_prvkey)
                print("Process", process.processid, "is assigned to executor")

                # Add a log entry
                self.colonies.add_log(process.processid, f"Running {process.spec.funcname}...\n", self.executor_prvkey)

                # Dynamically call the function based on the assigned process
                funcname = process.spec.funcname
                kwargs = process.spec.kwargs

                if funcname == "calculate_one_electron_integrals":
                    result = self.execute_calculate_one_electron_integrals(kwargs)
                elif funcname == "calculate_two_electron_integrals":
                    result = self.execute_calculate_two_electron_integrals(kwargs)
                    
                etc.. 
 

    def execute_calculate_one_electron_integrals(self, kwargs):
        one_electron_uuid = kwargs.get("one_electron_uuid")
        return calculate_one_electron_integrals(one_electron_uuid)
    
    etc.. 

calculate_one_electron_integrals and other functions under the executor is what contains the actual qiskit implementation:


from pyscf import gto, scf, ao2mo

etc..

def calculate_one_electron_integrals(uuid):
    step_name = "calculate_one_electron_integrals"
    start_time = time.time()
    log_step(step_name, "started", start_time=start_time)
    try:
        # Set up the molecule
        mol = gto.Mole()
        mol.atom = '''O 0 0 0; H 0 1 0; H 0 0 1'''
        mol.basis = 'sto3g'
        mol.unit = 'Angstrom'
        mol.build()

        # Logging the molecular details
        log_step(step_name, "molecule_built", additional_info={"molecule": mol.atom_coords().tolist(), "basis": mol.basis})

        # Perform Hartree-Fock calculation
        mf = scf.RHF(mol)
        mf_energy = mf.kernel()

        # Get one-electron integrals (core Hamiltonian)
        hcore = mf.get_hcore()

        end_time = time.time()

        metrics = {
            "step": "calculate_one_electron_integrals",
            "start_time": start_time,
            "end_time": end_time,
            "run_time": end_time - start_time,
        }

        one_electron_result = ExtendedIntegralResult(id = uuid, result = hcore, model_type = "integral", calc_metadata={"uuid": uuid, "description": "Core Hamiltonian"}, metrics = metrics)

        one_electron_result.save_to_db()

        log_step(step_name, "completed", additional_info={"uuid": uuid, "hcore_shape": hcore.shape}, start_time=start_time)
    except Exception as e:
        log_step(step_name, "error", additional_info={"error": str(e)}, start_time=start_time)
        raise e
    return uuid
    

Clearly, for simplicity here I'm using PySCF to do my integral calculations, but I could just as well use any other language and integral library to do this. This is a benefit of using a loosely coupled system like this - we could implement any node in any way as long as the format they use to exchange data is conserved.

The workflow output is stored in a local sqlite database, and subsequent nodes in the workflow can, upon successful completion of its dependencies (and only then) access the data in the database. If the database transactions are atomic and a particular node in the workflow has succeeded, we can trust this to be a safe operation that will yield clear errors if it fails.

Summary

The ambition of this was to show what is work in progress in terms of using a state of the art orchestrator for distributed computing for the use case of quantum computation. Furthermore, to use graph data analytics to view and analyse computation results. Future work remains to be published.

projects/quantum/distributed.txt · Last modified: 2024/12/06 16:05