This is an old revision of the document!

Albantakis, Prentner & Durham (2023) — Computing the integrated information of a quantum mechanism

Quick metadata

Key	albantakis2023computing
Type	Journal article
Journal	Entropy
Year	2023
Volume / Issue	25 / 3
Pages	449
Publisher	MDPI
Status	Review in progress
Topics	IIT, QIIT, quantum foundations, causality, information measures

Citation

Albantakis, Larissa; Prentner, Robert; Durham, Ian. (2023). Computing the integrated information of a quantum mechanism. Entropy 25(3):449.

BibTeX

@article{albantakis2023computing,
  title={Computing the integrated information of a quantum mechanism},
  author={Albantakis, Larissa and Prentner, Robert and Durham, Ian},
  journal={Entropy},
  volume={25},
  number={3},
  pages={449},
  year={2023},
  publisher={MDPI}
}

Albantakis et al. (2023) — QM re-formulation and failure modes

This note rewrites the paper’s core definitions in standard quantum information language, and tries to explain where the reasoning in the paper is physically ambiguous.

1. Translation into standard QM language

Let the total system be a finite-dimensional composite Hilbert space $$ \mathcal{H}_Q \;=\; \mathcal{H}_M \otimes \mathcal{H}_{M^0}, $$ where $M$ is the chosen subsystem (in the paper refered to as the “mechanism”, which implies “set of actions” more than “set of states”) and $M^0 := Q\setminus M$ is its complement. Let the system update $T$ be a completely positive, trace preserving (CPTP) map (often unitary in their examples) $$ T:\mathcal{B}(\mathcal{H}_Q) \to \mathcal{B}(\mathcal{H}_Q). $$

For a chosen subsystem $Z \subseteq Q$, write $Z^0 := Q\setminus Z$. This is called a “purview” in the paper. It is just the particular subsystem of states chosen.

1.1 Their “effect repertoire” is a channel with a fixed environment state

They define (single-node case, or single qubit if we discard the node terminology) the “effect repertoire” as $$ \pi_e(Z_i \mid m) \;=\; \rho_{Z_i \mid m,\, t+1} \;=\; \operatorname{tr}_{Z_i^0}\!\Bigl(T\bigl(\rho_M \otimes \rho^{\mathrm{mm}}_{M^0}\bigr)\Bigr), $$ where $\rho^{\mathrm{mm}}_{M^0} = I_{M^0}/\dim(\mathcal{H}_{M^0})$ is maximally mixed.

In standard QIT terms: this is simply the output of the effective channel $$ \mathcal{E}_{M\to Z}(X) \;:=\; \operatorname{tr}_{Z^0}\!\Bigl(T\bigl(X \otimes \rho^{\mathrm{mm}}_{M^0}\bigr)\Bigr), \qquad \pi_e(Z\mid m) = \mathcal{E}_{M\to Z}(\rho_M). $$

So “mechanism/purview/effect repertoire” = “pick subsystems + apply a derived channel + take a reduced state”.

This is mostly renaming as far as I can tell.

The nontrivial part is the choice $\rho_{M^0}\mapsto \rho^{\mathrm{mm}}_{M^0}$, i.e. replacing the outside world with maximally mixed “noise”.

1.2 Their multi-node “discount extraneous correlations” step is an entanglement-cluster factorization

For $|Z|>1$ they first form $$ \rho_{Z\mid m,\,t+1} := \operatorname{tr}_{Z^0}\!\Bigl(T\bigl(\rho_M \otimes \rho^{\mathrm{mm}}_{M^0}\bigr)\Bigr) $$ and then define a “maximal separability partition” $P^*(\rho_{Z\mid m,t+1})$ into subsets $Z^{(1)},\dots,Z^{(r^*)}$ that are internally entangled but not mutually entangled, and set $$ \pi_e(Z\mid m) \;:=\; \bigotimes_{i=1}^{r^*}\rho_{Z^{(i)}\mid m,\,t+1}. $$ This is their Definition 4 / Eq. (23).

This is equivalent to: “factor the chosen subset of states into entanglement clusters, then take a product state across clusters”, which removes correlations they label “extraneous classical correlations”.

Finding multipartite mixed-state entanglement structure is nontrivial.

2. Their information measure is a max-eigenvalue score against a maximally mixed baseline

They define quantum intrinsic difference (QID) as $$ \mathrm{QID}(\rho\Vert\sigma) \;:=\; \max_i \; p_i \left(\log p_i - \sum_j P_{ij}\log q_j\right), $$ where $\rho=\sum_i p_i|i\rangle\langle i|$, $\sigma=\sum_j q_j|j\rangle\langle j|$, and $P_{ij}=\langle i|j\rangle\langle j|i\rangle$.

They then set the “unconstrained” baseline to maximally mixed: $$ \sigma = \pi_e(Z) = \rho_Z^{\mathrm{mm}} = I_Z/d,\qquad d=\dim(\mathcal{H}_Z). $$

With $\sigma=I_Z/d$ one gets $q_j=1/d$ and $\sum_j P_{ij}\log q_j = \log(1/d)=-\log d$, hence $$ \mathrm{QID}\bigl(\rho\Vert I_Z/d\bigr) = \max_i \; p_i(\log p_i + \log d). $$

So, for their chosen baseline, QID depends only on the eigenvalues of $\rho$ and is maximized at the largest eigenvalue $p_{\max}$.

That is why they conclude the “intrinsic effect state” is $$ z_e^0(m,Z)=\arg\max_i\;p_i(\cdots) \quad\Rightarrow\quad z_e^0 \text{ is the eigenvector of } \pi_e(Z\mid m) \text{ with largest eigenvalue.} $$

They explicitly note: if $\pi_e(Z\mid m)$ is mixed, then $z_e^0 \neq \pi_e(Z\mid m)$.

This is the novelty of the paper.

3. What is “integrated information” here in plain QM terms?

The paper’s “integrated information” is built from a comparison between:

a full effective channel from the chosen subsystem $M$ to the chosen subsystem $Z$, and
a cut (partitioned) version of that channel where cross-part influences are disabled by injecting maximally mixed noise.

The difference between the full and cut outputs is then measured using their quantum intrinsic difference (QID), and finally optimized over partitions and purviews.

3.1 Full (uncut) effective channel $\mathcal{E}_{M\to Z}$

Assume a global update map (channel) on the total system $Q$, $$ T:\mathcal{B}(\mathcal{H}_Q)\to\mathcal{B}(\mathcal{H}_Q), $$ and a factorization $\mathcal{H}_Q=\mathcal{H}_M\otimes\mathcal{H}_{M^0}$.

Define the maximally mixed state on the complement of $M$: $$ \rho^{\mathrm{mm}}_{M^0} := \frac{I_{M^0}}{\dim(\mathcal{H}_{M^0})}. $$

Then the “effect repertoire” is exactly the output of the induced (effective) channel $$ \mathcal{E}_{M\to Z}(X) := \operatorname{tr}_{Z^0}\Bigl(T\bigl(X\otimes\rho^{\mathrm{mm}}_{M^0}\bigr)\Bigr), \qquad \pi_e(Z\mid m) = \mathcal{E}_{M\to Z}(\rho_M). $$

Interpretation: this is a standard reduced-state construction *except* that the outside world $M^0$ is forcibly set to maximally mixed as an intervention convention.

3.2 What a “partition” $\theta$ is

A partition $\theta$ is a rule that splits both $M$ and $Z$ into matched parts: $$ \theta = \{(M^{(i)}\to Z^{(i)})\}_{i=1}^k, $$ where the parts are disjoint and cover the sets: $$ M = \bigsqcup_{i=1}^k M^{(i)},\qquad Z = \bigsqcup_{i=1}^k Z^{(i)}. $$

For example, if $M=\{A,B\}$ and $Z=\{A,B\}$ then a common bipartition is $$ \theta = (A\to A)\cup(B\to B). $$

3.3 What “cutting connections” means (operationally)

“Cut” does not mean a physical wire is severed. It means:

when computing the output on each $Z^{(i)}$, all inputs not belonging to $M^{(i)}$ are replaced by maximally mixed noise.*

So cross-part influences $M^{(j)}\to Z^{(i)}$ for $j\neq i$ are disabled by construction.

In QIT terms: you construct a family of cut channels where some inputs are fixed to $I/d$ instead of being allowed to carry state-dependent information.

3.4 Cut channel and cut repertoire

For each part $i$, define a local induced channel $$ \mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}(X) := \operatorname{tr}_{(Z^{(i)})^0}\Bigl( T\bigl(X\otimes \rho^{\mathrm{mm}}_{(M^{(i)})^0}\bigr) \Bigr), $$ where $(M^{(i)})^0:=Q\setminus M^{(i)}$ and $\rho^{\mathrm{mm}}_{(M^{(i)})^0}=I/\dim(\mathcal{H}_{(M^{(i)})^0})$. This is “feed only $M^{(i)}$ as an input; everything else is noise”.

Then the partitioned (cut) effect repertoire is assembled as a product: $$ \pi_e^{\theta}(Z\mid m) \;:=\; \bigotimes_{i=1}^k \mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}\bigl(\rho_{M^{(i)}}\bigr). $$

This is the formal meaning of “the parts act independently”: each $Z^{(i)}$ receives information only from its paired $M^{(i)}$.

Note: in the multi-node case the paper may additionally factor $Z$ into entanglement clusters before forming products; that is an extra step intended to avoid destroying entanglement-generated correlations, and it is where mixed-state entanglement detection becomes a weak link.

3.5 The score $\varphi(m,Z,\theta)$

Let (Q) be factorized and let $$ T:\mathcal B(\mathcal H_Q)\to\mathcal B(\mathcal H_Q)) $$ be the global channel. For a chosen input subsystem (M) prepared in state $\rho_M$, define the intervention-induced effective channel $$ \mathcal E_{M\to Z}(X):=\operatorname{tr}*{Z^0}!\bigl(T(X\otimes \tau*{M^0})\bigr), \qquad \tau_{M^0}:=\frac{I_{M^0}}{d_{M^0}}, $$ and the corresponding output marginal on the chosen target subsystem (Z): $$ \rho_Z:=\mathcal E_{M\to Z}(\rho_M). $$

For a partition $\theta\in\Theta(M,Z)$, construct the paper’s cut / partitioned output $\sigma_Z^{(\theta)}$ by (i) injecting maximally mixed noise across the cut and (ii) taking a tensor product across the partition blocks (optionally after the paper’s ($P^*$) “entanglement cluster” factorization step for $|Z|>1)$. Denote the resulting state by $$ \sigma_Z^{(\theta)}:=\pi^{\theta}_e(Z\mid m)\quad\text{(paper’s notation)}. $$

Now diagonalize $$ \rho_Z=\sum_i p_i|i\rangle\langle i|,\qquad \sigma_Z^{(\theta)}=\sum_j q_j|j\rangle\langle j|, \qquad P_{ij}:=|\langle i|j\rangle|^2. $$ Intrinsic (selected) effect state. The paper first selects a particular eigenstate $|i^*\rangle$ of $\rho_Z$ via an “intrinsic information vs baseline” optimization. With their maximally mixed baseline on $Z$, this selection reduces to choosing the largest-eigenvalue eigenvector of $\rho_Z$ (or the corresponding eigenspace if degenerate). Call that chosen eigenstate $$ z'_e(m,Z):=|i^*\rangle. $$ Integrated effect information for a partition. The paper then evaluates its QID expression at that chosen eigenstate (not by maximizing over (i) again): $$ \phi_e(m,Z,\theta) :=\phi(m,z'_e,\theta) = p_{i^*}\left(\log p_{i^*}-\sum_j P_{i^*j}\log q_j\right). $$ Equivalently, define the geometric mean $$ \bar q_{i^*}:=\exp!\left(\sum_j P_{i^*j}\log q_j\right), $$ so that $$ \phi_e(m,Z,\theta)=p_{i^*}\log!\frac{p_{i^*}}{\bar q_{i^*}}. $$

3.6 Optimization: MIP and “best” purview

For fixed $M,\rho_M$ and candidate target $Z$, the paper chooses the minimum information partition (MIP) using a normalized criterion: $$ \theta'(m,Z) = \arg\min_{\theta\in\Theta(M,Z)} \frac{\phi_e(m,Z,\theta)}{\max_{T'*S}\phi_e(m,Z,\theta)}. $$ Here $\max*{T'_S}$ ranges over alternative systems/channels of the same dimensions; this normalization is part of the selection of $\theta$.

Then the (reported) integrated effect information for that $Z$ is $$ \phi_e(m,Z):=\phi_e(m,Z,\theta'(m,Z)). $$

Finally, the best target subsystem (or in paper terms: “maximally irreducible effect purview” - not sure why it is irreducible) is chosen by $$ Z^**e(m)=\arg\max*{Z\subseteq Q}\phi_e(m,Z), \qquad \phi_e(m)=\max_{Z\subseteq Q}\phi_e(m,Z). $$

3.7 Mathematics Framework

We assume a finite-dimensional composite system with a chosen tensor decomposition $$ \mathcal{H}_Q \cong \bigotimes_{i=1}^n \mathcal{H}_i, $$ and we pick subsets of indices to define subsystems: $$ M \subseteq \{1,\dots,n\},\qquad Z \subseteq \{1,\dots,n\}. $$ Write the complements as $M^0 := Q\setminus M$ and $Z^0 := Q\setminus Z$, so that $$ \mathcal{H}_Q \cong \mathcal{H}_M \otimes \mathcal{H}_{M^0} \cong \mathcal{H}_Z \otimes \mathcal{H}_{Z^0}. $$

Let the global update be a quantum channel (CPTP map) $$ T:\mathcal{B}(\mathcal{H}_Q)\to \mathcal{B}(\mathcal{H}_Q), $$ often $T(\rho)=U\rho U^\dagger$ for a unitary $U$.

Step 1: Build an embedding of a local state into the global system

The input for the mechanism is its reduced density matrix $\rho_M\in\mathcal{D}(\mathcal{H}_M)$.

Define the maximally mixed state on the complement $$ \rho^{\mathrm{mm}}_{M^0} := \frac{I_{M^0}}{d_{M^0}}, \qquad d_{M^0}:=\dim(\mathcal{H}_{M^0}). $$

Define the embedding map $$ \iota_M(\rho_M) := \rho_M \otimes \rho^{\mathrm{mm}}_{M^0} \in \mathcal{D}(\mathcal{H}_Q). $$

This is the first non-physical convention in the pipeline: it replaces whatever the actual state of $M^0$ is by $\rho^{\mathrm{mm}}_{M^0}$.

Step 2: Evolve globally and reduce to the purview

Evolve the embedded state: $$ \rho'_{Q} := T(\iota_M(\rho_M)). $$

Reduce to the purview: $$ \rho'_{Z} := \operatorname{tr}_{Z^0}(\rho'_Q). $$

This defines the induced channel from $M$ to $Z$: $$ \mathcal{E}_{M\to Z}(X) := \operatorname{tr}_{Z^0}\!\left(T\left(X\otimes \rho^{\mathrm{mm}}_{M^0}\right)\right), \qquad \rho'_Z=\mathcal{E}_{M\to Z}(\rho_M). $$

So far, everything is standard QIT manipulation (tensoring, channel application, partial trace) except for the choice of $\rho^{\mathrm{mm}}_{M^0}$.

Step 3: Define a partition $\theta$ and the corresponding cut construction

A partition $\theta$ is a collection of paired parts $$ \theta=\{(M^{(i)}\to Z^{(i)})\}_{i=1}^k, $$ with disjoint unions $$ M=\bigsqcup_{i=1}^k M^{(i)},\qquad Z=\bigsqcup_{i=1}^k Z^{(i)}. $$

For each part $i$, define an embedding that treats everything outside $M^{(i)}$ as maximally mixed: $$ \iota_{M^{(i)}}(X) := X \otimes \rho^{\mathrm{mm}}_{(M^{(i)})^0}, \qquad (M^{(i)})^0 := Q\setminus M^{(i)}. $$

Define a local induced channel for that part: $$ \mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}(X) := \operatorname{tr}_{(Z^{(i)})^0}\!\left( T\left(\iota_{M^{(i)}}(X)\right) \right), \qquad (Z^{(i)})^0 := Q\setminus Z^{(i)}. $$

Then define the cut (partitioned) output on the whole purview as the tensor product: $$ \mathcal{E}^{(\theta)}_{M\to Z}(\rho_M) := \bigotimes_{i=1}^k \mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}(\rho_{M^{(i)}}), $$ and correspondingly $$ \rho^{(\theta)}_Z := \mathcal{E}^{(\theta)}_{M\to Z}(\rho_M). $$

Interpretation: the cut construction forces $Z^{(i)}$ to depend only on $M^{(i)}$ because all other inputs are replaced by maximally mixed noise, and then it forces the joint $Z$ state to factor across parts by taking a tensor product of the part outputs.

(For multi-node purviews, the paper may add an entanglement-clustering step before taking the product; the above is the clean “channel surgery” skeleton.)

Step 4: Define the distance between full and cut outputs using QID

Let $$ \rho_Z := \mathcal{E}_{M\to Z}(\rho_M), \qquad \rho^{(\theta)}_Z := \mathcal{E}^{(\theta)}_{M\to Z}(\rho_M). $$

Their divergence is QID. A practical way to present it is: diagonalize both states $$ \rho_Z=\sum_i p_i |i\rangle\langle i|, \qquad \rho^{(\theta)}_Z=\sum_j q_j |j\rangle\langle j|, $$ and define overlaps $$ P_{ij}:=|\langle i|j\rangle|^2. $$

Then $$ \mathrm{QID}(\rho_Z\Vert \rho^{(\theta)}_Z) = \max_i\, p_i\left(\log p_i - \sum_j P_{ij}\log q_j\right). $$

They also define an associated maximizing eigenstate (or eigenspace if degenerate) by selecting the index that achieves the maximum: $$ i^\star \in \arg\max_i\, p_i\left(\log p_i - \sum_j P_{ij}\log q_j\right), \qquad z_e^0 := |i^\star\rangle. $$

The partition score is then $$ \varphi(m,Z,\theta) := \mathrm{QID}(\rho_Z\Vert \rho^{(\theta)}_Z). $$

Step 5: Optimize over partitions and purviews

For a fixed purview $Z$, define the minimizing partition (minimum-information partition): $$ \theta^\star(m,Z) \in \arg\min_\theta \varphi(m,Z,\theta), $$ and the corresponding value $$ \varphi(m,Z) := \min_\theta \varphi(m,Z,\theta). $$

Then select the purview that maximizes this value: $$ Z^\star(m) \in \arg\max_{Z\subseteq Q} \varphi(m,Z), \qquad \varphi_e(m) := \max_{Z\subseteq Q}\min_\theta \varphi(m,Z,\theta). $$

This completes the effect-side pipeline: full induced channel output versus cut-channel output, scored by QID, minimized over cuts, then maximized over purviews.

The key point is that the construction depends on the intervention convention $X\mapsto X\otimes I/d$ used to define both the full induced channel and the cut channels.

4. Where this is “just QM” vs where it stops being physics

4.1 What is genuinely standard (QM/QIT)

partial trace and reduced states,
CPTP maps / unitary channels,
maximally mixed state as a reference,
relative entropy and related divergences as distinguishability measures.

4.2 What is not standard QM (it is an IIT intervention convention)

Key point: the replacement $\rho_{M^0}\mapsto \rho^{\mathrm{mm}}_{M^0}$ is not “neglecting the environment” in the open-systems sense. It is defining a counterfactual: “what would happen if everything outside $M$ were randomized”.

This is a choice of *causal attribution rule*.

A physically evolved reduced state of $Z$ would be $$ \rho^{\mathrm{phys}}_{Z,t+1} = \operatorname{tr}_{Z^0}\!\bigl(T(\rho_{Q,t})\bigr), $$ with the actual joint state $\rho_{Q,t}$ (including correlations/entanglement). Their repertoire is instead $$ \rho^{\mathrm{IIT}}_{Z,t+1} = \operatorname{tr}_{Z^0}\!\Bigl(T\bigl(\rho_M \otimes \rho^{\mathrm{mm}}_{M^0}\bigr)\Bigr), $$ which equals $\rho^{\mathrm{phys}}_{Z,t+1}$ only if the actual joint state factorizes as $\rho_{Q,t}=\rho_M\otimes \rho^{\mathrm{mm}}_{M^0}$ (or if the dynamics makes $Z$ independent of $M^0$).

So: their pipeline is not deriving a physical prediction; it is defining a counterfactual dependence score.

4.3 Energy conservation / Hamiltonian structure is not enforced

Even if $T(\cdot)=U(\cdot)U^\dagger$ is unitary, nothing in the construction requires $U=e^{-iHt}$ for a specified Hamiltonian, nor does it enforce conservation constraints under the intervention $\rho_{M^0}\mapsto I/d$.

If you interpret $M^0$ as a physical environment with a physical state (e.g. thermal), replacing it by maximally mixed corresponds to an “infinite temperature” reference and can inject or remove energy relative to that environment model. The framework avoids this by treating the replacement as a formal intervention, not a physical process. Still, the connection to the evolution of physical processes is what this paper, I think, is trying to evaluate.

4.4 The “entanglement-cluster product” step is a major weak link for mixed states

They require a decomposition of mixed states to define $P^*(\rho)$ and acknowledge that identifying multipartite mixed-state entanglement structure is hard and not fully solved in general. So for nontrivial mixed states, the factorization step can be:

non-unique,
computationally hard,
sensitive to the chosen criterion of separability.

4.5 The measure collapses “effect content” to a single eigenvector

With the maximally mixed baseline, QID reduces to a function of eigenvalues and the “intrinsic effect” is the top-eigenvalue eigenvector.

That means:

the framework’s “content” is often not the full density matrix $\rho$,
phase-sensitive/coherence structure matters only insofar as it affects the spectrum (or the later partitioning step),
for mixed repertoires, “what the mechanism specifies” becomes “the most probable eigenstate”, not the full state.

This is a deliberate design choice (“specificity”), but it is not forced by QM.

4.6 Built-in causal asymmetry even under unitary dynamics

They explicitly note an asymmetry: the product structure is imposed on parts of $\rho_M$ for causes, not on $\rho_{Z|m}$, yielding a time-asymmetry in “causes vs effects” even when the underlying dynamics is unitary/reversible.

5. Steelman vs critique (short)

Steelman

As a *defined* counterfactual causal attribution scheme, the framework:

avoids common-cause “spurious correlation” effects by construction (their COPY-XOR/CNOT motivation),
tries to preserve entanglement-generated correlations by clustering entanglement before taking products,
can differentiate internal structure of multipartite entanglement classes (e.g. GHZ vs W under $T=I$) in a compositional way.

Critique

If read as physics (rather than as a convention):

the central “disconnect via maximally mixed noise” is not physically derived and ignores real correlations/entanglement with $M^0$,
conservation laws and Hamiltonian constraints are not part of the axioms,
mixed-state entanglement factorization is a weak/ill-defined step in general,
the “intrinsic effect” reduces to a top-eigenvector selection, discarding much of the state’s structure in mixed cases.

6. Minimal “one-line” summary

Mathematically, the paper computes a max-eigenvalue-weighted distinguishability between the output of a derived channel $$ \mathcal{E}_{M\to Z}(X)=\operatorname{tr}_{Z^0}(T(X\otimes I/d)) $$ and the output of its cut/partitioned variants, after optionally factorizing $Z$ into entanglement clusters.

Everything else is terminology.