Differences

This shows you the differences between two versions of the page.

--- bib:albantakis2023computing [2026/01/28 07:59] – [3. What is “integrated information” here in plain QM terms?] kymki
+++ bib:albantakis2023computing [2026/05/21 06:29] (current) – [Steelman] kymki
@@ Line 17: / Line 17: @@
 ===== BibTeX =====
-<code bibtex>
+<code javascript>
 @article{albantakis2023computing,
   title={Computing the integrated information of a quantum mechanism},
@@ Line 29: / Line 29: @@
 }
 </code>
+====== Albantakis et al. (2023) — QM re-formulation and failure modes ======
-===== One-paragraph summary (plain language) =====
+Im reading this with minimal background in IIT and most of its concepts are unknown to me. The reading is based on trying to extract what QIT the authors lean on to understand if what they propose makes "physical sense".
-The paper proposes a method (QIIT/QIIT-like) to compute “integrated information” for quantum systems by defining cause/effect “repertoires” using density matrices and comparing them to a baseline “chance” state. It adapts classical IIT machinery (mechanism/purview partitions, counterfactual noise injection, integration via partitioning) into a quantum formalism.
-===== Review =====
-==== What it claims to do (in operational terms) ====
-  * Defines quantum cause/effect repertoires for a chosen “mechanism” and “purview”.
-  * Defines an “intrinsic difference” measure (QID) between a constrained repertoire and a baseline repertoire.
-  * Defines “integration” by how much this measure drops under partitions (analogous to classical IIT).
-==== What is genuinely new (vs renaming) ====
-  * A concrete proposal for handling entanglement when factorizing repertoires (nontrivial compared to classical product-factor approaches).
-  * A specific divergence-like measure (QID) intended to encode “intrinsic” rather than channel-designer information.
-==== What looks like repackaging / relabeling ====
-  * “Mechanism” ≈ chosen subsystem.
-  * “Purview” ≈ another chosen subsystem.
-  * “Repertoire” ≈ a (conditional / counterfactual) reduced density matrix.
-  * The conceptual novelty is not in the quantum objects, but in the *interpretation* (intrinsic/self-specifying) and the *intervention rule*.
-==== Core conceptual friction points (physics-first critique) ====
-  - **Counterfactual noise injection:** The method “disconnects” everything outside the mechanism by replacing it with maximally mixed noise. This is not a physical open-system approximation; it is a *chosen intervention rule*. Any “intrinsic” claims depend on this convention.
-  - **Factorization dependence:** Results depend on the choice of subsystem decomposition (“units”). In quantum theory, factorization is not always unique or physically privileged; if the computed structure changes under refactorization, it’s hard to call it intrinsic.
-  - **Unitary bias / measurement gap:** The clean formalism largely lives in unitary evolution; measurement/non-unitary updates create ambiguity for “cause” directionality and can become interpretation-dependent.
-  - **Mixed-state ambiguity:** A density matrix can represent ignorance (epistemic) or a reduced state from entanglement (ontic-but-subsystem). The framework’s language often slides between these readings.
-==== Where the prose risks misleading the reader ====
-  * Phrases like “the system knows” or “specifies information about itself” read like ontology, but the actual operations are: choose a partition, apply an intervention/noise rule, compute a state, compute a divergence, pick a maximizing element.
-  * The mathematical pipeline can be valid as a *defined metric*, but the paper’s language can make it sound like a derived physical necessity.
-==== Strongest charitable reading ====
-The framework is a proposed *measure of “how concentrated and partition-resistant” a mechanism’s counterfactually-defined influence is* under a particular intervention scheme. It is a formal extension of IIT-style attribution to density matrices.
-==== Strongest skeptical reading ====
-It is a rebranding of subsystem/channel calculations with a heavy interpretive layer. “Intrinsic” properties are not shown to be invariant under factorization, interpretation of measurement, or physically constrained interventions—so the ontological talk outruns what the formalism guarantees.
-===== Notes / excerpts =====
-  * (Add your own quotes here as you read.)
-  * (Add page/section pointers you want to revisit.)
-===== Open questions to test the framework =====
-  - If you compute the quantity under two physically equivalent descriptions (different tensor factorizations / dilations), do you get the same “intrinsic” structure?
-  - If you replace “maximally mixed noise” with a physically motivated environment state (thermal, constrained by energy), how stable are the results?
-  - Does QID overemphasize top-eigenvalue behavior in ways that wash out phase-sensitive/coherence structure you’d expect to matter?
-====== Albantakis et al. (2023) — QM re-formulation and failure modes ======
-This note rewrites the paper’s core definitions in standard quantum information language, and isolates where the framework is (i) pure relabeling, (ii) a specific intervention convention, and (iii) where it becomes physically ambiguous.
+This note rewrites the paper’s core definitions in standard quantum information language, and tries to explain conclusions from that standpoint. The case may be that I completely miss subtleties due to not being well read in IIT.
 ===== 1. Translation into standard QM language =====
@@ Line 178: / Line 134: @@
 ===== 3. What is “integrated information” here in plain QM terms? =====
-Given a partition $\theta$ that “cuts” ?? connections from parts of $M$ to parts of $Z$, they construct a partitioned repertoire $\pi_e^\theta(Z\mid m)$ (also using maximally mixed noise for missing inputs), and evaluate
+The paper’s “integrated information” is built from a comparison between:
+  - a **full** effective channel from the chosen subsystem $M$ to the chosen subsystem $Z$, and
+  - a **cut** (partitioned) version of that channel where cross-part influences are disabled by injecting maximally mixed noise.
+The difference between the full and cut outputs is then measured using their quantum intrinsic difference (QID), and finally optimized over partitions and purviews.
+----
+==== 3.1 Full (uncut) effective channel $\mathcal{E}_{M\to Z}$ ====
+Assume a global update map (channel) on the total system $Q$,
+$$
+T:\mathcal{B}(\mathcal{H}_Q)\to\mathcal{B}(\mathcal{H}_Q),
+$$
+and a factorization $\mathcal{H}_Q=\mathcal{H}_M\otimes\mathcal{H}_{M^0}$.
+Define the maximally mixed state on the complement of $M$:
+$$
+\rho^{\mathrm{mm}}_{M^0} := \frac{I_{M^0}}{\dim(\mathcal{H}_{M^0})}.
+$$
+Then the “effect repertoire” is exactly the output of the induced (effective) channel
+$$
+\mathcal{E}_{M\to Z}(X)
+:=
+\operatorname{tr}_{Z^0}\Bigl(T\bigl(X\otimes\rho^{\mathrm{mm}}_{M^0}\bigr)\Bigr),
+\qquad
+\pi_e(Z\mid m) = \mathcal{E}_{M\to Z}(\rho_M).
+$$
+Interpretation: this is a standard reduced-state construction *except* that the outside world $M^0$ is forcibly set to maximally mixed as an intervention convention.
+----
+==== 3.2 What a “partition” $\theta$ is ====
+A partition $\theta$ is a rule that splits both $M$ and $Z$ into matched parts:
+$$
+\theta = \{(M^{(i)}\to Z^{(i)})\}_{i=1}^k,
+$$
+where the parts are disjoint and cover the sets:
+$$
+M = \bigsqcup_{i=1}^k M^{(i)},\qquad
+Z = \bigsqcup_{i=1}^k Z^{(i)}.
+$$
+For example, if $M=\{A,B\}$ and $Z=\{A,B\}$ then a common bipartition is
+$$
+\theta = (A\to A)\cup(B\to B).
+$$
+----
+==== 3.3 What “cutting connections” means (operationally) ====
+“Cut” does not mean a physical wire is severed. It means:
+  *when computing the output on each $Z^{(i)}$, all inputs not belonging to $M^{(i)}$ are replaced by maximally mixed noise.*
+So cross-part influences $M^{(j)}\to Z^{(i)}$ for $j\neq i$ are disabled by construction.
+In QIT terms: you construct a family of **cut channels** where some inputs are fixed to $I/d$ instead of being allowed to carry state-dependent information.
+----
+==== 3.4 Cut channel and cut repertoire ====
+For each part $i$, define a local induced channel
+$$
+\mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}(X)
+:=
+\operatorname{tr}_{(Z^{(i)})^0}\Bigl(
+T\bigl(X\otimes \rho^{\mathrm{mm}}_{(M^{(i)})^0}\bigr)
+\Bigr),
+$$
+where $(M^{(i)})^0:=Q\setminus M^{(i)}$ and $\rho^{\mathrm{mm}}_{(M^{(i)})^0}=I/\dim(\mathcal{H}_{(M^{(i)})^0})$.
+This is “feed only $M^{(i)}$ as an input; everything else is noise”.
+Then the partitioned (cut) effect repertoire is assembled as a product:
+$$
+\pi_e^{\theta}(Z\mid m)
+\;:=\;
+\bigotimes_{i=1}^k
+\mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}\bigl(\rho_{M^{(i)}}\bigr).
+$$
+This is the formal meaning of “the parts act independently”: each $Z^{(i)}$ receives information only from its paired $M^{(i)}$.
+**Note:** in the multi-node case the paper may additionally factor $Z$ into entanglement clusters before forming products; that is an extra step intended to avoid destroying entanglement-generated correlations, and it is where mixed-state entanglement detection becomes a weak link.
+----
+==== 3.5 The score $\varphi(m,Z,\theta)$ ====
+Let (Q) be factorized and let
+$$
+T:\mathcal B(\mathcal H_Q)\to\mathcal B(\mathcal H_Q))
+$$
+be the global channel. For a chosen **input subsystem** (M) prepared in state $\rho_M$, define the **intervention-induced effective channel**
+$$
+\mathcal E_{M\to Z}(X):=\operatorname{tr}*{Z^0}!\bigl(T(X\otimes \tau*{M^0})\bigr),
+\qquad \tau_{M^0}:=\frac{I_{M^0}}{d_{M^0}},
+$$
+and the corresponding **output marginal** on the chosen **target subsystem** (Z):
+$$
+\rho_Z:=\mathcal E_{M\to Z}(\rho_M).
+$$
+For a partition $\theta\in\Theta(M,Z)$, construct the paper’s **cut / partitioned output** $\sigma_Z^{(\theta)}$ by (i) injecting maximally mixed noise across the cut and (ii) taking a tensor product across the partition blocks (optionally after the paper’s ($P^*$) “entanglement cluster” factorization step for $|Z|>1)$. Denote the resulting state by
+$$
+\sigma_Z^{(\theta)}:=\pi^{\theta}_e(Z\mid m)\quad\text{(paper’s notation)}.
+$$
+Now diagonalize
+$$
+\rho_Z=\sum_i p_i|i\rangle\langle i|,\qquad
+\sigma_Z^{(\theta)}=\sum_j q_j|j\rangle\langle j|,
+\qquad
+P_{ij}:=|\langle i|j\rangle|^2.
+$$
+**Intrinsic (selected) effect state.** The paper first selects a particular eigenstate $|i^*\rangle$ of $\rho_Z$ via an “intrinsic information vs baseline” optimization. With their maximally mixed baseline on $Z$, this selection reduces to choosing the **largest-eigenvalue eigenvector** of $\rho_Z$ (or the corresponding eigenspace if degenerate). Call that chosen eigenstate
+$$
+z'_e(m,Z):=|i^*\rangle.
+$$
+**Integrated effect information for a partition.** The paper then evaluates its QID expression **at that chosen eigenstate** (not by maximizing over (i) again):
+$$
+\phi_e(m,Z,\theta)
+:=\phi(m,z'_e,\theta)
+=
+p_{i^*}\left(\log p_{i^*}-\sum_j P_{i^*j}\log q_j\right).
+$$
+Equivalently, define the geometric mean
+$$
+\bar q_{i^*}:=\exp!\left(\sum_j P_{i^*j}\log q_j\right),
+$$
+so that
+$$
+\phi_e(m,Z,\theta)=p_{i^*}\log!\frac{p_{i^*}}{\bar q_{i^*}}.
+$$
+==== 3.6 Optimization: MIP and “best” purview ====
+For fixed $M,\rho_M$ and candidate target $Z$, the paper chooses the minimum information partition (MIP) using a normalized criterion:
+$$
+\theta'(m,Z)
+=
+\arg\min_{\theta\in\Theta(M,Z)}
+\frac{\phi_e(m,Z,\theta)}{\max_{T'*S}\phi_e(m,Z,\theta)}.
+$$
+Here $\max*{T'_S}$ ranges over alternative systems/channels of the same dimensions; this normalization is part of the selection of $\theta$.
+Then the (reported) integrated effect information for that $Z$ is
+$$
+\phi_e(m,Z):=\phi_e(m,Z,\theta'(m,Z)).
+$$
+Finally, the best target subsystem (or in paper terms: “maximally irreducible effect purview” - not sure why it is irreducible) is chosen by
+$$
+Z^**e(m)=\arg\max*{Z\subseteq Q}\phi_e(m,Z),
+\qquad
+\phi_e(m)=\max_{Z\subseteq Q}\phi_e(m,Z).
+$$
+==== 3.7 Mathematics Framework ====
+We assume a finite-dimensional composite system with a chosen tensor decomposition
+$$
+\mathcal{H}_Q \cong \bigotimes_{i=1}^n \mathcal{H}_i,
+$$
+and we pick subsets of indices to define subsystems:
+$$
+M \subseteq \{1,\dots,n\},\qquad Z \subseteq \{1,\dots,n\}.
+$$
+Write the complements as $M^0 := Q\setminus M$ and $Z^0 := Q\setminus Z$, so that
+$$
+\mathcal{H}_Q \cong \mathcal{H}_M \otimes \mathcal{H}_{M^0}
+\cong \mathcal{H}_Z \otimes \mathcal{H}_{Z^0}.
+$$
+Let the global update be a quantum channel (CPTP map)
+$$
+T:\mathcal{B}(\mathcal{H}_Q)\to \mathcal{B}(\mathcal{H}_Q),
+$$
+often $T(\rho)=U\rho U^\dagger$ for a unitary $U$.
+-----
+=== Step 1: Build an embedding of a local state into the global system ===
+The input for the mechanism is its reduced density matrix $\rho_M\in\mathcal{D}(\mathcal{H}_M)$.
+Define the maximally mixed state on the complement
+$$
+\rho^{\mathrm{mm}}_{M^0} := \frac{I_{M^0}}{d_{M^0}},
+\qquad d_{M^0}:=\dim(\mathcal{H}_{M^0}).
+$$
+Define the embedding map
+$$
+\iota_M(\rho_M) := \rho_M \otimes \rho^{\mathrm{mm}}_{M^0} \in \mathcal{D}(\mathcal{H}_Q).
+$$
+This is the first non-physical convention in the pipeline: it replaces whatever the actual state of $M^0$ is by $\rho^{\mathrm{mm}}_{M^0}$.
+-----
+=== Step 2: Evolve globally and reduce to the purview ===
+Evolve the embedded state:
+$$
+\rho'_{Q} := T(\iota_M(\rho_M)).
+$$
+Reduce to the purview:
+$$
+\rho'_{Z} := \operatorname{tr}_{Z^0}(\rho'_Q).
+$$
+This defines the induced channel from $M$ to $Z$:
+$$
+\mathcal{E}_{M\to Z}(X)
+:=
+\operatorname{tr}_{Z^0}\!\left(T\left(X\otimes \rho^{\mathrm{mm}}_{M^0}\right)\right),
+\qquad
+\rho'_Z=\mathcal{E}_{M\to Z}(\rho_M).
+$$
+So far, everything is standard QIT manipulation (tensoring, channel application, partial trace) except for the choice of $\rho^{\mathrm{mm}}_{M^0}$.
+-----
+=== Step 3: Define a partition $\theta$ and the corresponding cut construction ===
+A partition $\theta$ is a collection of paired parts
+$$
+\theta=\{(M^{(i)}\to Z^{(i)})\}_{i=1}^k,
+$$
+with disjoint unions
+$$
+M=\bigsqcup_{i=1}^k M^{(i)},\qquad
+Z=\bigsqcup_{i=1}^k Z^{(i)}.
+$$
+For each part $i$, define an embedding that treats everything outside $M^{(i)}$ as maximally mixed:
+$$
+\iota_{M^{(i)}}(X) := X \otimes \rho^{\mathrm{mm}}_{(M^{(i)})^0},
+\qquad (M^{(i)})^0 := Q\setminus M^{(i)}.
+$$
+Define a local induced channel for that part:
+$$
+\mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}(X)
+:=
+\operatorname{tr}_{(Z^{(i)})^0}\!\left(
+T\left(\iota_{M^{(i)}}(X)\right)
+\right),
+\qquad (Z^{(i)})^0 := Q\setminus Z^{(i)}.
+$$
+Then define the cut (partitioned) output on the whole purview as the tensor product:
+$$
+\mathcal{E}^{(\theta)}_{M\to Z}(\rho_M)
+:=
+\bigotimes_{i=1}^k
+\mathcal{E}^{(i)}_{M^{(i)}\to Z^{(i)}}(\rho_{M^{(i)}}),
+$$
+and correspondingly
+$$
+\rho^{(\theta)}_Z := \mathcal{E}^{(\theta)}_{M\to Z}(\rho_M).
+$$
+Interpretation: the cut construction forces $Z^{(i)}$ to depend only on $M^{(i)}$ because all other inputs are replaced by maximally mixed noise, and then it forces the joint $Z$ state to factor across parts by taking a tensor product of the part outputs.
+(For multi-node purviews, the paper may add an entanglement-clustering step before taking the product; the above is the clean “channel surgery” skeleton.)
+-----
+=== Step 4: Define the distance between full and cut outputs using QID ===
+Let
+$$
+\rho_Z := \mathcal{E}_{M\to Z}(\rho_M),
+\qquad
+\rho^{(\theta)}_Z := \mathcal{E}^{(\theta)}_{M\to Z}(\rho_M).
+$$
+Their divergence is QID. A practical way to present it is:
+diagonalize both states
+$$
+\rho_Z=\sum_i p_i |i\rangle\langle i|,
+\qquad
+\rho^{(\theta)}_Z=\sum_j q_j |j\rangle\langle j|,
+$$
+and define overlaps
+$$
+P_{ij}:=|\langle i|j\rangle|^2.
+$$
+Then
+$$
+\mathrm{QID}(\rho_Z\Vert \rho^{(\theta)}_Z)
+=
+\max_i\, p_i\left(\log p_i - \sum_j P_{ij}\log q_j\right).
+$$
+They also define an associated maximizing eigenstate (or eigenspace if degenerate) by selecting the index that achieves the maximum:
+$$
+i^\star \in \arg\max_i\, p_i\left(\log p_i - \sum_j P_{ij}\log q_j\right),
+\qquad
+z_e^0 := |i^\star\rangle.
+$$
+The partition score is then
 $$
 \varphi(m,Z,\theta)
-\;\equiv\;
+:=
-\mathrm{QID}\bigl(\pi_e(Z\mid m)\Vert \pi_e^\theta(Z\mid m)\bigr)
+\mathrm{QID}(\rho_Z\Vert \rho^{(\theta)}_Z).
-\quad\text{but only evaluated on the maximizing eigenstate } z_e^0.
+$$
+-----
+=== Step 5: Optimize over partitions and purviews ===
+For a fixed purview $Z$, define the minimizing partition (minimum-information partition):
+$$
+\theta^\star(m,Z) \in \arg\min_\theta \varphi(m,Z,\theta),
+$$
+and the corresponding value
+$$
+\varphi(m,Z) := \min_\theta \varphi(m,Z,\theta).
+$$
+Then select the purview that maximizes this value:
+$$
+Z^\star(m) \in \arg\max_{Z\subseteq Q} \varphi(m,Z),
+\qquad
+\varphi_e(m) := \max_{Z\subseteq Q}\min_\theta \varphi(m,Z,\theta).
 $$
-(Their Eq. (31) is this in components.)
-Then they minimize over partitions to get a “minimum information partition” (MIP) and maximize over subsets $Z$.
+This completes the effect-side pipeline: full induced channel output versus cut-channel output, scored by QID, minimized over cuts, then maximized over purviews.
-In standard language:
+The key point is that the construction depends on the intervention convention $X\mapsto X\otimes I/d$ used to define both the full induced channel and the cut channels.
-  * define a family of **cut channels** $\mathcal{E}_{M\to Z}^{(\theta)}$ by replacing some inputs with maximally mixed noise,
-  * compare the output state of the original channel vs cut channel by a **max-eigenvalue-weighted relative-entropy-like score**,
-  * take min/max over partitions/purviews.
 ===== 4. Where this is “just QM” vs where it stops being physics =====
@@ Line 254: / Line 556: @@
 ==== Steelman ====
-As a *defined* counterfactual causal attribution scheme, the framework:
+As a defined counterfactual causal attribution scheme, the framework:
   * avoids common-cause “spurious correlation” effects by construction (their COPY-XOR/CNOT motivation),
   * tries to preserve entanglement-generated correlations by clustering entanglement before taking products,