🎲

Mixed States: Acting Under Irreducible Uncertainty

How quantum mixed states handle uncertainty about the environment itself

Understanding Quantum Mixed States

In quantum mechanics, a pure state is a definite quantum state |ψ⟩. But when we're uncertain about which pure state the system is in, we use a mixed state—a probabilistic mixture of pure states. This is represented by a density matrix ρ = Σᵢ pᵢ |ψᵢ⟩⟨ψᵢ|, where pᵢ are probabilities and Σᵢ pᵢ = 1.

The key insight: mixed states represent epistemic uncertainty—uncertainty about our knowledge of the system, not uncertainty inherent in the system itself. This is different from superposition, which represents the system being in multiple states simultaneously.

The Problem in Reinforcement Learning

In many RL problems, we're uncertain about the environment itself. Consider stochastic Frozen Lake: the ice is slippery, so when you try to move in a direction, there's a probability you'll move in a different direction. But beyond this known stochasticity, we might be uncertain about the map layout—where the holes are, where the goal is.

Classical RL often assumes we know the environment dynamics. But in reality, we might be uncertain about:

Transition probabilities: How likely is each state transition?
Reward structure: What rewards does each state-action pair yield?
State space: What states are even possible?

A mixed state agent maintains a probability distribution over possible environments, making decisions that account for this irreducible uncertainty.

💥

The Classical RL Limitation

Why single-model assumptions fail under uncertainty

The Problematic Approach

Classical RL algorithms like Q-learning assume we know the environment dynamics P(s'|s,a) and reward function R(s,a). Even when these are learned, the agent typically commits to a single estimate and acts as if it's certain.

Example: Stochastic Frozen Lake with Unknown Map - In stochastic Frozen Lake, we know the ice is slippery (stochastic transitions), but we might not know the exact map. Classical Q-learning would estimate transition probabilities from data, pick the most likely map, and act as if that map is definitely correct. This ignores uncertainty about which map we're actually in.

❌ Single-Model Assumption

Observations

→

One World Model

→

Policy

Critical Issues:

🚨

Overconfidence: the agent acts as if uncertainty is resolved when it is not.

🚨

No scenario awareness: rare but serious worlds are underrepresented.

🚨

Risk blindness: reward is often optimised in expectation only, ignoring tails.

🎯

Our Mixed-State QiRL Design

Explicit belief distributions over multiple worlds

The Quantum-Inspired Solution

In quantum mechanics, a mixed state is a density matrix ρ = Σᵢ pᵢ |ψᵢ⟩⟨ψᵢ| representing a probabilistic mixture of pure states. We apply this to RL by maintaining a belief distribution over possible environments.

Instead of committing to a single environment model, we maintain multiple hypotheses about the environment (different maps, different transition probabilities, etc.) and keep a probability distribution over them. The policy then acts based on this mixed state, accounting for uncertainty.

Example: Stochastic Frozen Lake with Unknown Map - We maintain beliefs about multiple possible maps: Map A (holes in positions X), Map B (holes in positions Y), etc. Each map has probability pᵢ. The agent's policy considers all maps weighted by their probabilities, making decisions that are robust to this uncertainty.

✅ Mixed-State Architecture

Latent Modes
World 1, World 2, …

→

Belief Distribution

→

Risk-Aware Policy

🎲

Density Matrix Representation

The environment is represented as ρ = Σᵢ pᵢ |Eᵢ⟩⟨Eᵢ|, a mixture of possible environments Eᵢ with probabilities pᵢ.

🔄

Bayesian Belief Updates

As we observe transitions, we update our beliefs using Bayes' rule: p(Eᵢ|data) ∝ p(data|Eᵢ) p(Eᵢ). The mixture evolves but doesn't collapse prematurely.

🛡️

Risk-Aware Policies

Policies optimize over the mixed state, considering worst-case scenarios. This naturally leads to risk-sensitive behavior without explicit risk penalties.

The policy takes the belief distribution as input and chooses tests or treatments that are robust across likely worlds, while still being efficient.

⚡

Production View: Belief over Latent Modes

Implementing mixed states in QiRL

Mixed-State Implementation

Below we sketch a mixed-state agent with multiple world models and a risk-aware policy that conditions on the belief distribution.

qirl_mixed_states.py

class MixedStateAgent:
    def __init__(self, world_models, policy):
        self.world_models = world_models  # list of models M₁, M₂, …
        self.belief = np.ones(len(world_models)) / len(world_models)
        self.policy = policy

    def update_belief(self, observation):
        # Bayes-like update (simplified)
        likelihoods = np.array([
            M.likelihood(observation) for M in self.world_models
        ])
        self.belief *= likelihoods
        self.belief /= self.belief.sum()

    def act(self, state):
        # Pass both state and belief to the policy
        return self.policy.act(state, belief=self.belief)

risk_objective.py

def cvar_loss(returns, alpha=0.1):
    """
    Conditional Value at Risk (CVaR) at level alpha.
    Focus on the worst alpha fraction of outcomes.
    """
    sorted_returns = np.sort(returns)
    cutoff = int(len(sorted_returns) * alpha)
    tail = sorted_returns[:cutoff] if cutoff > 0 else sorted_returns
    return -tail.mean()

Key Features:

Explicit belief state: uncertainty is part of the state, not an afterthought.
Multiple models: each mode captures a different plausible world.
Risk-aware training: objectives like CVaR align with safety requirements.

🔮

Summary: Mixed States for Robust Decision-Making

How quantum mixed states handle irreducible uncertainty

Key Insights

Quantum mixed states provide a natural framework for handling epistemic uncertainty in reinforcement learning. By maintaining a probability distribution over possible environments rather than committing to a single model, agents can make robust decisions that account for what they don't know.

Stochastic Frozen Lake Example: When the map is uncertain, a mixed-state agent maintains beliefs about multiple possible maps. It avoids paths that are safe in some maps but dangerous in others, naturally exhibiting risk-averse behavior without explicit risk penalties.

🎲

Explicit Uncertainty

Mixed states make uncertainty part of the state representation, not hidden in parameter estimates. The agent knows what it doesn't know.

🛡️

Risk-Aware Behavior

By optimizing over the mixed state, policies naturally avoid actions that are good in some environments but catastrophic in others.

🔄

Bayesian Updates

Beliefs update naturally as data arrives, but the mixture doesn't collapse prematurely—uncertainty is preserved when warranted.

🔮 Quantum Saga Secrets

Secret 1: Scenarios as Modes

👁️ Click to reveal

World modes can be defined with domain experts as narrative scenarios, not just fitted from data.

Secret 2: Explaining Uncertainty

👁️ Click to reveal

Belief over modes can be turned into human-friendly statements like “there is a 20% chance of a rare condition”.

The Quantum Saga in Full

Mixed states complete the quantum saga of QiRL: superposition, entanglement, interference, tunnelling, and mixed states. Together they form a vocabulary for designing, explaining, and governing reinforcement learning systems.

Explore All Quantum Stories Back to All Stories