Abstract:
This project surveys the foundational theorems that connect beliefs, values, and actions into coherent decision-making frameworks. Savage’s theorem and the Jeffrey–Bolker theorem characterize when qualitative preferences and probabilistic beliefs jointly determine expected-utility representations. The project will explore when and how beliefs can be distinguished from values, when this distinction collapses, and how is this related to the emergence of boundaries between agents. Interfaces are examined through epsilon-transducers, MDPs, and computational mechanics, which formalize the interaction between states, actions, and outcomes. A recurring theme is clarifying where the boundaries of agentic systems lie, and how different formalisms (belief-first, value-first, reward-first) articulate this interface. Connections to alignment are explicit: distinguishing reward, value, and policy-level objectives is crucial for understanding alignment failures.
Keywords: Savage theorem, Jeffrey–Bolker theorem, beliefs, values, preferences, boundaries of agency, epsilon-transducers, MDPs, computational mechanics, interfaces, reward vs value, alignment
People: Wentworth, Daniel Herrmann, Fernando Rosas, TBD