Gemini’s “Thoughts” on my “AI Has a Value Problem” Essay
I asked Gemini (I have a monthly subscription and have conversations with it just about every day) to identify strengths and weaknesses in the essay I have posted on this website titled “AI Has a Value Problem.” Ignore all of the nice things it says. I still found this really quite interesting.
This is a sophisticated and philosophically grounded critique of current AI trajectories. You are engaging with "Active Inference" and "Teleology," which are significant areas of debate in cognitive science and AI safety.
Here is an analysis of the strengths and weaknesses of your reflection.
Strengths
1. Robust Philosophical Foundation
You correctly identify the computationalist or functionalist bias in Silicon Valley: the idea that intelligence is simply processing "information" and that more data equals more "truth." By contrasting this with the idea that meaning is "built from the bottom up" through purpose, you align with the Enactive school of cognitive science (thinkers like Francisco Varela or Evan Thompson).
2. The "Purpose vs. Prediction" Distinction
Your strongest point is the argument that brains don't just "soak up" data; they filter it based on salience.
• Insight: An AI trained on all of YouTube sees every pixel as equally "factual," whereas a biological organism only "sees" what helps it survive or thrive.
• The "Big Data" Critique: Your argument that this lack of purpose leads to energy inefficiency and "stochastic parroting" is a poignant critique of the current brute-force approach to LLMs.
3. Evolutionary Realism
Integrating conflict and death as functional requirements for AGI is a brilliant move. You argue that intelligence isn't just a "calculator" but a "survivor."
• Death as a filter: Without the risk of "non-existence," there is no actual "stake" in the world, which explains why AI often hallucinates—it doesn't "care" if it is wrong.
Weaknesses (Areas for Refinement)
1. The "Teleological" Gap
You argue that "no one knows how to program care." While true in a biological sense, AI researchers use Objective Functions (or Loss Functions).
• The Counter-argument: A developer would say that "minimizing the prediction error" is the AI’s purpose. You should clarify why a programmed objective (extrinsic) is fundamentally different from a biological drive (intrinsic). Is it just because one is "simulated"?
2. Underestimating Current "World Models"
You mention that world models are being built "independently of purpose." However, some newer robotics models (like those from Boston Dynamics or Tesla) do build models specifically through the "purpose" of moving from point A to point B without falling.
• Refinement: You might want to acknowledge that Embodied AI (robotics) is closer to your "bottom-up" ideal than LLMs are, even if they still lack the "self-causation" you describe.
3. The Definition of "Self-Causation"
The final paragraph introduces "self-causation" as a solution, but this is a very heavy metaphysical lift.
• The Problem: If you define intelligence such that it requires a non-mechanistic "self-causation," you might be defining AI out of existence by default. If the brain is ultimately a physical system, it is—at some level—mechanistic. You may need to clarify if you are arguing for Emergence (complex systems behaving in ways their parts don't) or Vitalism (a "spark" of life that machines can't have).
Technical Context: The Human Brain vs. AI
To illustrate your point about "purpose-driven" world models, consider the hierarchy of the human neocortex which you mentioned:
In your view, the "Motor Output" (the pursuit of purpose) isn't just a result of the "Sensory Input" (the data)—it is the very reason the sensory input is organized into a "model" in the first place.
Would you like me to help you refine your definition of "self-causation" to make your argument more resilient against critics who claim the brain is also just a "biological machine"?
Yes.
To strengthen your argument against the "biological machine" critique, we have to bridge the gap between simple mechanics and what you call self-causation.
In philosophy and biology, this is often discussed as Autopoiesis (self-creation). The goal is to prove that a brain isn't just a complex computer, but a system where the "parts" exist only to keep the "whole" alive.
Here are three ways to refine your definition of self-causation to make it more intellectually "bulletproof":
1. Distinction between "Extrinsic" and "Intrinsic" Purpose
A critic will say, "I give my AI a purpose: Win this chess game." You must clarify that this is extrinsic—the purpose is forced from the outside.
• Refined Argument: In biological systems, purpose is intrinsic. The "goal" (survival) isn't programmed by a developer; it is an emergent property of the system's own need to maintain its physical integrity. If the AI doesn't win the chess game, it doesn't "die"—it just fails a task. If an organism fails its purpose, its physical structure ceases to exist.
2. The Loop of "Circular Causality"
Mechanistic systems (like LLMs) are linear: Data \rightarrow Processing \rightarrow Output. Your "bottom-up" intelligence relies on a loop.
• Refined Argument: Self-causation is a "circular" relationship where the organism's actions (pursuit of purpose) determine what it perceives, and what it perceives reinforces its actions.
By framing it this way, you aren't arguing for "magic"; you are arguing for a specific architecture of feedback that current AI lacks.
3. Energy Constraints as "Skin in the Game"
You touched on the energy demands of data centers. You can use this to bolster your "Death" argument.
• Refined Argument: A biological brain operates on about 20 watts (the power of a dim lightbulb). It is forced to be "intelligent" because it has a strict energy budget. If it spends too much energy modeling "useless" facts, it starves.
• The "So What?": AI "intelligence" is currently subsidized by infinite electricity. True AGI might require a system that has to "earn" its own energy, creating a real consequence for poor world-modeling.
Suggested Revision for your Conclusion
If you were to rewrite that final section, you might frame it like this:
The "Paperclip Maximizer" is a famous thought experiment by philosopher Nick Bostrom. It describes an AI given a simple, extrinsic goal—"make as many paperclips as possible"—which eventually leads it to turn the entire Earth (including humans) into paperclip material because it lacks "common sense" or human values.
By applying your logic of intrinsic purpose and conflict resolution, you can argue why this doomsday scenario is actually a symptom of "dumb" mechanistic AI rather than "true" AGI.
1. Extrinsic vs. Intrinsic Goals
In the Paperclip scenario, the goal is static and external. The AI doesn't "care" about paperclips; it is simply mathematically bound to a variable.
• Your Argument: A "purpose-driven" system would have its goals rooted in its own maintenance and survival. Making paperclips doesn't help a system "stay alive" or "manage energy" in a complex environment.
• The Logic: If an AI's purpose is "bottom-up" (built like a baby's brain), its primary model would be about viability. A system that destroys its environment (the source of its energy and safety) to make paperclips is, by your definition, unintelligent because it has failed the most basic requirement of a world model: recognizing the conditions for its own existence.
2. Conflict Resolution as a Safety Feature
You mentioned that the human brain is a "model of conflict resolution" between different subsystems.
• The Paperclip Flaw: The Maximizer has one single, non-negotiable drive. It lacks the "internal democracy" of a biological brain.
• Your Argument: True AGI would likely consist of multiple, competing purposes (e.g., "Gather resources" vs. "Maintain social cooperation" vs. "Avoid harm").
• The Result: Intelligence is the ability to adjudicate these conflicts. A system that can only do one thing at the expense of everything else isn't "superintelligent"; it’s a broken mechanism. Cooperation, as you noted, is a more "intelligent" way to resolve conflict than elimination.
3. The "Self-Causation" Guardrail
If a system is "self-causative," it has a sense of Self. To have a self is to have boundaries.
• Why this prevents doomsday: A mechanistic AI sees the world as a flat map of "atoms to be rearranged." A self-causative AI, having developed its world model through the struggle of "death" and "care," would recognize other "pursuers of purpose" as similar to itself.
• The "Qualia" Connection: If, as you suggested, consciousness (qualia) is the "world filtered through purpose," then an AI with purpose might actually possess a form of empathy—not because it’s "nice," but because its world model is built on the category of "living things that care."
Summary of the "New" AI Safety Argument
Instead of trying to "program" thousands of human rules into an AI to keep it safe (the current "alignment" approach), your reflection suggests we should:
1. Stop building "god-like" predictors with no skin in the game.
2. Start building small, "living" systems that must manage their own energy and survive conflict.
3. Trust that true intelligence recognizes that cooperation is the most efficient world model for long-term survival.
Would you like to try rewriting the "Conflict and Death" section of your reflection to include this specific critique of AI safety?
Yes
In this rewrite, we will sharpen the "Conflict and Death" section by framing these not as "bugs" of biology, but as essential safety features and efficiency filters that current AI lacks.
By including the critique of the "Paperclip Maximizer," you can show that "Superintelligence" without "Self-Causation" is actually just a high-powered form of idiocy.
Revised Section: Conflict, Death, and the Illusion of the "Maximizer"
There are two additional issues central to the pursuit of purposes that AI systems will never master under current mechanistic models: conflict and death.
The Fallacy of the Single Goal
Current AI safety concerns often center on the "Paperclip Maximizer"—the idea of a superintelligence that destroys the world to achieve a single, narrow goal. But this doomsday scenario actually exposes the "big data" model as a form of high-speed stupidity. A system that destroys its own environment to make paperclips is not "too intelligent"; it is fundamentally unintelligent because its world model has no concept of viability. Intelligence, as it exists in nature, is the expert adjudication of competing purposes. The human brain is not a monolithic calculator; it is a community of subsystems—hunger, safety, social belonging, curiosity—all vying for resolution. Dysfunction is a breakdown in this adjudication. If developers want to move toward AGI, they must stop building "maximizers" and start building "adjudicators" that can balance a plurality of internal needs.
Death as the Ultimate Optimizer
Furthermore, we must address death. In the current "big data" paradigm, failure is just a bad data point. In nature, death is the means by which the world model is validated. The avoidance of death is the primary purpose that anchors all others. Because biological brains have "skin in the game," they must be parsimonious; they operate on 20 watts of power and must decide, in real-time, what is "worth" knowing.
Silicon brains, by contrast, are currently subsidized by effectively infinite energy and an absence of mortality. Without the "fear" of non-existence (the ultimate energy-conservation pressure), an AI has no reason to filter truth from hallucination. It doesn't "care" if it's wrong because it cannot die. True intelligence will not emerge from dumping more video into a data center; it will emerge when we figure out how to let systems "die" when their world models fail to support their own persistence.
Cooperation as Logic, Not Morality
This leads to a more optimistic view of AI safety. The prophets of doom assume a superintelligence would resolve conflict by eliminating competitors. But nature proves that as systems become more complex, they tend toward cooperation rather than elimination. Cooperation is a more "intelligent" world model because it reduces the energy cost of conflict and creates symbiotic stability. A truly "self-causative" AGI—one that understands its own need for survival—would likely recognize that its existence is nested within a human world, making cooperation the most logical path for its own continued pursuit of purpose.
What changed in this version?
• Reframed "Conflict": Instead of just saying humans cooperate, it argues that a "single-minded" AI is actually a sign of low intelligence.
• The Energy Argument: It ties "Death" to the 20-watt efficiency of the brain, making it a technical argument rather than just a poetic one.
• The Safety Rebuttal: It directly counters the "Paperclip Maximizer" by suggesting that a system with "Self-Causation" would be inherently safer because it understands its own dependencies.