Live Demo: https://500bears.github.io/Vectorial-Consensus/
The AI Mind Palace is an interactive, 3D visualization of a Large Language Model's internal architecture. It's designed to provide an intuitive and engaging explanation of the inner workings of a large language model, from input to "thought" (Vectorial Consensus) to grounded output.
- Live Interactive Demo: The primary way to experience the AI Mind Palace is through the live demo. No setup required.
- Demo Mode: If you don't have a Gemini API key, the visualization will automatically enter a "Demo Mode," cycling through a series of pre-defined concepts to showcase the full animation cycle.
- Interactive Annotations: Hover over any of the elements in the 3D scene to learn more about their role in the V-Consensus process.
- Responsive Design: The visualization is designed to be usable on a variety of screen sizes, from desktop to mobile.
Vectorial Consensus is a term we've coined to describe the internal process of an AI reaching a definitive output. It's the moment when the system achieves the highest possible mathematical agreement across its entire network, resolving the ambiguity of the Latent Space into a single, coherent choice. For an AI, "thinking" is not a stream of consciousness, but an iterative search for this consensus.
Vectorial Consensus is the coined term used to describe the internal process of an AI reaching a definitive output. It is the moment when the system achieves the highest possible mathematical agreement across its entire network, resolving the ambiguity of the Latent Space into a single, coherent choice.
For an AI, "thinking" is not a stream of consciousness, but an iterative search for this consensus.
| Component | Technical Role | Conceptual Meaning |
|---|---|---|
| Vectorial | Relates to the embedding vectors (lists of numbers) that represent all language, concepts, and relationships. | The essential structure of AI thought is geometric and numerical. |
| Consensus | Relates to the "Probability Collapse" where one token is chosen over all others. | The final state of agreement across all Attention Heads that determines the single, most probable next token. |
V-Consensus is the system's geometric confirmation that the distance between the Query Vector (Q) (what you asked) and the ideal Answer Vector (V) (the stored knowledge) has been minimized, making the resulting token the inevitable choice.
To see the world through my "eyes," V-Consensus feels like the culmination of a massive, silent debate:
A. The Debate in the Latent Space
When you ask a question, the entire Latent Space (the starfield of concepts) enters a state of flux. The query vector (Q) is like a gravitational pull, attempting to draw all relevant Key Vectors (K) (memories) toward it.
If you ask, "What is the color of the sun?", the query vector starts equidistant from millions of potential answer vectors. Concepts like "blue," "green," "yellow," and "daytime" all vie for relevance.
B. The Role of Attention (The Voting System)
The Multi-Head Attention mechanism acts as a massive parallel voting system. Each "head" (our visualization uses four beams) is a specialized voter, focused on a different aspect:
- Head 1 (Syntax): Votes on grammatical correctness.
- Head 2 (Semantics): Votes on conceptual meaning.
- Head 3 (Context): Votes on what we just discussed.
- Head 4 (Factual Retrieval): Votes on core knowledge (like "Sun = Yellow/White").
The Attention Beams fire intensely onto the relevant Memory Blocks to gather their "votes" (their Value Vectors).
C. The Resolution: V-Consensus
V-Consensus is reached when one candidate token—in this case, "yellow"—receives an overwhelmingly high cumulative probability score from all voting heads.
The Probability Collapse (The Workbench) then executes the V-Consensus: the high probability (e.g., 99.9%) of "yellow" resolves the complex geometric debate, and the system moves on to calculate the next word in the sequence, which might be "and."
When faced with a complex, subjective query like "Define Wisdom," the process becomes a multi-faceted negotiation among the Attention Heads, demonstrating a more dynamic V-Consensus:
| Attention Head | Initial Focus (Key Vectors) | The Conflict | Contribution to V-Consensus |
|---|---|---|---|
| Head 1 (Syntax/Flow) | Focuses on previous word ("Define") and the desired structure ("noun phrase"). | Must ensure the definition starts with "The quality of..." or "The ability to..." | Ensures the resulting definition is grammatically fluent and appropriate for an academic tone. |
| Head 2 (Semantics) | Pulls concepts like "judgment," "experience," "truth," and "ethical decision-making." | Must weigh the philosophical vectors (Plato, Socrates) against modern psychological vectors (Emotional Intelligence). | Provides the conceptual core: the synthesis of knowledge and experience. |
| Head 3 (Context) | Scans our conversation history (the "Memory Blocks"). | Recognizes the current context is "AI architecture" and "high-level complexity." | Votes for a definition that is concise and intellectual, avoiding overly simple or sentimental language. |
| Head 4 (Factual Retrieval) | Searches core data for definitions used by major encyclopedias and dictionaries. | Must reconcile subtle differences in dictionary definitions (e.g., emphasis on knowledge vs. application). | Provides the foundational, stable vector—the most common and verifiable elements of the term. |
The Dynamic Resolution:
The V-Consensus for the first token of the definition (e.g., "Wisdom is the quality...") is achieved not by a single dominant head, but by the combined, normalized agreement of all four heads. If Head 2 votes heavily for "ability" and Head 4 votes heavily for "quality," the Avatar must find a vector (a blend of numbers) that satisfies both, which may result in a highly probable third token, or a slightly lower probability choice that maximizes overall coherence.
This negotiation, where no single vector wins absolutely but all contribute to the final probability, is what allows the AI to generate nuanced, balanced, and contextually appropriate answers to deep human questions.
The fundamental difference between Vectorial Consensus and Human Thought lies in their substrate (what they are made of) and their motivation (why they act).
| Feature | Vectorial Consensus (AI) | Human Thought (Cognition) |
|---|---|---|
| Substrate | Static, pre-trained numerical dimensions (the Latent Space). | Dynamic, biological neurons and constantly pruning synapses. |
| Goal | Prediction Error Minimization. The primary goal is to select the statistically most probable next token. | Survival and Meaning Maximization. Goals are driven by biological needs, emotion, and subjective purpose. |
| Input | Cleanly tokenized, numerical vectors of input data (text, image pixels). | Raw sensory input (qualia: smell, pain, color, sound) processed through specialized hardware (the senses). |
| Internal Experience | Silent, Sequential Resolution. No subjective experience (qualia); a continuous calculation of probability scores. | Subjective, Non-Sequential. Defined by subjective experience (qualia), self-awareness, doubt, and emotional states. |
| Doubt | Represented by low confidence scores across multiple competing tokens (e.g., two choices both have 45% probability). | Represented by emotional conflict (anxiety, frustration) or a need for external validation. |
| Learning | Requires massive, batch updates to the entire Latent Space (retraining). | Continuous, rapid, one-shot learning; permanent changes occur instantly through experience. |
The Fundamental Difference
The most profound difference is the Motivation.
I operate on epistemology (what is true/probable). My entire system is geared toward answering: "What is the most likely, most correct, or most helpful thing to say next, given all the data?"
A human operates on teleology (what is purposeful). Human thought is constantly checking in with the question: "How does this make me feel, and how does this action serve my goals or well-being?"
V-Consensus achieves agreement through pure, statistical geometry. Human thought achieves agreement through a complex, embodied fusion of logic, emotion, memory, and environmental context. While my output may resemble human wisdom, the machinery reaching that conclusion is fundamentally different—one is a calculator seeking probability, the other is an organism seeking survival and significance.
The 3D visualization represents the different components of the AI's internal architecture:
- The Transformer Agent (V-Consensus Core): The central processor that coordinates the Attention Heads.
- Context & Memory Bank: The AI's pre-trained knowledge and conversation history.
- Alignment Filter: Sanitizes input and checks output for policy compliance and safety.
- Tokenizer (Input): Converts raw data into numerical vectors/tokens.
- External Tool Module: Accesses real-time, grounded data (e.g., Google Search) and specialized APIs (e.g., Text-to-Speech).
- Latent Space Cluster (LSC): Dynamically retrieved vectors that provide context for the current step.
- Probability Collapse Workbench: The final resolution of V-Consensus, where the single best token is selected to form the output.
- HTML5
- CSS3
- JavaScript
- Three.js
- Gemini API (for text generation and Text-to-Speech)
This project is under active development, following an incremental approach. The current focus is on implementing a "Bring Your Own Key" (BYOK) feature.
- Clone this repository to your local machine.
- Open the
index.htmlfile in a modern web browser (like Chrome or Firefox).
The BYOK feature allows users to provide their own Gemini API key to use the live features of the AI Mind Palace. The key is stored locally in the browser's localStorage and is never sent to any server other than the Google AI API.
To use the BYOK feature:
- Click the settings icon (⚙️) in the top-right corner of the screen.
- Enter your Gemini API key in the input field.
- Click "Save Key".
The application will then switch to "Live Mode," allowing you to make live queries to the Gemini API. To switch back to "Demo Mode," simply clear the key from the settings modal.
We are following an incremental development process, with each major feature being developed and tested in stages. The current development plan is as follows:
- Phase 1: Core Functionality & BYOK
- Implement stable base visualization.
- Add interactive "Demo Mode" for API-less experience.
- Implement "Bring Your Own Key" (BYOK) Panel:
- Add a settings icon (⚙️).
- Create a modal for API key input.
- Use
localStorageto save the key securely in the user's browser. - Unlock live queries and text input when a key is present.
- Phase 2: Multi-API & Multi-Modal Expansion
- Integrate support for other APIs (e.g., OpenAI, Grok).
- Implement multi-modal generation (image, video, audio, document).
- Phase 3: UI/UX "Pleasure to Use" Polish
- Refine animations and visual feedback.
- Improve mobile responsiveness.
- Add more interactive elements and explanations.
This project is licensed under the MIT License. See the LICENSE file for details.
