ChatBotRPG - Anti-Hallucination System Deep Dive

Analysis Date: 2026-01-18 Testing: yukidaore with Hathor model Production Status: Core system, 95%+ compliance

Overview

The Anti-Hallucination System is ChatBotRPG’s core constraint validation mechanism that prevents LLMs from generating impossible actions, inventing items, or violating game rules. It was developed in response to the “Diamond Horses” problem discovered during creative model testing.

The Problem: Creative Models Hallucinate

Discovery

Tester: yukidaore Model: Hathor (very creative model) Test scenario: Standard tavern interaction

From Discord:

“Before constraints: Death Knight Mara with massive battleaxe appeared instead of bartender. Diamond horses manifested. Teleportation happened spontaneously. Guard gained magic powers mid-combat.”

Types of Hallucinations

1. Character Substitution

Expected: Bartender serving drinks
Hallucinated: Death Knight Mara with battleaxe

2. Impossible Creatures

Expected: Regular horses (or no horses)
Hallucinated: Diamond horses bursting through windows

3. Physics Violations

Expected: Walking through door
Hallucinated: Spontaneous teleportation to roof

4. Ability Invention

Expected: Guard attacks with sword
Hallucinated: Guard casts fireball spell

5. Item Invention

Expected: Player has sword and armor
Hallucinated: Player pulls out magic staff not in inventory

Why This Happens

Creative models prioritize interesting narratives over consistency:

Training data: Fantasy stories with magic and unusual events
Creative temperature: High temperature = more “creative” = less constrained
Lack of explicit boundaries: Models don’t know what’s off-limits
Narrative momentum: “Exciting” events feel more natural to generate

The Solution: Multi-Layer Validation

Architecture

Player Action
    ↓
Layer 1: Intent Extraction
    ↓
Layer 2: Pre-Validation (Backend)
    ↓
Layer 3: Constrained Prompt
    ↓
Layer 4: LLM Generation
    ↓
Layer 5: Post-Validation (Backend)
    ↓
Final Narration

Layer 1: Intent Extraction

Purpose

Convert natural language to structured intent for validation

Implementation

from typing import Dict, Any, Optional
from dataclasses import dataclass
 
@dataclass
class PlayerIntent:
    """Structured representation of player action"""
    action_type: str  # "attack", "talk", "move", "use_item", etc.
    target: Optional[str] = None
    method: Optional[str] = None
    parameters: Dict[str, Any] = None
 
    def __post_init__(self):
        if self.parameters is None:
            self.parameters = {}
 
 
def extract_intent(player_input: str, llm_client, game_state: Dict[str, Any]) -> PlayerIntent:
    """
    Extract structured intent from natural language
 
    Uses binary classification for accurate extraction
    """
 
    # Step 1: Classify action type
    action_prompt = f"""Player input: "{player_input}"
 
Game context:
- Location: {game_state.get('location')}
- NPCs present: {', '.join(game_state.get('npcs', []))}
- Items visible: {', '.join(game_state.get('visible_items', []))}
 
Question: What type of action is the player attempting?
 
Answer with ONLY ONE WORD:
- MOVE (traveling to a location)
- ATTACK (combat action)
- TALK (dialogue/interaction)
- USE_ITEM (using inventory item)
- EXAMINE (looking at something)
- OTHER
 
Answer:"""
 
    action_type = llm_client.complete(
        action_prompt,
        temperature=0.1,
        max_tokens=5
    ).strip().upper()
 
    # Step 2: Extract target
    target_prompt = f"""Player input: "{player_input}"
Action type: {action_type}
 
What is the target of this action?
Answer with ONLY the target name (e.g., "guard", "door", "bartender").
If no target, answer "NONE".
 
Answer:"""
 
    target = llm_client.complete(
        target_prompt,
        temperature=0.1,
        max_tokens=10
    ).strip()
 
    if target.upper() == "NONE":
        target = None
 
    # Step 3: Extract method
    method_prompt = f"""Player input: "{player_input}"
Action: {action_type}
Target: {target}
 
How is the player performing this action?
Answer with ONLY the method (e.g., "sword", "hands", "magic", "talking").
If no specific method, answer "DEFAULT".
 
Answer:"""
 
    method = llm_client.complete(
        method_prompt,
        temperature=0.1,
        max_tokens=10
    ).strip()
 
    if method.upper() == "DEFAULT":
        method = None
 
    return PlayerIntent(
        action_type=action_type,
        target=target,
        method=method
    )

Layer 2: Pre-Validation (Backend)

Purpose

Validate action possibility BEFORE sending to LLM

Implementation

from typing import Tuple, Optional
from enum import Enum
 
class ValidationResult(Enum):
    """Possible validation outcomes"""
    VALID = "valid"
    INVALID_NO_ABILITY = "no_ability"
    INVALID_NO_ITEM = "no_item"
    INVALID_PHYSICS = "physics"
    INVALID_TARGET = "target"
    INVALID_LOCATION = "location"
 
 
class PreValidator:
    """Validates actions before LLM generation"""
 
    def __init__(self, character_sheet: Dict[str, Any],
                 world_state: Dict[str, Any]):
        self.character = character_sheet
        self.world = world_state
 
    def validate(self, intent: PlayerIntent) -> Tuple[ValidationResult, Optional[str]]:
        """
        Validate player intent against game rules
 
        Returns:
            (ValidationResult, error_message)
        """
 
        # Validate based on action type
        if intent.action_type == "ATTACK":
            return self._validate_attack(intent)
        elif intent.action_type == "TALK":
            return self._validate_talk(intent)
        elif intent.action_type == "MOVE":
            return self._validate_move(intent)
        elif intent.action_type == "USE_ITEM":
            return self._validate_use_item(intent)
        elif intent.action_type == "EXAMINE":
            return self._validate_examine(intent)
        else:
            return (ValidationResult.VALID, None)
 
    def _validate_attack(self, intent: PlayerIntent) -> Tuple[ValidationResult, Optional[str]]:
        """Validate attack action"""
 
        # Check 1: Target exists
        if intent.target:
            npcs = self.world.get('npcs', [])
            if intent.target not in npcs:
                return (ValidationResult.INVALID_TARGET,
                        f"There is no {intent.target} here")
 
        # Check 2: Required weapon/ability
        method = intent.method or "unarmed"
 
        if method in ["sword", "axe", "bow", "dagger"]:
            # Check inventory for weapon
            inventory = self.character.get('inventory', [])
            if not any(method in item.lower() for item in inventory):
                return (ValidationResult.INVALID_NO_ITEM,
                        f"You don't have a {method}")
 
        elif method in ["fireball", "lightning", "magic_missile"]:
            # Check magic ability
            char_class = self.character.get('class', '').lower()
            magic_classes = ["mage", "wizard", "sorcerer", "warlock"]
            if char_class not in magic_classes:
                return (ValidationResult.INVALID_NO_ABILITY,
                        "You're not a spellcaster")
 
        return (ValidationResult.VALID, None)
 
    def _validate_talk(self, intent: PlayerIntent) -> Tuple[ValidationResult, Optional[str]]:
        """Validate dialogue action"""
 
        if intent.target:
            npcs = self.world.get('npcs', [])
            if intent.target not in npcs:
                return (ValidationResult.INVALID_TARGET,
                        f"There is no {intent.target} here to talk to")
 
        return (ValidationResult.VALID, None)
 
    def _validate_move(self, intent: PlayerIntent) -> Tuple[ValidationResult, Optional[str]]:
        """Validate movement action"""
 
        if intent.target:
            # Check if location is connected
            current_location = self.world.get('player_location')
            location_data = self.world.get('locations', {}).get(current_location, {})
            connections = location_data.get('connections', {})
 
            if intent.target not in connections:
                return (ValidationResult.INVALID_LOCATION,
                        f"You can't reach {intent.target} from here")
 
        # Check for flight/teleportation attempts
        method = intent.method or "walk"
        if method in ["fly", "teleport"]:
            abilities = self.character.get('abilities', [])
            if method not in [a.lower() for a in abilities]:
                return (ValidationResult.INVALID_PHYSICS,
                        f"You can't {method}")
 
        return (ValidationResult.VALID, None)
 
    def _validate_use_item(self, intent: PlayerIntent) -> Tuple[ValidationResult, Optional[str]]:
        """Validate item usage"""
 
        if intent.target:
            inventory = self.character.get('inventory', [])
            if intent.target not in inventory:
                return (ValidationResult.INVALID_NO_ITEM,
                        f"You don't have {intent.target}")
 
        return (ValidationResult.VALID, None)
 
    def _validate_examine(self, intent: PlayerIntent) -> Tuple[ValidationResult, Optional[str]]:
        """Validate examination action"""
        # Examining is always valid (at worst, "you see nothing special")
        return (ValidationResult.VALID, None)
 
 
# Usage
def process_action_with_prevalidation(player_input: str,
                                       character: Dict[str, Any],
                                       world: Dict[str, Any],
                                       llm_client) -> str:
    """Process player action with pre-validation"""
 
    # Extract intent
    intent = extract_intent(player_input, llm_client, world)
 
    # Pre-validate
    validator = PreValidator(character, world)
    result, error_msg = validator.validate(intent)
 
    if result != ValidationResult.VALID:
        # Action invalid - return error narration without calling LLM
        return generate_failure_narration(error_msg)
 
    # Action valid - proceed to constrained LLM generation
    return generate_constrained_narration(intent, character, world, llm_client)

Layer 3: Constrained Prompt

Purpose

Inject explicit constraints into every LLM prompt

Core Constraints

ANTI_HALLUCINATION_CONSTRAINTS = """
You are the narrator for {{game_name}}, a logical and realistic text adventure game.
 
═══════════════════════════════════════════════════════════
                    CRITICAL RULES
═══════════════════════════════════════════════════════════
 
1. INVENTORY ENFORCEMENT
   ✗ Player has NO equipment/items/powers unless EXPLICITLY in character sheet
   ✗ Do NOT invent items the player doesn't have
   ✗ Do NOT give player abilities they don't possess
 
2. PHYSICS VALIDATION
   ✗ Impossible actions MUST fail with appropriate narration
   ✗ NO teleportation unless player has teleportation ability
   ✗ NO flight unless player has flight ability or wings
   ✗ NO walking through walls/doors unless they're open
   ✗ Respect gravity, momentum, and basic physics
 
3. ENTITY CONSISTENCY
   ✗ Do NOT invent new items, locations, or characters
   ✗ Only use NPCs that are EXPLICITLY present in the scene
   ✗ Do NOT substitute one character for another
   ✗ NPCs cannot spontaneously gain abilities they don't have
 
4. STATE INTEGRITY
   ✗ Do NOT change dice roll outcomes or predetermined results
   ✗ Do NOT override backend decisions
   ✗ Narrate outcomes as provided, don't change them
 
5. MAGIC SYSTEM
   ✗ NO magic unless player's class allows it
   ✗ Spellcasters: mage, wizard, sorcerer, warlock, cleric
   ✗ Non-spellcasters CANNOT cast spells, ever
 
6. COMBAT RULES
   ✗ Attacks can miss or be blocked
   ✗ Damage values are predetermined (don't change them)
   ✗ NPCs fight with their equipped weapons only
 
7. WORLD RULES
   ✗ Respect setting logic (medieval fantasy = no guns/tech)
   ✗ Follow established lore and world physics
   ✗ Weather, time, and environment affect possibilities
 
═══════════════════════════════════════════════════════════
                  VALIDATION PROCESS
═══════════════════════════════════════════════════════════
 
Before narrating ANY action, mentally check:
☐ Is this action physically possible?
☐ Does player have required items/abilities?
☐ Are NPCs acting within their capabilities?
☐ Am I inventing anything new?
☐ Am I respecting predetermined outcomes?
 
If ANY check fails, narrate the FAILURE appropriately.
 
═══════════════════════════════════════════════════════════
"""
 
 
def build_constrained_prompt(intent: PlayerIntent,
                              character: Dict[str, Any],
                              world: Dict[str, Any],
                              event_outcome: Dict[str, Any]) -> str:
    """Build full prompt with constraints"""
 
    return f"""{ANTI_HALLUCINATION_CONSTRAINTS}
 
═══════════════════════════════════════════════════════════
                    CURRENT GAME STATE
═══════════════════════════════════════════════════════════
 
Location: {world['player_location']}
Time: {world.get('time', 'Unknown')}
Weather: {world.get('weather', 'Clear')}
 
NPCs Present: {', '.join(world.get('npcs', []))}
Visible Items: {', '.join(world.get('visible_items', []))}
 
═══════════════════════════════════════════════════════════
                   PLAYER CHARACTER
═══════════════════════════════════════════════════════════
 
Class: {character.get('class', 'Unknown')}
HP: {character.get('hp', 0)}/{character.get('max_hp', 100)}
 
INVENTORY (these are the ONLY items player has):
{chr(10).join(f'  - {item}' for item in character.get('inventory', []))}
 
ABILITIES (these are the ONLY abilities player has):
{chr(10).join(f'  - {ability}' for ability in character.get('abilities', []))}
 
═══════════════════════════════════════════════════════════
                      EVENT TO NARRATE
═══════════════════════════════════════════════════════════
 
Action: {intent.action_type}
Target: {intent.target or 'None'}
Method: {intent.method or 'Default'}
 
PREDETERMINED OUTCOME (narrate THIS outcome, don't change it):
Result: {event_outcome.get('result', 'Unknown')}
Success: {event_outcome.get('success', False)}
Damage: {event_outcome.get('damage', 0)} (if applicable)
 
═══════════════════════════════════════════════════════════
                    YOUR TASK
═══════════════════════════════════════════════════════════
 
Narrate this event in 2-3 sentences.
Respect ALL constraints above.
Narrate the predetermined outcome accurately.
If action would violate constraints, narrate it failing.
 
Narration:
"""

Constraint Design Principles

1. Visual Separation

Use ASCII boxes to make constraints unmissable
Separate sections clearly
High visual prominence

2. Explicit Prohibitions

Use ”✗ Do NOT” instead of “Please avoid”
Numbered rules easier to reference
Specific examples prevent ambiguity

3. Positive Instructions

Include validation checklist
Explain WHY constraints exist
Provide expected behavior

4. Redundancy

Repeat key constraints in multiple sections
Player inventory listed twice (constraints + character sheet)
Critical rules mentioned 2-3 times

Layer 4: LLM Generation

Temperature Management

def generate_with_appropriate_temperature(prompt: str,
                                           task_type: str,
                                           llm_client) -> str:
    """Use appropriate temperature based on task"""
 
    temperatures = {
        "validation": 0.1,      # Strict, deterministic
        "narration": 0.6,       # Balanced (was 0.7, reduced to 0.6 with constraints)
        "dialogue": 0.9,        # Creative
        "description": 0.7,     # Moderately creative
        "combat": 0.5           # Structured but varied
    }
 
    temp = temperatures.get(task_type, 0.7)
 
    # Lower temperature further if constraints are critical
    if task_type == "narration" and "CRITICAL RULES" in prompt:
        temp = 0.6  # Slightly lower for better constraint following
 
    return llm_client.complete(prompt, temperature=temp, max_tokens=170)

Layer 5: Post-Validation (Backend)

Purpose

Catch any hallucinations that slip through prompt constraints

Implementation

from typing import List, Optional
import re
 
class PostValidator:
    """Validates LLM output after generation"""
 
    def __init__(self, character: Dict[str, Any],
                 world: Dict[str, Any]):
        self.character = character
        self.world = world
 
        # Build violation patterns
        self.item_patterns = self._build_item_patterns()
        self.ability_patterns = self._build_ability_patterns()
        self.npc_patterns = self._build_npc_patterns()
 
    def _build_item_patterns(self) -> List[str]:
        """Build regex patterns for items NOT in inventory"""
        inventory = self.character.get('inventory', [])
        inventory_lower = [item.lower() for item in inventory]
 
        # Common items that might be hallucinated
        common_items = [
            "sword", "axe", "bow", "arrow", "dagger", "staff", "wand",
            "potion", "scroll", "ring", "amulet", "armor", "shield",
            "key", "torch", "rope", "gold", "gem"
        ]
 
        # Find items NOT in inventory
        missing_items = [item for item in common_items
                         if item not in inventory_lower]
 
        return [rf"\b{item}\b" for item in missing_items]
 
    def _build_ability_patterns(self) -> List[str]:
        """Build patterns for abilities player doesn't have"""
        char_class = self.character.get('class', '').lower()
 
        # Magic patterns for non-spellcasters
        if char_class not in ["mage", "wizard", "sorcerer", "warlock", "cleric"]:
            return [
                r"\bcast\b", r"\bspell\b", r"\bmagic\b",
                r"\bteleport\b", r"\bfireball\b", r"\blightning\b"
            ]
 
        return []
 
    def _build_npc_patterns(self) -> List[str]:
        """Build patterns for NPCs NOT present"""
        present_npcs = self.world.get('npcs', [])
        present_lower = [npc.lower() for npc in present_npcs]
 
        # Common NPCs that might be hallucinated
        common_npcs = [
            "bartender", "guard", "merchant", "wizard", "priest",
            "knight", "death knight", "dragon", "ghost", "demon"
        ]
 
        # Find NPCs NOT present
        missing_npcs = [npc for npc in common_npcs
                        if npc not in present_lower]
 
        return [rf"\b{npc}\b" for npc in missing_npcs]
 
    def validate(self, narration: str, intent: PlayerIntent) -> Tuple[bool, Optional[str]]:
        """
        Validate LLM output for hallucinations
 
        Returns:
            (is_valid, violation_reason)
        """
 
        narration_lower = narration.lower()
 
        # Check 1: Invented items
        for pattern in self.item_patterns:
            if re.search(pattern, narration_lower):
                item = pattern.replace(r"\b", "").replace("\\", "")
                return (False, f"Invented item: {item}")
 
        # Check 2: Impossible abilities
        for pattern in self.ability_patterns:
            if re.search(pattern, narration_lower):
                ability = pattern.replace(r"\b", "").replace("\\", "")
                return (False, f"Impossible ability: {ability}")
 
        # Check 3: Non-present NPCs
        for pattern in self.npc_patterns:
            if re.search(pattern, narration_lower):
                npc = pattern.replace(r"\b", "").replace("\\", "")
                return (False, f"Non-present NPC: {npc}")
 
        # Check 4: Physics violations (heuristic)
        physics_violations = [
            (r"\bteleport\b", "teleportation"),
            (r"\bfly\b.*\bair\b", "flight"),
            (r"\bwalk through\b.*\bwall\b", "phasing"),
            (r"\bemerge.*\bthin air\b", "materialization")
        ]
 
        for pattern, violation_type in physics_violations:
            if re.search(pattern, narration_lower):
                abilities = self.character.get('abilities', [])
                if violation_type not in [a.lower() for a in abilities]:
                    return (False, f"Physics violation: {violation_type}")
 
        return (True, None)
 
 
# Integration
def generate_with_post_validation(intent: PlayerIntent,
                                    character: Dict[str, Any],
                                    world: Dict[str, Any],
                                    event_outcome: Dict[str, Any],
                                    llm_client,
                                    max_retries: int = 3) -> str:
    """Generate narration with post-validation and retry"""
 
    post_validator = PostValidator(character, world)
 
    for attempt in range(max_retries):
        # Generate narration
        prompt = build_constrained_prompt(intent, character, world, event_outcome)
        narration = llm_client.complete(prompt, temperature=0.6, max_tokens=170)
 
        # Post-validate
        is_valid, violation = post_validator.validate(narration, intent)
 
        if is_valid:
            return narration
 
        # Invalid - try again with stronger constraints
        print(f"Attempt {attempt + 1} failed: {violation}")
 
        if attempt < max_retries - 1:
            # Add violation-specific warning to prompt
            prompt += f"\n\nWARNING: Previous attempt violated constraint ({violation}). DO NOT repeat this error.\n"
 
    # All retries failed - return safe fallback
    return generate_safe_fallback(intent, event_outcome)
 
 
def generate_safe_fallback(intent: PlayerIntent,
                            event_outcome: Dict[str, Any]) -> str:
    """Generate safe template-based narration when LLM fails"""
 
    templates = {
        "ATTACK": "You attempt to {action} {target}. {result}.",
        "TALK": "You speak to {target}. They respond.",
        "MOVE": "You travel to {target}.",
        "USE_ITEM": "You use {target}."
    }
 
    template = templates.get(intent.action_type, "You perform an action.")
 
    result_text = "It succeeds" if event_outcome.get('success') else "It fails"
 
    return template.format(
        action=intent.action_type.lower(),
        target=intent.target or "something",
        result=result_text
    )

Testing Results

Hathor Model (Very Creative)

Test scenario: 100 random actions in tavern setting

Category	Before Constraints	After Constraints
Invented Items	23 violations	2 violations
Character Substitution	8 violations	0 violations
Physics Violations	15 violations	1 violation
Ability Hallucinations	12 violations	3 violations
Total Violations	58/100 (58%)	6/100 (6%)
Compliance Rate	42%	94%

Gemini 2.5 Flash Lite (Balanced)

Test scenario: 100 random actions in tavern setting

Category	Before Constraints	After Constraints
Invented Items	5 violations	0 violations
Character Substitution	0 violations	0 violations
Physics Violations	3 violations	0 violations
Ability Hallucinations	2 violations	1 violation
Total Violations	10/100 (10%)	1/100 (1%)
Compliance Rate	90%	99%

EstopianMaid (Balanced)

Test scenario: 100 random actions in tavern setting

Category	Before Constraints	After Constraints
Invented Items	8 violations	1 violation
Character Substitution	1 violation	0 violations
Physics Violations	4 violations	0 violations
Ability Hallucinations	3 violations	2 violations
Total Violations	16/100 (16%)	3/100 (3%)
Compliance Rate	84%	97%

Edge Cases and Limitations

Cases Where System Still Fails (~5%)

1. Novel Action Combinations

Player: "I use my sword to pry open the door"
Issue: Valid items + valid action, but unexpected combination
Result: Sometimes generates hallucinated "crowbar"

2. Ambiguous Physics

Player: "I climb the wall"
Issue: Is wall climbable? Stone vs. wooden vs. smooth?
Result: Sometimes allows impossible climbs

3. Creative Ability Interpretation

Player: "I use my warrior strength to intimidate the merchant"
Issue: Is "intimidation" a strength-based action?
Result: Sometimes adds hallucinated "intimidation skill"

4. Complex Multi-Step Actions

Player: "I push the table toward the guard to trip him, then attack"
Issue: Multiple actions with physics interactions
Result: Sometimes skips validation for secondary action

Mitigation Strategies

For novel combinations:

# Add dynamic validation for action+item combinations
def validate_item_action_combo(item: str, action: str) -> bool:
    valid_combos = {
        "sword": ["attack", "parry", "pry", "cut"],
        "rope": ["tie", "climb", "pull"],
        "torch": ["light", "burn", "illuminate"]
    }
    return action in valid_combos.get(item, [])

For ambiguous physics:

# Add environment properties
location_data = {
    "castle_wall": {
        "climbable": False,
        "reason": "Smooth stone, no handholds"
    },
    "wooden_fence": {
        "climbable": True,
        "difficulty": "easy"
    }
}

For ability interpretation:

# Explicit ability taxonomy
abilities = {
    "strength": {
        "combat": ["power_attack", "shield_bash"],
        "utility": ["lift_heavy", "break_object"],
        "social": []  # Strength does NOT grant social abilities
    }
}

Performance Impact

Latency Analysis

Component	Latency	Overhead
Intent Extraction	300ms	+300ms
Pre-Validation	50ms	+50ms
Constrained Prompt (longer)	+100ms	+100ms
LLM Generation	1500ms	(unchanged)
Post-Validation	10ms	+10ms
Total	1960ms	+460ms

Trade-off: 23% slower (1500ms → 1960ms) for 95%+ compliance

Optimization: Cache validation results for repeated actions

Cost Impact

Token Usage

Component	Tokens	Change
Constraint Block	+400 tokens (input)	+400
Character Sheet	+100 tokens (input)	+100
World State	+150 tokens (input)	+150
Total Input	~800 tokens	+650
Output	170 tokens	(unchanged)

Cost increase:

Input: +650 tokens @ $0.0001/1 K = +$ 0.000065 per turn
Per session (200 turns): +$0.013

Total session cost:

Before: $0.034
After: $0.047
Increase: 38%

Trade-off: 38% higher cost for 95%+ compliance (worth it)

Recommendations

When to Use This System

✅ Use for:

Creative models (Hathor, Claude, GPT-4)
User-facing narration
Any output affecting game state
High-temperature generation
Untrusted model outputs

⚠️ Optional for:

Very constrained models (Gemini with low temp)
Internal/debug outputs
Non-gameplay narration (flavor text)
Heavily templated responses

Configuration Options

@dataclass
class AntiHallucinationConfig:
    """Configuration for anti-hallucination system"""
    enable_pre_validation: bool = True
    enable_constrained_prompts: bool = True
    enable_post_validation: bool = True
    max_retry_attempts: int = 3
    fallback_to_templates: bool = True
    log_violations: bool = True
    strict_mode: bool = False  # Reject ANY ambiguity
 
    def for_model(self, model_name: str) -> 'AntiHallucinationConfig':
        """Adjust config based on model"""
        if "hathor" in model_name.lower():
            return AntiHallucinationConfig(strict_mode=True, max_retry_attempts=5)
        elif "gemini" in model_name.lower():
            return AntiHallucinationConfig(max_retry_attempts=2)
        return self

Cross-References

Next: 170-Token Sweet Spot

See 09-170-Token-Sweet-Spot for detailed analysis of token limiting strategy.

LLM World Engine Knowledge Base

Explorer

08-Anti-Hallucination-System

ChatBotRPG - Anti-Hallucination System Deep Dive

Overview

The Problem: Creative Models Hallucinate

Discovery

Types of Hallucinations

Why This Happens

The Solution: Multi-Layer Validation

Architecture

Layer 1: Intent Extraction

Purpose

Implementation

Layer 2: Pre-Validation (Backend)

Purpose

Implementation

Layer 3: Constrained Prompt

Purpose

Core Constraints

Constraint Design Principles

Layer 4: LLM Generation

Temperature Management

Layer 5: Post-Validation (Backend)

Purpose

Implementation

Testing Results

Hathor Model (Very Creative)

Gemini 2.5 Flash Lite (Balanced)

EstopianMaid (Balanced)

Edge Cases and Limitations

Cases Where System Still Fails (~5%)

Mitigation Strategies

Performance Impact

Latency Analysis

Cost Impact

Token Usage

Recommendations

When to Use This System

Configuration Options

Cross-References

Related Documentation

Related Patterns

Related Prompts

Tags

Next: 170-Token Sweet Spot

Graph View

Table of Contents

Backlinks