Prompt: Format Enforcement (Structured Output Control)

Metadata

Category: Constraint
Technique: Explicit format instruction with bracketed output
Model Tested: GPT-3.5, GPT-4, various local models
Contributor: User-appl2613 (monkeyrithms)
Status: Proven (production in ChatBot RPG)
Implementation: ChatBot RPG

Purpose

Force LLM to output in a specific, parseable format (typically bracketed, comma-separated values) that can be reliably extracted and processed by code. This technique transforms LLM from creative writer to structured data extractor, enabling deterministic parsing and game logic.

Core Principle: Tell the LLM you’re expecting structured output, not creative output.

Template

Read the following text. Answer with only {{format_description}}.

Rules:
- Output format: {{format_example}}
- {{item_format_rule}}
- {{multiplicity_rule}}
- No additional text, commentary, or explanation
- If {{nothing_found_condition}}, output: {{empty_format}}

Text to analyze:
{{input_text}}

Output:

Variables

Variable	Description	Example
{{format_description}}	What to output	”the singular form of consumed objects in brackets”
{{format_example}}	Example of correct format	”[WATER],[BERRY JUICE]“
{{item_format_rule}}	How to format individual items	”Write in singular form, uppercase”
{{multiplicity_rule}}	How to handle duplicates	”If multiple same items, write singular once only”
{{nothing_found_condition}}	When to return empty	”no objects are found”, “character performs no action”
{{empty_format}}	What to return if nothing found	”[NONE]”, ”[]”, “NONE”
{{input_text}}	The text to extract from	Player message, chat log, narration

Usage Examples

Example 1: Item Consumption Detection

Prompt:

Read the following text. Answer with only the singular form of the object or objects that the character is consuming in brackets.

Rules:
- Output format: [ITEM1],[ITEM2],[ITEM3]
- Write each item in singular form, uppercase
- If multiple of the same item, write it once only (e.g., "two apples" = [APPLE])
- No additional text, commentary, or explanation
- If no consumption occurs, output: [NONE]

Text to analyze:
"I drink a soda to quench my thirst, then eat two apples."

Output:

Expected Output:

[SODA],[APPLE]

Code Integration:

def extract_consumed_items(player_message):
    prompt = format_consumption_prompt(player_message)
    raw_output = llm.generate(prompt, temperature=0.2)
 
    # Parse bracketed items
    items = re.findall(r'\[([^\]]+)\]', raw_output)
 
    # Convert to lowercase, check inventory
    for item in items:
        item_name = item.lower()
        if item_name in player_inventory:
            player_inventory[item_name] -= 1
            apply_item_effect(item_name)
        else:
            notify_player(f"You don't have {item_name}")
 
    return items

Example 2: Location/Keyword Detection

Prompt:

Read the following text. Answer with only location names or places of interest mentioned, in brackets.

Rules:
- Output format: [LOCATION1],[LOCATION2]
- Write each location name exactly as mentioned, capitalized
- Include only explicitly named locations (not "there" or "here")
- No additional text, commentary, or explanation
- If no locations mentioned, output: [NONE]

Text to analyze:
"I leave the house and head to the office. On the way, I stop by the park."

Output:

Expected Output:

[HOUSE],[OFFICE],[PARK]

Code Integration:

def update_location_keywords(chat_message):
    prompt = format_location_prompt(chat_message)
    raw_output = llm.generate(prompt, temperature=0.1)
 
    # Parse locations
    locations = re.findall(r'\[([^\]]+)\]', raw_output)
 
    # Add to setting keywords
    for location in locations:
        if location.lower() not in excluded_words:
            setting.add_keyword(location.lower())
            trigger_narrator_for_location(location)
 
    return locations

Example 3: Action Classification

Prompt:

Read the following text. Classify the action type with a single word in brackets.

Rules:
- Output format: [ACTION_TYPE]
- Valid types: [ATTACK],[DEFEND],[MOVE],[INTERACT],[SPEAK],[WAIT]
- Choose only ONE action type (the primary action)
- No additional text, commentary, or explanation
- If unclear or no action, output: [WAIT]

Text to analyze:
"I draw my sword and charge at the goblin!"

Output:

Expected Output:

[ATTACK]

Example 4: Yes/No Decision

Prompt:

Read the following text. Determine if the character is interacting with the object "{{object_name}}".

Rules:
- Output format: [YES] or [NO]
- Answer [YES] only if explicit interaction with {{object_name}}
- Mere mention without interaction = [NO]
- No additional text, commentary, or explanation

Text to analyze:
{{player_message}}

Output:

Expected Output:

[YES]

[NO]

Code Integration:

def check_object_interaction(player_message, object_name):
    prompt = format_interaction_prompt(player_message, object_name)
    raw_output = llm.generate(prompt, temperature=0.1)
 
    if '[YES]' in raw_output.upper():
        trigger_object_event(object_name)
        return True
    return False

Format Markers: Why Brackets?

User-appl2613’s Explanation:

“You probably don’t need something like brackets but it just tells the LLM that you’re expecting a structured output vs. a creative one.”

Alternative Markers:

[ITEM] - Brackets (most common)
{ITEM} - Braces
<ITEM> - Angle brackets (be careful, may trigger HTML thinking)
|ITEM| - Pipes
ITEM:: - Colons (less reliable)
JSON format - {"item": "value"} (good for complex structures)

Why Brackets Work Best:

Visual distinction: Clearly not natural prose
Easy parsing: Simple regex: \[([^\]]+)\]
Convention: LLMs trained on code/data use brackets
Unambiguous: Unlikely to appear naturally in text

Effectiveness Notes

What Works Well

Deterministic parsing: Reliable extraction even from inconsistent LLMs
Model-agnostic: Works across GPT, Claude, local models
Low temperature effective: Temperature 0.1-0.3 gives consistent formatting
Clear instruction: Reduces hallucination by constraining creativity
Error handling: Can detect malformed output and retry
Production-proven: Used in ChatBot RPG for inventory, locations, actions

Known Limitations

Hallucination risk: Small models may invent items not in text
Context sensitivity: More context = more likely to follow format
Ambiguity issues: LLM may interpret differently than intended
Model drift: Each model requires prompt tuning
Parsing brittleness: Extra text around brackets breaks simple parsers
Cost: Every format enforcement = an LLM call

Hallucination Warning

User-appl2613 (February 2024):

“this approach has a major weakness: hallucination, especially in small models. The more sensory-deprived the model is, the more it will hallucinate, because it has to (like our brains will in a sensory-depravation tank). Small models love to hallucinate”

Example of Hallucination:

Player input: "I eat my apple."

Expected: [APPLE]

Small model output: "[Post office] John eats his apple and then heads to the post office to pick up his package."

Mitigation:

Use lower temperature (0.1-0.2)
Provide more context (recent events, location, inventory)
Add explicit “only output what’s in the text” constraint
Use validation: check if extracted items make sense given context
Retry with rephrased prompt if output is malformed

Model Considerations

GPT-3.5/GPT-4: Very reliable, rarely breaks format
Local 7B-13B models: Need stricter instructions, more examples
Creative models (Hathor, etc.): May resist structured output, need firm constraints
Instruction-tuned models: Best for this task

Temperature Settings

Recommended Settings for Format Enforcement:

Temperature: 0.1-0.3 (lower = more consistent format)
Top-p: 0.85-0.9
Repetition Penalty: 1.0-1.1 (don’t want to suppress correct repeated format)
Max Tokens: 50-100 (short output expected)

Note: Format enforcement is one of the few cases where temperature near 0 is beneficial.

Anti-Hallucination - Preventing invented data
Length Limiting - Controlling output size
Chain of Thought - Step-by-step extraction
Few-Shot Examples - Teaching format by example

Variations

1. With Few-Shot Examples

Show correct format multiple times:

Read the text and extract consumed items in brackets.

Example 1:
Text: "I drink water and eat bread."
Output: [WATER],[BREAD]

Example 2:
Text: "She consumes two healing potions."
Output: [HEALING POTION]

Example 3:
Text: "He walks to the door."
Output: [NONE]

Now analyze:
Text: {{input_text}}
Output:

2. JSON Format (Complex Data)

For multi-field extraction:

Read the text and extract information as JSON only.

Format:
{
  "action": "verb describing action",
  "target": "object or character affected",
  "location": "where it occurs",
  "items_used": ["item1", "item2"]
}

Text: {{input_text}}

JSON:

3. Multi-Stage Format

Extract multiple things in one pass:

Read the text and extract:

ITEMS: [item1],[item2]
LOCATIONS: [loc1],[loc2]
NPCS: [npc1],[npc2]
ACTION: [action_type]

Text: {{input_text}}

Output:

4. Confidence Scoring

Include confidence with format:

Extract consumed items with confidence (0-10).

Format: [ITEM:confidence]

Example: [WATER:10],[MAYBE_BREAD:4]

Text: {{input_text}}

Output:

Validation and Error Handling

def robust_format_extraction(text, max_retries=3):
    """
    Extract formatted data with retries and validation.
    """
    for attempt in range(max_retries):
        prompt = format_prompt(text)
        output = llm.generate(prompt, temperature=0.1)
 
        # Try to parse
        try:
            items = parse_bracketed_items(output)
 
            # Validate
            if validate_items(items, text):
                return items
            else:
                # Retry with stricter prompt
                prompt = add_strictness(prompt, attempt)
        except ParseError:
            # Malformed output, retry
            continue
 
    # All retries failed, fall back to deterministic method
    return fallback_parser(text)
 
def parse_bracketed_items(output):
    """Extract items from [ITEM] format"""
    matches = re.findall(r'\[([^\]]+)\]', output)
    if not matches:
        raise ParseError("No bracketed items found")
    return [m.strip() for m in matches]
 
def validate_items(items, original_text):
    """Check if extracted items make sense"""
    if items == ['NONE']:
        return True
 
    # Check if items appear in original text
    text_lower = original_text.lower()
    for item in items:
        if item.lower() not in text_lower:
            # Item hallucinated, reject
            return False
 
    return True

Integration with Game Logic

class FormatEnforcer:
    """Helper class for format-enforced LLM queries"""
 
    def __init__(self, llm):
        self.llm = llm
 
    def extract_items(self, text, format_type='consumption'):
        prompt = self.prompts[format_type].format(input_text=text)
        output = self.llm.generate(prompt, temperature=0.2)
        return self.parse(output)
 
    def extract_yes_no(self, text, question):
        prompt = f"""
        {question}
        Answer only [YES] or [NO].
        Text: {text}
        Output:
        """
        output = self.llm.generate(prompt, temperature=0.1)
        return '[YES]' in output.upper()
 
    def extract_classification(self, text, categories):
        cat_list = ','.join(categories)
        prompt = f"""
        Classify the following text into one category: {cat_list}
        Output format: [CATEGORY]
        Text: {text}
        Output:
        """
        output = self.llm.generate(prompt, temperature=0.2)
        match = re.search(r'\[([^\]]+)\]', output)
        return match.group(1) if match else None
 
    def parse(self, output):
        return re.findall(r'\[([^\]]+)\]', output)

Testing Checklist

Performance Metrics

From ChatBot RPG Production Use:

Format compliance: ~95% with GPT-3.5 at temp 0.2
Hallucination rate: ~5-10% with small local models, ~1% with GPT-3.5
Parse success rate: ~98% with bracketed format
Retry rate: ~2% need second attempt
Latency: 200-500ms per extraction (model-dependent)

Source

Original Discussion: Discord transcript lines 6550-6599 (February 10, 2024) Context: User-appl2613 explaining ChatBot RPG’s item consumption and location detection system Key Quotes:

“You probably don’t need something like brackets but it just tells the LLM that you’re expecting a structured output vs. a creative one.”

“Then I’ll parse the results into: ‘soda’ and it checks if ‘soda’ is in the inventory. If it is, it subtracts 1 soda from the inventory and cures the player’s thirst.”

Warning on Hallucination:

“this approach has a major weakness: hallucination, especially in small models. Small models love to hallucinate… so it can take a fair bit of fanangling to get it to play nice”

02-Prompt-Engineering - Broader prompting context
User-appl2613 - Creator’s other contributions
01-Architecture-and-Design - Where parsing fits in system design

Advanced: Function Calling vs Format Enforcement

Modern Alternative: OpenAI/Claude function calling

# Instead of format enforcement prompt:
functions = [{
    "name": "extract_consumed_items",
    "parameters": {
        "type": "object",
        "properties": {
            "items": {
                "type": "array",
                "items": {"type": "string"}
            }
        }
    }
}]
 
response = llm.chat_completion(
    messages=[{"role": "user", "content": text}],
    functions=functions,
    function_call={"name": "extract_consumed_items"}
)
 
items = response.function_call.arguments['items']

Pros of Function Calling:

More reliable structure
Native support
Better for complex schemas

Pros of Format Enforcement:

Works with any model
Simpler implementation
More control over exact format
No special API features needed

Recommendation: Use function calling if available, format enforcement otherwise.

LLM World Engine Knowledge Base

Explorer

format-enforcement

Prompt: Format Enforcement (Structured Output Control)

Metadata

Purpose

Template

Variables

Usage Examples

Example 1: Item Consumption Detection

Example 2: Location/Keyword Detection

Example 3: Action Classification

Example 4: Yes/No Decision

Format Markers: Why Brackets?

Effectiveness Notes

What Works Well

Known Limitations

Hallucination Warning

Model Considerations

Temperature Settings

Variations

1. With Few-Shot Examples

2. JSON Format (Complex Data)

3. Multi-Stage Format

4. Confidence Scoring

Validation and Error Handling

Integration with Game Logic

Testing Checklist

Performance Metrics

Source

Advanced: Function Calling vs Format Enforcement

Graph View

Table of Contents

Backlinks

LLM World Engine Knowledge Base

Explorer

format-enforcement

Prompt: Format Enforcement (Structured Output Control)

Metadata

Purpose

Template

Variables

Usage Examples

Example 1: Item Consumption Detection

Example 2: Location/Keyword Detection

Example 3: Action Classification

Example 4: Yes/No Decision

Format Markers: Why Brackets?

Effectiveness Notes

What Works Well

Known Limitations

Hallucination Warning

Model Considerations

Temperature Settings

Related Prompts

Variations

1. With Few-Shot Examples

2. JSON Format (Complex Data)

3. Multi-Stage Format

4. Confidence Scoring

Validation and Error Handling

Integration with Game Logic

Testing Checklist

Performance Metrics

Source

Related Documentation

Advanced: Function Calling vs Format Enforcement

Graph View

Table of Contents

Backlinks