Prompt: Format Enforcement (Structured Output Control)
Metadata
- Category: Constraint
- Technique: Explicit format instruction with bracketed output
- Model Tested: GPT-3.5, GPT-4, various local models
- Contributor: User-appl2613 (monkeyrithms)
- Status: Proven (production in ChatBot RPG)
- Implementation: ChatBot RPG
Purpose
Force LLM to output in a specific, parseable format (typically bracketed, comma-separated values) that can be reliably extracted and processed by code. This technique transforms LLM from creative writer to structured data extractor, enabling deterministic parsing and game logic.
Core Principle: Tell the LLM you’re expecting structured output, not creative output.
Template
Read the following text. Answer with only {{format_description}}.
Rules:
- Output format: {{format_example}}
- {{item_format_rule}}
- {{multiplicity_rule}}
- No additional text, commentary, or explanation
- If {{nothing_found_condition}}, output: {{empty_format}}
Text to analyze:
{{input_text}}
Output:
Variables
| Variable | Description | Example |
|---|---|---|
| {{format_description}} | What to output | ”the singular form of consumed objects in brackets” |
| {{format_example}} | Example of correct format | ”[WATER],[BERRY JUICE]“ |
| {{item_format_rule}} | How to format individual items | ”Write in singular form, uppercase” |
| {{multiplicity_rule}} | How to handle duplicates | ”If multiple same items, write singular once only” |
| {{nothing_found_condition}} | When to return empty | ”no objects are found”, “character performs no action” |
| {{empty_format}} | What to return if nothing found | ”[NONE]”, ”[]”, “NONE” |
| {{input_text}} | The text to extract from | Player message, chat log, narration |
Usage Examples
Example 1: Item Consumption Detection
Prompt:
Read the following text. Answer with only the singular form of the object or objects that the character is consuming in brackets.
Rules:
- Output format: [ITEM1],[ITEM2],[ITEM3]
- Write each item in singular form, uppercase
- If multiple of the same item, write it once only (e.g., "two apples" = [APPLE])
- No additional text, commentary, or explanation
- If no consumption occurs, output: [NONE]
Text to analyze:
"I drink a soda to quench my thirst, then eat two apples."
Output:
Expected Output:
[SODA],[APPLE]
Code Integration:
def extract_consumed_items(player_message):
prompt = format_consumption_prompt(player_message)
raw_output = llm.generate(prompt, temperature=0.2)
# Parse bracketed items
items = re.findall(r'\[([^\]]+)\]', raw_output)
# Convert to lowercase, check inventory
for item in items:
item_name = item.lower()
if item_name in player_inventory:
player_inventory[item_name] -= 1
apply_item_effect(item_name)
else:
notify_player(f"You don't have {item_name}")
return itemsExample 2: Location/Keyword Detection
Prompt:
Read the following text. Answer with only location names or places of interest mentioned, in brackets.
Rules:
- Output format: [LOCATION1],[LOCATION2]
- Write each location name exactly as mentioned, capitalized
- Include only explicitly named locations (not "there" or "here")
- No additional text, commentary, or explanation
- If no locations mentioned, output: [NONE]
Text to analyze:
"I leave the house and head to the office. On the way, I stop by the park."
Output:
Expected Output:
[HOUSE],[OFFICE],[PARK]
Code Integration:
def update_location_keywords(chat_message):
prompt = format_location_prompt(chat_message)
raw_output = llm.generate(prompt, temperature=0.1)
# Parse locations
locations = re.findall(r'\[([^\]]+)\]', raw_output)
# Add to setting keywords
for location in locations:
if location.lower() not in excluded_words:
setting.add_keyword(location.lower())
trigger_narrator_for_location(location)
return locationsExample 3: Action Classification
Prompt:
Read the following text. Classify the action type with a single word in brackets.
Rules:
- Output format: [ACTION_TYPE]
- Valid types: [ATTACK],[DEFEND],[MOVE],[INTERACT],[SPEAK],[WAIT]
- Choose only ONE action type (the primary action)
- No additional text, commentary, or explanation
- If unclear or no action, output: [WAIT]
Text to analyze:
"I draw my sword and charge at the goblin!"
Output:
Expected Output:
[ATTACK]
Example 4: Yes/No Decision
Prompt:
Read the following text. Determine if the character is interacting with the object "{{object_name}}".
Rules:
- Output format: [YES] or [NO]
- Answer [YES] only if explicit interaction with {{object_name}}
- Mere mention without interaction = [NO]
- No additional text, commentary, or explanation
Text to analyze:
{{player_message}}
Output:
Expected Output:
[YES]
or
[NO]
Code Integration:
def check_object_interaction(player_message, object_name):
prompt = format_interaction_prompt(player_message, object_name)
raw_output = llm.generate(prompt, temperature=0.1)
if '[YES]' in raw_output.upper():
trigger_object_event(object_name)
return True
return FalseFormat Markers: Why Brackets?
User-appl2613’s Explanation:
“You probably don’t need something like brackets but it just tells the LLM that you’re expecting a structured output vs. a creative one.”
Alternative Markers:
[ITEM]- Brackets (most common){ITEM}- Braces<ITEM>- Angle brackets (be careful, may trigger HTML thinking)|ITEM|- PipesITEM::- Colons (less reliable)- JSON format -
{"item": "value"}(good for complex structures)
Why Brackets Work Best:
- Visual distinction: Clearly not natural prose
- Easy parsing: Simple regex:
\[([^\]]+)\] - Convention: LLMs trained on code/data use brackets
- Unambiguous: Unlikely to appear naturally in text
Effectiveness Notes
What Works Well
- Deterministic parsing: Reliable extraction even from inconsistent LLMs
- Model-agnostic: Works across GPT, Claude, local models
- Low temperature effective: Temperature 0.1-0.3 gives consistent formatting
- Clear instruction: Reduces hallucination by constraining creativity
- Error handling: Can detect malformed output and retry
- Production-proven: Used in ChatBot RPG for inventory, locations, actions
Known Limitations
- Hallucination risk: Small models may invent items not in text
- Context sensitivity: More context = more likely to follow format
- Ambiguity issues: LLM may interpret differently than intended
- Model drift: Each model requires prompt tuning
- Parsing brittleness: Extra text around brackets breaks simple parsers
- Cost: Every format enforcement = an LLM call
Hallucination Warning
User-appl2613 (February 2024):
“this approach has a major weakness: hallucination, especially in small models. The more sensory-deprived the model is, the more it will hallucinate, because it has to (like our brains will in a sensory-depravation tank). Small models love to hallucinate”
Example of Hallucination:
Player input: "I eat my apple."
Expected: [APPLE]
Small model output: "[Post office] John eats his apple and then heads to the post office to pick up his package."
Mitigation:
- Use lower temperature (0.1-0.2)
- Provide more context (recent events, location, inventory)
- Add explicit “only output what’s in the text” constraint
- Use validation: check if extracted items make sense given context
- Retry with rephrased prompt if output is malformed
Model Considerations
- GPT-3.5/GPT-4: Very reliable, rarely breaks format
- Local 7B-13B models: Need stricter instructions, more examples
- Creative models (Hathor, etc.): May resist structured output, need firm constraints
- Instruction-tuned models: Best for this task
Temperature Settings
Recommended Settings for Format Enforcement:
- Temperature: 0.1-0.3 (lower = more consistent format)
- Top-p: 0.85-0.9
- Repetition Penalty: 1.0-1.1 (don’t want to suppress correct repeated format)
- Max Tokens: 50-100 (short output expected)
Note: Format enforcement is one of the few cases where temperature near 0 is beneficial.
Related Prompts
- Anti-Hallucination - Preventing invented data
- Length Limiting - Controlling output size
- Chain of Thought - Step-by-step extraction
- Few-Shot Examples - Teaching format by example
Variations
1. With Few-Shot Examples
Show correct format multiple times:
Read the text and extract consumed items in brackets.
Example 1:
Text: "I drink water and eat bread."
Output: [WATER],[BREAD]
Example 2:
Text: "She consumes two healing potions."
Output: [HEALING POTION]
Example 3:
Text: "He walks to the door."
Output: [NONE]
Now analyze:
Text: {{input_text}}
Output:
2. JSON Format (Complex Data)
For multi-field extraction:
Read the text and extract information as JSON only.
Format:
{
"action": "verb describing action",
"target": "object or character affected",
"location": "where it occurs",
"items_used": ["item1", "item2"]
}
Text: {{input_text}}
JSON:
3. Multi-Stage Format
Extract multiple things in one pass:
Read the text and extract:
ITEMS: [item1],[item2]
LOCATIONS: [loc1],[loc2]
NPCS: [npc1],[npc2]
ACTION: [action_type]
Text: {{input_text}}
Output:
4. Confidence Scoring
Include confidence with format:
Extract consumed items with confidence (0-10).
Format: [ITEM:confidence]
Example: [WATER:10],[MAYBE_BREAD:4]
Text: {{input_text}}
Output:
Validation and Error Handling
def robust_format_extraction(text, max_retries=3):
"""
Extract formatted data with retries and validation.
"""
for attempt in range(max_retries):
prompt = format_prompt(text)
output = llm.generate(prompt, temperature=0.1)
# Try to parse
try:
items = parse_bracketed_items(output)
# Validate
if validate_items(items, text):
return items
else:
# Retry with stricter prompt
prompt = add_strictness(prompt, attempt)
except ParseError:
# Malformed output, retry
continue
# All retries failed, fall back to deterministic method
return fallback_parser(text)
def parse_bracketed_items(output):
"""Extract items from [ITEM] format"""
matches = re.findall(r'\[([^\]]+)\]', output)
if not matches:
raise ParseError("No bracketed items found")
return [m.strip() for m in matches]
def validate_items(items, original_text):
"""Check if extracted items make sense"""
if items == ['NONE']:
return True
# Check if items appear in original text
text_lower = original_text.lower()
for item in items:
if item.lower() not in text_lower:
# Item hallucinated, reject
return False
return TrueIntegration with Game Logic
class FormatEnforcer:
"""Helper class for format-enforced LLM queries"""
def __init__(self, llm):
self.llm = llm
def extract_items(self, text, format_type='consumption'):
prompt = self.prompts[format_type].format(input_text=text)
output = self.llm.generate(prompt, temperature=0.2)
return self.parse(output)
def extract_yes_no(self, text, question):
prompt = f"""
{question}
Answer only [YES] or [NO].
Text: {text}
Output:
"""
output = self.llm.generate(prompt, temperature=0.1)
return '[YES]' in output.upper()
def extract_classification(self, text, categories):
cat_list = ','.join(categories)
prompt = f"""
Classify the following text into one category: {cat_list}
Output format: [CATEGORY]
Text: {text}
Output:
"""
output = self.llm.generate(prompt, temperature=0.2)
match = re.search(r'\[([^\]]+)\]', output)
return match.group(1) if match else None
def parse(self, output):
return re.findall(r'\[([^\]]+)\]', output)Testing Checklist
- Test with clear, unambiguous input
- Test with ambiguous input
- Test with no relevant content (should return [NONE])
- Test with multiple items
- Test with duplicate items
- Test for hallucination (items not in input)
- Test parsing robustness (extra text before/after)
- Test across different models
- Test at different temperatures
- Test with malformed input
- Measure retry rate
- Validate accuracy vs. ground truth
Performance Metrics
From ChatBot RPG Production Use:
- Format compliance: ~95% with GPT-3.5 at temp 0.2
- Hallucination rate: ~5-10% with small local models, ~1% with GPT-3.5
- Parse success rate: ~98% with bracketed format
- Retry rate: ~2% need second attempt
- Latency: 200-500ms per extraction (model-dependent)
Source
Original Discussion: Discord transcript lines 6550-6599 (February 10, 2024) Context: User-appl2613 explaining ChatBot RPG’s item consumption and location detection system Key Quotes:
“You probably don’t need something like brackets but it just tells the LLM that you’re expecting a structured output vs. a creative one.”
“Then I’ll parse the results into: ‘soda’ and it checks if ‘soda’ is in the inventory. If it is, it subtracts 1 soda from the inventory and cures the player’s thirst.”
Warning on Hallucination:
“this approach has a major weakness: hallucination, especially in small models. Small models love to hallucinate… so it can take a fair bit of fanangling to get it to play nice”
Related Documentation
- 02-Prompt-Engineering - Broader prompting context
- User-appl2613 - Creator’s other contributions
- 01-Architecture-and-Design - Where parsing fits in system design
Advanced: Function Calling vs Format Enforcement
Modern Alternative: OpenAI/Claude function calling
# Instead of format enforcement prompt:
functions = [{
"name": "extract_consumed_items",
"parameters": {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": {"type": "string"}
}
}
}
}]
response = llm.chat_completion(
messages=[{"role": "user", "content": text}],
functions=functions,
function_call={"name": "extract_consumed_items"}
)
items = response.function_call.arguments['items']Pros of Function Calling:
- More reliable structure
- Native support
- Better for complex schemas
Pros of Format Enforcement:
- Works with any model
- Simpler implementation
- More control over exact format
- No special API features needed
Recommendation: Use function calling if available, format enforcement otherwise.