Language and Phrasing

The linguistic patterns that produce reliable agents — plain English, active voice, concrete-over-abstract, numbered lists, and the multilingual decisions you have to make explicitly.

Why The Way You Phrase Matters

Two instruction sets can describe the exact same workflow and produce wildly different agent behavior, just because of how they are worded. This page is about the linguistic patterns that produce reliable agents — and the phrasings that quietly cause problems.

This is the page to read when you have already written your instructions but the agent is doing something subtly off and you cannot figure out why. The fix is often in the language, not the logic.

Use Plain English, Not Marketing Voice

Models follow plain, declarative English better than they follow polished marketing copy. Compare:

Polished but vague:

"Leverage your customer service expertise to deliver world-class support experiences with empathy and care."

Plain and operational:

"Read each ticket carefully. Acknowledge the customer's frustration in one sentence. Then answer their question directly using a help centre article if one exists."

The first sentence sounds nicer. The second tells the model what to actually do. Reserve marketing voice for marketing pages — agent instructions should read like an internal SOP, not a brand statement.

Active Voice Over Passive

"The ticket should be reviewed" is weaker than "Review the ticket." Active voice with imperative verbs makes the action explicit and the agent the subject performing it.

Apply this everywhere:

"The reply should mention the order ID" → "Mention the order ID in the reply."
"Confirmation must be received before posting" → "Wait for human confirmation before posting."
"It is recommended that the agent escalates" → "Escalate this kind of ticket."

Be Concrete, Not Abstract

Models handle concrete instructions much better than abstract principles. Replace vague concepts with specific behaviors.

"Be concise" → "Keep replies to 2-3 paragraphs unless the question requires more depth."
"Be friendly" → "Address the customer by their first name. Use contractions (you're, we'll). Avoid corporate jargon."
"Handle escalations appropriately" → "If the customer mentions a dispute, refund, or legal threat, add an internal note tagging billing-review and stop."
"Be careful with sensitive information" → "Never include the customer's full credit card number, password, or full address in any reply or note."

Numbered Lists Beat Prose For Steps

If the agent should follow a sequence, use a numbered list. The model treats numbers as explicit ordering, while prose suggests-but-does-not-enforce sequencing.

Prose (weak):

"You should read the ticket first, then look up the customer, and finally draft a reply that addresses their issue and includes any relevant order details."

Numbered (strong):

"For every ticket:
1. Read the ticket using zendesk_get_ticket.
2. Look up the customer with zendesk_search_users.
3. Look up the order with shopify_search_orders.
4. Draft a reply that addresses the issue and includes order details."

The numbered version is more reliable. Models almost always execute step 1 before step 2 when the steps are numbered. With prose, they sometimes skip ahead.

Negative Rules Need To Be Explicit

"Do not promise refunds" is much stronger than "be careful about refunds." Vague guidance becomes wishful thinking; explicit negation becomes a hard constraint.

The strongest pattern: state the rule and the consequence:

"Never quote pricing. Refer customers to acme.com/pricing instead. Quoting an outdated price would create a contract dispute, so this rule is absolute."

The "consequence" sentence is doing real work. Models reason about why a rule exists and apply it more carefully when they understand the cost of breaking it.

Examples Help, But Pair Them With Rules

Showing the agent two example replies is helpful, but examples alone do not generalise. The model may pattern-match the surface details of the examples without internalising the underlying rule.

The reliable pattern is rule + example, not example alone:

"Rule: Open every reply with a one-sentence acknowledgment of the customer's situation before answering the question.

Example: If the customer wrote 'My order is late and I'm frustrated,' open with 'I understand how frustrating it is when an order doesn't arrive on time.' Then proceed with the lookup and answer."

The example illustrates the rule. The rule is what the model generalises from.

Code Identifiers Stay In Code Format

Tool names, field names, status values, IDs — anything that matches verbatim against an external system — should be in code format (backticks in Markdown, or visually distinguished from prose). This signals to the model that the string is meaningful and should not be paraphrased.

"Update the priority to urgent" is more reliable than "update the priority to urgent." The first instruction the model treats as a literal value to pass; the second it sometimes paraphrases.

Ambiguity Is The Enemy

Read your instructions and look for words that could mean two things. Common offenders:

"Quickly" — how quickly? Within an hour? Within the same conversation? After how many tool calls?
"Recent" — last 24 hours? Last week? Since the last status change?
"Important customers" — defined how? Tier? Lifetime value? Specific tag?
"Standard reply" — what does standard mean here? Show an example or define it.
"Appropriate tone" — appropriate by whose standard?

Replace each ambiguous word with a concrete definition. If you cannot define it concretely, the model cannot apply it consistently.

Multilingual Considerations

If your customer base is multilingual, decide explicitly how the agent should handle non-English tickets. Three workable patterns:

Pattern A: Match the customer's language. "Reply in the same language the customer wrote in. If you are not confident in your fluency in the customer's language, escalate instead."

Pattern B: English only, escalate the rest. "Only reply in English. If the customer wrote in any other language, add an internal note tagging the multilingual team and stop."

Pattern C: Specific languages whitelisted. "Reply in English, Spanish, French, or German based on the customer's language. For any other language, escalate."

The wrong pattern is leaving it implicit. An agent given no guidance will sometimes attempt languages it should not, with results that range from awkward to embarrassing.

Match The Model's Reading Level

Write instructions at the same level of formality and complexity you would use briefing a competent human colleague. Models do not benefit from being talked down to ("simple instructions for the AI") or talked up to ("sophisticated multi-stage cognitive workflow"). Plain professional English works best across all current models.

One exception: when writing for mini models, slightly more structure helps. Numbered steps, explicit branches, and shorter sentences reduce the cognitive load and improve compliance. Mini models are not less smart, but they have less slack for ambiguity.

Test The Phrasing By Reading It Aloud

The fastest way to catch bad phrasing is to read your instructions aloud, slowly, as if you were the agent receiving them. Anywhere you stumble, mumble, or hesitate is a place the model will too. Anywhere you find yourself adding mental clarification ("well, what they mean here is...") is a place the model will improvise to fill the gap.

Edit until you can read the instructions aloud and they sound like clear, specific, actionable guidance. That phrasing is what produces reliable agents.

One More Pass: Remove Instructions That Restate Defaults

Once you have a draft, look for instructions that are telling the model to do things it would do anyway. "Be helpful." "Try to be accurate." "Use proper grammar." All of these are wasted tokens — the model defaults to all three.

Keep only the instructions that meaningfully shape behavior away from the model's default. The shorter and more pointed your instruction set, the more weight each instruction carries.

Previous ← Writing Instructions

Next Designing the Tool Set →