Studies — AI Analysis Across Your Records

Run an AI analysis across thousands of records — Zendesk tickets, support history, and more. Define the columns you want, get a structured results grid, export to CSV, or push back as a knowledge source.

What Studies are

Studies — also called AI Analysis in the sidebar — let you run an AI analysis across a list of records and get structured results back. Instead of opening one ticket at a time and asking an agent for an answer, you point a Study at a set of records, define the columns you want filled in, and Macha analyses every record in parallel and writes the answers back into the grid.

The end result is a spreadsheet-shaped grid of insights: one row per record, one column per field you defined. You can browse it, filter it, export it to CSV, or push it back into Macha as a knowledge source for your agents to search.

Studies are the right tool when you have a question that needs an answer across many records — auditing the last 5,000 tickets for refund mentions, classifying support volume by root cause, finding every ticket where the customer raised a billing concern. Doing that by hand is hours of work; doing it through chat would mean opening a conversation per record. A Study runs the same extraction over every record at once.

How a Study is shaped

Every Study has four parts:

An input source — where the records come from. Today that's Zendesk tickets matched by a search query and an optional date range. More record sources are coming.
Input fields — the parts of each record you want the AI to see. Picking only what's necessary keeps the prompt small and costs predictable.
A schema — the columns you want filled in. Each column has a type (boolean, single select, multi select, number, short text, long text), a label, and optional guidance for the AI on how to fill it.
A model — which LLM does the extraction. The model you pick sets the credit cost per record.

The model reads the input fields you selected, follows your instructions and per-column guidance, and returns a value for each column. One record in, one row out.

Creating a Study

Studies live in the sidebar under their own section. To create one:

Open the Studies page and click New Study.
Give it a name and an optional description.
Pick the input source — for example, Zendesk Tickets.
If the source needs a connector, pick the connected instance. Multi-instance accounts (e.g. two Zendesk accounts) each show up separately.
Configure the input — add a search query and/or a date range to scope the records you want.
Pick the input fields you want the AI to read.
Define the schema — add a column for each piece of information you want extracted.
Choose a model and save.

You don't have to run the Study right away. It's a saved configuration you can re-run any time — useful for monthly audits or recurring classification.

The input source: Zendesk Tickets

The first input source available is Zendesk Tickets. It pulls tickets that match a Zendesk search query, optionally bounded by a date range.

Configuring the query

Search query — any Zendesk search expression (the same syntax you'd use in the Zendesk search bar). Examples: status:open priority:urgent, tags:refund, brand_id:1234 form:billing. Leave it empty to match all tickets in the date range.
From / To — restrict to tickets created in a date range. Useful for "last quarter" or "since the new policy went live" audits.

Tip

The estimate Macha shows before you run a Study is calculated from the exact same query that the run will use — so the number you see is what you'll process. If it's higher than you expected, tighten the query before running.

Ticket fields you can include

You choose which fields the AI sees for each ticket. The more you include, the more context the model has — but also the larger the prompt and the slower the run. Available fields:

Subject — the ticket subject line
Description — the customer's first message
Status, Priority, Type, Tags — ticket metadata
Created at, Updated at — timestamps
Requester ID, Assignee ID, Group ID — identifiers
Custom fields — every populated custom field with its human-readable label and resolved value. Flagged as expensive because it requires fetching the field schema.
Full comment thread — the complete public + internal conversation. Flagged as expensive because it adds an extra API call per ticket and significantly grows the prompt.

For most extractions, Subject and Description are enough. Add the comment thread only when the answer genuinely needs the full conversation.

Defining the schema

The schema is the heart of a Study. Each entry is one column in the final results grid. You can have as many columns as you need.

Column types

Each column has an answer type that defines the shape of the value the model returns. The Study editor shows a short hint under the Answer-type selector explaining what each type returns — useful when you're picking the right shape for the question.

Boolean — yes/no. Best for "is this a refund request?" or "did the customer mention a competitor?"
Single select — one value from a list you define. Best for categorical classification: billing / shipping / product / other.
Multi select — zero or more values from a list. Best for tags: which issues are mentioned.
Number — a numeric value. Best for counts, scores, or amounts the model can read from the record.
Short text — a brief free-text answer. Best for things like "what is the customer's main concern in one sentence?"
Long text — multi-line free text. Best for summaries, suggested replies, or extracted quotes.

For single select and multi select, you add the allowed values as tags — type each option and hit Enter, or paste a comma-separated list and Macha splits it for you. Click the × on a tag to remove it. The model is constrained to pick from this fixed set, so spelling matters: pick the labels you want to see in the final grid.

Writing good guidance

Each column has an optional guidance field. This is where you teach the model what you actually want. The guidance is included in the extraction prompt for that column, so be specific:

For a single-select column "Root cause", spell out what each option means and how to disambiguate edge cases.
For a boolean "Refund requested", clarify the difference between asking about a refund and actually requesting one.
For a short-text "Customer's main concern", say how long it should be ("one sentence") and what voice it should use ("describe the issue, not the customer's emotion").

Treat guidance like instructions you'd give a new analyst. The clearer the rules, the more consistent the output.

Tip

When the answers come back inconsistent, the fix is almost always in the guidance, not the model. Tighten the column definition, add examples in the guidance, and run a test on a small batch before re-running the full set.

Choosing a model

Studies use the same model lineup as agent chat. The model you pick drives the credit cost per record — which compounds across the whole run, so it matters a lot more here than in a one-off chat. The Study editor shows a credits / record badge next to the Model selector and again at the top of every run page, so the cost is visible at every step. A few rules of thumb:

For simple classification, boolean extraction, or short field lookups, a mini model is plenty.
For nuanced multi-column extractions, long-form summaries, or judgement calls (e.g. "did this agent handle the ticket well?"), step up to a more capable model.
When in doubt, run a test on 10–20 records first with the model you're considering. Look at the output. If it's good enough, keep the model. If not, step up — the cost difference at small scale is tiny, and you'd rather know now than after burning credits on 5,000 records.

Estimate and the pre-run gate

Before any run starts, Macha shows you a Review screen with:

The number of records the input is going to match
The credit cost per record (based on the model)
The total estimated credits for the run
Your remaining credits

You can't accidentally run a 10,000-ticket study without seeing the cost first. If the estimate is uncomfortable, go back, narrow the query, or switch to a cheaper model.

Clicking Run study opens a second confirmation modal that re-shows the record count, the estimated credits, and the model — you have to explicitly confirm before any work actually starts. Test runs ("Test on 3") skip this second step since they're small by definition.

If you're short on credits at this stage, you can buy a top-up pack directly from the Review screen — top-up credits never expire and they're consumed automatically after your monthly allowance runs out.

Test runs

Before committing to a full run, do a test. A test run takes a small sample (typically 10–50 records) and runs the same extraction on it. Use it to:

Sanity-check that the schema produces the answers you actually want.
Tune the per-column guidance.
Make sure the model is capable enough for the task before scaling up.

Test runs are marked as previews in the UI so they don't clutter your real history. They still cost credits (you're paying for real model calls), but only for the sample size.

Running, progress, and cancelling

When you start a real run, Macha kicks off a background worker that streams records from the source and processes them with bounded concurrency — multiple records run in parallel, but never so many that we'd overwhelm the source API or your model quota.

The run page shows progress live: how many records processed, how many succeeded, how many errored, and how many credits have been spent so far.

You can cancel a run at any time. Cancellation is graceful — the worker finishes the records it has in flight, then stops. You only pay for what was actually processed; nothing is charged for records that never ran.

How a run stops

A run can finish in one of several ways:

Completed — every record in the input was processed.
Cancelled — you stopped the run from the UI.
Out of credits — your balance hit zero mid-run. The worker stops gracefully and reports the stop reason on the run.
Cap reached — the run hit the platform's hard ceiling. Studies are capped at 20,000 records per run. To go bigger, split into multiple runs with date-range filters.
Failed — a fatal error (e.g. the connector disconnected mid-run). Whatever was processed before the failure is preserved.

Reviewing results

When a run completes, its results page shows a table — one row per record, one column per schema field, plus a status column. Each row links back to the source record (e.g. the Zendesk ticket URL) so you can verify the extraction against the original.

The results table is Excel-shaped: click any column header to sort (ascending → descending → off), and use the search box in the toolbar to filter loaded rows by free text across every column. Click a row to open the full record in a side drawer; J/K keys move to the next/previous row, Esc closes.

Useful things you can do from the results page:

Sort and search — click any header to sort, type in the search box to filter the loaded rows.
Group by a field — pick a column and Macha computes bucket counts you can click to filter the table.
Open a row — see the full record in the side drawer.
View report — open the run's analytics view (see below).
Use as knowledge source — push the results back into Macha as searchable knowledge (see further below).
Export to CSV — download the whole result set as a spreadsheet.
Push to a Knowledge Source — see below.

Frozen snapshots

Every run keeps a copy of the input config, schema, and model it ran with — frozen at the moment the run started. Editing the parent Study later doesn't rewrite history. Past runs always reflect the configuration they ran under, so audit trails stay clean.

The Report view

The results table is great for reading individual rows. The Report view is for reading the run as a whole — how many tickets fell into each category, what the distribution looks like, where the outliers are. From a completed run, click View report in the toolbar to open it.

The report has three layers:

Top-of-page stat callouts

Four big-number cards summarise the run at a glance:

Records processed — the total record count.
Credits spent — with the average credits per record shown beneath.
Errors — turns red if anything failed.
Reportable columns — how many of your schema columns produce charts. Number columns are also called out separately because they get histograms.

Per-column aggregation cards

Every column whose answer type is Yes/No, Single choice, Multiple choice, or Number gets its own card. Short-text and long-text columns don't get a card — they're too free-form to chart meaningfully.

Yes/No, Single choice, Multiple choice render as horizontal bar charts. Each bar shows the bucket label, the count, and the percentage of the answered set. Click any bar to drill into the rows that fall into that bucket.
Number columns render as histograms. The card shows summary stats at the top — min, mean, median, max — and a bin chart below. If a number column has 10 or fewer unique values, each value becomes its own bin (good for integer counts like "comments per ticket"). With more unique values, Macha bins automatically into ~10 equal-width ranges.

Drilldown modal

Clicking a bar opens a modal listing the records that match that bucket. The modal uses the same results table you see on the main run page — same columns, same record drawer when you click a row — so navigating from a chart into the underlying rows feels continuous. Use the link at the bottom of the modal to jump to the full results table if you need to see every match.

Coming soon

The report page also includes an Ask Sidekick bar at the top, currently shown as a "Coming soon" preview. Once Study-scoped query tools land, you'll be able to ask the Sidekick questions about a specific run ("how many tickets mention billing this month?", "what's the most common root cause for tickets that got escalated?") and it'll answer using the run's data directly. The bar is intentionally disabled until then so it doesn't return misleading answers.

Plan availability

The Report view is a Professional and Enterprise feature. Trial and Starter plans see an upgrade card with a link to billing instead of the dashboard. The underlying Study itself can run on any plan that has Studies enabled, but the analytics view is gated.

Pushing results into a Knowledge Source

Studies don't just produce a static spreadsheet — they can feed your agents. From a completed run's results page, you can export the results as a new knowledge source. Each row becomes a document, composed from the columns you choose, and indexed via the same embeddings pipeline that powers the rest of Macha's knowledge.

This is how Studies become operational instead of one-off analyses. Run a Study to extract structured insights from your last 5,000 tickets, push the results to a knowledge source, and now your support agent can search those insights when handling new tickets.

You choose which fields end up in each document and which one is used as the title. From there it behaves like any other knowledge source — assign it to an agent, set scope, and the agent's search_knowledge and get_document tools can pull from it.

Credits and pricing

Studies use the same credit system as the rest of Macha. The cost per record is the credit cost of the model you picked — so a Study on a mini model costs less per record than the same Study on a top-tier model.

Credits are deducted per successful record, not up front.
Errored records don't cost anything.
Cancelled records (ones that never ran) don't cost anything.
Monthly plan credits are used first; once those are spent, top-up credits kick in automatically. Top-up credits never expire.
Enterprise plans bypass per-record credit checks.

Plan availability

Trial — Studies are not available on trial.
Starter — not available; upgrade to Professional to use Studies.
Professional — full access.
Enterprise — full access, with credit checks bypassed.

Best practices

Start narrow, then widen

The first time you build a schema, run it against a small date range or a tight query first. Look at the results. Iterate on the schema and guidance. Once you're happy, re-run against the full set.

One question per column

A column is meant to answer one question. If a column tries to do two things at once ("category and severity"), split it into two columns. Single-purpose columns are easier for the model to answer consistently and easier for you to filter on later.

Use single-select over short-text when you can

Single-select forces the model to pick from your taxonomy, which makes the column easy to group, sort, and chart. Short-text leaves it free-form, which is useful for genuine free-text answers but bad for anything you want to count.

Don't pull comments unless you need them

The full comment thread roughly doubles or triples the prompt size and adds an API call per ticket. If the answer is in the subject and description, leave comments off.

Run smaller batches across longer windows

If you want to study "the whole year" but it's 60,000 tickets, run four quarterly Studies instead of one giant one. You stay under the 20,000-record cap, you can review intermediate results, and you can iterate the schema between runs if something looks off.

Push high-value runs to a knowledge source

If a Study's results would help your agents (e.g. a classified backlog of past resolutions, or a clean list of known issues), don't just export to CSV. Push it to a knowledge source so your agents can search it during real conversations.

Tip

A Study is just a saved configuration — it costs nothing until you run it. Iterate on the schema freely; only test runs and real runs use credits.

Previous ← Custom Tools

Next Agent Evaluation →