How to Index a Website or Webpage for Your AI Agent
A lot of the knowledge an agent needs already lives on your website — your docs, your FAQs, your product pages. Macha lets you turn that into searchable knowledge by indexing it, and you've got two options depending on how much of the site you need.
Watch it
Two options: whole site or single page
From the Sources page, you can point Macha at the web two ways:
Add Website — crawls the entire site. Macha follows the links it finds, discovers every page, and indexes all of them. This is what you use when you want the agent to have access to your whole website — all your documentation, every page. There's a system limit: it indexes the first 200 pages it finds.
Add Webpage — indexes a single URL. Use this when you only need one specific page in the agent's knowledge, not the whole site.
The choice is just about scope: the whole site, or one page.
What happens when you add a website
Hit Add Source on a website and Macha gets to work crawling. As it goes, it lists out the pages it's discovering — so you can watch it find your content in real time. In the example from the video, it discovers 160 pages across the site and indexes them. Once done, the whole site is searchable knowledge for any agent you link it to.
Remember the 200-page cap: if your site is larger, Macha indexes the first 200 pages it finds. For a big site, that means it's worth pointing the crawl at the section that matters most (e.g., your docs subdomain) rather than the entire marketing site, so the 200 pages are the right ones.
When to use which
| You want the agent to know… | Use |
|---|---|
| Your entire site / all your docs | Add Website (crawl, up to 200 pages) |
| One specific page | Add Webpage (single URL) |
| A site bigger than 200 pages | Add Website pointed at the most important section |
A note on freshness
Like uploaded files, an indexed website is a snapshot — Macha indexes the pages as they are when you add them. If your site changes a lot, you'll want to re-index periodically so the agent's knowledge stays current. (For content that changes constantly, consider reading it live instead — see knowledge source vs. tool.)
After indexing: link it to an agent
Indexing adds the site to your sources; an agent only uses sources it's linked to. Point your agent at the source — the whole thing or specific pages — and it'll reference your site content when answering. (See connecting a knowledge base to an agent.)
Frequently asked questions
What's the difference between Add Website and Add Webpage? Add Website crawls and indexes the whole site (up to 200 pages); Add Webpage indexes a single URL.
How many pages can it index? Up to the first 200 pages it discovers on a site.
What if my site is bigger than 200 pages? Point the crawl at the most important section (like your docs) so the 200 pages are the right ones.
Does it stay up to date? It's a snapshot — re-index periodically if your site changes, or read truly dynamic content live.
Can I see what it indexed? Yes — during the crawl, Macha lists the pages it discovers.
The bottom line
To give an agent your website as knowledge, crawl the whole thing with Add Website (up to 200 pages) or index a single page with Add Webpage. Point the crawl at your most important section if your site is large, link the source to your agent, and re-index when things change — and your agent answers from your live site content.
Index your docs: crawl your help site and let your agent answer from it. 7-day free trial, no credit card required. Start free.