Skip to content

    Is your site invisible to AI? Get a free AEO Audit →

    Back to Articles
    AEO
    14 min read

    Why Your SPA Is Invisible to AI Search (And How to Fix It)

    B

    BrandingLab

    Editorial

    Why Your SPA Is Invisible to AI Search (And How to Fix It)

    Open a terminal. Pick any URL on your marketing site. Run this:

    curl -A "GPTBot" https://yoursite.com/your-best-article
    

    Read what comes back.

    If your site is built as a single-page application — React, Vue, Angular, Svelte without server rendering — there is a strong chance the response is a near-empty HTML shell. A <div id="root"></div>, a script tag, and almost nothing else. No headline. No body copy. No schema. No author. No publish date. Nothing for an answer engine to read, summarize, or cite.

    This is the AEO visibility problem in a single command. It is the most consequential SEO issue most marketing teams have never tested for, and it is silently excluding well-resourced brands from the conversations being had inside ChatGPT, Perplexity, Claude, and Google AI Overviews.

    The good news: it is solvable. The harder news: the solution is architectural, not editorial. No amount of better copy, more backlinks, or richer schema will fix a page that the crawler cannot read in the first place.

    This article explains exactly why this happens, how to diagnose it on your own site in five minutes, and the three viable paths to fix it — with the trade-offs each one carries.

    Why this matters now

    For two decades, search optimization meant ranking on a list of blue links. The unit of success was a click. The crawler that mattered was Googlebot.

    That world is fragmenting fast. A growing portion of high-intent research now happens inside conversational interfaces that synthesize an answer rather than return a list. ChatGPT processes hundreds of millions of queries a week. Perplexity is being adopted as a default research tool by analyst teams, consultants, and procurement leads. Google AI Overviews now sit above the traditional results for a substantial share of commercial queries. Claude is increasingly used inside enterprises for vendor research and technical evaluation.

    The unit of success in these interfaces is not a click. It is a citation. When a buyer asks an answer engine "who are the best AEO consultancies in Europe?" or "what is the difference between SEO and AEO?", the engine composes a response from sources it has indexed, and it names some of them. Being named is the new ranking. Not being named is the new page two.

    The crawlers behind these systems are not Googlebot. They are GPTBot (OpenAI), ClaudeBot and Claude-Web (Anthropic), PerplexityBot, Google-Extended, and a dozen smaller ones. They behave differently from the crawler your SEO tooling is designed around. Critically, most of them do not execute JavaScript reliably, do not wait for client-side hydration, and do not return for a second pass.

    If your site only renders its content after JavaScript runs, these crawlers see your blank shell — and they move on. Your authority, your insight, your case studies, your schema markup all exist only inside a black box they never opened.

    The technical root cause

    To understand the fix, you have to understand exactly what is happening on the wire.

    A traditional server-rendered website works like this: the browser asks for a URL, the server runs application logic, queries a database, composes the full HTML for that page, and sends it back. Everything visible to a human is also visible in the raw HTML response. The browser then enhances that HTML with CSS and JavaScript, but the content is already there.

    A single-page application works differently. The server returns the same minimal HTML file for nearly every URL — a shell containing a script tag and an empty <div>. The browser downloads a JavaScript bundle, executes it, fetches data from APIs, and constructs the page in memory. Content appears only after this process completes, sometimes seconds after the initial response.

    A human with a modern browser does not notice this. They see a brief loading state, then the finished page.

    A crawler is not a human. A crawler issues an HTTP request, receives the response, and parses what it gets. Whether it then executes the JavaScript bundle to discover what the page actually contains is a choice — one made by the crawler operator, weighed against compute cost, latency, and the number of pages they need to index.

    Here is how the major crawlers currently behave with respect to JavaScript execution. Treat this as a directional snapshot rather than a contract — these systems change quietly and frequently:

    CrawlerOperatorJavaScript executionPractical implication for SPAs
    GooglebotGoogle SearchYes (deferred, second pass)Eventually sees content, often with delay
    BingbotMicrosoft / BingYes (limited)Mostly sees content
    GPTBotOpenAI / ChatGPTNo reliable executionSees the blank shell
    ClaudeBotAnthropicNo reliable executionSees the blank shell
    PerplexityBotPerplexityLimitedOften sees the blank shell
    Google-ExtendedGoogle AI / Gemini trainingInherits Googlebot behavior, opt-inEventually sees content
    LinkedInBotLinkedIn previewsNoBlank shell — broken social previews
    Slackbot, Twitterbot, FacebookExternalHitSocial previewsNoBlank shell — broken social previews

    The pattern is clear. The traditional search crawlers have invested heavily in JavaScript execution because their commercial model demands it. The newer AI crawlers and the social preview bots have not. They prioritize breadth and freshness over completeness on individual pages, and they assume that any site that genuinely wants to be indexed will return meaningful HTML on the first request.

    This is not an oversight. It is a design choice. And it is unlikely to change soon, because executing JavaScript at the scale these crawlers operate at would multiply their infrastructure costs by an order of magnitude.

    The takeaway is uncomfortable but precise: if your content is not in the HTML that comes back from the first HTTP request, you should assume AI crawlers cannot see it.

    How to diagnose your own site

    Before deciding on a fix, confirm you actually have the problem. Three tests, five minutes total, no tools required beyond a browser and a terminal.

    Test 1: The curl test

    This is the definitive test. It shows you exactly what a crawler sees, with no JavaScript involved.

    curl -s -A "Mozilla/5.0 (compatible; GPTBot/1.0)" \
      https://yoursite.com/your-most-important-page \
      | head -100
    

    A healthy result contains your headline, body copy, structured data, and meta tags in readable HTML. A broken result contains a <title> (often generic), a few meta tags, an empty <div id="root"> or similar, and one or more <script> tags. If you see the latter on a content page, you have the problem.

    Repeat the test on three different page types: a marketing page, an article or blog post, and a product or case study page. Inconsistent results across page types are common and informative.

    Test 2: The view-source test

    In your browser, navigate to the same page. Right-click and select View Page Source — not "Inspect Element". The two are different in a way that matters here.

    "View Page Source" shows you the raw HTML the server sent. "Inspect Element" shows you the live DOM, which includes everything JavaScript has constructed. Crawlers without JavaScript execution see the former. Make sure that is what you are looking at.

    If View Page Source on your article page does not contain the article body in readable text, your site is failing the AEO basics.

    Test 3: The schema validator test

    Paste your URL into Google's Rich Results Test (search.google.com/test/rich-results). Choose the option to view the rendered HTML, then switch to the "Page resources" or "More info" tab and look at the raw response.

    If your JSON-LD schema appears only after rendering, it is invisible to crawlers that do not render. This is the most common false sense of security in AEO work — teams add comprehensive schema, see it appear in their browser, and assume the job is done. The validator will tell you the truth.

    The three viable fixes

    There are three architectural approaches that actually solve this problem. There are also several that look like solutions but are not — those are addressed at the end of this section.

    Path 1: Server-Side Rendering via framework migration

    The most thorough fix is to move the site onto a framework that renders HTML on the server by default. In the React ecosystem, this means Next.js. In Vue, Nuxt. In Svelte, SvelteKit. In a multi-framework setup, Astro.

    These frameworks invert the SPA model. Each request is handled by a server (or serverless function) that runs your components, fetches data, and returns fully populated HTML. The browser then hydrates that HTML into an interactive application. Crawlers receive the same HTML a human does on the first request.

    When this is the right choice: You publish content regularly — articles, case studies, product pages, documentation. You have an editorial workflow where new content needs to be discoverable within minutes, not days. You expect AI search to be a meaningful traffic and citation channel for years to come.

    True cost: A migration is real work. Plan four to six weeks for a content-heavy marketing site, longer if you have a custom admin or complex integrations. Existing components mostly port over, but routing, data fetching, metadata handling, and authentication all change shape. Hosting moves to a platform that supports the framework natively — Vercel for Next.js, Netlify for many others, or self-hosted Node infrastructure.

    What you keep: Your design system, your component library, your CMS or database, your business logic, your domain. The visible product changes very little.

    What you gain beyond AEO: Faster initial page loads, better Core Web Vitals, simpler social previews, working metadata for messaging app previews, and a foundation that scales as content needs grow.

    What you lose: The simplicity of static SPA hosting. You now have a server-side runtime to maintain, monitor, and pay for.

    Path 2: Edge prerendering proxy

    A second path keeps the existing SPA in place and adds a layer in front of it that detects crawlers and serves them a pre-rendered version of each page.

    The standard implementation uses a CDN edge worker (Cloudflare Workers, Fastly Compute, AWS Lambda@Edge) combined with a prerender service (Prerender.io, Rendertron, or self-hosted Puppeteer infrastructure). The edge worker inspects the user agent of each incoming request. Human browsers pass through to the SPA unchanged. Recognized crawlers are routed to the prerender service, which fetches the page in a real headless browser, waits for JavaScript to finish, and returns the fully rendered HTML.

    When this is the right choice: You have an existing SPA you cannot or will not rebuild in the near term. You have budget for ongoing infrastructure. You need a fix in weeks, not months.

    Operational reality nobody warns you about: This setup has more moving parts than it looks. You are now responsible for a CDN configuration, an edge worker, a prerender service contract, a cache invalidation strategy, and a list of crawler user agents to keep current. When prerendered HTML goes stale, you have to invalidate it. When you publish new content, the prerender cache needs to know. When a crawler arrives with an unfamiliar user agent, your worker needs a sensible fallback. None of this is hard, but all of it is permanent overhead.

    Cost reality: Prerender.io plans start around $90/month and scale with traffic and page count. Self-hosting a Puppeteer-based prerender service is technically free but operationally expensive — you are running and maintaining headless Chrome at scale. CDN edge worker costs are usually negligible. Budget realistically: $100 to $500 per month for a mid-size content site.

    What you gain: Crawlers see fully rendered HTML without changing your application code. Implementation is two to three weeks for a competent infrastructure engineer.

    What you should be honest with yourself about: This is a workaround. It is a perfectly defensible workaround, but it adds permanent complexity to your stack to compensate for an architectural mismatch. Sophisticated technical buyers who inspect your setup will recognize it as such.

    Path 3: Build-time static generation

    The third path generates static HTML files for every route at build time, then serves those files directly. Tools include vite-plugin-ssg, react-snap, and the static export modes of frameworks like Astro.

    The build process renders each route in a headless browser, captures the resulting HTML, and writes it to disk. The output is a folder of fully rendered HTML files that any static host can serve. Crawlers receive complete HTML on the first request.

    When this is the right choice: Your site is genuinely mostly static — a marketing site for a product company, a documentation site, a portfolio. Content changes are infrequent and intentional. You want zero ongoing infrastructure cost beyond static hosting.

    The trap that disqualifies this for content publishers: Build-time static generation means every published page is a snapshot taken at build time. When you publish a new article in your CMS, that article does not exist on the public site until you trigger a new build and deploy. For a static marketing site updated quarterly, this is fine. For a publication that ships content weekly, daily, or in response to news cycles, it is unacceptable. You either spend constant engineering effort wiring up build webhooks that frequently fail, or you accept that new content takes hours or days to appear.

    Hidden complexity for dynamic routes: Routes that depend on parameters — /articles/[slug], /work/[case-study] — require enumerating every possible value at build time. This means querying your database during the build, which means your build process needs database credentials, which means your CI environment needs to be configured for it. None of this is impossible, but it is more work than the "just generate static files" pitch suggests.

    What you gain: Best-in-class performance, lowest possible hosting cost, no runtime infrastructure to manage, full crawler visibility.

    What you lose: Real-time content publishing without an engineering loop in the middle.

    What not to do

    Three approaches surface frequently in conversations about this problem and are worth flagging as dead ends.

    A homemade prerender script that runs in CI. Teams sometimes propose writing their own headless Chrome script that captures every route, commits the HTML to the repo, and serves those files. This works for a week, breaks the first time the script encounters an authentication redirect, a slow API response, or a hydration error, and then sits broken because nobody owns it.

    "We will fix it later when AEO matters more." AEO already matters. The teams winning citations in AI answer engines today made these architectural decisions twelve to eighteen months ago. The compounding nature of crawler trust and content indexing means that catching up requires more effort, not less, the longer you wait.

    Adding more schema markup to a blank shell. This is addressed in detail below, but the short version: if the surrounding HTML is invisible to the crawler, the schema inside it is also invisible.

    A decision framework

    The right path depends on a single question: how often does your content change?

    Are you publishing new content weekly or more often?
    ├── Yes → Path 1: SSR migration (Next.js, Nuxt, etc.)
    │         Reason: only SSR keeps new content crawlable in real time.
    │
    └── No → Are you actively planning to publish more in the next 12 months?
        ├── Yes → Path 1: SSR migration
        │         Reason: invest now, not after the content team is blocked.
        │
        └── No → Do you have engineering capacity for a rebuild?
            ├── No, but have budget → Path 2: Edge prerendering proxy
            │                          Reason: fastest path to crawlable HTML.
            │
            └── No to both → Path 3: Build-time static generation
                             Reason: cheap, simple, suitable for static sites.
    

    Two more honest checks worth running:

    • If the AEO channel is strategic to your business, default to Path 1. SSR is the only path that does not impose ongoing operational cost or content-velocity penalties. Every other path is a workaround. Workarounds are appropriate when they buy time for a real solution, not as the destination.
    • If you sell AEO services to clients, your own site cannot be on Path 2 or 3. Your prospects will run the curl test. They are running it on every vendor they evaluate. The credibility cost of being caught with an invisible site is higher than the cost of any migration.

    What schema markup cannot save you from

    A common reaction to AEO concerns is to add structured data — JSON-LD schemas for Article, FAQPage, BreadcrumbList, Organization, Product. This is good practice. It is also insufficient on its own.

    Schema is metadata. It tells a crawler what the surrounding content means. It does not replace the surrounding content.

    If your BreadcrumbList schema is injected into the page by JavaScript after hydration, a crawler that does not execute JavaScript never sees it. The same applies to Article schema, Person schema, FAQPage schema, and every other type. Schema in a blank shell is invisible. Schema in a server-rendered page is amplified.

    The order of operations matters. Get the HTML right first. Then layer schema on top to make that HTML easier for machines to interpret. Doing schema first is like putting captions on a film nobody can see.

    This is also why client-side metadata libraries like React Helmet — useful as they are for managing the document head in an SPA — do not solve the AEO problem on their own. They modify the head of the live DOM after JavaScript runs. Crawlers without JavaScript see the original <head> from the HTML shell, with whatever defaults were baked in at build time.

    If you want the test: open View Page Source on a page where you added schema. Search the raw HTML for the schema string. If it is there, crawlers can see it. If you have to use the browser's Inspect tool to find it in the live DOM, only crawlers that execute JavaScript will ever read it.

    Next: the tactical companion. For the exact JSON-LD patterns we ship on every Vite + React SPA — static site-wide schema in index.html, per-route schema via react-helmet-async, and the validator workflow — see JSON-LD for React SPAs: A Practical Schema Markup Guide.

    Closing

    AEO is being sold to marketing teams as a content discipline. It is partly that. It is also, more fundamentally, an architectural discipline.

    The teams winning citations in AI answer engines are not winning because they wrote better articles or added more schema. They are winning because their content is in the HTML that comes back on the first HTTP request, on every page, every time, with no JavaScript required. That single property makes them crawlable, indexable, and citable by every system that has emerged in the last two years and every system that will emerge in the next five.

    The teams losing visibility are mostly not aware they are losing. Their analytics show traffic, their browsers show beautiful pages, their schema validators show green check marks. The thing they are not measuring is what happens when an AI crawler asks for the page. That is the test that decides whether their brand appears in the answers their buyers are reading.

    Five minutes with curl will tell you which side of that line your site sits on. Where you go from there is a decision worth making with eyes open — about content velocity, engineering capacity, and how seriously you take AEO as a channel.

    If you want help running that diagnostic across your most important pages, choosing between the three paths, or executing the migration, that is the work BrandingLab does. Start with the curl test. The rest follows from what you find.

    Want the SSR fix and the AEO programme run together? That's the standard shape of BrandingLab's AEO programme — rebuild the foundations, then compound citations on top.

    Key Takeaways

    • AI crawlers like GPTBot and ClaudeBot do not reliably execute JavaScript, so SPAs return a blank shell
    • A single-page app served as a blank shell is invisible to most answer engines
    • Schema markup only helps if the surrounding HTML is server-rendered
    • Three fixes exist: SSR migration, edge prerendering, or build-time static generation
    • The right choice depends on publishing cadence, not site size

    Want to discuss this topic?

    Start a Conversation