OAuth Discovery Metadata Missing or Broken

If agents or third-party clients need OAuth to access your service, they should be able to discover the correct authorization metadata without guessing endpoints by hand.

Start here

Before You Fix It: What This Check Means

OAuth Discovery Metadata Missing or Broken shows whether this part of your site is behaving the way users and search systems expect. In plain terms, this tells you whether AI crawlers and answer systems can understand and reuse your content correctly. Scavo probes `/.well-known/oauth-authorization-server`, `/.well-known/openid-configuration`, and `/.well-known/oauth-protected-resource`. It also checks whether the scanned URL advertises a `resource_metadata` hint in a `WWW-Authenticate` header.

Why this matters in practice: unclear machine-facing signals can reduce retrieval quality and citation consistency.

How to use this result: treat this as directional evidence, not final truth. Answer-engine retrieval behavior can shift over time even when your technical setup is stable. First, confirm the issue in live output: verify bot-facing output and policy files on the final URL Then ship one controlled change: If the resource is protected, publish `/.well-known/oauth-protected-resource` and include the `authorization_servers` field pointing at at least one valid issuer. Finally, re-scan the same URL to confirm the result improves.

TL;DR: Protected resources should advertise where authorization lives. If your site exposes protected APIs or agent endpoints, discovery metadata is the cleanest way to make login flows interoperable.

Treat "OAuth Discovery Metadata Missing or Broken" as a practical fix cycle: isolate the break, patch the smallest safe thing, and confirm quickly. The payoff is more reliable AI crawl and citation signals over time.

This check only becomes important if your site exposes protected resources that an external client or agent might need to access. For ordinary public marketing pages, an info result is normal.

Where it does matter, the goal is to remove guesswork. A client should be able to discover the authorization server, the protected resource metadata, and the right OAuth/OpenID configuration using the well-known endpoints defined by the standards.

What Scavo checks (plain English)

Scavo probes /.well-known/oauth-authorization-server, /.well-known/openid-configuration, and /.well-known/oauth-protected-resource. It also checks whether the scanned URL advertises a resource_metadata hint in a WWW-Authenticate header.

A pass means Scavo found valid public JSON discovery metadata. A warning means discovery was advertised or returned 200, but the document was invalid or incomplete. An info result means no discovery metadata was detected.

  • Scan key: ai_oauth_discovery
  • Category: AI_VISIBILITY

How Scavo scores this check

  • Warning: a discovery endpoint or resource_metadata hint exists, but the response is broken, non-JSON, or otherwise unusable.
  • Pass: valid OAuth authorization server metadata, OpenID configuration, or protected resource metadata was found.
  • Info: no OAuth discovery metadata was detected.

Why fixing this matters

Without discovery metadata, every client has to be preconfigured by hand or pushed into unsafe browser-cookie workarounds. That does not scale well once you want agents, integrations, or external tooling to access protected resources on behalf of a user.

Discovery also reduces support burden. A well-formed metadata document makes the integration surface more predictable and easier to verify during onboarding or incident response.

Common reasons this check flags

  • The endpoint returns HTML or a branded login page instead of JSON.
  • Only one of the expected well-known documents exists, while the rest are missing or stale.
  • A WWW-Authenticate header points to a resource_metadata URL that does not resolve cleanly.
  • Protected resource metadata exists, but it does not identify the authorization server correctly.

If you are not technical

  1. Only prioritize this if your service actually exposes protected API or agent endpoints. A public brochure site does not need it.
  2. Ask engineering to name the protected resource, the authorization server, and the canonical public URLs for each discovery document.
  3. Require one demo of the end-to-end login flow after the metadata is published. Discovery files that look correct but do not work in practice are not enough.

Technical handoff message

Copy and share this with your developer.

Scavo flagged OAuth Resource Discovery (ai_oauth_discovery). Please publish or fix the live well-known metadata documents for the protected resource and authorization server, confirm they return valid JSON, and re-run the scan once the production OAuth discovery flow is stable.

If you are technical

  1. If the resource is protected, publish /.well-known/oauth-protected-resource and include the authorization_servers field pointing at at least one valid issuer.
  2. Publish OAuth authorization server metadata at /.well-known/oauth-authorization-server and/or OpenID configuration at /.well-known/openid-configuration for the issuer you support.
  3. On 401 Unauthorized responses for protected resources, include a WWW-Authenticate challenge with resource_metadata="..." when appropriate.
  4. Return real JSON documents from the well-known URLs, not HTML fallback pages or redirect loops.
  5. Validate the metadata against your actual token, scope, and client-registration setup before treating the rollout as complete.

How to verify

  • Fetch each well-known URL directly and confirm it returns valid JSON from production.
  • If the resource is protected, trigger a 401 response and inspect the WWW-Authenticate header for the resource_metadata hint.
  • Run one real OAuth flow with a test client or agent after publishing the metadata.
  • Re-run Scavo and confirm the warning is gone or the result remains intentional info.

What this scan cannot confirm

  • Scavo does not complete the full OAuth flow or validate token semantics end-to-end.
  • Scavo cannot decide whether your product should expose protected resources to third-party agents. It only checks whether the published discovery surface is technically usable.

Owner checklist

  • [ ] Name one owner for this check and note where it is controlled (app, CDN, server, or CMS).
  • [ ] Add a release gate for this signal so regressions are caught before production.
  • [ ] After deploys that touch this area, run a follow-up scan and confirm the result is still healthy.
  • [ ] Re-check AI crawler and citation signals after robots, schema, or author metadata changes.

FAQ

What does Scavo actually validate for OAuth Discovery Metadata Missing or Broken?

Scavo checks live production responses using the same logic shown in your dashboard and weekly report.

Will AI visibility changes show immediately after we ship fixes?

Usually not instantly. Crawlers and answer engines refresh on different schedules, so confirm technical signals first, then monitor citations and mentions over time.

What is the fastest way to confirm the fix worked?

Run one on-demand scan after deployment, open this check in the report, and confirm it moved to pass or expected info. Then verify at source (headers, HTML, or network traces) so the fix is reproducible.

How do we keep this from regressing?

Keep one owner, keep config in version control, and watch at least one weekly report cycle. If this regresses, compare the release diff and edge configuration first.

Sources


Need stack-specific help? Send support your stack + check key and we will map the fix.

More checks in this area

ai_bot_access_parity

AI Crawlers Blocked More Restrictively Than Search Engines

ClaudeBot saw the highest growth in block rates — increasing 32.67% year-over-year (EngageCoders, 2024). If you block AI crawlers while allowing Googlebot, you're letting Google use your content in its AI products (Gemini, AI Overviews) while excluding others. Consider whether this asymmetry aligns with your content strategy, or whether parity across all bots better serves your interests.

Open guide
ai_chunkability

Content Not Structured for AI Processing

44.2% of AI citations come from the first 30% of content (Profound), so front-loading key facts matters. AI models work better with structured, chunked content — clear headers, concise paragraphs, fact boxes, and attributed claims. Walls of unstructured text force AI to guess at relevance, reducing your chances of being cited or recommended in AI-generated responses.

Open guide
ai_citation_readiness

Content Not Structured for AI Citation

44.2% of all LLM citations come from the first 30% of text, with content depth and readability being the most important factors for citation (Profound). AI-driven referral traffic increased more than tenfold from July 2024 to February 2025, with 87.4% coming from ChatGPT (Adobe). To be cited, your content needs clear, fact-based claims with attribution — not just narrative prose.

Open guide