Edit

Use answer synthesis for citation-backed responses in Azure AI Search (preview)

Note

Some agentic retrieval features are generally available in the 2026-04-01 REST API. However, this feature remains in preview and requires a preview REST API. Preview features are provided without a service-level agreement and aren't recommended for production workloads. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Important

These features and functionality are part of the 2026-05-01-preview REST API. The 2026-05-01-preview is licensed to you as part of your Azure subscription and is subject to the terms applicable to "Previews" in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum ("DPA"), and the Supplemental Terms of Use for Microsoft Azure Previews.

The 2026-05-01-preview supports connections to other Microsoft services and third-party services. Use of these services is subject to their respective terms and might result in data processing or storage outside of the Azure compliance boundary, as well as data flowing into the Azure compliance boundary.

It's your responsibility to manage whether your data will flow outside of your organization's compliance and geographic boundaries and any related implications, and that appropriate permissions, boundaries, and approvals are provisioned.

You're responsible for carefully reviewing and testing applications you build in the context of your specific use cases and making all appropriate decisions and customizations. This includes implementing your own responsible AI mitigations, such as metaprompts, content filters, or other safety systems, and ensuring your applications meet appropriate quality, reliability, security, and trustworthiness standards. For more information, see the Azure AI Search Transparency Note.

By default, a knowledge base in Azure AI Search performs data extraction, which returns raw grounding chunks from your knowledge sources. Data extraction is useful for retrieving specific information but lacks the context and reasoning necessary for complex queries.

You can instead enable answer synthesis (preview), which uses the LLM specified in your knowledge base to answer queries in natural language. Each answer includes citations to the retrieved sources and follows any instructions you provide, such as using bulleted lists.

You can enable answer synthesis in two ways:

  • On the knowledge base (becomes the default for all queries)
  • On individual retrieval requests (overrides the default)

Important

  • The minimal retrieval reasoning effort disables LLM processing, so it's incompatible with answer synthesis in both knowledge base definitions and retrieval requests. For more information, see Set the retrieval reasoning effort.

  • Answer synthesis incurs pay-as-you-go charges from Azure OpenAI, which is based on the number of input and output tokens. Charges appear under the LLM assigned to the knowledge base. For more information, see Availability and pricing of agentic retrieval.

Prerequisites

  • An Azure AI Search service with a knowledge base that specifies an LLM.

  • Permissions to update knowledge bases. Configure keyless authentication with the Search Service Contributor role assigned to your user account (recommended) or use an API key.

  • For outbound calls to the LLM, the search service must have a managed identity with Cognitive Services User permissions on the Microsoft Foundry resource.

  • The 2026-05-01-preview REST API or an equivalent Azure SDK preview package: .NET | Java | JavaScript | Python

Enable answer synthesis in a knowledge base

This section explains how to enable answer synthesis in an existing knowledge base. Although you can use this configuration for new knowledge bases, knowledge base creation is beyond the scope of this article.

To enable answer synthesis in a knowledge base:

  1. Use the 2026-05-01-preview of Knowledge Base - Create or Update (REST API) to formulate the request.

  2. In the body of the request, set outputMode to answerSynthesis.

  3. (Optional) Use answerInstructions to customize the answer output. Our example instructs the knowledge base to Use concise bulleted lists.

@search-url = <YOUR SEARCH SERVICE URL>
@api-key = <YOUR API KEY>
@knowledge-base-name = <YOUR KNOWLEDGE BASE NAME>

### Enable answer synthesis in a knowledge base
PUT {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version=2026-05-01-preview
Content-Type: application/json
api-key: {{api-key}}

{
    "name": "{{knowledge-base-name}}",
    "knowledgeSources": [ ... // OMITTED FOR BREVITY ],
    "models": [ ... // OMITTED FOR BREVITY ],
    "outputMode": "answerSynthesis",
    "answerInstructions": "Use concise bulleted lists"
}

Note

This example assumes that you're using key-based authentication for local proof-of-concept testing. We recommend role-based access control for production workloads. For more information, see Connect to Azure AI Search using roles.

The equivalent SDK shape sets output_mode (Python) or OutputMode (C#) to answerSynthesis on the KnowledgeBase definition and passes the same KnowledgeBase instance to create_or_update_knowledge_base / CreateOrUpdateKnowledgeBaseAsync:

from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    AzureOpenAIVectorizerParameters,
    KnowledgeBase,
    KnowledgeBaseAzureOpenAIModel,
    KnowledgeSourceReference,
)

aoai_params = AzureOpenAIVectorizerParameters(
    resource_url=aoai_endpoint,
    deployment_name=aoai_gpt_deployment,
    model_name=aoai_gpt_model,
)

knowledge_base = KnowledgeBase(
    name=knowledge_base_name,
    models=[KnowledgeBaseAzureOpenAIModel(azure_open_ai_parameters=aoai_params)],
    knowledge_sources=[KnowledgeSourceReference(name=knowledge_source_name)],
    output_mode="answerSynthesis",
    answer_instructions="Use concise bulleted lists",
)

index_client = SearchIndexClient(endpoint=search_endpoint, credential=credential)
index_client.create_or_update_knowledge_base(knowledge_base)
var aoaiParams = new AzureOpenAIVectorizerParameters
{
    ResourceUri = new Uri(aoaiEndpoint),
    DeploymentName = aoaiGptDeployment,
    ModelName = aoaiGptModel,
};

var knowledgeBase = new KnowledgeBase(
    name: knowledgeBaseName,
    knowledgeSources: new[] { new KnowledgeSourceReference(knowledgeSourceName) })
{
    Models = { new KnowledgeBaseAzureOpenAIModel(aoaiParams) },
    OutputMode = "answerSynthesis",
    AnswerInstructions = "Use concise bulleted lists",
};

await indexClient.CreateOrUpdateKnowledgeBaseAsync(knowledgeBase);

Enable answer synthesis in a retrieve request

For per-query control over the response format, you can enable answer synthesis at query time. This approach overrides the default output mode specified in the knowledge base.

To enable answer synthesis in a retrieve request:

  1. Use the 2026-05-01-preview of Knowledge Retrieval - Retrieve (REST API) to formulate the request.

  2. In the body of the request, set outputMode to answerSynthesis.

@search-url = <YOUR SEARCH SERVICE URL>
@api-key = <YOUR API KEY>
@knowledge-base-name = <YOUR KNOWLEDGE BASE NAME>

### Enable answer synthesis in a retrieve request
POST {{search-url}}/knowledgebases/{{knowledge-base-name}}/retrieve?api-version=2026-05-01-preview
Content-Type: application/json
api-key: {{api-key}}

{
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What is healthcare?"
                }
            ]
        }
    ],
    "outputMode": "answerSynthesis"
}

Note

This example assumes that you're using key-based authentication for local proof-of-concept testing. We recommend role-based access control for production workloads. For more information, see Connect to Azure AI Search using roles.

The equivalent SDK shape constructs a KnowledgeBaseRetrievalRequest with output_mode="answerSynthesis" (Python) or OutputMode = "answerSynthesis" (C#):

from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient
from azure.search.documents.knowledgebases.models import (
    KnowledgeBaseMessage,
    KnowledgeBaseMessageTextContent,
    KnowledgeBaseRetrievalRequest,
)

agent_client = KnowledgeBaseRetrievalClient(
    endpoint=search_endpoint,
    credential=credential,
    knowledge_base_name=knowledge_base_name,
)

request = KnowledgeBaseRetrievalRequest(
    messages=[
        KnowledgeBaseMessage(
            role="user",
            content=[KnowledgeBaseMessageTextContent(text="What is healthcare?")],
        )
    ],
    output_mode="answerSynthesis",
)

result = agent_client.retrieve(retrieval_request=request)
var client = new KnowledgeBaseRetrievalClient(
    endpoint: new Uri(searchEndpoint),
    knowledgeBaseName: knowledgeBaseName,
    credential: credential);

var request = new KnowledgeBaseRetrievalRequest();
request.Messages.Add(
    new KnowledgeBaseMessage(
        content: new[]
        {
            new KnowledgeBaseMessageTextContent("What is healthcare?")
        }) { Role = "user" });
request.OutputMode = "answerSynthesis";

var result = await client.RetrieveAsync(request);

Get a synthesized answer

When answer synthesis is enabled, Knowledge Retrieval - Retrieve (REST API) returns a natural-language answer based on the instructions you optionally specified in the knowledge base. Citations to your knowledge sources are formatted as [ref_id:<number>].

For example, if your instructions are Use concise bulleted lists and your query is What is healthcare?, the response might look like this:

{
  "response": [
    {
      "content": [
        {
          "type": "text",
          "text": "- Healthcare encompasses various services provided to patients and the general population ... // TRIMMED FOR BREVITY"
        }
      ]
    }
  ]
}

The full text output is as follows:

"- Healthcare encompasses various services provided to patients and the general population, including primary health services, hospital care, dental care, mental health services, and alternative health services [ref_id:1].\n- It involves the delivery of safe, effective, patient-centered care through different modalities, such as in-person encounters, shared medical appointments, and group education sessions [ref_id:0].\n- Behavioral health is a significant aspect of healthcare, focusing on the connection between behavior and overall health, including mental health and substance use [ref_id:2].\n- The healthcare system aims to ensure quality of care, access to providers, and accountability for positive outcomes while managing costs effectively [ref_id:2].\n- The global health system is evolving to address complex health needs, emphasizing the importance of cross-sectoral collaboration and addressing social determinants of health [ref_id:4]."

Depending on your knowledge base's configuration, the response might include other information, such as activity logs and reference arrays. For more information, see Create a knowledge base.