Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
After you complete the prerequisite steps, complete the steps in this article to deploy the Agentic Retrieval extension.
To try Agentic Retrieval without the need for local hardware, see Quickstart: Install Agentic Retrieval.
Important
Agentic Retrieval in Foundry Local is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
Prerequisites
Before you begin, complete the deployment prerequisites for Agentic Retrieval.
Deploy the extension
Deploy Agentic Retrieval by using either the Azure portal or Azure CLI.
In the Azure portal, go to the Azure Kubernetes cluster on Azure Local.
Select Settings > Extensions > + Add, and Agentic Retrieval from the list.
Select Create.
On the Basics tab, provide the following information:
Field Value Subscription Select the subscription that contains your Azure Kubernetes Service (AKS) cluster on Azure Local. Resource group Select the resource group that contains your AKS Arc cluster. Deployment name Provide a name for the deployment. Region Select the region to deploy Agentic Retrieval. Cluster Select the cluster that you want to deploy Agentic Retrieval to. Select Next.
On the Configurations tab, provide the following information:
Field Value Capabilities Select one or both components to include in the deployment. Agentic Retrieval Engine Select this option to install the agentic retrieval engine. Knowledge sources layer Select this option to install the knowledge sources layer. Deployment mode Deployment mode Select GPU or CPU based on your available hardware. This setting applies to the Knowledge Sources layer. SharePoint Server Enable SharePoint data source Optional. If you want to connect to SharePoint by using Key Vault authentication, select this option. Key Vault name Required only when SharePoint ingestion is selected. Enter the Azure Key Vault name. KV cert secret name Required only when SharePoint ingestion is selected. Enter the Key Vault secret name that stores the certificate. KV cert password secret name Required only when SharePoint ingestion is selected. Enter the Key Vault secret name that stores the certificate password. Workload identity client ID Required only when SharePoint ingestion is selected. Enter the workload identity client ID (GUID). NFS kerberos authentication Enable kerberos authentication Optional. If you want to connect to an NFS server by using Kerberos authentication, select this option. Kerberos SPN Required only when Kerberos is selected. Enter the SPN in the format service/host@REALM(for example,nfs/edgerag-svc@CONTOSO.COM).Inference model Language model source Select Foundry Local or Bring your own. Application ID Required only when Foundry Local is selected. Language model name Required. Enter your deployed language model name. LLM endpoint Required. Enter your OpenAI-compatible endpoint URL. For example: https://<Foundry_Resource_Name>.openai.azure.com/openai/deployments/<model_name>/chat/completions?api-version=<API_VERSION>. For Foundry Local on Azure Local, use your cluster-internal endpoint.Max token (K) Required. Enter a value from 4K to 2048K. Select Next.
On the Access tab, provide the following information:
Field Value SSL settings SSL CNAME Enter the domain name for your system. The domain name should match the redirect URI used during app registration and must not include the https://prefix. For example,arcrag.contoso.com.Kubernetes SSL secret name Enter the name of the Kubernetes secret to store the SSL certificate. By default, Agentic Retrieval uses a self-signed SSL certificate in this secret. After installation, you can replace it with a signed certificate. Entra ID Entra application ID Enter the application ID from the enterprise application you registered for authentication. Entra tenant ID Enter the tenant ID from the enterprise application you registered for authentication. Select Review + create.
Review and validate the parameters you provided.
Select Create to complete the Agentic Retrieval deployment.
When the deployment is complete, under Extensions, validate that the extension types microsoft.arc.rag and microsoft.extensiondiagnostics are listed.
The Agentic Retrieval extension deployment typically takes about 30 minutes but can take longer depending on your connectivity.
Verify deployment by mode
After deployment, verify the components running in the arc-rag namespace match your selected deployment mode:
| Mode | Expected pods |
|---|---|
| Combined | All Knowledge Layer pods (ingestionapi, inferencingflow, vectordb-api-server, embedding models, docling, milvus, postgres) + all Agentic Layer pods (agent-manager, agents-runtime, knowledge-sources, indexed-sources-mcp-server) |
| Agentic | Agentic Layer pods only (agent-manager, agents-runtime, knowledge-sources, indexed-sources-mcp-server, postgres) |
| Knowledge | Knowledge Layer pods only (ingestionapi, inferencingflow, vectordb-api-server, embedding models, docling, milvus, postgres) |
Run the following command to check:
kubectl get pods -n arc-rag
Verify end-to-end connectivity
After deployment, verify that the extension can communicate with Foundry Local:
Check Foundry model availability (port-forward since
endpoint.enabled=false):kubectl port-forward svc/gpt-oss-20b 5000:5000 -n foundry-local-operator # In another terminal: curl http://localhost:5000/v1/modelsTest a chat completion via port-forward:
curl -X POST http://localhost:5000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-oss-20b", "messages": [ {"role": "user", "content": "Hello, what is 2+2?"} ], "temperature": 0.7, "max_tokens": 100 }'Test the extension's inference endpoint:
curl -X POST http://localhost:3001/edgeai/chat/completions?api-version=2024-10-01-preview \ -H "Content-Type: application/json" \ -H "x-user-role: dev" \ -d '{ "messages": [{"role": "user", "content": "Test question"}], "data_sources": [{"type": "milvus", "parameters": {"endpoint": "", "index_name": "edgeragapp"}}] }'
Configure post-deployment authentication
After deploying the Agentic Retrieval extension, complete the authentication task that matches your language model source:
- If you installed the Foundry Local on Azure Local Azure Arc extension and deployed the Agentic Retrieval extension with
foundryClientId(Foundry Local), complete Configure Foundry Local inference authentication for Agentic Retrieval. - If you deployed the extension for BYOM (
useFoundryLocal=false), complete Configure BYOM endpoint authentication for Agentic Retrieval.
Related content
- Deployment parameter reference and troubleshooting
- Configure endpoint authentication for Agentic Retrieval
- Custom certificate authority in Azure Kubernetes Service (AKS)
- Knowledge layer configuration
- Add data source for the chat solution in Agentic Retrieval
- Compare Azure Government and global Azure