Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure Speech in Foundry Tools lets your agent convert speech to text and generate speech audio from text. You connect the tool by adding a remote Model Context Protocol (MCP) server to your agent in Foundry Agent Service.
Important
The Speech MCP tool doesn't support Network-secured Microsoft Foundry. For more information, see Connect to Model Context Protocol servers.
Prerequisites
- An Azure subscription. Create one for free.
- A Microsoft Foundry resource created in a supported region. Your Foundry resource includes speech capabilities and is used by the Speech MCP server.
- Your Foundry resource must support MCP tools in Foundry Agent Service. MCP tools require the Agent Service Enterprise tier. For the list of regions and models that support MCP tools, see Tool support by region and model.
Note
If you receive the error Invalid tool value(s): mcp. Use the Enterprise offerings to access these tool(s) when you connect the tool, your resource doesn't support MCP. Create a new Foundry resource that supports the Enterprise tier in a supported region.
Usage support
This article shows how to connect the tool in Foundry portal.
If you want to work with code, see Connect to Model Context Protocol servers for SDK examples in Python, C#, and JavaScript.
Security and privacy
Treat your Speech resource key and storage shared access signature (SAS) URLs as secrets:
- Don't paste keys or SAS URLs into agent prompts, chat transcripts, screenshots, or source control.
- Use the shortest practical SAS expiry time.
- Scope SAS URLs to the minimum required resource (for example, a single container).
- Rotate keys periodically as a security best practice, or immediately if you suspect they're exposed.
Set up storage
You need an Azure Storage account to store input audio files for speech-to-text processing and receive output audio files from text-to-speech processing.
- Create an Azure Storage account if you don't already have one.
- Assign the Storage Blob Data Contributor role to your user account on the storage account so you can generate SAS URLs:
- In the Azure portal, go to your storage account.
- Select Access control (IAM) > Add > Add role assignment.
- Select the Storage Blob Data Contributor role, assign it to your user account, and then select Review + assign.
- Create one or more blob containers to store the input and output audio files. In your storage account, select Containers > + Container, enter a name (for example,
speech-audio), and then select Create.
Create an agent
- Go to Microsoft Foundry.
- In the upper-right menu, select Build.
- In the left pane, select Agents, and then select Create agent.
- Enter a name and description.
- Select a model that supports tool calling, such as
gpt-4oor later. Models without tool-calling capability can't invoke the Speech MCP server. - Select Create.
Your agent appears in the Agents list. Select it to open the agent playground.
Connect the Azure Speech tool to your agent
In your agent, open the agent playground.
Under Tools, select Add > Add a new tool.
In Select a tool, select the Catalog tab.
Search for Azure Speech MCP Server, select it, and then select Create.
On the setup page, fill in the following fields:
Section Field Value Parameters foundry-resource-nameThe name of the Foundry resource you created in the prerequisites. Authorization Bearer(API key)Either KEY1orKEY2from your Foundry resource's Keys and Endpoint page in the Azure portal.Authorization X-Blob-Container-UrlA SAS URL for your storage container with read and write permissions. The service stores audio output files in this container. To generate the container SAS URL: in the Azure portal, go to your storage container, select Shared access tokens, set Permissions to Read and Write, set the shortest practical expiry time, select Generate SAS token and URL, and then copy the Blob SAS URL.
Select Connect to add the remote Speech MCP server as a tool for your agent.
After connecting, the Speech tool appears in your agent's Tools list with a connected status. If it doesn't, see Troubleshooting.
Test the Azure Speech tool
In the agent playground chat, enter What can you do?.
The agent lists its available capabilities, including speech-to-text and text-to-speech. This confirms that the remote Speech MCP server is connected.
Test speech-to-text
The Speech tool can convert an audio file to text. The audio file can be stored in Azure Blob Storage and accessed with a SAS URL, or it can be any publicly accessible URL to an audio file.
Note
Supported audio formats include WAV, MP3, OGG, FLAC, and other common formats. For best results with speech recognition, use WAV files with 16 kHz sample rate and 16-bit depth.
Upload your audio file to your Azure blob storage container.
Generate a SAS URL for the file using one of the following methods:
Azure portal:
- Select your uploaded audio file.
- In Properties, select Generate SAS.
- Under Permissions, select Read.
- Set the shortest practical expiry time, and then select Generate SAS token and URL.
Azure CLI:
az storage blob generate-sas \ --account-name <your-storage-account> \ --container-name <your-container> \ --name <your-audio-file> \ --permissions r \ --expiry <expiry-datetime-utc> \ --auth-mode login \ --as-user \ --full-uri -o tsvCopy the SAS URL. Then use it in one of the following example prompts in the agent chat window:
Recognize this English audio file located in <blob SAS URL>Recognize the audio file located in <blob SAS URL> with these phrase hints: "Azure, OpenAI, Cognitive Services, Lucy" to improve accuracy.Convert this audio file located in <blob SAS URL> into text and summarize it for me.Recognize this French audio file located in <blob SAS URL> with detailed output format.Recognize this Hindi audio file located in <blob SAS URL> and remove profanity.
View the output text in the chat window.
Test text-to-speech
Start a new chat in the agent playground, and use one of the following example prompts. Replace the placeholder with your own text:
Convert text to speech: <your text to speak>Synthesize speech from "<your text to speak>"Generate speech audio from text "<your text to speak>"Convert text to speech with Chinese language: <your text to speak>Synthesize speech with voice en-US-JennyNeural from text <your text to speak>
The output audio is saved as a WAV file in your blob container. An audio link is displayed in the chat window. Select it to listen to the output.
Troubleshooting
| Issue | Likely cause | Resolution |
|---|---|---|
| You can't find Azure Speech MCP Server in the tool catalog. | The tool isn't available for your tenant, region, or resource tier. | Confirm your Foundry resource is created in a supported region and supports MCP tools (Agent Service Enterprise tier). |
Connect fails with Invalid tool value(s): mcp. Use the Enterprise offerings to access these tool(s). |
Your Foundry resource doesn't support MCP tools. | Create a new Foundry resource with the Enterprise Agent Service tier in a supported region. |
| Connect fails with authorization errors. | The API key is incorrect or expired. | Recopy KEY1 or KEY2 from your resource's Keys and Endpoint page. Rotate keys if needed. |
| Speech output audio link doesn't work. | The container SAS URL is invalid, expired, or missing permissions. | Regenerate the container SAS URL with read and write permissions and a valid expiry time. |
| Speech-to-text can't access the audio file. | The file SAS URL is invalid or expired. | Regenerate the file SAS URL with read permission and retry the prompt. |
| Agent doesn't list speech capabilities after you connect the tool. | The selected model doesn't support tool calling. | Select a tool-capable model (such as gpt-4o or later) in the agent configuration. |
| Audio upload fails with a permission error. | Your account lacks the Storage Blob Data Contributor role on the container. | Assign the Storage Blob Data Contributor role to your user account on the storage account. |
Clean up resources
If you created resources only to test this tool, remove them to avoid ongoing costs:
- Delete the agent: In the Foundry portal, go to Agents, select your test agent, and then select Delete.
- Delete the storage container (optional): In the Azure portal, go to your storage container and select Delete.
- Revoke SAS URLs: SAS URLs expire automatically based on the expiry time you set. To revoke all SAS tokens immediately, rotate your storage account keys on the Access keys page.
Next steps
- Learn more about MCP connections: Connect to Model Context Protocol servers.
- Review tool usage guidance: Tool best practices.
The following Speech features are available in the Foundry (new) portal: