Hello, the documentation states:
"Sending a request to the Speech service's REST API requires either a subscription key or an access token. In general, it's easiest to send the subscription key directly. The Speech service then obtains the access token for you. "
I don't believe this is true. I cannot get the https://westus2.tts.speech.microsoft.com/cognitiveservices/v1 endpoint to return a 200 using a subscription key in using the "Ocp-Apim-Subscription-Key" header.
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
Could you post your current code? I got it working in C# so I might be of help.
Sure; here it is ... the value of "key" is the value from the Azure portal, and the value of "ssml" is the XML from the Text-to-Speech sample on this page. I've tried every combination of authorization header and extra header than I can think of - everything returns 401.
It's probably worth mentioning that when I use this key against the /issuetoken endpoint, it works perfectly.
Thanks!
using (var httpClient = new HttpClient())
{
var request = new HttpRequestMessage(HttpMethod.Post, "https://westus2.tts.speech.microsoft.com/cognitiveservices/v1");
//request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", key);
//request.Headers.Authorization = new AuthenticationHeaderValue("Ocp-Apim-Subscription-Key", key);
//request.Headers.Authorization = new AuthenticationHeaderValue(key);
request.Headers.Add("Ocp-Apim-Subscription-Key", key);
request.Headers.Add("X-Microsoft-OutputFormat", "riff-24khz-16bit-mono-pcm");
var ssmlContent = new StringContent(ssml, Encoding.UTF8);
ssmlContent.Headers.ContentType = new MediaTypeHeaderValue("application/ssml+xml");
request.Content = ssmlContent;
var response = await httpClient.SendAsync(request);
using (var stream = await response.Content.ReadAsStreamAsync())
using (var fileStream = new FileStream("output.wav", FileMode.OpenOrCreate, FileAccess.Write))
{
await stream.CopyToAsync(fileStream);
}
}
@overslacked Thanks for the feedback! I have assigned the issue to the content author to investigate further and update the document as appropriate.
@erhopf Hi, I think this is a doc-bug with the document, can you please give more details and update the document as necessary? Thanks a lot!
@YutongTie-MSFT - No problem. Working with the Speech team to identify documentation improvements.
Dear All, any idea why the code below does not return respectively only returns 400 Bad Request ? Getting access token works fine, but not the request to Speech to text. Thanks.
import http.client, urllib.request, urllib.parse, urllib.error
import json
from azure.storage.blob import BlockBlobService
headers = {
'Content-Type': 'application/x-www-form-urlencoded',
'Ocp-Apim-Subscription-Key': '0d...a6',
}
def get_access_token():
try:
conn = http.client.HTTPSConnection('southeastasia.api.cognitive.microsoft.com')
conn.request("POST", "/sts/v1.0/issueToken", None, headers)
response = conn.getresponse()
access_token = response.read()
conn.close()
return access_token
except Exception as e:
print(e)
acc_tok = get_access_token().decode("utf-8")
md5-298d0c944f9c19ed75c5ed195e39f4d3
account_name = 'mlstudiohahastorage'
account_key = '42...TA=='
container_name = 'speechtt'
service = BlockBlobService(account_name, account_key)
spech = service.get_blob_to_bytes('speechtt','jp1mono.wav')
md5-298d0c944f9c19ed75c5ed195e39f4d3
recorequestheader = {
'Ocp-Apim-Subscription-Key': '0d...a6',
/*
'Authorization': 'Bearer ' + acc_tok,
'Accept': 'application/json;text/xml',
'Transfer-Encoding': 'chunked', # optional
'Content-type': 'audio/wav; codec=audio/pcm; samplerate=16000',
*/
}
md5-298d0c944f9c19ed75c5ed195e39f4d3
try:
conn = http.client.HTTPSConnection('southeastasia.stt.speech.microsoft.com')
# => both endpoints return Bad Request
conn = http.client.HTTPSConnection('southeastasia.api.cognitive.microsoft.com')
conn.request("POST", "speech/recognition/conversation/cognitiveservices/v1?language=ja-JP&format=simple", spech.content, recorequestheader)
response = conn.getresponse()
print(str(response.status) + ' ' + response.reason)
res = response.read()
print(res)
conn.close()
except Exception as e:
print(e)
print(type(e))
@overslacked - Can you try using one of the other TTS endpoints in the documentation and see if it resolves your issue: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#text-to-speech.
@erhopf @AlexanderTodorovic Can you please also have a look at Alexander's issue, I don't want to open a post duplicate. Thanks a lot!
@erhopf - I added a loop to test all endpoints, and ran it with each different header combination shown in the initial post. All invocations returned 401 Unauthorized.
I have since moved on to acquiring an access token and using that, which works perfectly well using the same key.
Perhaps @niels9001 could share an example of working C# code?
@overslacked - It's working when you pass the access token in as the Authorization Bearer, but not when you use the subscription key? Are you still setting the "Ocp-Apim-Subscription-Key"?
public
class Authentication
{
public static readonly string AccessUri = "URL + /issueToken";
private string apiKey;
public string accessToken;
private Timer accessTokenRenewer;
//Access token expires every 10 minutes. Renew it every 9 minutes only.
private const int RefreshTokenDuration = 9;
private HttpClient client;
public Authentication(string apiKey)
{
this.apiKey = apiKey;
client = new HttpClient();
var getAccessTokenTask = HttpClientPost(AccessUri, this.apiKey);
getAccessTokenTask.Wait();
this.accessToken = getAccessTokenTask.Result;
// renew the token every specfied minutes
accessTokenRenewer = new Timer(new TimerCallback(OnTokenExpiredCallback), this, TimeSpan.FromMinutes(RefreshTokenDuration), TimeSpan.FromMilliseconds(-1));
}
public string GetAccessToken()
{
return this.accessToken;
}
private async Task RenewAccessToken()
{
string newAccessToken = await HttpClientPost(AccessUri, this.apiKey);
//swap the new token with old one
//Note: the swap is thread unsafe
this.accessToken = newAccessToken;
Console.WriteLine(string.Format("Renewed token for user: {0} is: {1}",
this.apiKey,
this.accessToken));
}
private void OnTokenExpiredCallback(object stateInfo)
{
try
{
RenewAccessToken();
}
catch (Exception ex)
{
Console.WriteLine(string.Format("Failed renewing access token. Details: {0}", ex.Message));
}
finally
{
try
{
accessTokenRenewer.Change(TimeSpan.FromMinutes(RefreshTokenDuration), TimeSpan.FromMilliseconds(-1));
}
catch (Exception ex)
{
Console.WriteLine(string.Format("Failed to reschedule the timer to renew access token. Details: {0}", ex.Message));
}
}
}
private async Task<string> HttpClientPost(string accessUri, string apiKey)
{
HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Post, accessUri);
request.Headers.Add("Ocp-Apim-Subscription-Key", apiKey);
HttpResponseMessage httpMsg = await client.SendAsync(request).ConfigureAwait(false);
Debug.WriteLine($"Authentication Response status code: [{httpMsg.StatusCode}]");
return await httpMsg.Content.ReadAsStringAsync();
}
}
public class VoiceManager
{
private async static Task<Stream> GenerateSpeech(string SSML)
{
Console.WriteLine("Starting Authtentication");
string accessToken = "";
Authentication auth = new Authentication("API KEY");
try
{
accessToken = auth.GetAccessToken();
Console.WriteLine("Token: {0}\n", accessToken);
}
catch (Exception ex)
{
Console.WriteLine("Failed authentication.");
Console.WriteLine(ex.ToString());
Console.WriteLine(ex.Message);
}
Console.WriteLine("Starting TTSSample request code execution.");
CookieContainer cookieContainer = new CookieContainer();
HttpClientHandler handler = new HttpClientHandler() { CookieContainer = cookieContainer };
HttpClient client = new HttpClient(handler);
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-Type", "application/ssml+xml");
client.DefaultRequestHeaders.TryAddWithoutValidation("X-Microsoft-OutputFormat", "riff-16khz-16bit-mono-pcm");
client.DefaultRequestHeaders.TryAddWithoutValidation("Authorization", "Bearer " + accessToken);
client.DefaultRequestHeaders.TryAddWithoutValidation("X-Search-AppId", "07D3234E49CE426DAA29772419F436CA");
client.DefaultRequestHeaders.TryAddWithoutValidation("X-Search-ClientID", "1ECFAE91408841A480F00935DC390960");
client.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "TTSClient");
HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Post, "URL ")
{
Content = new StringContent(SSML)
};
HttpResponseMessage httpTask = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, CancellationToken.None);
try
{
if (httpTask.Content != null && httpTask.IsSuccessStatusCode)
{
Stream httpStream = await httpTask.Content.ReadAsStreamAsync().ConfigureAwait(false);
return httpStream;
}
else
{
Debug.WriteLine("ERROR: " + httpTask.StatusCode);
return null;
}
}
catch (Exception e)
{
Debug.WriteLine("ERROR: " + e.GetBaseException());
return null;
}
finally
{
request.Dispose();
client.Dispose();
handler.Dispose();
}
}
}
}`
Please find my above UWP code attached.. I can confirm that this works for the normal TTS service, but also for the custom voice model!
Feel free to clean it up and put it into the official docs!
@erhopf - When I call the /issueToken endpoint, specifying the key in "Ocp-Apim-Subscription-Key", I am able to acquire an access token. I am then able to cal the TTS endpoint using that access token in the authentication header. This is a normal authentication workflow that I am already familiar with.
However, the documentation states that acquiring an access token is not necessary from the /issueToken endpoint. It states:
"Sending a request to the Speech service's REST API requires either a subscription key or an access token. In general, it's easiest to send the subscription key directly. The Speech service then obtains the access token for you. "
I don't believe that part is working as described. I am not able to call the TTS endpoint using only the key. I have tried the following headers, and all return 401:
Authorization: Ocp-Apim-Subscription-Key [key]
Authorization: Bearer [key]
Ocp-Apim-Subscription-Key [key]
Bearer [key]
Authorization [key]
@niels9001 It looks like your code is doing the two-step process, and not the single API call that is the subject of this issue.
@overslacked - Thanks for confirming. I figured that this was the case after reviewing @niels9001 code. I'm working with engineering to confirm that this is only a doc bug and not a bug in the service.
I've changed to the westus endpoint and I've been also trying with Bearer Token, but still receiving 401.
I've tried the code below with the west us endpoint and Bearer token. I can generate the Bearer token successfully. But now the api call never returns.
...
spech = service.get_blob_to_bytes('speechtt','jp1mono.wav')
recorequestheader = {
'Authorization': 'Bearer ' + acc_tok,
'Content-type': 'audio/wav; codec=audio/pcm; samplerate=16000',
}
conn = http.client.HTTPSConnection('westus.stt.speech.microsoft.com')
conn.request("POST", "speech/recognition/conversation/cognitiveservices/v1?language=jp-JP&format=detailed", spech.content, recorequestheader)
response = conn.getresponse()
print(response.msg)
print(str(response.status) + ' ' + response.reason)
res = response.read()
conn.close()
@erhopf and everyone else, thank you for your help and time, it is appreciated!
@overslacked + @AlexanderTodorovic - Hey guys, thanks for you patience. I've confirmed that the TTS endpoint requires the access token to be passed in the Authorization header. This is a bug, which we will be addressing in our documentation.
@AlexanderTodorovic, I'm still looking into Speech to Text. As soon as I have an answer I'll comment in this thread.
@erhopf Thanks for your quick addressing, do you have temporary workaround for customers?
@YutongTie-MSFT - Yup. The workaround is to send an access token as an Authorization header for all TTS requests. "Ocp-Apim-Subscription-Key" is not supported. Instructions to obtain an access token are available here: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#authentication.
@AlexanderTodorovic - Speech to text supports both passing the subscription key or the access token as the header, so you don't need to send both. Looking at your code, it looks like you're missing a /
on the line pasted below. Please keep in mind that I haven't had a chance to run your code sample, but have confirmed that the API is returning 200s:
conn.request("POST", "speech/recognition/conversation/cognitiveservices/v1?language=ja-JP&format=simple", spech.content, recorequestheader)
It should be:
conn.request("POST", "/speech/recognition/conversation/cognitiveservices/v1?language=ja-JP&format=simple", spech.content, recorequestheader)
All is fine now. Thank you very much!
@YutongTie-MSFT - Looks like our developers are unblocked and we have a backlog item in place to correct confusing/incorrect documentation related to passing sub key/access tokens with the TTS endpoint.
@erhopf Thanks a lot!
@overslacked We will now proceed to close this thread. If there are further questions regarding this matter, please respond here and @YutongTie-MSFT and we will gladly continue the discussion.
Providing an update to everyone that commented on this issue. We have started patching the docs based on your input and you'll see the first updates to auth information later today. Again, special thanks go out to @AlexanderTodorovic, @overslacked, and @niels9001.
Most helpful comment
@AlexanderTodorovic - Speech to text supports both passing the subscription key or the access token as the header, so you don't need to send both. Looking at your code, it looks like you're missing a
/
on the line pasted below. Please keep in mind that I haven't had a chance to run your code sample, but have confirmed that the API is returning 200s:It should be: