For example, max file size is 100mb?
max duration is 1 hours?
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
I am using Batch Transcription: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription
I am not using any SDK, just get auth token using HTTP and create transcription task with REST API.
What's the limit? max file size 100mb or? max duration is 1 hour or?
@1c7, thank you for reaching out. We are looking into this and would get back to you soon on this thread.
@1c7 Can you please add more detail about the input audio file's that you are trying.
A standard subscription (S0) for Speech service is required to use batch transcription. Please follow below details for limits.
https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/
Please follow the below for faq.
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/faq-stt
The limits above and in the doc are the default limits. We do work closely with large customers and can change the quota when necessary.
Current API_2.0 is only single file based, A version3 of the batch api is going to become available,This API version will allow you to supply a container as the input.
@ram-msft
Hi, the input audio file comes from the user, so it could be any format or size or duration.
I am trying to find out what's the limit,
so my program can say: Sorry, the file is too big/duration is too long
the FAQ https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/faq-stt
didn't say file size limit and duration limit.
This is request frequency limit
I don't need to train model, I just use the default one
Current API_2.0 is only single file based, A version3 of the batch api is going to become available,This API version will allow you to supply a container as the input.
that's great but still haven't answer what's the limit for single file
?
there are no hard-coded limits in the batch transcription service itself regarding file size or audio duration.
a batch transcription request must be finished withing 48hours (currently), once it has started processing. This includes downloading the audio blob, transcribing, uploading the result data. We are transcribing with up-to double realtime speed. All these parameters are internal and can change, the service usage might be relevant, or available space (memory / disk).
I would recommend staying in a manageable space of several hours of audio. Longer files I would actually split to parallelize the upload of the audio and also the processing of the audio. Splitting 20 hours of audio in 10 segments of 2 hours might get you the transcription results in a couple of hours, as a big file you will have to wait at least 10 hours or so.
@wolfma61 Thank you :)
Most helpful comment
there are no hard-coded limits in the batch transcription service itself regarding file size or audio duration.
a batch transcription request must be finished withing 48hours (currently), once it has started processing. This includes downloading the audio blob, transcribing, uploading the result data. We are transcribing with up-to double realtime speed. All these parameters are internal and can change, the service usage might be relevant, or available space (memory / disk).
I would recommend staying in a manageable space of several hours of audio. Longer files I would actually split to parallelize the upload of the audio and also the processing of the audio. Splitting 20 hours of audio in 10 segments of 2 hours might get you the transcription results in a couple of hours, as a big file you will have to wait at least 10 hours or so.