Introducing G2.ai, the future of software buying.Try now
Product Avatar Image
G2 recognized AssemblyAI - Speech to Text API
AssemblyAI - Speech to Text API

By AssemblyAI

4.6 out of 5 stars

How would you rate your experience with AssemblyAI - Speech to Text API?

Share your insights with AssemblyAI - Speech to Text API

Thousands of people like you come to G2 to find out whether solutions like AssemblyAI - Speech to Text API are the right fit for them. Share your real experiences with AssemblyAI - Speech to Text API and the G2 community and help someone make the right decision about their software.

AssemblyAI - Speech to Text API Pros and Cons: Top Advantages and Disadvantages

Quick AI Summary Based on G2 Reviews

Generated from real user reviews

Users rave about the high accuracy of AssemblyAI's speech to text model, enhancing transcription quality and intelligence features. (29 mentions)
Users praise the exceptional transcription accuracy of AssemblyAI, finding it the best in the market for speech to text. (20 mentions)
Users highlight the ease of use of AssemblyAI, benefiting from fast and accurate transcriptions effortlessly. (19 mentions)
Users find the documentation easy to use, enabling quick setup and seamless integration with applications. (14 mentions)
Users appreciate the easy setup of AssemblyAI, allowing them to get started quickly and efficiently. (12 mentions)
Users express concern over high pricing for transcription, especially for long videos and currency conversion issues. (7 mentions)
Users note that improvement is needed in startup time, punctuation accuracy, and support for multiple languages. (5 mentions)
Users experience issues with inaccuracy in transcription, especially with heavy accents, fast speech, and similar voices. (5 mentions)
Users find limited language support frustrating, wishing for broader international features and improved transcription capabilities. (4 mentions)
Users experience slow performance, with delayed startup, inconsistent response times, and lengthy processing speeds impacting usability. (4 mentions)
Users find the user interface challenging, particularly for non-tech users and managing multiple accounts effectively. (4 mentions)
Users experience inaccurate accent recognition that leads to mis-transcriptions and challenges with heavy accents. (3 mentions)
Users face integration challenges with AssemblyAI, especially when working with complex systems and large audio files. (2 mentions)
Users express frustration with the slow response from customer support, often delaying resolution of issues and bugs. (2 mentions)

Top Pros or Advantages of AssemblyAI - Speech to Text API

1. Accuracy
Users rave about the high accuracy of AssemblyAI's speech to text model, enhancing transcription quality and intelligence features.
See 29 mentions

See Related User Reviews

Verified User
U

Verified User

Small-Business (50 or fewer emp.)

4.5/5

"Accurate and consistent ASR"

What do you like about AssemblyAI - Speech to Text API?

AssemblyAI produces reliable ASR results at a great price.

Verified User
U

Verified User

Mid-Market (51-1000 emp.)

5.0/5

"Great Accuracy, but Diarization Can Be Inconsistent with Similar Voices"

What do you like about AssemblyAI - Speech to Text API?

High accuracy. The main reason we switched to AssemblyAI was that it support diarization out of the box. This feature gives a one stop ease of use.

2. Transcription Accuracy
Users praise the exceptional transcription accuracy of AssemblyAI, finding it the best in the market for speech to text.
See 20 mentions

See Related User Reviews

Sharhad B.
SB

Sharhad B.

Mid-Market (51-1000 emp.)

4.5/5

"Using AssemblyAI for video transcribing"

What do you like about AssemblyAI - Speech to Text API?

Its easy to use It is cheap. We can transcribe 500 audio in a day for under a dollar It is reliable. The transcriptions are very accurate It is fas

Ryan J.
RJ

Ryan J.

Mid-Market (51-1000 emp.)

5.0/5

"Best Speech to Text technology in the market!"

What do you like about AssemblyAI - Speech to Text API?

AssemblyAI really is focused on Product Development as their core customer inside of an organization. Their APIs are well defined and are always maki

3. Ease of Use
Users highlight the ease of use of AssemblyAI, benefiting from fast and accurate transcriptions effortlessly.
See 19 mentions

See Related User Reviews

Sharhad B.
SB

Sharhad B.

Mid-Market (51-1000 emp.)

4.5/5

"Using AssemblyAI for video transcribing"

What do you like about AssemblyAI - Speech to Text API?

Its easy to use It is cheap. We can transcribe 500 audio in a day for under a dollar It is reliable. The transcriptions are very accurate It is fas

Verified User
U

Verified User

Mid-Market (51-1000 emp.)

5.0/5

"Great Accuracy, but Diarization Can Be Inconsistent with Similar Voices"

What do you like about AssemblyAI - Speech to Text API?

High accuracy. The main reason we switched to AssemblyAI was that it support diarization out of the box. This feature gives a one stop ease of use.

4. Documentation
Users find the documentation easy to use, enabling quick setup and seamless integration with applications.
See 14 mentions

See Related User Reviews

Dave G.
DG

Dave G.

Small-Business (50 or fewer emp.)

5.0/5

"Great Trial period | Easy API to Work with | Accurate transcription"

What do you like about AssemblyAI - Speech to Text API?

- Easy to configure due to good documentation - I am not a developer but figured it out - Integrated into N8N for my automation - Nano model is ver

Verified User
U

Verified User

Small-Business (50 or fewer emp.)

4.5/5

"A cutting edge speech to text service"

What do you like about AssemblyAI - Speech to Text API?

There were 2 things we were looking for when evaluating speech to text APIs: 1) Quality of the speech to text (the most important thing) 2) Speed of

5. Easy Setup
Users appreciate the easy setup of AssemblyAI, allowing them to get started quickly and efficiently.
See 12 mentions

See Related User Reviews

Verified User
U

Verified User

Small-Business (50 or fewer emp.)

4.5/5

"Best in class speech transcription"

What do you like about AssemblyAI - Speech to Text API?

AssemblyAI's transcription service is easy to setup and use, while being extremely stable for month.

Verified User
U

Verified User

Mid-Market (51-1000 emp.)

4.5/5

"Easy to use speech to text solution"

What do you like about AssemblyAI - Speech to Text API?

Their documentation was very easy to work with. I was able to hit the ground running within minutes.

Top Cons or Disadvantages of AssemblyAI - Speech to Text API

1. Pricing Issues
Users express concern over high pricing for transcription, especially for long videos and currency conversion issues.
See 7 mentions

See Related User Reviews

DC

Danilo C.

Small-Business (50 or fewer emp.)

3.5/5

"Good quality and speed with a high price."

What do you dislike about AssemblyAI - Speech to Text API?

Challenges with low-quality audio (common in Brazil) and pricing in dollars.

Timur M.
TM

Timur M.

Small-Business (50 or fewer emp.)

4.0/5

"a great solution to build into your product"

What do you dislike about AssemblyAI - Speech to Text API?

I wish the price was even lower, we have so many more videos to process. Also it is not quite clear how formatting into paragraphs works, according to

2. Improvement Needed
Users note that improvement is needed in startup time, punctuation accuracy, and support for multiple languages.
See 5 mentions

See Related User Reviews

Alen O.
AO

Alen O.

Small-Business (50 or fewer emp.)

4.5/5

"Great but diarization can be better"

What do you dislike about AssemblyAI - Speech to Text API?

Diazarization needs improvement, and we need streaming available in EU for all European languages.

Павел .
П

Павел .

Small-Business (50 or fewer emp.)

5.0/5

"Affordable and Easy-to-Integrate Transcription Service"

What do you dislike about AssemblyAI - Speech to Text API?

There are some aspects I'd like to see improved. The API response contains too many unnecessary fields that I don't need, which increases loading time

3. Inaccuracy
Users experience issues with inaccuracy in transcription, especially with heavy accents, fast speech, and similar voices.
See 5 mentions

See Related User Reviews

Verified User
U

Verified User

Mid-Market (51-1000 emp.)

5.0/5

"Great Accuracy, but Diarization Can Be Inconsistent with Similar Voices"

What do you dislike about AssemblyAI - Speech to Text API?

The diarization can either work really well or really poorly. When the 2 voices are similar to each other it really struggles - when the voices are di

Verified User
A

Verified User

Small-Business (50 or fewer emp.)

5.0/5

"Very good"

What do you dislike about AssemblyAI - Speech to Text API?

Sometimes, French accuracy has some issues. Same for the speakers detection.

4. Limited Language Support
Users find limited language support frustrating, wishing for broader international features and improved transcription capabilities.
See 4 mentions

See Related User Reviews

Fábio G.
FG

Fábio G.

Small-Business (50 or fewer emp.)

4.5/5

"CallCenter transcribe calls"

What do you dislike about AssemblyAI - Speech to Text API?

Some features are not available for my language (pt_BR)

Павел .
П

Павел .

Small-Business (50 or fewer emp.)

5.0/5

"Affordable and Easy-to-Integrate Transcription Service"

What do you dislike about AssemblyAI - Speech to Text API?

There are some aspects I'd like to see improved. The API response contains too many unnecessary fields that I don't need, which increases loading time

5. Slow Performance
Users experience slow performance, with delayed startup, inconsistent response times, and lengthy processing speeds impacting usability.
See 4 mentions

See Related User Reviews

Павел .
П

Павел .

Small-Business (50 or fewer emp.)

5.0/5

"Affordable and Easy-to-Integrate Transcription Service"

What do you dislike about AssemblyAI - Speech to Text API?

There are some aspects I'd like to see improved. The API response contains too many unnecessary fields that I don't need, which increases loading time

Verified User
A

Verified User

Mid-Market (51-1000 emp.)

5.0/5

"Works well"

What do you dislike about AssemblyAI - Speech to Text API?

Takes a moment to start up and sometimes punctuation could be better, it often breaks sentences up into smaller bits. Setting up websockets for stre

6. User Interface Issues
Users find the user interface challenging, particularly for non-tech users and managing multiple accounts effectively.
See 4 mentions

See Related User Reviews

Verified User
U

Verified User

Small-Business (50 or fewer emp.)

5.0/5

"Great transcription for Spanish, quicker than other providers"

What do you dislike about AssemblyAI - Speech to Text API?

I think the worst part about Assembly has been that the API itself is a bit complicated to work with, since with recordings you've got to make them in

Rodrigo F.
RF

Rodrigo F.

Small-Business (50 or fewer emp.)

5.0/5

"Best Speech-to-Text Service Overall"

What do you dislike about AssemblyAI - Speech to Text API?

It's not that I don't like but I think there is high bareer for non-techs to access the serviece. I know tht they ahve a playground, but it's still sc

7. Accent Recognition
Users experience inaccurate accent recognition that leads to mis-transcriptions and challenges with heavy accents.
See 3 mentions

See Related User Reviews

Sandeep K.
SK

Sandeep K.

Enterprise (> 1000 emp.)

4.5/5

"Reliable and developer- friendly speech to text API"

What do you dislike about AssemblyAI - Speech to Text API?

Sometimes automatic language detection fails, often It detecting wrong language.

Verified User
A

Verified User

Small-Business (50 or fewer emp.)

5.0/5

"Powerful and Accurate Speech-to-Text API"

What do you dislike about AssemblyAI - Speech to Text API?

Occasionally the API struggles with heavy accents or extremely fast speech, leading to minor mis-transcriptions that require manual correction

8. Integration Issues
Users face integration challenges with AssemblyAI, especially when working with complex systems and large audio files.
See 2 mentions

See Related User Reviews

Rodrigo F.
RF

Rodrigo F.

Small-Business (50 or fewer emp.)

5.0/5

"Best Speech-to-Text Service Overall"

What do you dislike about AssemblyAI - Speech to Text API?

It's not that I don't like but I think there is high bareer for non-techs to access the serviece. I know tht they ahve a playground, but it's still sc

Giorgio S.
GS

Giorgio S.

Small-Business (50 or fewer emp.)

5.0/5

"Best-in-Class Speech-to-Text Solution"

What do you dislike about AssemblyAI - Speech to Text API?

Integration with complex database systems like VertexDB can be challenging and requires additional development effort. The response latency can someti

9. Poor Customer Support
Users express frustration with the slow response from customer support, often delaying resolution of issues and bugs.
See 2 mentions

See Related User Reviews

Verified User
U

Verified User

Mid-Market (51-1000 emp.)

4.0/5

"I use the API for speaker recognition."

What do you dislike about AssemblyAI - Speech to Text API?

the response to my questions to the customer support took too long

Verified User
U

Verified User

Small-Business (50 or fewer emp.)

4.5/5

"Accurate and consistent ASR"

What do you dislike about AssemblyAI - Speech to Text API?

Most of the time, AssemblyAI provides an API with excellent uptime and few errors. There are occasional bugs that crop up, and it can take a while to

AssemblyAI - Speech to Text API Reviews (100)

Reviews

AssemblyAI - Speech to Text API Reviews (100)

4.6
100 reviews
Search reviews
Filter Reviews
Clear Results
G2 reviews are authentic and verified.
Vladyslav H.
VH
CMO
Small-Business (50 or fewer emp.)
"Excellent support. Low cost."
What do you like best about AssemblyAI - Speech to Text API?

Excellent documentation and responsive support that will help you resolve any issues with using the API.

Multiple language support and automatic detection. The ability to upload files directly to their server, which makes it faster than saving them to third-party services.

You pay for usage instead of a subscription, which is very nice. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

During my time using the service, I haven't found much that I dislike. The main my issue is that I would like to see support for video files from services such as YouTube directly via a link. Currently, I have to use third-party services to download and process videos from YouTube before sending them to AssamblyAI. Review collected by and hosted on G2.com.

Response from Devon Malloy of AssemblyAI - Speech to Text API

Thank you for this wonderful review, it's great to hear that AssemblyAI is powering your mobile and web applications successfully!

Your feedback about direct YouTube URL support is super valuable—we've passed your note on to our product team to explore. If you'd like to stay updated on new features or have additional suggestions, please don't hesitate to reach out to our support team at [support.assemblyai.com].

Павел .
П
Xamarin Developer
Small-Business (50 or fewer emp.)
"Affordable and Easy-to-Integrate Transcription Service"
What do you like best about AssemblyAI - Speech to Text API?

I'm impressed with AssemblyAI's transcription service due to its reasonable pricing. For transcribing 243 hours of audio, I paid only $68. In comparison, Google's Chirp_2 model cost $47 for just 35 hours, which would have totaled $326 for the same 243 hours.

Additional benefits include the ability to separate text by different speakers (English only) and automatic language detection. The API is straightforward to use and was easy to integrate into both Flutter and .NET Core Web applications.

Overall, I'm satisfied with the service and plan to continue using it. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

There are some aspects I'd like to see improved. The API response contains too many unnecessary fields that I don't need, which increases loading times. I would also appreciate faster speech-to-text processing speeds and an increase in the maximum duration limit beyond the current 10-hour restriction. Additionally, the slam-1 model only works with English text, and I would like to see this model become internationalized to support multiple languages. Review collected by and hosted on G2.com.

Rodrigo F.
RF
Consultant
Small-Business (50 or fewer emp.)
"Best Speech-to-Text Service Overall"
What do you like best about AssemblyAI - Speech to Text API?

AssemblyAI is seriously impressive. Before I found it, I tried out Google Cloud, Whisper, and some open-source tools for diarization. I even gave Read.ai a shot, but honestly, none of them gave me the results I was looking for.

Then I saw someone mention AssemblyAI on Reddit, and I decided to give it a try. I’m so glad I did—their transcription and diarization are on another level. I barely ever need to edit the transcripts, which is rare with these kinds of tools.

The pricing is super reasonable for what you get, and the API is really flexible. I’ve been able to build my own workflows to transcribe meetings, interviews, and videos without any hassle. I use it pretty much every day for transcribing meetings I record on my computer, and I save everything in Markdown format.

If you’re looking for a solid, reliable transcription service that just works, I can’t recommend AssemblyAI enough. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

It's not that I don't like but I think there is high bareer for non-techs to access the serviece. I know tht they ahve a playground, but it's still scary for peop,e who want to use the service but see the. Some friends who see my workflow wants to mimic but stop when they see the api nterface. The docs are very well detailed, but there are barreres for adoption for certain customer segments still.

Another thing that I would like would to store the cluster of voicers that are recorded I would like the odel to automatically name them. I think this would be too complicated and probably there's privacy concerns involved. But it would be a quality of life approach. But I guess this is a niche need instead of something the custoemr base would be interested at Review collected by and hosted on G2.com.

Max M.
MM
CTO
Small-Business (50 or fewer emp.)
"Developer-Friendly and Accurate Transcripts"
What do you like best about AssemblyAI - Speech to Text API?

Beyond accurate transcripts, AssemblyAI made it easy to determine each call’s outcome, flag unqualified leads, and capture the exact reason a lead wasn’t qualified. Those structured insights rolled up into useful reports and metrics that our team could act on immediately. The whole process felt simple, reliable, and developer-friendly. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

Using the default analysis was not that great, but once I figured out how to use LeMUR I got exactly what I needed. Review collected by and hosted on G2.com.

Rohan P.
RP
"Quick Switch to Efficient, User-Friendly API"
What do you like best about AssemblyAI - Speech to Text API?

I appreciate that AssemblyAI offers quick and accurate transcriptions, essential for maintaining compliance within our industry. The diarization feature is beneficial, providing clear speaker differentiation, which aids in compliance documentation. The user-friendly documentation made the setup process straightforward, which coupled with the appealing business insights and aesthetics of the platform, makes it enjoyable to use. The capability to seamlessly integrate with existing systems, like handling S3 links for file locations, significantly streamlines our workflow. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

I find it problematic that the diarization feature does not differentiate between real human dialogue and automated call menus. It would be very useful if there were an option to ignore these automated voices or classify them separately, as they often appear as additional speakers in the transcription, which complicates the process for us. This issue requires us to manually remove irrelevant portions, which wastes time and effort. Review collected by and hosted on G2.com.

Response from Devon Malloy of AssemblyAI - Speech to Text API

Thank you so much for your thoughtful review, Rohan! We're glad to be helping with your reporting automation needs.

Your feedback about differentiating automated call menus from human speakers is super valuable—I've passed that insight along to our product team. If you have any additional context or details you'd like to share about your use-case, feel free to reach out to support@assemblyai.com to help us prioritize effectively.

Devon

Verified User in Internet
UI
Small-Business (50 or fewer emp.)
"Accurate transcription, reasonably easy to integrate"
What do you like best about AssemblyAI - Speech to Text API?

Assembly's accuracy is strong and on-par or better with many competitors in the space, especially after the launch of Slam-1. LLM Gateway is convenient for transcript summarization. Developer experience is largely strong with some exceptions. New feature releases continue demonstrating value. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

Additional language support for Slam-1. Clearer documentation for more complex/specific workflows (Zoom + multichannel). Existing docs only explain how to implement in Python and support is having trouble helping us diagnose our issue. Out-of-the box speaker diarization and speaker labeling could be more accurate. Review collected by and hosted on G2.com.

Timur M.
TM
Developer
Small-Business (50 or fewer emp.)
"a great solution to build into your product"
What do you like best about AssemblyAI - Speech to Text API?

We recently started using the AssaemblyAI api to transcribe videos from our educational channels. The API works quickly and reliably. So far we have never encountered any limitations of the platform, although our videos are quite large. The quality of recognition is very high, the price is about the same as with OpenAI analogs, but there is no limit of 25 minutes per video fragment. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

I wish the price was even lower, we have so many more videos to process. Also it is not quite clear how formatting into paragraphs works, according to the api we get exactly the text without paragraphs, although in the version available for free through the interface, the recognized text is already formatted Review collected by and hosted on G2.com.

Andrea R.
AR
Manager
Small-Business (50 or fewer emp.)
"High-quality speech recognition with robust diarization and smart API design"
What do you like best about AssemblyAI - Speech to Text API?

AssemblyAI impresses with its high transcription quality, even when dealing with messy or low-quality audio inputs. The diarization capabilities are particularly strong—accurately distinguishing between speakers in less-than-perfect recordings. The API suite is fast, well-documented, and returns a rich, detailed output format that makes post-processing straightforward and powerful. I also found the Word Boost feature especially helpful: being able to prioritize tricky or uncommon words significantly improves recognition accuracy in niche use cases. Overall, it’s a developer-friendly platform that balances precision with flexibility. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

Honestly, there’s little to complain about. The pricing model is reasonable for the level of quality and features provided, and I haven’t encountered any significant drawbacks in my usage Review collected by and hosted on G2.com.

NH
Head of technology and marketing
Small-Business (50 or fewer emp.)
"Much more affordable and accessible then other options"
What do you like best about AssemblyAI - Speech to Text API?

One of the best things about AssemblyAI is how much more affordable and accessible it is compared to many other options on the market. The pricing is straightforward and budget-friendly, which makes it an excellent choice for both small developers and larger teams. Despite the lower cost, the transcription accuracy and feature set remain top-notch. The API is easy to implement, and the documentation is clear and helpful. It’s reliable, fast, and packed with features like speaker diarization and topic detection that are usually reserved for much more expensive platforms. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

Currently there are some features not available to the European users but I believe these are in development. Review collected by and hosted on G2.com.

Response from Madison Boyd of AssemblyAI - Speech to Text API

Thank you for your feedback! We are continuously working to expand our features to all users, including those in Europe. We appreciate your patience as we work on further development.

Verified User in Financial Services
UF
Small-Business (50 or fewer emp.)
"Great transcription for Spanish, quicker than other providers"
What do you like best about AssemblyAI - Speech to Text API?

It's really great for Spanish specifically and user diarization. Also, it's quick compared to Speechmatics API; it's really slow, so kudos on that also, and it's been really cost-effective. I must have transcribed 800-1000 calls with the free credits, so that's really great. Overall super solid though. Review collected by and hosted on G2.com.

What do you dislike about AssemblyAI - Speech to Text API?

I think the worst part about Assembly has been that the API itself is a bit complicated to work with, since with recordings you've got to make them into links first and then send the links and transcript IDs to a separate endpoint. I can still work with it and have done lots of things, but it would be easier if it was a single API if I'm working with recordings that did this in the background. Review collected by and hosted on G2.com.

Product Avatar Image
Product Avatar Image