Qubole

Qubole

3.9
(108)
Optimized for quick response

Qubole delivers a Self-Service Platform for Big Data Analytics built on Amazon, Microsoft and Google Clouds

Work for Qubole?

Learning about Qubole?

We can help you find the solution that fits you best.

Qubole Reviews

Ask Qubole a Question
Write a Review
Filter Reviews
Filter Reviews
  • Ratings
  • Company Size
  • User Role
  • For Category
  • Industry
Ratings
Company Size
User Role
For Category
Industry
Showing 108 Qubole reviews
LinkedIn Connections
Aldo O.
Validated Reviewer
Verified Current User
Review Source

"Surprisingly easy to use"

What do you like best?

I especially like how easy it is to trigger commands, not only from the Qubole console but also using the REST API to trigger actions from our custom scripts. Also, being able to switch between technologies such as Hive, Presto and Spark with a simple dropdown is great. Finally, the capability of switching between environments to sandbox changes

What do you dislike?

Notebooks are sometimes unstable and we get weird exceptions that are solved by waiting for a little while and retrying. Also, the displayed tables are not automatically refreshed.

Qubole lacks some good videos to explain how to configure the interpreters and add custom artifacts; I had a hard time finding that out on external forums

Recommendations to others considering the product:

More documentation on the REST APIs to facilitate the usage

What problems are you solving with the product? What benefits have you realized?

Running adhoc queries on our datalakes and increase/decrease the size of our cluster on demand. We also connect our scheduler to this platform to submit jobs on a regular basis. Whenever we want to take a look at something in particular, we review the logs and find the job id to dig into the logs. The most important benefit we've experienced is the cost reduction of using our cloud provider since Qubole expands and shrinks the clusters accordingly and it is extremely easy to use.

Sign in to G2 to see what your connections have to say about Qubole
U
User
Validated Reviewer
Verified Current User
Review Source

"Getting better"

What do you like best?

Hive queries work great. I don't recall ever having an issue. If something goes wrong in Hive, there is usually a clear path to resolution. New features for Zeppelin notebooks are steadily arriving and making Qubole's version of Zeppelin stand out from others.

What do you dislike?

Zeppelin notebooks can be troublesome. Issues are hard to replicate which makes them more frustrating to use. However, as time goes on, the issues are fewer and I am learning workarounds. A lot of the notebook issues are Zeppelin-specific and do not have to do with Qubole - who are working hard to improve that experience.

Recommendations to others considering the product:

You will be happy with the Hive query capabilities. Make sure you test your specific use cases for notebook use. With careful cluster setup and management and recognizing the inherent limitation in Zeppelin itself, you will be happy.

What problems are you solving with the product? What benefits have you realized?

Machine learning to model and predict various business outcomes. Data cleaning.

Data ETL. Exploratory data analysis. Scheduled jobs to populate dashboards.

What Big Data Processing and Distribution solution do you use?

Thanks for letting us know!
Alexander E.
Validated Reviewer
Verified Current User
Review Source

"True SaaS experience in Big Data Analytics"

What do you like best?

SaaS nature of the product, not a IaaS like EMR. Permanent history of all command, scripts, queries. Ability to start/down cluster on demand without calling additional API, just by submitting the actual query. Ability to get log back to client through REST API.

What do you dislike?

Some UI sections look redundant from functionality point, like Explorer overlaps with Analysis in many aspects. I think it would be better to unify them. Some weird glitches are annoying, for example when the Spark logs are disappeared.

The big disadvantage is proprietary extensions of open source products made by Qubole.

Recommendations to others considering the product:

Evaluate a product for real use case

What problems are you solving with the product? What benefits have you realized?

Clickstream event processing, search events processing. Funnel reports, corporate analytics, etc

Valeriy B.
Validated Reviewer
Review Source

"Valeriy Borysyuk's Qubole review"

What do you like best?

Qubole provides easily accessible history of all queries with results, I can share Query ID with query itself and it's results with someone else in my team or across the company. This is the killing feature really. Also clusters are started, scaled and shut down on demand. Also excellent Qubole support. They help to resolve all problems, simple and complex ones. Qubole also developed a lot of features, like Hive-JDBC-Storage-Handler for accessing other JDBC data sources from Hive, and many other features.

What do you dislike?

Impossible to run Hive from the shell script. Impossible to execute simple light-weight shell without Map-Reduce started. When using GUI, need to re-login every few minutes, this is annoying feature. I'd prefer my sessions not to expire during my working day. Cost views are heavy and it seems query costs even when it stuck waiting for resources and not running at all.

Recommendations to others considering the product:

You can access lot of tools for data processing such as Presto, Hive, RDBMS, etc. using the same API and GUI. Also Qubole supports AirFlow clusters.

What problems are you solving with the product? What benefits have you realized?

Running Data Warehouse using different tools using the same GUI and API. Easy development because of reach unified API.

Matt E.
Validated Reviewer
Review Source

"A solid managed platform for Spark"

What do you like best?

Qubole simplifies the operational side of running Spark clusters and jobs, both scheduled and ad-hoc. It allows researchers and data engineers to focus on business logic instead of platform maintenance and tuning.

What do you dislike?

Once you have a hammer, every problem starts to look like a nail. Sometimes the overhead of running a Spark job dominates the actual data processing work to be done. See the blog post "Don't use Hadoop when your data isn't that big" and think about whether you might achieve faster turnaround without reaching for Spark. Initial exploratory work, especially, can go faster if you just run everything from a fast laptop without any distributed processing.

Recommendations to others considering the product:

It's a solid choice if you want some of the most popular Big Data tools and don't want to spend time maintaining them yourself.

What problems are you solving with the product? What benefits have you realized?

We're performing content and user classification, content NLP, and user activity analytics with Qubole. We've been able to standardize around Spark and Qubole for batch jobs so that there's a common reference framework used by everyone on the team. We also don't need to devote time to maintaining the cluster and can focus on business logic.

Endy L.
Validated Reviewer
Review Source

"Fantastic"

What do you like best?

Easy to setup, use, and maintain. End to end Big Data (Lake) Complete Platform within less than an hour setup. A no brainer for any company who want to have a platform for experience users to use, or even for new beginner user to start learning and advancing their knowledge. Proven on my previous company, within 6 months, have more than 10 Data Engineers with great technical expertise with Spark, Hadoop/Hive, and Airflow.

What do you dislike?

Initially, the inconsistency on product releases, that can cause unexpected errors because of bugs. But over the time has improved significantly.

Recommendations to others considering the product:

No Brainer for companies who are looking for cloud native Big Data Platform. Simple to setup, great framework and environment for new user to learn Hadoop, spark, airflow, and big data in general (including Python, Data Science stuff, ML). Cost optimization, fantastic auto scaling capabilities, and fantastic Presto performance for Analytics.

What problems are you solving with the product? What benefits have you realized?

Data Lake, Data Accessibility, Dat Automation, Data Science Workload with Spark and Python/R with Zeppelin, AutoScaling capability for Hadoop and Spark Workload, Data Preparation and Integration.

Ahmad Z.
Validated Reviewer
Review Source

"Convenient but quirky "

What do you like best?

I do not have to worry about any installation or set up process. I just select the config I need and I can start working on spark or presto. There are some configurations that must be done (connection to S3, keys, iAM roles and so on), however, they pale in comparison to running an installation from scratch.

Scaling out or scaling up is also made simple by just choosing a few things from the cluster config options.

Built in solutions like Scheduler help with scheduling and automating a few jobs, and when things get too large, you can use airflow on Qubole.

What do you dislike?

Sometimes there are things that work on Zeppelin or the underlying technology in general, but due to some issues with Qubole value added product, these things fail (example is connection between spark & redshift). When these things are highlighted to the support team, they get addressed, but the resolution time varies.

Recommendations to others considering the product:

It is very convenient, but it has some issues. The support team is quite active and they address measure problems immediately (if problems were not fixed, the root cause analysis is provided along with a fix estimation)

What problems are you solving with the product? What benefits have you realized?

Running queries on hive data meta stores with data stored in S3 using Presto

Running data aggregation jobs on data stored in S3 using spark

Running queries that join S3 data with redshift data on presto

Adhoc queries using Spark that search through S3 files to answer business questions

UC
User in Computer Software
Validated Reviewer
Verified Current User
Review Source

"Great tool to query big data in S3"

What do you like best?

- Simplicity of queries,

- Similarity to SQL makes learning curve faster.

- Ability to download results in csv format.

- Qubole keeps history of ran queried for long period of time which helped me a lot in finding previous queries and saved time for writing new queries.

What do you dislike?

- Slower when comparing to competitor like Athena.

- If the result set is higher then results are stored in S3 with partitioned files. This makes difficult to combine all result and view it together.

Recommendations to others considering the product:

- I can advocate to use qubole for other companies who are looking to query data in S3, Azure or other cloud storage technologies in efficient and timely manner.

What problems are you solving with the product? What benefits have you realized?

- We are medium scale company dealing petabytes of data and We really need to query this data for reporting, analytics and for machine learning purposes.

- As we are working in AWS technology we store all the data in S3, we wanted a tool which can query data present in S3.

- Qubole lets us solve this problem exactly by storing data in S3 in partitioned format and then query it efficiently.

- Qubole makes it easy to analyze the big data using simple sql like queries.

- Qubole also gives an option of selecting clusters according to price and time need. which provides options considering needs.

- Qubole is also cost efficient in terms of running bid queries and cost associated with it. By allowing to store data in partitioned format saved us a lot of time and money.

- I can advocate to use qubole for other companies who are looking to query data in S3, Azure or other cloud storage technologies in efficient and timely manner.

U
User
Validated Reviewer
Verified Current User
Review Source

"Qubole puts our data at the fingertips of all colleagues"

What do you like best?

With Qubole, our non-engineering colleagues are no longer required to talk to engineers in order to get immediate insights out of our data lake. This saves time for both sides, such that everyone can focus on what they do best.

What do you dislike?

I think Qubole's UI could be a little friendlier when it comes to displaying query execution progress. It's easy for me to parse as an engineer, but some non-technical colleagues find it difficult to determine if a query is on track to complete or not.

Recommendations to others considering the product:

Be sure to have clear guidance regarding creation of tables. It's too easy to end up with a big pile of tables that lack consistent ownership. This leads to confusion on the part of users, as well as operators who cannot tell which tables are safe to delete, or when maintenance can be performed on underlying data. In addition, keeping this information in a centralized location provides more opportunities for tighter access control at the underlying store level.

What problems are you solving with the product? What benefits have you realized?

The most important one is the ability of product managers to quickly gain insights into patterns hidden in the data in order to deliver a more impactful product to our customers. This results in less rework by engineers later on. Everybody wins!

U
User
Validated Reviewer
Verified Current User
Review Source

"Easy-to-Use Querying Software"

What do you like best?

I like the simplicity of the Analyze interface as well as the built-in scheduling capabilities. The functionality to search through old queries, both saved and unsaved, is powerful and intuitive. Additionally, I like the ability to easily manage multiple clusters for managing workloads of varying size. Moreover, the available cluster UI makes it simpler to investigate failed queries.

What do you dislike?

I'd like to be able to set default values for schema so that I don't need to specify schema every time if I am predominantly referencing just one schema. I also find the log process updates annoying. While it is helpful to see specific log details, I get annoyed scrolling through a list of % completes. I feel that a status bar would better use the screen real estate and make it a bit simpler to read through the logs.

What problems are you solving with the product? What benefits have you realized?

We are trying to make large, log-level data sources available to cross-functional teams to run both automated and ad hoc analytics to meet their varied needs.

UC
User in Computer Software
Validated Reviewer
Verified Current User
Review Source

"Quick to start with bigdata using Qubole"

What do you like best?

Autoscaling and way Qubole uses AWS spot instances minimizes compute cost for customers. Our team has experienced close to 30% reduction in cost compared to previous cluster without Qubole.

Amount of time it tales to on board new process onto Qubole is very less, and we can have many clusters each designed for its own purpose.

Qubole also supports notebooks backed by spark clusters, which is very handy for quickly iterating on ideas for developers.

Qubole support is really good, they respond to tickets in timely manner and we always have someone helping us out via slack/emails on time critical projects..

What do you dislike?

Qubole sometimes has lot of bugs and lot of their features are not well documented. You have to engage support to get those things figured out.

Personally Im not very much impressed with UI and interface of qubole query editor, most of times it doesn't show results in well formatted manner.

Recommendations to others considering the product:

You will save money if you are currently using public cloud like AWS.

What problems are you solving with the product? What benefits have you realized?

Ablility to query big data sets for adhoc analysis, generate reports, data processing to generated ML models. We use presto clusters for near realtime queries which mostly support our backend, performance of these clusters is really amazing.

A
Administrator
Validated Reviewer
Verified Current User
Review Source

"2 years back it was serving our needs ..now the platform has to catchup for productivity."

What do you like best?

the Support system

Spark ,Airflow deployments

Team that works with us on cutomer support

What do you dislike?

Lack of innovation on ML AI

Lack of serverless deployments

The technology is 2 years outdated and not growing

Need solution experts to help partner soultions in qubole systems

Recommendations to others considering the product:

please improve ml and ai offering

Dont just say spark can do it..

have more blogs

technical sessions on how to do it.

learn from products like sagemaker ,cloudml and others on how to solve ml ai piece for organization

build automatic suites for anomaly detection unsupervised learning etc.

share code and engineering experitse through blog and explain how to leverage the qubole product.

Make serverless implementation like presto glue .

..

build deeplearning applications for clients and share the experience with others functionally and technically

have redis for storing hot data

What problems are you solving with the product? What benefits have you realized?

Scheduling.

Big data workloads with spark.

Used presto for some time but now there are seveless options from Amzon which has replaced all of our spark workloads

Stanislav M.
Validated Reviewer
Verified Current User
Review Source

"I like qubole as this is easy to run queries, but it is a bit complex to use and find best use cases"

What do you like best?

I really like that tables are easily achievable and i can fast switch between different tables and see the content. Also I can see what should be queried mandatory.

What do you dislike?

I don't like that I can open a table content only once being on the same page. Sometimes I can close the table content, but then I need to open it again, and I can't do that before refreshing the whole page

Recommendations to others considering the product:

Would be amazing to have this product more easy to use and improve the errors description.

What problems are you solving with the product? What benefits have you realized?

We are using cubole to have an access to huge set of data we are operating and can't put everything in our UI.

Pradeep J.
Validated Reviewer
Verified Current User
Review Source

"Scaling Machine Learning Platform"

What do you like best?

- Excellent customer support and the relative ease in getting started.

- Continued Qubole rep support in spite of internal organizational reorgs.

-

What do you dislike?

Does not offer much in addition to the Spark UI for monitoring, alerting and debugging. Features like profiling or suggesting improvements to job parameters will go a long way.

Recommendations to others considering the product:

- Excellent customer support, the ability to raise issues and receive support is phenomenal

- Good breakdown of costs

- Stable as a company

- Not a big sales push, the product speaks for itself

What problems are you solving with the product? What benefits have you realized?

Data Pipeline and Machine Learning Scoring/Feature Extraction

Data Transformations

Real Time Streaming Analytics

Jacker M.
Validated Reviewer
Verified Current User
Review Source

"Good experience"

What do you like best?

The way tables and contents are organized and also the easy way of running queries, seeing errors and getting results.

What do you dislike?

Large results are unable to export using the tool. I know that we can use a different url to export it, but we need to get API-key and query id and access a different tool to do that. It would be better to run it seamlessly.

Recommendations to others considering the product:

I dont have any.

What problems are you solving with the product? What benefits have you realized?

I use Qubole to get raw logs for events related to entities we have in our platform.

Connor E.
Validated Reviewer
Verified Current User
Review Source

"A useful query program"

What do you like best?

Overall, Qubole works pretty well for my needs. I enjoy the overall layout, it's rather easy to navigate and to browse old queries.

What do you dislike?

My queries taking a long time to run. I'm not sure if this is specific to the queries i'm running or the tables I'm pulling from, but it does take longer for me to run a Qubole query than some other similar softwares.

What problems are you solving with the product? What benefits have you realized?

I use it to better understand our data and to make reports that get shared internally.

I
Industry Analyst / Tech Writer
Validated Reviewer
Verified Current User
Review Source

"Qubole makes my life easier"

What do you like best?

I like the features within the Qubole program such as templates and the sheduler. I use the templates regularly which makes it very easy to use when i need to regulalry run specific queries.

What do you dislike?

I dont know if this could be a bit optimistic or whether it is possible - but maybe there could be more help with troubleshooting queries. Also, query run times can vary.

What problems are you solving with the product? What benefits have you realized?

Working in analysis, Qubole has enabled me to be able to analyse our clients website more granularly. For example we have been able to segment out their website using qubole queries to create our custom pixels and analyse specificly on these segments. This can provide more in depth analysis which the client is always looking for.

UI
User in Internet
Validated Reviewer
Review Source

"Terrible for query iterations, good for historical record"

What do you like best?

You can find past queries and results pretty easily if they are within a recent timeframe, this is good for having a record of your work and being able to see what you ran if you are not good about keeping your own documentation of your projects. It is also nice that qubole when paired with a datalake allows for the use of partitions, so you may store lots of historical data and be able to query a smaller timeframe without it taking forever. This feature is not available in the other engine that I use so I tend to chose qubole when looking at events or other large data sources within my work.

What do you dislike?

You cannot exit a query without running or your changes will be deleted, this means you need to run a new query and clog up your history for every single iteration you want to make which makes it difficult to discovery within your query history. Search is terrible, results will not show up for queries run a long time ago so you won't know the results you received on the original run, which may be different if you run it now.

What problems are you solving with the product? What benefits have you realized?

Working as a data analyst, it functions as a necessity for analysis, I appreciate the use of partitions which are possible with qubole and will only use this engine when in need of partitions and I use qubole to query large tables with event data or other large data which I would not have access to elsewhere because of impossibly long run time without the use of a partition.

Russell L.
Validated Reviewer
Verified Current User
Review Source

"Managing your cluster problems"

What do you like best?

I like the fact that we can spin up different clusters for different purposes and all are managed through qubole. We don't have to deal with the headache of keeping our cluster up and running.

What do you dislike?

I don't like certain features of the python SDK and the notebooks are not as reliable as hoped.

What problems are you solving with the product? What benefits have you realized?

ETL, ML, and visualizations.

I have learned that there are many use cases that can be solved with many of the scheduler tools and s3 locations alone.

U
User
Validated Reviewer
Review Source

"Good but can get way better and smooth"

What do you like best?

The interface is amazing. With respect to queries and all in handling data, it looks really easy. but at the same time, there is no visualization tool which can help us make dashboards for the queries. The tool looks complete since we can schedule queries as well as also work on large queries smoothly. The scheduler and crones are the best things I and my team uses the most.

What do you dislike?

There is no info available with respect to the limit available on the system for a particular queries. A lot of time it takes hours to run a query and it fails in the end due to limit on the amount of resources. There is no visualization for the data. Data is visible only in tabular format but no dashboard is available to track data from queries.

Recommendations to others considering the product:

It is one of the easiest places for analysts people to run queries, and manage it. Despite not having enough knowledge about scheduler and crones and their background working, We are able to run large scale queries in order to make the best possible insights.

One thing that bothers us is after fetching data from Qubole, we have to upload it to some dashboard management tools.

Out of 10, would grade it 9 with respect to ease of use.

Can improve on dashboard fact I believe

What problems are you solving with the product? What benefits have you realized?

Analyzing large scale data and making meaningful insights out of it. We run, schedule and manage queries over it and use it to run about 100s of queries per day

UM
User in Marketing and Advertising
Validated Reviewer
Review Source

"Strong product with good, but not quite great shortcuts and time-saving features"

What do you like best?

I really like the ease of use of the platform. Several SQL languages are available within the platform, but I typically find the Presto SQL is sufficient to meet most of my needs and is easy to use for anyone with any prior SQL knowledge.

What do you dislike?

There are some shortcuts that other database management software companies offer that are not available in Qubole. For instance, you cannot set a default schema, so the schema must be explicitly called for every table in a query. Additionally, when using the 'Explore' tab to manually review data in a table, you cannot set-limits for the number of rows you'd like to see. Simple dimensional tables that have over ~50 rows won't display entirely due to the software's default setting of returning only 50 rows or so.

What problems are you solving with the product? What benefits have you realized?

We are using Qubole to manage data from various external sources in a data lake. My role then frequently queries the data lake to develop standard reporting and analytics for internal teams.

U
User
Validated Reviewer
Review Source

"Decent but a lot of hassle to work with"

What do you like best?

It's much more complete when compared to other products but at the same time also managing and handling queries or jobs are at times failed because of bugs from qubole's end which are managed very late or are time consuming

What do you dislike?

Need to talk to the executive a lot of times while handling issues. Many times the answer is this will be fixed in next release but that is way far in future

Recommendations to others considering the product:

A lot of times the customer would need to have a feature or things being fixed as soon as possible. It can't wait for the next release to come up, mostly when its in the production environment. It'll be great in terms of handling if the requests are made quickly without hassle

What problems are you solving with the product? What benefits have you realized?

Hadoop cluster runs over qubole'. Easy to manage resources, just need to add or subtract the count. There is no option to remove a specific nodes but still it's very easy to manage and complete as it contains scheduler, Crons etc.

A
Agency
Validated Reviewer
Review Source

"My Qubole Experience"

What do you like best?

- I like the ability to quickly change between a variety of languages to manipulate data

- The scheduler's ability to use a workflow and switch between languages in the same process

- The history features are extremely useful. I especially like that I can see historical results in addition to my old queries.

What do you dislike?

- When running a query, the lack of a status bar is really annoying. Especially as the logs tend to grow indefinitely

- It is difficult to have Qubole connect to Tableau effectively. The connection is slow and often fails. This has improved with more recent updates.

What problems are you solving with the product? What benefits have you realized?

- We use Qubole to analyze programmatic media spend and identify potentially useful insights for programmatic traders.

- The results of our data sets are stored and updated using the Qubole scheduler. We then use these data source to power Tableau visualizations.

Tanmay S.
Validated Reviewer
Verified Current User
Review Source

"Powerful and Full of Features"

What do you like best?

The history of commands that you have run is available for you to go back to. The matters a lot when you are trying test and build something quickly

What do you dislike?

The pricing is high. To cut down on pricing you have to buy less powerful nodes for distributed computing which sometimes leads to memory issues when working on large data sets

Recommendations to others considering the product:

Use for brute force computing. No BI options

What problems are you solving with the product? What benefits have you realized?

I wanted a platform which gives me the benefit of distributed computing on demand without myself setting up the environment. Qubole does a really good job in this regard

UI
User in Internet
Validated Reviewer
Verified Current User
Review Source

"Qubole notebooks are great, scheduling system could be better"

What do you like best?

The best part is Qubole notebooks. Notebooks can be attached to clusters, and clusters can be loaded with different libraries, easy to use, and Pyspark and Spark functionalities almost always work. I have used notebooks in my school project, and found it much better than Sagemaker and Zeppelin.

What do you dislike?

Scheduling system is not reliable. As a company we used to have production jobs scheduled on Qubole scheduler, and most would fail for some reason or the other. This caused us to host our own Airflow instance.

What problems are you solving with the product? What benefits have you realized?

Production ETL jobs, and machine learning projects at my school. Ease of use with notebook functionality is pretty good.

Learn more about Qubole

Qubole Videos

Kate from G2

Learning about Qubole?

I can help.
* We monitor all Qubole reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. Validated reviews require the user to submit a screenshot of the product containing their user ID, in order to verify a user is an actual user of the product.
Qubole
3.9
(108)