The most helpful thing is that it eases out the work you do. The upsides of using hbase is the user friendliness that it provides. This technology is gaining a lot of importance these days.
Other helpful features are a variety of commands that it provides. Apart from that you would explore a lot of upsides when you actually start working on this great and helpful technology.Since hbase is a no sql database so the users dealing with massive data or big data will find it really helpful to work with hbase. This technology is a great upside in this field and certainly gonna be a leader on upcoming days. When you are dealing with a huge velocity and volume of data this technology comes out to be a saviour. I highly recommend use of hbase instead of conventional technologies. Review collected by and hosted on G2.com.
The least helpful about hbase I think is the lack of some common features that are available with similar technologies available in the market. The downsides of hbase hence could be improved. For a technology to become popular the least helpful needs to be ommited out and the downsides could be removed. Technologically I found out that when only one Hmaster is used, there could be a possibility of failure. Since joins are handled in MapR layer. Since a traditional RDBMS is index on multiple columns Hbase is only index on a key. Review collected by and hosted on G2.com.
The speed at which querying is possible in Hbase, with large datasets. Working alongside a hadoop based environment, with huge clusters, Hbase really made the database querying part a lot easier. Review collected by and hosted on G2.com.
The Java API. Although it is updated quite regularly, but I wish it was made a little bit easier to use. I needed to make an external Java program for a Sanity job on an Hbase cluster, and it took me a month to write clean, consistent and reliable code which would as expected everytime. Pretty sure it would have been much easier in Scala, but the requirement was for Java. Hence, I think the API might need a little improvement. Review collected by and hosted on G2.com.
The ease of use of Apache HBase is a good thing but the thing that stands out for us was the performance.
RestAPI calls are also a great assets.
Ease to integrate with Apache Solr and Helix for visualizations is amazing. Review collected by and hosted on G2.com.
Aggregate functions could be more optimized or run faster. Real-time aggregation can be improved. OLAP queries are really slow on HBase and can be improved. Review collected by and hosted on G2.com.
Handling very large datasets comfortably and behaving like a DBMS on Hadoop, although not relational. Also, we have the parameter to tune for Consistency/Availability so that it could be tuned depending on the business requirement. Review collected by and hosted on G2.com.
The complications in getting started with HBase - Installation of HBase on top of Hadoop is tricky and querying for HBase differs a lot compared to other NoSqls like Cassandra, so there is a few days of learning phase. The learning of how to use HBase for storing tables and querying it is not at all intuitive for someone from SQL background. Review collected by and hosted on G2.com.
The following features which i really liked for Hbase are:
1) Strong consistency – writes and reads are always consistent as compared to eventually consistent databases like
Cassandra.
2) Proven scalability to dozens of petabytes.
3) Auto-sharding.
4) Scaling with commodity hardware.
5) Cost-effective from gigabytes to petabytes. Review collected by and hosted on G2.com.
HBase is really tough for querying. We may have to integrate HBase with some SQL layers like Apache phoenix where we can write queries to trigger the data in the HBase. It's really good to have Apache Phoenix on top of HBase.
Also,another drawback with HBase is that, we cannot have more than one indexing in the table, only row key column acts as a primary key. So, the performance would be slow when we wanted to search on more than one field or other than Row key. This problem we can overcome by writing MapReduce code, integrating with Apache SOLR and with Apache Phoenix. Review collected by and hosted on G2.com.
It is an Open source platform for learners, has the capability to handle BigTable, fault-tolerant capacity, a grouping of records in billions, cross-platform. because of its lineage with Hadoop and HDFS. HBase runs on top of HDFS and is well-suited for faster read and write operations on large datasets with high throughput and low input/output latency.
Memory compression provided by it is super awesome. It is a distributed and scalable platform. It can work with structured as well as instructed data. Wide-column. HBase stores data in a table-like format with the ability to store billions of rows with millions of columns. Columns can be grouped together in column families, which allows the physical distribution of row values on different cluster nodes. Consistent. HBase is architected to have strongly consistent reads and writes, as opposed to other NoSQL databases, like Cassandra, that are eventually consistent. Once a writer has been performed, all read requests for that data will return the same value. Failover. HBase tables are replicated for failover.HBase was designed to scale; data that is accessed together are stored together. Grouping the data by row key is central to running on a cluster. In HBase, the data is automatically distributed across a cluster. Sharding distributes different data across multiple servers, and each server is the source for a subset of data. Distributed data is accessed together, which makes it faster for scaling. Review collected by and hosted on G2.com.
It is one of the best sources but lacks in SQL Scripting. With a relational database, you normalize your schema, which eliminates redundant data and makes storage efficient. Indexes and queries with joins are used to bring the data back together again. Indexes slow down data ingestion with lots of nonsequential disks I/O and joins cause bottlenecks on reads with lots of data. The relational model does not scale horizontally across a cluster.HDFS is written in Java on top of the Linux file system and is a write-once storage layer. Updates to closed files are conducted via an append process. The batch updates of HDFS are a major limitation. There is no support for continuous updates to a file. Moreover, HDFS relies on the underlying Linux file system to store the HDFS content.
The NameNode, the part of the master node that identifies the location of each file block, has scalability and reliability issues. NameNodes are hard to configure, and as they are replicated so as not to become single points of failure, configuration gets even harder. Review collected by and hosted on G2.com.
1. Can serve multiple queries and handle it very effectively.
2.Can store large data sets on top of HDFS file storage and will aggregate and analyze billions of rows.
3. Due to its architecture we can use it in creating our ML models. Review collected by and hosted on G2.com.
It should have web interface also like other services have.
No support for transaction.
And lack of sql query Review collected by and hosted on G2.com.
I use Hbase to integrate it with REST API's and Java client. It is also very helpful in storing very huge client's secure data and querying it very effectively. We can scale the database when required. Review collected by and hosted on G2.com.
In my opinion, transaction support is not helpful by Hbase and also no built in authentication is available. Review collected by and hosted on G2.com.
good for random read,write operations, can be integrated to apache phoenix, architecture wise it is simple, good concept of column. family to make read operations faster Review collected by and hosted on G2.com.
hbase is not easy to integrate with MapReduce.
Setup wise it is a bit challenging
There are many columnar databases like Cassandra etc which are better than hbase Review collected by and hosted on G2.com.
It is a columnary multidimensional database and is versatile. No fixed structure like normal database. It has MapReduce and Hive/Pig integration for operational needs. Ut us dynamic and It can be modified on runtime. Review collected by and hosted on G2.com.
We cannot have cross operations like joins in Hbase we can Implement ghis using MapReduce but it taked lot of time. It is tough for querying. It is also a time taking process if we want to Insert data from a relational database to hbase. Review collected by and hosted on G2.com.
1. HBase can handle as well as stores large datasets on top of HDFS file storage. Moreover, it aggregates and analyzes billions of rows present in the HBase tables.
2. Databases breakdown.As compared to traditional dataBase, data reading and processing in HBase will take the small amount of time.
3. Scalability is supported in both linear and modular form
4. There is no concept of fixed columns schema in HBase Review collected by and hosted on G2.com.
1. Single point of failure
2. There is no support for the transaction
3. No handling of JOINS in database
4. Sorted only on key
5. There is no permissions or built-in authentication Review collected by and hosted on G2.com.
Best thing I love about hbase is I could use storage badly that means I also needed to use that data by retrieveing. In order to do that hbase is good option as I use it to store data and it's Java API Client helps me use that for multiple applications. Review collected by and hosted on G2.com.
Data arrangement is only on a key based it consumes time for any transaction. As it's key based sorting doesn't works out much efficiently incase of diving through huge data Review collected by and hosted on G2.com.
the database can be shared. Operations such as data reading and processing will take small amount of time as compared to traditional relational models Review collected by and hosted on G2.com.
In HBase, there is no support for the transaction. Review collected by and hosted on G2.com.
It delivers the faster retrieval of data as compared to other Big data components and can handle semi-structured data as well and goes very well with APIs. Review collected by and hosted on G2.com.
The challenges faced in HBASE is that we need to truncate the table and could not able to overwrite the data as like we can do in Hive. Even Hive is linked to Hbase data Storage. Review collected by and hosted on G2.com.
Hbase is highly recommend specially with respect to perform ,setup and maintenance.
Working with huge dataset ,It's a total win against various database.Provides lots of parameter to tune the performance based on conditions. Review collected by and hosted on G2.com.
Hbase has its own difficulties , syntax not so user friendly ,takes time to adapt.Also restrictions to allow single index.aggreations slows down the performance. Review collected by and hosted on G2.com.
easy to understand and easy to implement in many big data framework such as Hadoop, it is very fast as compared to a relational database, in combination with Hadoop was awesome and easy to understand, I used HBase as a learning experience Review collected by and hosted on G2.com.
there are slightly fewer tutorials for beginner to understand the HBase Review collected by and hosted on G2.com.
I really love block cache support and bloom filters for real-time query processing. I do love HBase's failover support and load-sharing feature. Additionally, as compared with traditional databases, data reading and processing requires a small period of time. There is no specification for a fixed column scheme in HBase since it is schema-free. Consequently, it describes only families with columns. Review collected by and hosted on G2.com.
It results in unpredictable latencies when we try integrating HBase with Map-reduce jobs. There is a probability of failure at a time when only one HMaster is used. Review collected by and hosted on G2.com.
Hadoop being very slow for real-time reporting, we used connected tableau with HBase for real-time reporting. Review collected by and hosted on G2.com.
It is not possible to implement cross data operations like joins etc. Review collected by and hosted on G2.com.
Handle the large volume of data, fault tolerance , license free ,very flexible on schema design/ no fixed schema. Supports scaling out in coordination with Hadoop file system even on commodity hardware. Review collected by and hosted on G2.com.
Sorting only on key,
No handling of joins in database,
Unpredictable latencies,
No support SQL structure
Built-in authentication
Memory issues on the cluster Review collected by and hosted on G2.com.
HBASE provide functionality to upload data in bulk or update data on Hadoop which hive does not provide as HBASE is nosql database. Review collected by and hosted on G2.com.
The commands and a bit structure of row key and column family. Review collected by and hosted on G2.com.
Variable Schema: columns can be added and removed dynamically.
Integration with Java client, Thrift and REST APIs.
Auto scaling, sharding and failover
Great integration with the ecosystem Review collected by and hosted on G2.com.
No transactions
Need to have Phoenix or some query layer on top of it, to take full advantage
No joins
Cannot store blobs
Resource intensive Review collected by and hosted on G2.com.
Hbase storage and processing speed and the way it works on a particular record as cell.
Can be easily connected with stream data kafka also. Review collected by and hosted on G2.com.
Commands is not easy to remember , i can understand since it is a nosql database Review collected by and hosted on G2.com.