Explain the storage mechanisms of HBase. Differentiate HBase with RDBMS.

 STORAGE MECHANISM IN HBASE

HBase is a column-oriented database, with tables ordered by row. Only column families, which are key-value pairs, are defined in the table structure. A table contains many columns families, each of which can include any number of columns. Subsequent column values are saved on the disk in a logical order. A timestamp is associated with each cell value in the table.

In a nutshell, in an HBase:

  • The table is a collection of rows.
  • The row is a collection of column families.
  • A Column family is a collection of columns. 
  • The column is a collection of key-value pairs.


An example schema of a table in HBase is provided below.



Difference between HBase and RDBMS.

HBase

  • HBase is schema-less, it does not have the concept of fixed columns schema; defines only column families.
  • It is built for wide tables. HBase is horizontally scalable.
  • No transactions are there in HBase.
  • It has de-normalized data.
  • It is good for semi-structured as well as structured data.


RDBMS

  • An RDBMS is governed by its schema, which describes the whole structure of tables.
  • It is thin and built for small tables. Hard to scale.
  • RDBMS is transactional.
  • It will have normalized data.
  • It is good for structured data.

Comments

Popular posts from this blog

What is the cloud cube model? Explain in context to the Jericho cloud cube model along with its various dimensions.

Suppose that a data warehouse consists of the three dimensions time, doctor, and patient, and the two measures count and charge, where a charge is the fee that a doctor charges a patient for a visit. a) Draw a schema diagram for the above data warehouse using one of the schemas. [star, snowflake, fact constellation] b) Starting with the base cuboid [day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2004? c) To obtain the same list, write an SQL query assuming the data are stored in a relational database with the schema fee (day, month, year, doctor, hospital, patient, count, charge)

Explain cloud computing reference model .