Comparing Hadoop Data Storage (HDFS, HBase, Hive and Pig)

Hadoop ecosystem contains components such as HDFS, HBase, Hive and Pig that are used for data storage and data access. Sometimes these components are used as a replacement for existing data storage and sometimes as an extension to it. Each of these components is designed to address specific problems and has specific application; however they can also used together to solve different problems.

This talk will introduce the audience with various components in Hadoop ecosystem such as HDFS, HBase, Hive, Pig, Zoopkeeper, Chukwa and Mahaout, role of each component and its application areas.

The talk will mainly focus on HDFS, HBase, Hive and Pig. It will cover design features, examples, application areas, limitations and comparison of these components and will give fair idea about how these components can be used.

In addition, it will cover few case studies covering Hadoop,usage in companiesĀ  such as Facebook, Yahoo and LinkedIn to give idea about how these components are currently being used in the field.


Rakesh Jadhav is a Software Specialist at SAS. He has 7 years of experience and works on Java and Hadoop technologies. He is B.E. Information Technology(Pune University), Post Graduate Diploma in Business Management(Pune University). He has earlier spoken on Big data and Big data Analytic and Hadoop Ecosystem.

Enhanced by Zemanta

Comments are closed.