Installation of Apache Hadoop 3.2.0

Apache Hadoop 3.2.0 has been released after incorporating many outstanding enhancements over the previous stable release. The objective of this article is to explain step by step installation of Apache […]

Manual procedure to add a new Datanode into an existing basic data lake without Apache Ambari or Cloudera Manager. Constructed using HDFS (Hadoop Distributed File System) on the multi-node cluster

   The aim of this article is to highlight the essential steps when there would be a need for a new DataNode into an exiting multi-node Hadoop cluster. Midsize or […]

Transfer structured data from Oracle to Hadoop storage system

Using Apache’s sqoop, we can transfer structured data from Relational Database Management System to Hadoop distributed file system (HDFS). Because of distributed storage mechanism in Hadoop Distributed File System (HDFS), […]

Data Lake

Basic concept on Data Lake

The info graphics representing the basic concept of Data Lake where we can use the approach ELT (Extraction, loading and then transformation) against traditional ETL (Extraction, Transformation and then loading)process. […]