Data Analytics in Sports

How Data Analytics is transforming Sports Performance – Captivating

The latest buzzword in the business world today is Data Analytics, a science where data is being used to build models to achieve better decision out of those models. Business […]

data governance

Data governance and security mechanism in distributed data storage system

We are very much aware that the traditional data storage mechanism is incapable to hold the massive volume of lightning speed generated data for further utilization even though perform vertical […]

A Credible approach of Big Data processing and subsequent analysis on telecom data to minimize crime, combat terrorism, unsocial activities, etc.

Telecom providers have a treasure trove of captive data – customer data, CDR (call detail records), call center interactions, tower logs etc. and are metaphorically “sitting on a gold mine”. […]


Deleting Solr log files/folder from Standby NameNode could be the disaster when Primary NameNode is active in the HDP (Hortonworks Data Platform) Hadoop Cluster

Most of us know that we use Apache Ambari for managing, provisioning and monitor different components of a Hortonworks Hadoop cluster. We also know that Apache Ranger can be used […]


Fault tolerance enhancement on Apache Hadoop 3.0.0-alpha2 by supporting more than 2 NameNodes.

NameNode is the most critical resource in Hadoop core cluster. Once very large files loaded into the Hadoop Distributed File System (HDFS), the files get broken into block-sized chunks as […]

Streaming by Apache Flink

Basic Understanding of Stateful data Streaming supported by Apache Flink

The technologies related to Big Data processing platform are enhancing the maturity in order to efficiently execute the streaming data which is becoming a major focal point to take business […]

Apache Flink

Apache Flink – A 4G Data Processing Engine

Analyzing streaming data in large-scale systems is becoming a focal point day by day to take accurate business decisions due to mushrooming of digital data generation sources around the globe […]


Steering number of mapper (MapReduce) in sqoop for parallelism of data ingestion into Hadoop Distributed File System (HDFS)

To import data from most the data source like RDBMS, sqoop internally use mapper. Before delegating the responsibility to the mapper, sqoop performs few initial operations in a sequence once […]

Transfer structured data from Oracle to Hadoop storage system

Using Apache’s sqoop, we can transfer structured data from Relational Database Management System to Hadoop distributed file system (HDFS). Because of distributed storage mechanism in Hadoop Distributed File System (HDFS), […]

Data Lake

Data Ingestion phase for migrating enterprise data into Hadoop Data Lake

The Big Data solutions helps to achieve valuable information to iron out the accurate strategic business decision. Exponential growth of digitalization, social media, telecommunication etc. are fueling enormous data generation […]