Hadoop Training course content

Spread the love

Hadoop Training course content

what is hadoop

Apache Hadoop is Associate in Nursing ASCII text file software system framework used for distributed storage and process of massive information sets victimization the MapReduce programming model. It consists of pc clusters engineered from artefact hardware. All the modules in Hadoop square measure designed with a basic assumption that hardware failures square measure common occurrences and will be mechanically handled by the framework.

Hadoop Training in hyderabad kukatpallyThe core of Apache Hadoop consists of a storage half, called Hadoop Distributed filing system (HDFS), and a processhalf that could be a MapReduce programming model. Hadoop splits files into giant blocks and distributes them across nodes in a very cluster. It then transfers prepacked code into nodes to method the info in parallel. This approach takes advantage of information section, wherever nodes manipulate the info they need access to. this enables the informationset to be processed quicker and a lot of expeditiously than it’d be in a very a lot of typical mainframedesign that depends on a parallel filing system wherever computation and data square measure distributed via high-speed networking. for more info click here 



Hadoop course content Details

01.Introduction To Hadoop

  • What is Enterprise BIGDATA
  • What is Hadoop?
  • History of Hadoop
  • Hadoop Eco-System
  • Hadoop Framework
  • Hadoop vs RDBMS
  • Hadoop vs SAP Hana vs Teradata
  • How ETL tools works in Hadoop
  • Hadoop Requirements and supported versions
  • Case Studies: Hadoop and Hive at Yahoo, Facebook etc

02.Hadoop Distributed File Systems

  • Installation of Ubuntu 13.04 *
  • Basic Unix Commands *
  • Hadoop Commands
  • HDFS & Job Tracker Access URLs & ports.
  • HDFS design
  • Hadoop file systems
  • Master and Slave node architecture
  • Filesystem API – Java
  • Serialization in Hadoop – Reading and writing data from/to Hadoop URL

03.Administering Hadoop

  • Cluster specification
  • Hadoop cluster setup and installation
  • Standalone
  • Pseudo-distributed mode
  • Fully distributed mode
  • fs, fsck, distcp, archive, —–
  • dfsadmin, balancer, jobtracker, tasktracker, namenode—-
  • Step-by-step multi-node installation
  • Hadoop Configuration
  • Namenode and datanode directory structure
  • User commands
  • Administration commands
  • Monitoring
  • Benchmarking a Hadoop cluster

04.Mapreduce

  • Map/Reduce Overview and Architecture
  • Developing Map/Red JobsMapreduce Data types
  • Custom DataTypes/Writables
  • Input File Formats
  • Text Input File Format
  • Zip File Input Format
  • LZO Compression & LZO Input Format
  • XML Input Format
  • JSON Input Format
  • Packaging, Launching, Debugging jobs
  • Hash Partitioner
  • Custom Partitioner
  • Capacity Scheduler
  • Fair Scheduler
  • Output Formats
  • Job Configuration
  • Job Submission
  • Mapreduce workflows
  • Practicing Map Reduce Programs
  • Combiner
  • Partitioner
  • Search
  • Sorting
  • Secondary Sorting
  • Distributed Cache
  • Chain Mapping/Reducing
  • Schedulin
  • One Example for Each Concept*
  • Practical Examples execution on Local, HDFS and Using Eclipse Plugins* too.




05.HIVE

  • Hive concepts
  • Hive installation
  • Hive configuration, hive services & metastore
  • Hive datatypes – primitive and complex types
  • Hive operators
  • Hive Builtin functions
  • Hive Tables
  • creating tables
  • External Table
  • Internal Table
  • Partitions and buckets
  • Browsing tables and partitions
  • Storage formats
  • Loading data
  • Joins
  • Aggregations and sorting
  • Insert into local file
  • Altering, dropping tables
  • Importing data




06.PIG

  • Why pig
  • Pig and Pig latin
  • Pig installation
  • Pig latin command
  • Pig latin relational operators
  • Pig latin diagnostic operators
  • Data types and Expressions
  • Builtin functions
  • Data processing in pig
  • load and store
  • Filtering the data\
  • Grouping the data
  • Joining the data
  • Sorting the data
  • Sqoop

07.Sqoop

  • Sqoop installation
  • Sqoop commands
  • Sqoop connectors
  • Importing the data from mysql
  • Exporting the data
  • Creating hive tables by importing data

08.HBase

  • HBase Introduction.
  • HBase Installation
  • HBase Architecture
  • Zoo Keeper
  • Keys & Column families
  • Integration with MapReduce
  • Integration with Hive

09.Other Miscellaneous Topics

  • Hue
  • Impala
  • Hadoop Streaming
  • Storm – Real Time Hadoop
  • Eclipse Plugins
  • Cloudera Hadoop Installation
  • Cloudera Administration
  • Hiho ecosystem
  • Flume ecosyste
  • Reporting Tools Introduction

Hadoop Training Demo





online and classroom training course content

venkat: 9059868766
email:[email protected]
Address: PlotNo 126/c,2nd floor,Street Number 4, Addagutta Society, Jal Vayu Vihar, Kukatpally, Hyderabad, Telangana 500085

Sharing is caring!

Updated: May 13, 2017 — 5:31 am

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.