Apache Spark Training Course Content

Spread the love

Apache Spark Training Course Content Details

What is Apache Spark

Processing Re-write Suggestions Done (Unique Article)
Apache Spark is AN ASCII text file cluster-computing framework. Originally developed at the University of California, Berkeley’s AMPLab, the Spark codebase was later given to the Apache software system Foundation, that has maintained it since. Spark provides AN interface for programming entire clusters with implicit knowledgecorrespondence and fault-tolerance.
Apache spark Training in hyderabad kukatpally
Apache Spark provides programmers with AN application programming interface focused on an information structure known as the resilient distributed dataset (RDD), a read-only multiset of knowledge things distributed over a cluster of machines, that’s maintained in a very fault-tolerant manner. it had been developed in response to limitations within the MapReduce cluster computing paradigm, that forces a selected linear dataflow structure on distributed programs: MapReduce programs browse input file from disk, map a operate across the information, scale back the results of the map, and store reduction results on disk. Spark’s RDDs operate as a operating set for distributed programs that provides a (deliberately) restricted sort of distributed shared memory. For more info click here




Apache Spark Course Content

01.SCALA (Object Oriented and Functional Programming)

  • Getting started With Scala.
  • Scala Background, Scala Vs Java and Basics.
  • Interactive Scala – REPL, data types, variables,expressions, simple functions.
  • Running the program with Scala Compiler.
  • Explore the type lattice and use type inference
  • Define Methodsand Pattern Matching.

02.Scala Environment Set up.

  • Scala set up on Windows.
  • Scala set up on UNIX.

03.Functional Programming.

  • What is Functional Programming.
  • Differences between OOPS and FPP.

04.Collections (Very Important for Spark)

  • Iterating, mapping, filtering and counting
  • Regular expressions and matching with them.
  • Maps, Sets, group By, Options, flatten, flat Map
  • Word count, IO operations,file access, flatMap

05.Object Oriented Programming.

  • Classes and Properties.
  • Objects, Packaging and Imports.
  • Traits.
  • Objects, classes, inheritance, Lists with multiple related types, apply

06.Integrations

  • What is SBT?
  • Integration of Scala in Eclipse IDE.
  • Integration of SBT with Eclipse.




07.SPARK CORE.

  • Batch versus real-time data processing
  • Introduction to Spark, Spark versus Hadoop
  • Architecture of Spark.
  • Coding Spark jobs in Scala
  • Exploring the Spark shell -> Creating Spark Context.
  • RDD Programming
  • Operations on RDD.
  • Transformations
  • Actions
  • Loading Data and Saving Data.
  • Key Value Pair RDD.
  • Broad cast variables.

08.Persistence.

  • Configuring and running the Spark cluster.
  • Exploring to Multi Node Spark Cluster.
  • Cluster management
  • Submitting Spark jobs and running in the cluster mode.
  • Developing Spark applications in Eclipse
  • Tuning and Debugging Spark.

09.CASSANDRA (N0SQL DATABASE)

  • Learning Cassandra
  • Getting started with architecture
  • Installing Cassandra.
  • Communicating with Cassandra.
  • Creating a database.
  • Create a table
  • Inserting Data
  • Modelling Data.
  • Creating an Application with Web.
  • Updating and Deleting Data.

10.SPARK INTEGRATION WITH NO SQL (CASSANDRA) and AMAZON EC2

  • Introduction to Spark and Cassandra Connectors.
  • Spark With Cassandra -> Set up.
  • Creating Spark Context to connect the Cassandra.
  • Creating Spark RDD on the Cassandra Data base.
  • Performing Transformation and Actions on the Cassandra RDD.
  • Running Spark Application in Eclipse to access the data in the Cassandra.
  • Introduction to Amazon Web Services.
  • Building 4 Node Spark Multi Node Cluster in Amazon Web Services.
  • Deploying in Production with Mesos and YARN.

11.SPARK STREAMING

  • Introduction of Spark Streaming.
  • Architecture of Spark Streaming
  • Processing Distributed Log Files in Real Time
  • Discretized streams RDD.
  • Applying Transformations and Actions on Streaming Data
  • Integration with Flume and Kafka.
  • Integration with Cassandra
  • Monitoring streaming jobs.

12.SPARK SQL

  • Introduction to Apache Spark SQL
  • The SQL context
  • Importing and saving data
  • Processing the Text files,JSON and Parquet Files
  • DataFrames
  • user-defined functions
  • Using Hive

13.SPARK MLIB.

  • Introduction to Machine Learning
  • Types of Machine Learning.
  • Introduction to Apache Spark MLLib Algorithms.
  • Machine Learning Data Types and working with MLLib.
  • Regression and Classification Algorithms.
  • Decision Trees in depth.
  • Classification with SVM, Naive Bayes
  • Clustering with K-Means
  • Building the Spark server
  • Local Hive Metastore server

Apache Spark Training Demo





Contact Us forApache Spark Online and Classroom Training

venkat: 9059868766
email:[email protected]
Address: PlotNo 126/c,2nd floor,Street Number 4, Addagutta Society, Jal Vayu Vihar, Kukatpally, Hyderabad, Telangana 500085

Sharing is caring!

Updated: May 12, 2017 — 11:41 am

1 Comment

Add a Comment
  1. Hi,

    Iam ramesh. i want to learn Spark. So that i need some information regarding that.
    Please send the following details.

    Course Duaration :

    Course Fee :

    Batch starting Date :

    Course material (soft or hard copy) :

    Batch timings (morning or evening) :

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.