Get in Touch

Course Outline

Big Data Overview:

  • Defining Big Data
  • The reasons behind the growing popularity of Big Data
  • Real-world Big Data Case Studies
  • Key Characteristics of Big Data
  • Solutions for managing Big Data.

Hadoop & Its Components:

  • An introduction to Hadoop and its core components.
  • Hadoop Architecture and the types of data it can handle or process.
  • A brief history of Hadoop, the companies utilizing it, and the motivations behind its adoption.
  • A detailed explanation of the Hadoop Framework and its components.
  • Understanding HDFS and the processes for reading from and writing to the Hadoop Distributed File System.
  • Instructions for setting up a Hadoop Cluster in various modes: Stand-alone, Pseudo-distributed, and Multi-node.

(This section covers setting up a Hadoop cluster using VirtualBox, KVM, or VMware, carefully addressing network configurations, starting Hadoop Daemons, and testing the cluster).

  • Understanding the MapReduce Framework and its operational mechanics.
  • Executing MapReduce jobs on a Hadoop cluster.
  • Grasping concepts of Replication, Mirroring, and Rack awareness within Hadoop clusters.

Hadoop Cluster Planning:

  • Strategies for planning your Hadoop cluster.
  • Aligning hardware and software requirements for effective cluster planning.
  • Analyzing workloads to plan a cluster that prevents failures and ensures optimal performance.

What is MapR and Why Choose MapR:

  • An overview of MapR and its architecture.
  • Understanding and working with the MapR Control System, MapR Volumes, snapshots, and Mirrors.
  • Planning a cluster specifically within the context of MapR.
  • Comparing MapR with other distributions and Apache Hadoop.
  • MapR installation and cluster deployment procedures.

Cluster Setup & Administration:

  • Managing services, nodes, snapshots, mirrored volumes, and remote clusters.
  • Understanding and managing Nodes.
  • Familiarity with Hadoop components and installing them alongside MapR Services.
  • Accessing data on the cluster, including via NFS, while managing services and nodes.
  • Managing data through volumes, handling users and groups, assigning roles to nodes, commissioning and decommissioning nodes, cluster administration, performance monitoring, configuring and analyzing metrics, and administering MapR security.
  • Understanding and working with M7 Native storage for MapR tables.
  • Cluster configuration and tuning for optimum performance.

Cluster Upgrade and Integration with Other Setups:

  • Upgrading the MapR software version and understanding different types of upgrades.
  • Configuring the MapR cluster to access an HDFS cluster.
  • Setting up a MapR cluster on Amazon Elastic MapReduce.

All the aforementioned topics include demonstrations and practice sessions to provide learners with hands-on experience with the technology.

Requirements

  • Fundamental knowledge of the Linux File System
  • Basic Java proficiency
  • Familiarity with Apache Hadoop (recommended)
 28 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories