Course : Hadoop, developing applications for Big Data

Practical course - 4d - 28h00 - Ref. APH
Price : 2520 € E.T.

Hadoop, developing applications for Big Data




This hands-on course will teach you how to develop applications that enable you to process distributed data in batch mode. You'll collect, store and process data in heterogeneous formats with Apache Hadoop, to set up processing chains integrated with your information system.


INTER
IN-HOUSE
CUSTOM

Practical course in person or remote class
Disponible en anglais, à la demande

Ref. APH
  4d - 28h00
2520 € E.T.




This hands-on course will teach you how to develop applications that enable you to process distributed data in batch mode. You'll collect, store and process data in heterogeneous formats with Apache Hadoop, to set up processing chains integrated with your information system.


Teaching objectives
At the end of the training, the participant will be able to:
Building a MapReduce-based program
Integrating Hadoop HBase into an enterprise workflow
Travailler avec Apache Hive et Pig depuis Hadoop Distributed File System (HDFS)
Using a task graph with Hadoop

Intended audience
Concepteurs, développeurs.

Prerequisites
Good experience in Java development. Knowledge of web architecture a plus.

Practical details
Hands-on work
Application development for Big Data.
Teaching methods
Lectures 30%, practical work 70%.

Course schedule

1
Big data

  • Defining the scope of big data.
  • The role of the Hadoop project.
  • Basic concepts of big data projects.
  • Introduction to cloud computing.
  • The difference between private and public cloud computing.
  • Big data architectures based on the Hadoop project.
Demonstration
Use of Hadoop and GoogleApp.

2
Collecting data and applying MapReduce

  • Analysis of company data flows.
  • Structured and unstructured data.
  • The principles of semantic analysis of enterprise data.
  • MapReduce-based task graph.
  • Data consistency granularity.
  • Transfer data from a persistence system to Hadoop.
  • Transferring data from a Cloud to Hadoop.
Hands-on work
Managing the collection of customer information using MapReduce. Configuring the Yarn implementation. Developing a Map Reduce-based task.

3
Data storage with HBase

  • Several types of XML database.
  • Usage patterns and their application to the cloud.
  • Application of Hadoop database within a workflow.
  • Using Hive/Pig projects.
  • Using the HCatalog project.
  • HBase Java API.
Hands-on work
Manage modifications to a supplier data catalog.

4
Data storage on HDFS

  • Usage patterns and their application to the cloud.
  • Architecture and installation of an HDFS system, journal, NameNode, DataNode.
  • Operations, orders and order management.
  • Java HDFS API.
  • Data analysis with Apache Pig.
  • The Pig Latin language. Using Apache Pig with Java.
  • Querying with Apache Hive.
  • Data replication. Data sharing on HDFS architecture.
Hands-on work
Administering a shared client repository on Hadoop. Using the visualization console.

5
Spring Data Hadoop

  • Introduction to Spring and Spring Data.
  • The Hadoop namespace for Spring.
  • Using Spring to simplify Hadoop configuration.
  • Distributed cache configuration.
  • Job definition and dependencies between jobs.
  • Integration of tools (Pig, Hive...).
Hands-on work
Redesign of supplier data catalog management using Spring Data.


Dates and locations
Select your location or opt for the remote class then choose your date.
Remote class

Dernières places
Date garantie en présentiel ou à distance
Session garantie

REMOTE CLASS
2026 : 26 May, 6 Oct.

PARIS LA DÉFENSE
2026 : 26 May, 6 Oct.