Course : Apache Hop, orchestrating data flows

Manage your ETL processes visually

Practical course - 3d - 21h00 - Ref. HOA
Price : 2010 € E.T.

Apache Hop, orchestrating data flows

Manage your ETL processes visually



Data is vital to business. Apache Hop is a powerful, customizable open source software package for managing ETL (Extract, Transform, Load) processes. It enables you to automate the collection, transformation and organization of data from heterogeneous sources, then send it to a specific source. Save time by mastering pipelines and workflows through an accessible interface.


INTER
IN-HOUSE
CUSTOM

Practical course in person or remote class
Available in English on request

Ref. HOA
  3d - 21h00
2010 € E.T.




Data is vital to business. Apache Hop is a powerful, customizable open source software package for managing ETL (Extract, Transform, Load) processes. It enables you to automate the collection, transformation and organization of data from heterogeneous sources, then send it to a specific source. Save time by mastering pipelines and workflows through an accessible interface.


Teaching objectives
At the end of the training, the participant will be able to:
Understand and explain the Hop environment and how it works
Know how to extract and transform data
Distinguish and order different data sources
Combine, classify and compare different types of data
Apply task automations and analyze errors to respond to them

Intended audience
Anyone feeding and manipulating data flows in a BI database.

Prerequisites
Good knowledge of SQL.

Practical details
Exercise
Application of theory to concrete cases, group discussions, hands-on practice.
Teaching methods
Active teaching.

Course schedule

1
Introducing Apache Hop

  • Why Hop?
  • History and overview
  • Tool installation and configuration
Hands-on work
Install and configure Apache Hop.

2
Generate an initial data extraction

  • Workflows and pipelines (scheduling)
  • From data extraction to data feeding
  • Understanding and managing data flows
  • Pipeline and workflow execution
Hands-on work
Design a pipeline and create a workflow.

3
Access source and target data

  • The concept of metadata
  • Configure access to data sources
  • Supported sources/targets
  • Links between sources (joins)
  • Insert/Update power supply
Hands-on work
Configure access to data sources by identifying supported targets/sources and joins.

4
Handling data

  • Sort your flow up or down
  • Doubling your flow
  • Filter data according to several criteria (to lighten the flow)
  • Extract information from a field (character string)
  • Replace one data item with another
  • Operator management/flow calculation operations
  • Using the Cartesian product
  • Linking information from heterogeneous data
  • Compare data streams
Hands-on work
Handle flows by sorting, splitting and filtering data. Compare feeds.

5
Enrich your data flow

  • Log generation
  • Create and retrieve variables (dates, numeric, alphanumeric)
  • Using the result of a flow
  • Properties of a flow and its scheduler
Hands-on work
Create and retrieve variables, use flow results and enrich your data flow.

6
Loops

  • The issues
  • Loops with parameters and the "Copy lines to result" component
  • Loops with the components "Copy rows to result" and "Retrieve rows from result".
Hands-on work
Understand and manipulate loop components.

7
Operating

  • Managing errors
  • Generate logs
  • Understand errors and launch alerts (debugging)
  • Parallelization (simultaneous execution of multiple data streams)
  • Import/export developments
  • Task automation/transformation
  • Documentation (implementation of standards in the event of errors or rework)
Hands-on work
Run streams simultaneously, manage errors and automate tasks.


Dates and locations
Select your location or opt for the remote class then choose your date.
Remote class

Dernières places
Date garantie en présentiel ou à distance
Session garantie

REMOTE CLASS
2026 : 16 Mar., 8 June, 14 Sep., 30 Nov.

PARIS LA DÉFENSE
2026 : 1 June, 7 Sep., 23 Nov.