Kafka Connect

Integrate your data as a real-time event stream

Price : 650€ H.T. - per attendee

For more information about this training course, please free to contact :
training@streamthoughts.io

Description

During this one-day workshop you will have the opportunity to learn in detail how the Apache Kafka Connect Framework works. We will present and discuss the uses, architecture concepts and Java APIs to develop your own data integration connectors.

Course Objectives

This course enables participants to acquire the following skills:

Undertsand how to use, configure and deploy Kafka Connect.
Integrate data as a real-time event stream.
Create custom data integration connectors.
Understand best practices for developing and deploying Kafka connectors.

Pedagogy

50% theory, 50% practise

Who Should Attend ?

This workshop is designed for Developers, Architects and Data Engineers who need to create data integration pipelines, in real time, through Apache Kafka.

Course Duration

1 Day

Course Prerequisites

Participants must be familiar with Java development. Participants should also be familiar with the basic concepts of Apache Kafka.

Course Content

Module 1: Introduction

Motivations
What is Kafka Connect?
What is it used for?
The ecosystem, Confluent Connect Hub
Advantages & Disadvantages

Module 2: Concepts and Architectures

Types of connector: Source & Sink
Kafka Connect Cluster: Workers & Tasks
Message formats: Converters
Data transformation: Single Message Transforms (SMTs)
Plugins
Delivery guarantees

Module 3: Managing and deploying connectors

Deployment Models: Standalone & Distributed
The REST API
Configuring connectors
Installing new plugins
Strategies to deployed Kafka Connect: Dedicated vs Mutualized

Module 4: Data Integration

Integrate data from filesystem with SpoolDir and FilePulse
Capturing Database Changes: Data Sourcing vs Change Data Capture
Introduction to Kafka Connect JDBC
Introduction to Debezium

Module 5: Developing connectors

The main JAVA interfaces
The model and data schemas
Managing Source and Sink Offsets
Developing Transformers
Developing RESTs extensions
Best practices for development

Module 6: Handling Errors

Dead Letter Queues

Moduke 7: Security

Authentications
ACLs
Externalizing Configurations
1. Monitoring and Tools

Instructor

Florian Hussonnois

Florian travaille depuis plus de 8 ans dans le conseil, il est co-fondateur et CEO de StreamThoughts. Au cours de sa carrière, il a travaillé sur divers projets impliquant la mise en oeuvre de plateformes d’intégration et de traitement de la data à travers les mondes technologiques de Hadoop et de Spark. Passionné par les systèmes distribués, il se spécialise dans les technologies d’event-streaming comme Apache Kafka, Apache Pulsar. Aujourd’hui, il accompagne les entreprises dans leur transition vers les architectures orientées streaming d’événements. Florian est certifié Confluent Administrator & Developer pour Apache Kafka. Il est nommé deux années consécutive (2019 et 2020) “Confluent Community Catalyst” pour ses contributions sur le projet Apache Kafka Streams et son implication dans la communauté open-source. Il fait partie des organisateurs du Paris Apache Kafka Meetup.