StreamThoughts
coming soon

Kafka Connect

Integrate your data as a real-time event stream

Price : 650€ H.T. - per attendee

For more information about this training course, please free to contact :
training@streamthoughts.io

Description

During this one-day workshop you will have the opportunity to learn in detail how the Apache Kafka Connect Framework works. We will present and discuss the uses, architecture concepts and Java APIs to develop your own data integration connectors.

Course Objectives

This course enables participants to acquire the following skills:

  • Undertsand how to use, configure and deploy Kafka Connect.
  • Integrate data as a real-time event stream.
  • Create custom data integration connectors.
  • Understand best practices for developing and deploying Kafka connectors.

Pedagogy

50% theory, 50% practise

Who Should Attend ?

This workshop is designed for Developers, Architects and Data Engineers who need to create data integration pipelines, in real time, through Apache Kafka.

Course Duration

1 Day

Course Prerequisites

Participants must be familiar with Java development. Participants should also be familiar with the basic concepts of Apache Kafka.

Course Content

Module 1: Introduction

  • Motivations
  • What is Kafka Connect?
  • What is it used for?
  • The ecosystem, Confluent Connect Hub
  • Advantages & Disadvantages

Module 2: Concepts and Architectures

  • Types of connector: Source & Sink
  • Kafka Connect Cluster: Workers & Tasks
  • Message formats: Converters
  • Data transformation: Single Message Transforms (SMTs)
  • Plugins
  • Delivery guarantees

Module 3: Managing and deploying connectors

  • Deployment Models: Standalone & Distributed
  • The REST API
  • Configuring connectors
  • Installing new plugins
  • Strategies to deployed Kafka Connect: Dedicated vs Mutualized

Module 4: Data Integration

  • Integrate data from filesystem with SpoolDir and FilePulse
  • Capturing Database Changes: Data Sourcing vs Change Data Capture
  • Introduction to Kafka Connect JDBC
  • Introduction to Debezium

Module 5: Developing connectors

  • The main JAVA interfaces
  • The model and data schemas
  • Managing Source and Sink Offsets
  • Developing Transformers
  • Developing RESTs extensions
  • Best practices for development

Module 6: Handling Errors

  • Dead Letter Queues

Moduke 7: Security

  • Authentications

  • ACLs

  • Externalizing Configurations

    1. Monitoring and Tools
The Author's Avatar
Instructor

Florian travaille depuis plus de 8 ans dans le conseil, il est co-fondateur et CEO de StreamThoughts. Au cours de sa carrière, il a travaillé sur divers projets impliquant la mise en oeuvre de plateformes d’intégration et de traitement de la data à travers les mondes technologiques de Hadoop et de Spark. Passionné par les systèmes distribués, il se spécialise dans les technologies d’event-streaming comme Apache Kafka, Apache Pulsar. Aujourd’hui, il accompagne les entreprises dans leur transition vers les architectures orientées streaming d’événements. Florian est certifié Confluent Administrator & Developer pour Apache Kafka. Il est nommé deux années consécutive (2019 et 2020) “Confluent Community Catalyst” pour ses contributions sur le projet Apache Kafka Streams et son implication dans la communauté open-source. Il fait partie des organisateurs du Paris Apache Kafka Meetup.