StreamThoughts

Kafka Connect File Pulse

Connect File Pulse is a multi-purpose open-source Kafka Connect(source) plugin for ingesting and transforming data from files before sending it to a Kafka Apache cluster.

Download

Ingest, transform and integrate

Continuously ingest your data from files in any format. Transform and filter your data into structured data with a collection of extensible filters. Do it easily with Connect File Pulse, a distributed, fault tolerant and scalable solution based on the Kafka Connect framework.

Analyze and structure your data before sharing it in real-time.

Readers
Data is frequently exported, shared and integrated from legacy systems (not very scalable) through the use of files in a wide variety of formats. Connect File Pulse enables you to easily integrate any type of data into a centralized Apache Kafka platform and distribute it across your organization.

Filters
Define complex pipelines to transform and structure your data before integration into Kafka:

  • Parse data in JSON format.
  • Structure and aggregate log data via grok.
  • Filter and anonmyze personal data
  • Enrich data with descriptive metadata

Uniformize your data by combining the plug-and-play filters which are available with the connector.

Stop losing your data and integrate it continuously.

Connect File Pulse is a Kafka Connect plugin which is a distributed, elastic and fault tolerant solution. Do you want to quickly integrate a large number of files? No worries: increase the number of tasks allocated to data ingestion with a simple call to a REST API.

The progress of each file is persisted in Kafka. Thus, in case of a connector failure or a stop/restart during a maintenance operation, the connector will resume the ingestion of your files where it had stopped.

Enrich and adapt the connector to your needs.

From the very beginning, we aimed to offer a solution that could be adapted to the characteristics of each project. Connect File Pulse is based on an evolutive architecture which allow you to configure everything. You do not find the right filter or reader for your target data or data structure ? You can easily develop your own Filters, Readers using a relatively simple API.