Data Ingestion using Kinesis Firehose and Kinesis Producer Library (KPL)

Overview

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application.

Kinesis Data Firehose is fully managed service that delivers data to target locations including Amazon S3, Redshift, and the ElasticSearch Service. The Kinesis Producer Library (KPL) simplifies producer application development, allowing developers to achieve high write throughput to a Kinesis Data Stream. In this example, KPL is used to write data to a Kinesis Data Stream from the producer application. Kinesis Firehose then reads this stream and batches incoming records into files and delivers them to S3 based on file buffer size/time limit defined in the Firehose configuration.

Architecture Component Walkthrough

Your application uses the Kinesis Producer Library to collect records and write to Kinesis Data Streams.
If your application resides in VPC, you can use an Internet Gateway or a Kinesis VPC Endpoint to access the Kinesis Service.
Kinesis Firehose is configured to read data from the Kinesis Data Stream
Firehose batches records based on count or target file size, and then compresses and encrypts files before delivering to S3

PreviousIngestion Architectures for Data lakes on AWS NextData Ingestion using Database Migration Service(DMS) and Lambda

Last updated 5 years ago

Was this helpful?

Overview

Architecture Component Walkthrough

Have suggestions? Join our Slack channel to share feedback.