# Ingestion Architectures for Data lakes on AWS

### Overview

One of the core values of a data lake is that it is a collection point and repository for all of an organizations data assets, in whatever their native formats are. This enables quick ingestion, elimination of data duplication and data sprawl, and centralized governance and management. After data assets are collected, they need to be transformed into normalized formats to be used by a variety of data analytics and processing tools. During this phase, customers will typically choose to standardize on a scheme for data compression, encryption of their data, and layout of information at the prefix level in S3.

The key to ‘democratizing’ data, and making it available to the widest number of users - of varying skill sets and responsibilities - is to transform data assets into a format that allows for efficient ad hoc SQL queries. As discussed earlier, when a data lake is built on AWS, we recommend transforming log-based data assets into Columnar formats. AWS provides multiple services to quickly and efficiently achieve this.

&#x20;In this section, we would share some of the common architectural patterns for ingestion that we see with  many of our customers' data lakes.

### Reference *Architectures for Ingesting Data into a Data Lake*

1. [***Ingest events and logs data using Kinesis Firehose***](https://aws-reference-architectures.gitbook.io/datalake/data-ingestion/kinesis-firehose-and-kpl)
2. [***Ingest database changes using Database Migration Service***](https://aws-reference-architectures.gitbook.io/datalake/data-ingestion/dms-and-lambda)
3. [***Ingest data from JDBC sources using Amazon Glue***](https://aws-reference-architectures.gitbook.io/datalake/data-ingestion/aws-glue)
4. [***Ingest datafiles using Amazon DataSync***](https://aws-reference-architectures.gitbook.io/datalake/data-ingestion/aws-datasync-on-prem-nfs)

## Have suggestions? Join our [Slack channel](https://join.slack.com/t/cat-cwp4274/shared_invite/zt-e2ztjpgw-Bugw46iXsLbZ~V54AljWsA) to  share feedback.
