# Data lake Storage Architecture FAQs

## What  features to look for while selecting a cloud datalake storage platform?

Selecting a data storage solution is  always driven by the data retrieval pattern, scalability, performance cost and  durability characteristics. In case of a datalake,  characteristics like data-retrieval pattern, performance are  unclear at the beginning. So, it is recommended to select a solution that is secure, durable, distributed and  decoupled from data processing compute infrastructure. [Amazon S3](https://aws.amazon.com/s3/) provides you all the above characteristics with seem-less integration with other AWS and open source data analytic services. Amazon S3 provides secure APIs for programmatic access, so it is easy to build new integrations where required.

## Why Hadoop HDFS or data warehouse storage are not great choices for datalake?&#x20;

There are 3 primary reasons why solutions like HDFS storage and data warehouse(DW) storage systems are not suitable for datalakes.

1. **Scalability:** Datalakes are supposed to store all data of an organization. As the data volume grows. HDFS or data warehouse storage systems needs to be scaled from time to time.&#x20;
2. **Open data format support and storage-compute couple:** HDFS supports open data format, however it comes coupled with compute services. Similarly, DW systems store data in proprietary format that is not accessible to external computes for analytics.

## Have suggestions? Join our [Slack channel](https://join.slack.com/t/cat-cwp4274/shared_invite/zt-e2ztjpgw-Bugw46iXsLbZ~V54AljWsA) to  share feedback.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://aws-reference-architectures.gitbook.io/datalake/storage-foundation/datalake-storage-architecture.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
