aws-reference-architectures/datalake
  • Overview of a Data Lake on AWS
  • Amazon S3: A Storage Foundation for Datalakes on AWS
    • Data lake Storage Architecture FAQs
  • Data Catalog Architecture
    • Schema Management Within a Data Catalog
  • Data Security and Access Control Architecture
    • Data Security and Access Control Using IAM
    • Fine-grained Access Control With AWS LakeFormation
  • Ingestion Architectures for Data lakes on AWS
    • Data Ingestion using Kinesis Firehose and Kinesis Producer Library (KPL)
    • Data Ingestion using Database Migration Service(DMS) and Lambda
    • Data Ingestion using Amazon Glue
    • Data Ingestion From On-Premise NFS using Amazon DataSync
  • Data Curation Architectures
    • Overwrite Table Partitions Using PySpark
  • Data Consumption Architectures
    • Query S3 Data lake using Athena and Glue Catalog
    • Query Data lake using Redshift Spectrum and Glue Catalog
    • Query Data lake using EMR and External Hive Metastore in VPC
    • Query Data lake using EMR and Glue Catalog
  • Code of Conduct
  • Contributing Guidelines
Powered by GitBook