# Data Ingestion From On-Premise NFS using Amazon DataSync

## Overview

[AWS DataSync](https://aws.amazon.com/datasync/) is a fully managed data transfer service that simplifies, automates, and accelerates moving and replicating data between on-premises storage systems and AWS storage services over the internet or AWS Direct Connect. In a datalake environment, AWS DataSync can be used to sync files securely from on premise storage servers like NFS to S3 based datalake automatically.

In this architecture, we = walk you through how to use AWS DataSync and DataSync Agent to migrate data to a datalake in Amazon S3.

![Data Ingestion Amazon Glue](https://2553439727-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LXQF3JgpYb-IDUgkC6e%2F-LXUCd_m6SPY3a3y3Qyr%2F-LXUCf7TaTVVhBi7feRt%2Faws-datasync-from-nfs-on-prem.png?generation=1548859391799516\&alt=media)

## Architecture Component Walkthrough

1. You create a network attached file storage server (NFS) inside your data center.
2. You [install an AWS Datasync Agent](https://docs.aws.amazon.com/datasync/latest/userguide/create-agent-cli.html) as a VMware ESXi [hypervisor](https://en.wikipedia.org/wiki/Hypervisor) based environment. This Agent will have read access on the NFS server.
3. You configure AWS DataSync with the [locations](https://docs.aws.amazon.com/datasync/latest/userguide/create-locations-cli.html) required to perform syncronisation
4. You [create](https://docs.aws.amazon.com/datasync/latest/userguide/create-task-cli.html) and then [start](https://docs.aws.amazon.com/datasync/latest/userguide/start-task-execution.html) an AWS DataSync task to synchronization files from NFS to S3.
5. Use an [AWS Glue Crawler](https://docs.aws.amazon.com/glue/latest/dg/add-crawler.html) to catalog the S3 location that receives files via AWS DataSync.

## References

* [Getting started with AWS DataSync](https://docs.aws.amazon.com/datasync/latest/userguide/getting-started.html)
* [How AWS DataSync works](https://docs.aws.amazon.com/datasync/latest/userguide/how-datasync-works.html)

## Have suggestions? Join our [Slack channel](https://join.slack.com/t/cat-cwp4274/shared_invite/zt-e2ztjpgw-Bugw46iXsLbZ~V54AljWsA) to  share feedback.
