Member-only story
AWS Datasync for Windows Shared Drive
4 min readJan 20, 2025
AWS DataSync is a powerful service designed to simplify, automate, and accelerate the transfer of data between on-premises storage systems (like Windows shared drives) and AWS. In this section, we provide a detailed step-by-step guide to set up DataSync for reading from Windows shared drives and integrating the data into ETL pipelines.
Why Use AWS DataSync for ETL?
- High-Performance Data Transfer: DataSync is optimized for large-scale data movement, ensuring faster and more reliable transfers.
- File Detection: Automatically detects new or updated files on the source and syncs them to AWS storage.
- Simplified Management: Managed service with built-in scheduling, monitoring, and fault tolerance.
- Integration with AWS Services: Sync data directly to Amazon S3, Amazon EFS, or Amazon FSx for subsequent processing in ETL pipelines.
Step-by-Step Setup for AWS DataSync
1. Prerequisites
- An AWS account with necessary permissions.
- On-premises access to the Windows shared drive.
- An AWS DataSync agent installed on a VM or server that has network access to the shared drive.
- SMB credentials for the shared drive.