View Transcript
Episode Description
We take a deep dive into Amazon S3 Files, AWS's exciting new managed file system backed by S3!
We kick things off by exploring why S3 isn't a traditional file system, covering everything from the lack of true directories and atomic renames to immutable objects and POSIX access control differences. We then walk through the existing solutions people have used to bridge that gap, like S3FS FUSE, MountPoint for S3, FSx for Lustre, and Storage Gateway.
From there, we get into the heart of the episode: how S3 Files works, how to set it up, and how it uses EFS under the hood as a caching layer. We share our own real-world benchmarking results comparing S3 Files against various EFS configurations across Lambda and Fargate, and we discuss a real customer project where we put S3 Files to the test.
We also cover the important caveats like eventual consistency, the 60-second write-back delay, the lack of cross-account bucket support, and the cost model so you can make an informed decision.
Resources mentioned
- Episode 124: S3 Performance
- Episode 95: Mounting S3 as a Filesystem
- Amazon S3 FAQs: S3 Files
- fourTheorem S3 Files demo code on GitHub
- Amazon documentation: Understanding how synchronization works
Sponsor
Thanks to fourTheorem for powering AWS Bites. We help teams build cloud systems that are simple, scalable, and cost effective. Visit fourtheorem.com.
Chapters
- 00:00 Introduction: Why S3 is amazing but not a file system, and what S3 Files promises to solve
- 01:47 Why S3 is not a file system: no true directories, immutable objects, no atomic renames, expensive listings, and POSIX differences
- 05:23 Existing solutions for mounting S3 as a file system: S3FS FUSE, Python fsspec, Hadoop S3A, MountPoint, FSx for Lustre, File Cache, and Storage Gateway
- 07:16 How S3 Files works: NFS-based access, EFS caching layer, streaming from S3, and supported compute services like EC2, ECS, EKS, and Lambda
- 09:49 Setting up S3 Files: buckets, file system resources, import and expiration rules, mount targets, access points, VPC requirements, and NFS port configuration
- 13:42 S3 Files performance numbers from AWS documentation: throughput, IOPS, latency figures, and why real-world benchmarking is recommended
- 15:39 Benchmarking S3 Files vs EFS configurations on Lambda and Fargate: small and large file reads and writes, memory/CPU impact, and key findings
- 19:48 Downsides and limitations: NFS only, no hard links, no atomic renames, eventual consistency, the 60-second write-back delay, and large-scale rename performance warnings
- 23:05 Real-world project experience: a SaaS multi-tenant architecture, cross-account bucket limitation discovered, and how the team worked around it
- 27:52 Cost breakdown: EFS-equivalent cache pricing, S3 storage costs, reads from cache vs. S3 directly, and how S3 access tiers still apply
- 29:50 Final recap and take: when S3 Files shines, when to be cautious, mixed access pattern warnings, and an invitation to share your own experiences
- 33:42 Closing
Send us your AWS questions
Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on X/Twitter, Bluesky, or LinkedIn: