AWS Bites

·S1 E154

154. S3 Files

May 22
34 mins

View Transcript

Episode Description

We take a deep dive into Amazon S3 Files, AWS's exciting new managed file system backed by S3!

We kick things off by exploring why S3 isn't a traditional file system, covering everything from the lack of true directories and atomic renames to immutable objects and POSIX access control differences. We then walk through the existing solutions people have used to bridge that gap, like S3FS FUSE, MountPoint for S3, FSx for Lustre, and Storage Gateway.

From there, we get into the heart of the episode: how S3 Files works, how to set it up, and how it uses EFS under the hood as a caching layer. We share our own real-world benchmarking results comparing S3 Files against various EFS configurations across Lambda and Fargate, and we discuss a real customer project where we put S3 Files to the test.

We also cover the important caveats like eventual consistency, the 60-second write-back delay, the lack of cross-account bucket support, and the cost model so you can make an informed decision.

Resources mentioned

Sponsor

Thanks to fourTheorem for powering AWS Bites. We help teams build cloud systems that are simple, scalable, and cost effective. Visit fourtheorem.com.

Chapters

  • 00:00 Introduction: Why S3 is amazing but not a file system, and what S3 Files promises to solve
  • 01:47 Why S3 is not a file system: no true directories, immutable objects, no atomic renames, expensive listings, and POSIX differences
  • 05:23 Existing solutions for mounting S3 as a file system: S3FS FUSE, Python fsspec, Hadoop S3A, MountPoint, FSx for Lustre, File Cache, and Storage Gateway
  • 07:16 How S3 Files works: NFS-based access, EFS caching layer, streaming from S3, and supported compute services like EC2, ECS, EKS, and Lambda
  • 09:49 Setting up S3 Files: buckets, file system resources, import and expiration rules, mount targets, access points, VPC requirements, and NFS port configuration
  • 13:42 S3 Files performance numbers from AWS documentation: throughput, IOPS, latency figures, and why real-world benchmarking is recommended
  • 15:39 Benchmarking S3 Files vs EFS configurations on Lambda and Fargate: small and large file reads and writes, memory/CPU impact, and key findings
  • 19:48 Downsides and limitations: NFS only, no hard links, no atomic renames, eventual consistency, the 60-second write-back delay, and large-scale rename performance warnings
  • 23:05 Real-world project experience: a SaaS multi-tenant architecture, cross-account bucket limitation discovered, and how the team worked around it
  • 27:52 Cost breakdown: EFS-equivalent cache pricing, S3 storage costs, reads from cache vs. S3 directly, and how S3 access tiers still apply
  • 29:50 Final recap and take: when S3 Files shines, when to be cautious, mixed access pattern warnings, and an invitation to share your own experiences
  • 33:42 Closing

Send us your AWS questions

Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on X/Twitter, Bluesky, or LinkedIn:

See all episodes