HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
Load data to Blob Storage for data lake or file-sharing use cases. Extract files from Blob Storage and use them in pipelines.