EN FR
EN FR


Section: Scientific Foundations

Towards scalable, BLOB-based distributed file systems

Recent research  [50]  emphasizes a clear move currently in progress from a block-based interface to a object-based interface in storage architectures. The goal is to enable scalable, self-managed storage networks by moving low-level functionalities such as space management to storage devices or to storage server, accessed through a standard object interface. This move has a direct impact on the design of today's distributed file systems: object-based file system would then store data rather as objects than as unstructured data blocks. According to  [50] , this move may eliminate nearly 90% of management workload which was the major obstacle limiting file systems' scalability and performance.

Two approaches exploit this idea. In the first approach, the data objects are stored and manipulated directly by a new type of storage device called object-based storage device (OSD). This approach requires an evolution of the hardware, in order to allow high-level object operations to be delegated to the storage device. Examples of parallel/distributed file systems following this approach are Lustre  [66]  and Ceph  [69] . Recently, research efforts  [48]  have explored the feasibility and the possible benefits of integrating OSDs into parallel file systems, such as PVFS  [45] .

The second approach does not rely on the presence of OSDs, but still tries to benefit from an object-based approach to improve performance and scalability: files are structured as a set of objects that are stored on storage servers. Google File System  [51] , and HDFS (Hadoop File System[33] illustrate this approach.