TL;DR: Lustre Unveiled (2): Design Principles and Key Features

Lustre Unveiled: Evolution, Design, Advancements, and Current Trends provides a comprehensive journey of Lustre, including its history and evolution, detailed archtiecture and design elements, comparison with other prominent storage technologies, case study of Lustre on a real-world supercomputer and the future development of Lustre.

In this post I share my digests of this journal’s section about design drinciples and key features.


Design Principles and Key Features

Design Principles

Lustre design has been motivated by several design principles.

Scalability is the ability of the filesystem to grow to meet ever-increasing demands;

  • the number of application clients
  • data storage capacity
  • number of filesystem entries
  • number of storage servers

Performance is the ability of the filesystem to deliver the aggregate capabilities of its constituent resources to a wide variety of application clients;

  • High bandwidth for large streaming I/O
  • Low latency and high throughput for small I/O operations and metadata accesses
  • Large quantities of concurrent I/O and metadata operations

Resilience is the ability of the filesystem to continue operating correctly in the face of transient or persistent fault conditions occurring in its constituent resources;

  • Reliability: the probability that underlying components in the system will fail and cause observable incorrect behavior
  • Availability: the percentage of time the fault conditions lead to an inability of the filesystem to service its clients
  • Serviceability: a measure of how quickly the system can be serviced or repaired to restore it to an operational status

Usability is an abstract principle that focuses on delivering a “good” experience for people using or managing the system (i.e., users and administrators).

  • Usability for users
  • Manageability for administrators
  • Flexibility to alter behavior based on input/configurations

Compatibility is another general principle that combines aspects of Compliance, Portability, and Interoperability.

  • Compliance to well-established standards or practices
  • Portability to utilize diverse component resources;
  • Interoperability to integrate with external systems, services and resources

Data and Metadata Separation

  • Scalability
  • Performance
  • Resilience

Lustre has two distinct types of storage services: the MDS and OSS.

MDS handles metadata operations such as file creation, deletion, and directory lookup on nodes with more CPU and RAM resources with high-IOPS storage; OSS manages file I/O operations on typically high-bandwidth storage.

This separation prevents bottlenecks, as each node and storage type is optimized for specific tasks, leading to improved system performance.

As Lustre’s workload grows, integrating more MDS and OSS nodes is straightforward, allowing the filesystem to scale almost linearly with additional servers.

In case of MDS/OSS failure, another can take over the management of the MDT/OST, maintaining resilient system operation.

Object-Based Design

  • Scalability
  • Parallel Performance

Lustre’s object-based design allows each file’s data to be stored on one or more OSTs, and abstracts the underlying storage technology and its management from the client.

Each OSS manages its own objects and low-level storage devices directly and independently. It avoids central points of contention during file I/O operations and eliminates bottlenecks while enabling parallel I/O on objects within a single file.

Data Layout

  • Parallel Performance
  • Usability
  • Scalability

A file’s layout is referred to the pattern by which the file’s data is mapped to one or more OSTs. The layout is composed of one or more components, each of which maps a range of file offsets to a corresponding range of OST object offsets, also called strips.

A component may be stripped across multiple OST objects in a RAID-0 manner to increase bandwidth and/or balance space usage.

This object distribution is key to enabling parallel I/O and increasing overall system bandwidth.

Distributed Namespace

  • Scalability
  • Performance

Namespace is the logical structure of directories and files and their attributes. Lustre’s approach of using distributed namespace is fundamental to its high scalability.

This allows namespaces to be spread across multiple MDTs, each MDT stores a subset of the namespace, distributing the metadata workload and thereby reducing the load on any single server.

Network Abstraction Layer

  • Performance
  • Compatibility
  • Usability

Lustre’s networking layer LNet acts as an abstraction layer that decouples Lustre’s core filesystem operations from the underlying network hardware, allowing it to operate over common network types like Infiniband, Ethernet and Slingshot.

LNet’s architecture is designed to be highly modular, enabling it to adapt to new network technologies as they emerge. This flexibility is crucial for Lustre’s deployment in diverse HPC environments.

The design is LNet also includes support for advanced network features such as RDMA.

Failover Mechanism

  • Resilience

Lustre’s failover mechanism is designed to ensure availability and reliability in the event of hardware and software failures.

In the event of a server failure, Lustre architecture allows for a seamless transition of service responsibilities to a pre-configured standby or secondary server. This failover process is designed to be transparent to applications running on the clients, minimizing downtime and maintaining access to data.

Client Caching

  • Performance
  • Scalability

In Lustre, the client-side cache allows frequently accessed data and metadata to be stored in the client’s RAM.

POSIX Compliance

  • Usability
  • Compatibility

Open Source and Community-Driven Development

  • Usability
  • Compatibility

Adaptability to Various Domains

  • Usability
  • Compatibility

Reference

Anjus George, Andreas Dilger, Michael J. Brim, Richard Mohr, Amir Shehata, Jong Youl Choi, Ahmad Maroof Karimi, Jesse Hanley, James Simmons, Dominic Manno, Veronica Melesse Vergara, Sarp Oral, and Christopher Zimmer. 2025. Lustre Unveiled: Evolution, Design, Advancements, and Current Trends. ACM Trans. Storage 21, 3, Article 21 (June 2025), 109 pages. https://doi.org/10.1145/3736583