Write amplification of flattened object

Write amplification of regular object

When DAOS runs in MD-on-SSD mode, it writes metadata for twice: it packs all metadata changes to contiguous buffer and writes to WAL during I/O handling, then flushes in-place changes of metadata from DRAM to MD-blob periodically by checkpointing service. The write amplification can be reduced if multiple in-place metadata changes land in the same set of dirty pages.

Write amplification of flattened object

When WORM object and flattening service are added to DAOS, metadata of WORM object will be written for three times, because metadata will be serialized and written to DT-blob again by flattening service. This may impact the overall performance and endurance of SSD.

Because writing to WAL and DT-blob are mandatory for WORM object, so the only step can be optimized or eliminated is in-place write of checkpointing service. However, keeping the tree structures of VOS up-to-date is extremely important for DTX and MVCC, it requires significant amount of efforts to change.

Client side flattening (optional)

DAOS server has to run all the MVCC and conditional checks even it is not always necessary because client can misbehave. However, if application can guarantee all the writes are new, either new object or new key or new value extent, then DAOS client can submit the I/O request in flattened format instead of regular RPC format. In this case, server side can directly write the flattened RPC to WAL, and skip the checkpointing phase.

The main challenge of this approach is, if the application submits writes against the same object in multiple RPC, then there will be multiple flattened buffers for the same object, which conflicts with the initial goal of the design: the entire object can be loaded to DRAM by one SSD read. So the flattening service should be able to find and merge flattened buffers of the same object before flushing to DT-blob. It also means that WAL can only be reclaimed after flattening service wrote merged buffers to DT-blob, this may require some significant changes to checkpointing and flattening services.

Giving the complexity and constraint of client side flattening, it can be an optional feature of this project.