Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Terminologies:

  • MD-blob: metadata blob, all its contents will be copied to DRAM.

  • DT-blob: data blob, its content can only be temporally put in DRAM while serving I/O.

Background

DAOS has a python tool to estimate internal metadata consumption based on DFS data model (daos_storage_estimator.py), the numbers below are the results for 1 million 4K files.

  • 1.0 GB metadata

    • 196.28 MB (object)

    • 307.20 MB (dkey)

    • 329.00 MB (akey)

    • 192.00 MB (array value)

  • 4.0 GB user data

Internal metadata is about 25% of user data for 4K files, but there are a few more things that are not counted:

  • VEA and DTX space consumption are not considered

  • PMDK/DAV has its own internal metadata

If a DAOS storage server has 1TB DRAM, it reserves 20% of the DRAM for OS, DMA/RDMA buffers, VOS object cache, VEA index, DTX tables…, then it has 800GB for MD-blobs of all pools. Based on the estimated results above, each 4K file consumes 1K bytes for internal metadata, this storage server can store 800 million 4K files at most, which is 3.2TB user data. Giving a storage server can have over 100TB or more SSDs for user data, DAOS server(MD-on-SSD phase-I) can only make use of tiny portion of the storage space if dataset of application only includes small files.

There are a few ways to improve the this:

Reduce memory consumption

Object flattening and eviction

...

Reducing memory consumption can always benefit storage engine, it is more important than ever when PMEM is removed from engine because the current metadata server stack is designed for system with multiple terabytes persistent memory, which is obviously more than DRAM capacity. Decreasing internal metadata can reduce the change of cache miss and object eviction, mitigate the impact of removing PMEM.