Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Jira Legacy
serverSystem JIRA
serverIdf325724b-f7c9-34db-bd1c-69d12ec98a69
keyDAOS-1141413559

Terminologies

  • MD-blob: metadata blob, all its contents will be copied to DRAM.

  • DT-blob: data blob, its content can only be temporally put in DRAM while serving I/O.

  • Hierarchical object: the current object format of DAOS, keys and values are indexed by tree.

  • Flatten: serialize object keys and values to a self-described contiguous buffer.

Background

DAOS has a python tool to estimate internal metadata consumption based on DFS data model (daos_storage_estimator.py), the numbers below are the results for 1 million 4K files.

...

If a DAOS storage server has 1TB DRAM, it reserves 128GB DRAM for OS, DMA/RDMA buffers, VOS object cache, VEA index, DTX tables…, then it has 900GB for MD-blobs of all pools. Based on the estimated results above, each 4K file consumes 1K bytes for internal metadata, this storage server can store 900 million 4K files at most, which is 3.6TB user data. Giving a storage server can have over 100TB or more SSDs for user data, DAOS server(MD-on-SSD phase-I) can only make use of tiny portion of the storage space if dataset of application only includes small files.

Overview

In this design, DAOS will not dynamically load or evict mapped metadata pages of MD-blob, instead, DAOS will try to manage object and its metadata. For example, it can migrate significant amount of internal metadata from MD-blob to DT-blob, after that, those migrated metadata can be evicted from DRAM. During I/O handling, the evicted metadata can be brought back to DRAM from DT-blob, they can also be evicted again when system is under memory pressure.

...