/
Object flattening

Object flattening

Most AI/ML jobs are perceived to be read-intensive with a lot of small reads while a few ML jobs also perform small writes. This kind of I/O behavior require the storage system provides superior random/small read performance. Research shows that 99% read and write calls of Biology, Computer Science, Materials, and Chemistry are less that 10MB, over 90% are less than 1MB.

The data format of VOS is designed for generic requirements so it depends on scalable data structure like B+Tree and EVTree. However, for read-intensive AI/ML workload, because most of read calls are small, keeping scalable index for data does not help much on performance. It actually brings a lot of metadata overhead, which consumes DRAM after removing PMEM from the stack.

In order to reduce the metadata overhead of indexing user data of small objects/files, a technology called object flattening is proposed in this document.

 

Related content

Flattened object and memory consumption
Flattened object and memory consumption
More like this
Tree optimizations
Tree optimizations
More like this
Flattened object index tree
Flattened object index tree
More like this
VEA optimization
VEA optimization
More like this
WORM object flattening and eviction
WORM object flattening and eviction
Read with this
Future improvements
Future improvements
More like this