UMEM Abstraction Layer

Stack layering

The UMEM abstraction layer was designed to provide Unified MEMory interface to access both DRAM and PMEM, so DAOS can use the same implementation of common data structure like B+tree on both DRAM and PMEM.

In order to support Metadata on SSD, UMEM interface should be extended and be able to describe new storage type exported by BIO (DRAM + blob), a new ad-hoc allocator can be built for this new storage type.

As a summary, UMEM should provide a set of APIs, which includes features like:

  • Transactional API

  • Arbitrary alignment space allocation

BIO can manage multiple storage types, including both SPDK+NVMe and PMEM, for the server I/O stack, so it sits on top of UMEM layer in the stack. However, it also describes and exports the underlying blob and backend of the ad-hoc allocator, so BIO and UMEM recursively depends on each other in the stack. In order to decouple the recursive dependency, UMEM should provide a set of data structures and callbacks to describe the generic properties and behaviors of the storage, upper level stack can initialize these properties and callbacks then pass them into UMEM. After this, the ad-hoc allocator under UMEM can manage the space for the storage exported by BIO.

UMEM Store

Abstraction of storage device (SPDK blob), umem allocator manage the specified space for metadata.

struct umem_store { /** * Based address of the umem storage, umem allocator can manage abitrary partition * of the device/blob. */ daos_addr_t stor_addr; /** size of the storage partition managed by umem allocator */ daos_size_t stor_size; /** private data passed between layers */ void *stor_priv; /** callbacks provided by upper level stack, umem allocator can use them to * operate the storage device. */ struct umem_store_ops *stor_ops; };

umem_store currently provides the simplest functions to access storage: UMEM layer uses the first two functions to load and store checkpoint from/to storage device, the last two functions to submit Write Ahead Log (WAL) to storage device.

/** Describing a storage region for I/O */ struct umem_store_region { daos_addr_t sr_addr; daos_size_t sr_size; }; /** Arbitrary number of storage regions */ struct umem_store_iod { int io_nr; /* embedded one for convenience */ struct umem_store_region io_region; struct umem_store_region *io_regions; }; /** Function table for UMEM to access storage */ struct umem_store_ops { int (*so_read)(struct umem_store *store, struct umem_store_iod *iod, d_sg_list_t *sgl); int (*so_write)(struct umem_store *store, struct umem_store_iod *iod, d_sg_list_t *sgl); int (*so_wal_reserv)(struct umem_store *store, struct umem_wal_id *id); int (*so_wal_submit)(struct umem_store *store, struct umem_wal_id id, struct umem_wal_rec *rec); /* TODO: replay callbacks */ };