Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Today each VOS instance has single associated “data blob” to store the bulk values, to support metadata on SSD, two more blobs will be introduced, one is named “meta blob” for storing VOS index and small values, the other is named “WAL blob” for storing write-ahead log (WAL). Depending on the configuration schemes, meta blob and WAL blob could reside in same SSD or separate SSDs, they could also share same SSD with data blob as well.

...

  1. The low 32 bits represents offset within the WAL. That means it can support up to 16TB WAL size for a 4k block sized WAL.

  2. The high 32 bits represents sequence number which is increased by 1 once every time the log wraps.

Each transaction starts with a “WAL transaction header” entry, and it’s followed by multiple “WAL transaction operation” entries, the last entry of a transaction is “WAL transaction csum” entry, it contains the checksum of all entries and will be used for data integrity check on recovery phase.

...

Code Block
languagec
#define WAL_HDR_FL_CSUM 0x1 /* The tail csum entry is in current block */

struct wal_trans_head {
  uint32_t  wth_magic;
  uint16_t  wth_len;    /* Transaction data length within current block, in bytes */
  uint16_t  wth_flags;  /* Transaction header flags */
  uint64_t  wth_id;     /* Transaction ID */
};

enum wal_trans_op_type {
  /* Memory copy data to given meta blob offset */
  WAL_OP_MEMCPY = 0,
  /* Memory move data of given meta blob offset */
  WAL_OP_MEMMOVE,
  /* ZeoringZeroing data from given meta blob offset */
  WAL_OP_ZEROING,
  /* Checksum of given data on data blob */
  WAL_OP_CSUM,
  WAL_OP_MAX,
};

struct wal_trans_entry {
  uint64_t  wte_off;    /* Offset within meta or data blob, in bytes */
  uint32_t  wte_type;   /* Operation type */
  uint32_t  wte_len;    /* Data length in bytes */
  uint8_t   wte_data[0];
};

struct wal_trans_csum {
  uint32_t  wtc_len;    /* Checksum length in bytes */
  uint8_t   wtc_csum[0];
};

...