Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Number of ULTs in flights. (calculated by per target memory limit / 16k)

  2. Number of RPCS in the per-pool waiting queue and global numbers of waiting queue.

When RPC arrives server, it might be put on waiting queue if number of in-flight
exceed limit, and it might be rejected if number of waiting queue is full or RPC could be not handled timely based on current RPC processing speed and numbers of RPC in waiting queue.

A new error DER_BUSY will be returned to client, a hint will be returned to client.(calculated bynumber of waiting queue and current RPC processing speed).

number of RPCs queued will consider following factors:

Info

  1. reserved memory that DAOS might use, this could might be different for different setup (MD-on-SSD or PMDK).

  2. current xstream IO ult processing speed, it might change dynamically depends on space pressure thus each xstream might have different numbers of limit.

  3. one xstream could not queue too much RPCs, eg not exceeding 1/2 of total RPC queue limit for fairness consideration.

  4. RPC might be rejected as well even waiting queue is not full, because smaller timeout of individual RPC.

  5. latency, this will be different for different types.

In order to avoid tail-latency, a Priority separate heap is introduced , by default, RPCs Priority is 0, and it will be sorted
by enqueue time(which is still FIFO) if there is no RPC retry. every time RPC retried its priority increased by some weight to avoid one RPC was retried forever. to insert retried RPC. whenever an RPC arrived server, it will get sorted ID, re-tried RPC will share same ID, server will always pick smaller ID from waiting queue.

Client changes:

RPC retry will be handled in the DAOS client side, DER_BUSY is a retry-able error, client shall re-schedule RPC with hint (0-hint timeout randomly) to resend RPC.

Cart change

high 16 bits of cch_dst_tag will be used for reply hint, a new flag CRT_RPC_FLAG_REJECT will be introduced for interoperability purpose. server will only return hint to client if REJECT flag set on RPC.

and DER_TIMEOUT will return to client so old clients could work with newer servers.(retry without hints)

Protocal Change:

To support NRS, we might extend DAOS RPC to send/reply enough information:

Code Block
struct daos_req_comm_in {
      uuid_t  req_in_pool_id;
      uuid_t  req_in_cont_id;
      uint32_t req_in_uid;
      uint32_t  req_in_gid;
      uint32_t  req_in_projid;
      uint64_t  req_in_hint; /* for RPC reject */
      uint64_t  req_in_paddings[4];
      crt_phy_addr_t req_in_addr;
      d_string_t req_in_jobid;
};

struct daos_req_comm_out {.
      uint64_t req_out_hint;
      uint64_t req_out_paddings[4];
};

This will introduce interoperability issues for involved RPCs, to simplify this a bit, we might just consider object/dtx module.

v2 in/output struct will be introduced for these modules RPC format. modules will negotiate between client and server, then register proper RPC format handler.

In the server side ->dms_get_req_attr will be used extract NRS required attributes from different modules.