Server Side Change:
Server side change will be implemented in DAOS engine Sched layer, since each xstream might have different workloads and RPC processing speed might be different depends on space pressure as well.
Basically there will be two limitations that prevent RPC from processing:
Number of ULTs in flights.
Number of RPCS in the per-pool waiting queue and global numbers of waiting queue.
When RPC arrives server, it might be put on waiting queue if number of in-flight
exceed limit, and it might be rejected if number of waiting queue is full.
A new error DER_BUSY will be returned to client, a hint will be returned to client.(calculated by
number of waiting queue and current RPC processing speed).
number of RPCs queued will consider following factors:
reserved memory that DAOS might use, this could be different for different setup (MD-on-SSD or PMDK).
current xstream IO processing speed, it might change dynamically depends on space pressure thus each xstream might have different numbers of limit.
one xstream could not queue too much RPCs, eg not exceeding 1/2 of total RPC queue limit for fairness consideration.
RPC might be rejected as well even waiting queue is not full, because smaller timeout of individual RPC.
In order to avoid tail-latency, a Priority heap is introduced, by default, RPCs Priority is 0, and it will be sorted
by enqueue time(which is still FIFO) if there is no RPC retry. every time RPC retried its priority increased by some weight to avoid one RPC was retried forever.
Client changes:
RPC retry will be handled in the DAOS client side, DER_BUSY is a retry-able error, client shall re-schedule RPC with hint (0-hint randomly) to resend RPC.
Cart change
high 16 bits of cch_dst_tag will be used for reply hint, a new flag CRT_RPC_FLAG_REJECT will be introduced for interoperability purpose. server will only return hint to client if REJECT flag set on RPC.
and DER_TIMEOUT will return to client so old clients could work with newer servers.(retry without hints)