Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When RPC arrives server, it might be put on waiting queue if number of in-flight
exceed limit, and it might be rejected if number of waiting queue is full.

A new error DER_BUSY will be returned to client, a hint will be returned to client.(calculated by

number of waiting queue and current RPC processing speed).number of RPCs queued will consider following factors:

Info

  1. reserved memory that DAOS might use, this could be different for different setup (MD-on-SSD or PMDK).

  2. current xstream IO processing speed, it might change dynamically depends on space pressure thus each xstream might have different numbers of limit.one xstream could not queue too much RPCs, eg not exceeding 1/2 of total RPC queue limit for fairness consideration.

  3. RPC might be rejected as well even waiting queue is not full, because smaller timeout of individual RPC.

In order to avoid tail-latency, a Priority separate heap is introduced , by default, RPCs Priority is 0, and it will be sorted
by enqueue time(which is still FIFO) if there is no RPC retry. every time RPC retried its priority increased by some weight to avoid one RPC was retried foreverto insert retried RPC. whenever an RPC arrived server, it will get sorted ID, re-tried RPC will share same ID, server will always pick smaller ID from waiting queue.

Client changes:

RPC retry will be handled in the DAOS client side, DER_BUSY is a retry-able error, client shall re-schedule RPC with hint (0-hint randomly) to resend RPC.

...