DAOS Tools

Proposal for DAOS tools consolidation.


Toolsdmgdaosdcp
Target

Administrator

Users

Users

API

Control plane API (Go)

Data plane API (C)

Data plane API (C)

Authentication

Certificate

daos_agent

daos_agent

Lustre Equivalent

lctl/mkfs/mount/IML(partially)

lfs

pcp

Functionality
  • Storage provisionning
  • Burn-in
  • Firmware update
  • Data plane mgmt & monitoring
  • Configure/monitor scrubbing
  • Pool mgmt
  • Telemetry
  • Pool query
  • Container mgmt
  • Unified namespace mgmt
  • Container user attributes
  • Snapshots
  • Object debugging
  • POSIX container configuration
  • Parallel copy of POSIX containers
  • HDF5-level copy
  • Container parking


Syntax: dmg  [resource] [action] [args]
              daos [resource] [action] [args]


Proposal: High Level Characteristics

Proposal: Characteristics: Resource Names Summary

Specifying DAOS system name (formerly known as server group):

  • --sys-name=SYSNAME ; or --sys=SYSNAME (example: --sys=daos_server)

Specifying storage server ranks (e.g., pool create/add-storage/del-storage, and system drain/reintegrate/kill/exclude)

  • --ranks=SRVRANKLIST (example: --ranks=0,1,2)

Specifying added or removed pool service replica ranks (for pool add-svc/del-svc):

  • --ranks=SRVRANKLIST (example: --ranks=0,1,2)

Specifying number of pool service replicas (for pool create):

  • --nsvc=NUM

Specifying pool service replica ranks (legacy - currently required but eventually will not be needed) specify replica ranks:

  • --svc=SRVRANKLIST (example: --svc=1,2,3)

Specifying a fault domain / entire rack of servers (e.g., for pool create/add-storage/del-storage and system drain/reintegrate/kill/exclude)

  • --fdomains=FDRANKLIST (often a single item, but keeping a list for flexibility)
  • --fd=FDRANKLIST (shorter option name for convenience)

Specifying targets (e.g., for system drain/reintegrate):

  • List of Rank:Target pairs (--targets=SRVRANK:TGTRANK LIST ; or --tgt=SRVRANK:TGTRANK LIST)
    • server 0 targets 0 and 1 (0:0,0:1)
    • Server 1 targets 2 and 4 (1:2,1:4)
    • Server 2 targets 0 and 1 (2:0,2:1)
    • (whole list all together) --tgt=0:0,0:1,1:2,1:4,2:0,2:1

Specifying container snapshots

  • named snapshot: --snap=NAME
  • snapshot identified by a single epoch number: --epc=NUM
  • snapshots that site within a specified range of epoch numbers: --epcrange=M-N


Proposal: Highlighted Operations and Tool Command Lines

Proposal: Highlighted Operations: System

  • List all pools in a DAOS system
    •  dmg system list-pools

Proposal: Highlighted Operations: Pool

  • Create (dmg pool create)
    • by server rank list
      • Specify only SCM storage
        •  dmg pool create --sys=SYSNAME --uid=UID --gid=GID --mode=MODE --nsvc=NREP --ranks=SRVRANKLIST --scm-size=SIZE
      • Specify SCM + NVMe storage
        • dmg pool create --sys=SYSNAME --uid=UID --gid=GID --mode=MODE --nsvc=NREP --ranks=SRVRANKLIST --scm-size=SIZE--nvme-size=SIZE
    • by fault domain (rack) rank list (using --fd shorthand for --fdomains)
      •  dmg pool create --sys=SYSNAME --uid=UID --gid=GID --mode=MODE --nsvc=NREP --fd=FDRANKLIST --scm-size=SIZE--nvme-size=SIZE
  • Add pool service replicas (dmg pool add-svc)
    • Usage: dmg pool add-svc --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST  --ranks=MORESRVRANKSADDLIST
  • Remove pool replicas (dmg pool del-svc)
    • Usage: dmg pool del-svc --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST  --ranks=OLDSRVRANKSDELLIST
  • Destroy pool in a DAOS system (dmg pool destroy)
    • Usage: dmg pool destroy --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST [--force]
  • List all containers in a pool (dmg pool list-containers ; or shorter command equivalent dmg pool list-cont)
    • Usage: daos pool list-containers -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST 
    • Usage: daos pool list-cont -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST 
  • Add storage (dmg pool add-storage aka extend)
    • Rack (all servers in the rack, and all targets in all of the servers; using --fd shorthand for --fdomains)
      • dmg pool add-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST  --fd=FDRANKLIST
    • Servers (all targets on the servers)
      • dmg pool add-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST  --ranks=SRVRANKLIST
  • Remove storage (dmg pool del-storage aka exclude)
    • Rack (all servers in the rack, and all targets in all of the servers; using --fd shorthand for --fdomains)
      •  dmg pool del-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST  --fd=FDRANKLIST
    • Servers (all targets on the servers)
      • dmg pool del-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --ranks=SRVRANKLIST

Proposal: Highlighted Operations: Container

Notes:

  • command for all container operations is daos container (shown in the examples below). However, as a convenience, a shorter command equivalent may be used daos cont
  • daos container list-objects command (shown in the examples below) has a shorter command equivalent daos container list-obj.
  • resource for object commands is daos object. However, as a convenience, a shorter equivalent may be used daos obj


Container Create - by UUID and/or unified namespace path

  • Create a container in a pool (daos container create)
    • User-specified container UUID
      • Usage: daos container create -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
    • No container UUID specified (implementation generates a random UUID as a convenience)
      • Usage: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST
    • User-specified container UUID and user-specified unified namespace path to link the container to
      • Usage: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --path=/path/to/create_and_link --type=POSIX|HDF5 --oclass=tiny|small|large|R2|R2S|repl_max --chunk_size=BYTES
        • path is a directory for type=POSIX, and is a file for type=HDF5
        • oclass is DAOS object class
        • chunk_size is the chunk_size in bytes to use with files created in the container.
    • No container UUID specified, and user-specified unified namespace path to link the container to (implementation will generate a random UUID)
      • Usage:: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST -path=/path/to/create_and_link --type=POSIX|HDF5 --oclass=tiny|small|large|R2|R2S|repl_max --chunk_size=BYTES

Container "Lookup" (All Other Commands) - by UUID or unified namespace path

There are 2 variants of the commands: 1) where the user provides the pool and container UUIDs ; and a 2) where the user provides only the unified namespace path to which the container is linked. In the second format, the implementation will resolve the pool and container UUIDs by getting extended filesystem attributes of the specified entity in the path (i.e., the user does not provide the pool UUID and does not provide the container UUID).

  • Destroy a container in a pool (daos container destroy)
    • Destroy by container UUID
      • Usage: daos container destroy -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
    • Destroy by path that the container is linked to
      • Usage: daos container destroy -sys=SYSNAME --svc=SRVRANKLIST --path=/path/to/destroy_cont_and_unlink

The remaining container commands use the --cont=UUID form (the --path= option is available, but is not shown)

  • List all objects in a container (daos-container list-objects)
    • Usage: daos container list-objects -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID 
  • Create a snapshot on container based on the latest committed epoch
    • Unnamed
      • Usage: daos container create-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID 
    • Named
      • Usage: daos container create-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --snap=mysnapname
  • List all snapshots in a container
    • Usage: daos container list-snaps -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
  • Destroy container snapshot(s)
    • Single epoch snapshot
      • Usage: daos container destroy-snap  -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --epc=B
    • Multiple snapshots within an epoch range
      • Usage: daos container destroy-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID  --epcrange=B-D
  • Rollback container to specified snapshot
    • Rollback to a named snapshot
      • Usage: daos container rollback  -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --snap=mysnapname
    • Rollback to a snapshot at an epoch number
      • Usage: daos container rollback  -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --epc=A


Proposal: daosctl test considerations


  • Change --server-group to --sys=
  • Change --size to --scm-size?
  • Change --replicas=NUM_METADATA_REPLICAS to --nsvc=
  • Change --servers=SRVRANKLIST to --svc= (for pool replica ranks)
  • Change exclude-target --targets= to take a list of pairs (instead of current approach that makes pairs from 2 lists: --rank=ra,rb,rc and --targets=ta,tb,tc)
  • Change --server= to --rank=
  • Change --rank= to --ranks= (or svc= ???) for kill-leader - (what will we do for dmg kill? Probably --ranks=. Choose same)
  • Change --server=SERVER-LIST to  --ranks= (or svc= ???) for kill-server (what will we do for dmg kill? Choose same)
  • Change -c-uuid to --cont=CUUID
  • Change -p-uuid to --pool=PUUID


Proposal: Commands, Resources, Operations and Arguments


Tool

ComponentComponent ArgsOperation

Operation Args

Description, Notes / IssuesAPIImplemented?

 dmg

 storage
 scan
discover all storage available on the nodes applying filters from yaml file
Y



 query
report status & stats about storage




query smd

--devices

--pools

query SMD device table

query SMD pool table.


Y



query nvme-health--hostlist="HOST:PORT"query raw SPDK NVMe device health stats. Returns all stats for all NVMe SSDs on all hosts in hostlist.
Y



query blobstore-health

--devuuid="DEVICE_UUID"

--tgtid="VOS_TGT_ID"

query BIO in-memory health data. Returns all BIO device health data and I/O errors & checksum error stats for given device UUID or VOS target ID.
Y



query device-state--devuuid="DEVICE_UUID"query the current device state of the given device UUID stored in SMD (ie NORMAL or FAULTY).
Y



set-faulty--devuuid="DEVICE_UUID"allow admin/user to manually set the device state of a given device to FAULTY (will trigger faulty device reaction callbacks).
Y



 prep


device-specific configuration that may require a reboot. E.g. setting up AEP DIMMs in interleaved mode


Y



 burnin

running fio against storage devices to verify it operates well and validate the performance.






 format


reset content of NVMe SSDs, format SCM with ext4, mount SCM and start the DAOS service (io_server)
Y



 fwupdate

firmware update




 network


 scan *


list discovered network interfaces


 * suggestion: report which interfaces and OFI providers would be used with the discovered interfaces


Y



 query *
 report status & stats about network interfaces

 * suggestion: perform a local test to indicate in advance if an OFI runtime error is going to occur with the interface, for example as seen with daos_server: "na_ofi_getinfo(): fi_getinfo failed, rc=-61(No data available)". Here, was from a VM build that didn't have PSM2 devel.




 system

--sys=SYSNAME

 query


report service status on all or a subset of the servers
Y



list-pools
list all pools created (do we want an alias for this as "pool list"?)
Y


Same as above

 list-ranks


 list all DAOS system server ranks in the specified system ("query" currently lists all system ranks, do we need list-ranks?)



Same as above stat *

report various stats about the service

 * alternative command name: get-statistics





Same as above log *
report service logs


 * alternative: get-log





Same as above

 debug *
 change debug mask


* alternative: set-debug





Same as above

 drain

 --fdomains=FDRANKLIST

--fd=FDRANKLIST *

 --ranks=SRVRANKLIST

 --targets=SRVRANK:TGTRANK LIST

 --tgt=SRVRANK:TGTRANK LIST

drain a list of racks, list of servers, or list of targets in preparation for maintenance

 * use --fd= as a convenience (shorter than --fdomains)

This one really does require ability to specify at target or SSD level. Use case: one of the SSDs in a server is about to fail, hot swap it after a drain and before a reintegrate.





Same as above

 reintegrate

 --fdomains=FDRANKLIST

--fd=FDRANKLIST

 --ranks=SRVRANKLIST

 --targets=SRVRANK:TGTRANK LIST

 --tgt=SRVRANK:TGTRANK LIST

reintegrate a drained component






Same as above stop
full shutdown of

the DAOS service


Y


Same as above start
restart service after full shutdown
Y


Same as above kill

 --fdomains=FDRANKLIST

--fd=FDRANKLIST

 --ranks=SRVRANKLIST

abrupt shutdown of a particular server (really: set of servers, or whole fault domains/racks of servers)



Same as above exclude *

 --fdomains=FDRANKLIST

--fd=FDRANKLIST

 --ranks=SRVRANKLIST

Remove node from DAOS system (really: set of servers, or whole fault domains/racks of servers)


 * alternative command names: del-nodes, del-servers?




 scrub
 start
Start background checksum scrubbing process (or resume after prior stop)




 stop
Stop scrubbing process




 query
Report status of background checksum scrubbing process (e.g., number of corruptions found, percentage of storage scanned so far)


 pool --pool=UUID

 --sys=SYSNAME

 --svc=SRVRANKLIST *

query
Report pool status


 * (applies to all pool commands) given a pool UUID and DAOS system name, eventually is expected the implementation will look up the existing pool service replica SRVRANKLIST (i.e., get rid of need for svc=)

 daos_pool_query()Y


Same as above stat *
Get pool statistics

 * alternative command name: get-statistics

?


Same as above get-prop *
Get pool properties


 * alternative: prop (but I like having the commands be "verbs")

?



set-propTBD

Y


Same as aboveget-acl *

overwrite-acl

update-acl

delete-acl


Get/set/delete pool access control?Y


Same as above

 get-attr

 set-attr

 del-attr

 list-attrs

 --attr=ATTRNAME (get,del)

 --value=VALUESTR (set)

no arguments for list-attrs

 Get / set user attributes

 daos_pool_attr_get()
 daos_pool_attr_set()
 daos_pool_attr_list()



Same as above

 list-containers

 list-cont


List all containers in the pool



N/A create--user=USERNAME@,

--group=GROUPNAME@,

--mode=MODE

--nsvc=NREP

--sys=SYSNAME

 --ranks=SRVRANKLIST *

 --scm-size=SIZE

 --nvme-size=SIZE

 --fdomains=FDRANKLIST

--fd=FDRANKLIST

--acl-file=FILE

 * change existing dmg --target= to --ranks=


 daos_pool_create()Y


Same as above destroy --force
 daos_pool_destroy()Y


Same as aboveadd-storage *

 --fdomains=FDRANKLIST

--fd=FDRANKLIST

 --ranks=SRVRANKLIST


Add a storage fault domain (rack) or list of servers to an existing pool

 * formerly named "extend"

 daos_pool_exclude()


Same as abovedel-storage *

 --fdomains=FDRANKLIST

--fd=FDRANKLIST

 --ranks=SRVRANKLIST

Remove a fault domain (rack) or list of servers from a pool.

 * formerly named "exclude"

 daos_pool_extend()



Same as above

 add-svc --ranks=MORESRVRANKLIST

Add a pool service replicate

--svc= to specify current list of metadata service server ranks; --ranks= to specify new ranks to add to the set.

?


Same as above del-svc --ranks=OLDSRVRANKLISTRemove a pool service replicate?


Same as above rebuild
Manage rebuild for a pool?


Same as above rebalance
Trigger rebalance after add-storage(extend) by racks/servers?


Same as above resize

 --scm-size=SIZE

 --nvme-size=SIZE

Extend the size of a pool's existing targets



Same as above evict
Evict all active pool connections daos_pool_evict()


Same as above lurk *
Dump activity on the pool


 * alternative: get-log

?
ToolComponent
OperationArgumentsDescription and Notes / IssuesAPI

 daos

 pool * --pool=UUID

 --sys=SYSNAME

 --svc=SRVRANKLIST

 query,

 stat,

 get-prop


report pool status (rebuild/rebalancing status, ...)

report various stats about the pool (size, usage, number of containers, ...), same as dmg pool stats

show pool properties

Note: daos pool is mostly "read-only" versus "dmg pool" used by the administrator. So the set-prop command is not available here.


Y (query, get-prop). Missing statistics support for "stat", but it may stay an admin/dmg thing ?




Same as above get-attr

 set-attr

 del-attr

 list-attrs

 --attr=ATTRNAME (get,del)

 --value=VALUESTR (set)

no arguments for list-attrs



Y



Same as above

 list-containers

 list-cont


List all containers in the specified pool
Y

container

cont

Pool related

(same as daos pool):

--pool=UUID

 --sys=SYSNAME

 --svc=SRVRANKLIST


Container (choose 1):

--cont=UUID

OR

--path=FILESYSDIR

 query *
show container status


query by container UUID with --cont 

or

query by unified namespace (directory or file) --path=FSENTITY (like current duns resolve_path). Note: do not specify --pool= when querying by path.


 * alternative: get-status

 daos_cont_query()Y


Same as above stat
show various container statistics

 * alternative command name: get-statistics

?Missing statistics/metrics support for "stat".


Same as above get-attr

 set-attr

 del-attr

 list-attrs

--attr=ATTRNAME (get,del)

 --value=VALUESTR (set)

no arguments for list-attrs

 set/retrieve user attributes


Same as aboveget-prop

 * is there such a thing as getting container properties (like pool properties)?

?Y



set-propTBD

Y (currently, only "label"property is supported.


Same as above

 list-objects

list-obj


Enumerate all objects in the specified container
Y


Pool related:

same as above

Container related:

(--cont=UUID)

OR

--path=FSENTITY

--type=CONTTYPE

--oclass=OBJCLASS

--chunk_size=NBYTES

 create

 (implementation generates CUUID if not specified)


Also optional are --path/type/oclass/chunk_size

create a container with specific properties (including type, object class, and chunk_size if provided) and link it with the path (if provided - similar to duns link_path, create a POSIX container with DFS-specific parameters). 

CONTTYPE: posix, hdf5


OBJCLASS: tiny, small, large, R2S, R2, repl_max

 daos_cont_create()Y


Same as query above destroy --force * destroy a container based on UUID or path (unlink path as well if provided)

* current dcont destroy does not have the --force option

 daos_cont_destroy()Y


Same as query above

create-snap,

list-snaps,

destroy-snap

--snap=NAME (create)

--epc=NUM (destroy)

--epcrange=RANGE (destroy)



Take, list, destroy container snapshots
Optionally name snapshots on creation. Snapshot created based on the most recent committed epoch.

List all snapshots in the container

Destroy a single snapshot by epoch number, or all that snapshots between two epoch numbers (inclusive of the begin/end epoch numbers?).


Y


Same as query above rollback

--snap=NAME

--epc=NUM

Revert a container back to a previous snapshot specified by name or epoch number.




 verify *

Validate content of a POSIX container

 * TBD - this was in a separate "daos fs" command section that has been merged into "daos cont"




 object

obj

Pool, cont related
(same as daos pool and daos cont):

--pool=UUID

 --sys=SYSNAME

 --svc=SRVRANKLIST

 --cont=UUID

Object:

 --oid=OID

 query *

 get-layout

TBD: Epoch?Show the layout of a particular DAOS object including all the targets where it is distributed

 * get-layout (or get layout) instead of query

 daos_obj_open()?
 daos_obj_fetch()?
Y


Same as above list-keys


 daos_obj_list_dkey()?
 daos_obj_list_akey()



Same as above dump
Dump content of an object


Proposal: TODO

  1. Container create use cases, unified namespace related additions
    1. container create by pool UUID + path (container UUID generated)
    2. query by path only
    3. Determine if create is to be done in "daos cont create" with more options, or have a uns or fs resource (e.g., daos uns).
  2. Container type specification (optional: unknown if not specified)
    1. --type=fs (or --type=posix)
    2. --type=block (future: e.g., spdk, virtio for cloud use cases)
    3. --type=hdf5
  3. Object class and chunk sizes.

Proposal: Opens

  1. How to expose the mapping of SSDs to VOS targets on a given server node
    1. Use cases
      • SSD fails:
        • system will detect, DAOS will rebuild pool excluding affected target (using SSD to target mapping and using DAOS API to exclude by target).
      • Admin gets predictive alert that SSD wear is high, could fail soon.
        • Admin needs a way to query hw topology and associate with affected targets, then invoke dmg exclude (dmg system drain specifying targets I think is what we decided instead)
    2. How:
      • Pool map, topology portion - does this contain topology details at this low level - or does it only go to the server/node level and stop there?
      • System map?
  2. How to number (or instead name?) the fault domains in the system (with a numeric rank just like servers?)
    1. How: system map?
  3. Server ranks: should we support ranges of consecutive ranks?
    1. Example: ranks 0-1023
      1. --ranks=0-1023
    2. Example: ranks 0-511 and 1000-1023:
      1. --ranks=0-511,1000-1023
  4.  daosctl related
    1. OK to include "test" commands from daosctl into the official product dmg admin and daos user tools? Options
      1. include test commands in the official tools
        1. Print in help messages
        2. Do not print in help messages, but support in the code
      2. Do not include test commands in official tools (maintain daosctl as a developer-only utility)
        1. keep the "test-" commands in daosctl, but remove the ones that are supported by "dmg" and "daos" tools (e.g., pool create, container create, ...)
    2. Is daosctl connect-pool needed?
  5. (more daosctl related) Should we have daos pool kill-leader command?
    1. daosctl kill-leader (kills one of the metadata service server ranks for a specific pool)