DAOS Tools
Proposal for DAOS tools consolidation.
Tools | dmg | daos | dcp |
---|---|---|---|
Target | Administrator | Users | Users |
API | Control plane API (Go) | Data plane API (C) | Data plane API (C) |
Authentication | Certificate | daos_agent | daos_agent |
Lustre Equivalent | lctl/mkfs/mount/IML(partially) | lfs | pcp |
Functionality |
|
|
|
Syntax: dmg [resource] [action] [args]
daos [resource] [action] [args]
Proposal: High Level Characteristics
Proposal: Characteristics: Resource Names Summary
Specifying DAOS system name (formerly known as server group):
- --sys-name=SYSNAME ; or --sys=SYSNAME (example: --sys=daos_server)
Specifying storage server ranks (e.g., pool create/add-storage/del-storage, and system drain/reintegrate/kill/exclude)
- --ranks=SRVRANKLIST (example: --ranks=0,1,2)
Specifying added or removed pool service replica ranks (for pool add-svc/del-svc):
- --ranks=SRVRANKLIST (example: --ranks=0,1,2)
Specifying number of pool service replicas (for pool create):
- --nsvc=NUM
Specifying pool service replica ranks (legacy - currently required but eventually will not be needed) specify replica ranks:
- --svc=SRVRANKLIST (example: --svc=1,2,3)
Specifying a fault domain / entire rack of servers (e.g., for pool create/add-storage/del-storage and system drain/reintegrate/kill/exclude)
- --fdomains=FDRANKLIST (often a single item, but keeping a list for flexibility)
- --fd=FDRANKLIST (shorter option name for convenience)
Specifying targets (e.g., for system drain/reintegrate):
- List of Rank:Target pairs (--targets=SRVRANK:TGTRANK LIST ; or --tgt=SRVRANK:TGTRANK LIST)
- server 0 targets 0 and 1 (0:0,0:1)
- Server 1 targets 2 and 4 (1:2,1:4)
- Server 2 targets 0 and 1 (2:0,2:1)
- (whole list all together) --tgt=0:0,0:1,1:2,1:4,2:0,2:1
Specifying container snapshots
- named snapshot: --snap=NAME
- snapshot identified by a single epoch number: --epc=NUM
- snapshots that site within a specified range of epoch numbers: --epcrange=M-N
Proposal: Highlighted Operations and Tool Command Lines
Proposal: Highlighted Operations: System
- List all pools in a DAOS system
- dmg system list-pools
Proposal: Highlighted Operations: Pool
- Create (dmg pool create)
- by server rank list
- Specify only SCM storage
- dmg pool create --sys=SYSNAME --uid=UID --gid=GID --mode=MODE --nsvc=NREP --ranks=SRVRANKLIST --scm-size=SIZE
- Specify SCM + NVMe storage
- dmg pool create --sys=SYSNAME --uid=UID --gid=GID --mode=MODE --nsvc=NREP --ranks=SRVRANKLIST --scm-size=SIZE--nvme-size=SIZE
- by fault domain (rack) rank list (using --fd shorthand for --fdomains)
- dmg pool create --sys=SYSNAME --uid=UID --gid=GID --mode=MODE --nsvc=NREP --fd=FDRANKLIST --scm-size=SIZE--nvme-size=SIZE
- Add pool service replicas (dmg pool add-svc)
- Usage: dmg pool add-svc --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --ranks=MORESRVRANKSADDLIST
- Remove pool replicas (dmg pool del-svc)
- Usage: dmg pool del-svc --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --ranks=OLDSRVRANKSDELLIST
- Destroy pool in a DAOS system (dmg pool destroy)
- Usage: dmg pool destroy --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST [--force]
- List all containers in a pool (dmg pool list-containers ; or shorter command equivalent dmg pool list-cont)
- Usage: daos pool list-containers -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST
- Usage: daos pool list-cont -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST
- Add storage (dmg pool add-storage aka extend)
- Rack (all servers in the rack, and all targets in all of the servers; using --fd shorthand for --fdomains)
- dmg pool add-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --fd=FDRANKLIST
- Servers (all targets on the servers)
- dmg pool add-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --ranks=SRVRANKLIST
- Rack (all servers in the rack, and all targets in all of the servers; using --fd shorthand for --fdomains)
- Remove storage (dmg pool del-storage aka exclude)
- Rack (all servers in the rack, and all targets in all of the servers; using --fd shorthand for --fdomains)
- dmg pool del-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --fd=FDRANKLIST
- Servers (all targets on the servers)
- dmg pool del-storage --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --ranks=SRVRANKLIST
- Rack (all servers in the rack, and all targets in all of the servers; using --fd shorthand for --fdomains)
Proposal: Highlighted Operations: Container
Notes:
- command for all container operations is daos container (shown in the examples below). However, as a convenience, a shorter command equivalent may be used daos cont
- daos container list-objects command (shown in the examples below) has a shorter command equivalent daos container list-obj.
- resource for object commands is daos object. However, as a convenience, a shorter equivalent may be used daos obj
Container Create - by UUID and/or unified namespace path
- Create a container in a pool (daos container create)
- User-specified container UUID
- Usage: daos container create -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
- No container UUID specified (implementation generates a random UUID as a convenience)
- Usage: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST
- User-specified container UUID and user-specified unified namespace path to link the container to
- Usage: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --path=/path/to/create_and_link --type=POSIX|HDF5 --oclass=tiny|small|large|R2|R2S|repl_max --chunk_size=BYTES
- path is a directory for type=POSIX, and is a file for type=HDF5
- oclass is DAOS object class
- chunk_size is the chunk_size in bytes to use with files created in the container.
- Usage: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --path=/path/to/create_and_link --type=POSIX|HDF5 --oclass=tiny|small|large|R2|R2S|repl_max --chunk_size=BYTES
- No container UUID specified, and user-specified unified namespace path to link the container to (implementation will generate a random UUID)
- Usage:: daos container create --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST -path=/path/to/create_and_link --type=POSIX|HDF5 --oclass=tiny|small|large|R2|R2S|repl_max --chunk_size=BYTES
Container "Lookup" (All Other Commands) - by UUID or unified namespace path
There are 2 variants of the commands: 1) where the user provides the pool and container UUIDs ; and a 2) where the user provides only the unified namespace path to which the container is linked. In the second format, the implementation will resolve the pool and container UUIDs by getting extended filesystem attributes of the specified entity in the path (i.e., the user does not provide the pool UUID and does not provide the container UUID).
- Destroy a container in a pool (daos container destroy)
- Destroy by container UUID
- Usage: daos container destroy -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
- Destroy by path that the container is linked to
- Usage: daos container destroy -sys=SYSNAME --svc=SRVRANKLIST --path=/path/to/destroy_cont_and_unlink
The remaining container commands use the --cont=UUID form (the --path= option is available, but is not shown)
- List all objects in a container (daos-container list-objects)
- Usage: daos container list-objects -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
- Create a snapshot on container based on the latest committed epoch
- Unnamed
- Usage: daos container create-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
- Named
- Usage: daos container create-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --snap=mysnapname
- Unnamed
- List all snapshots in a container
- Usage: daos container list-snaps -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID
- Destroy container snapshot(s)
- Single epoch snapshot
- Usage: daos container destroy-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --epc=B
- Multiple snapshots within an epoch range
- Usage: daos container destroy-snap -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --epcrange=B-D
- Single epoch snapshot
- Rollback container to specified snapshot
- Rollback to a named snapshot
- Usage: daos container rollback -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --snap=mysnapname
- Rollback to a snapshot at an epoch number
- Usage: daos container rollback -pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID --epc=A
- Rollback to a named snapshot
Proposal: daosctl test considerations
- Change --server-group to --sys=
- Change --size to --scm-size?
- Change --replicas=NUM_METADATA_REPLICAS to --nsvc=
- Change --servers=SRVRANKLIST to --svc= (for pool replica ranks)
- Change exclude-target --targets= to take a list of pairs (instead of current approach that makes pairs from 2 lists: --rank=ra,rb,rc and --targets=ta,tb,tc)
- Change --server= to --rank=
- Change --rank= to --ranks= (or svc= ???) for kill-leader - (what will we do for dmg kill? Probably --ranks=. Choose same)
- Change --server=SERVER-LIST to --ranks= (or svc= ???) for kill-server (what will we do for dmg kill? Choose same)
- Change -c-uuid to --cont=CUUID
- Change -p-uuid to --pool=PUUID
Proposal: Commands, Resources, Operations and Arguments
Tool | Component | Component Args | Operation | Operation Args | Description, Notes / Issues | API | Implemented? |
dmg | storage | scan | discover all storage available on the nodes applying filters from yaml file | Y | |||
query | report status & stats about storage | ||||||
query smd | --devices --pools | query SMD device table query SMD pool table. | Y | ||||
query nvme-health | --hostlist="HOST:PORT" | query raw SPDK NVMe device health stats. Returns all stats for all NVMe SSDs on all hosts in hostlist. | Y | ||||
query blobstore-health | --devuuid="DEVICE_UUID" --tgtid="VOS_TGT_ID" | query BIO in-memory health data. Returns all BIO device health data and I/O errors & checksum error stats for given device UUID or VOS target ID. | Y | ||||
query device-state | --devuuid="DEVICE_UUID" | query the current device state of the given device UUID stored in SMD (ie NORMAL or FAULTY). | Y | ||||
set-faulty | --devuuid="DEVICE_UUID" | allow admin/user to manually set the device state of a given device to FAULTY (will trigger faulty device reaction callbacks). | Y | ||||
prep | device-specific configuration that may require a reboot. E.g. setting up AEP DIMMs in interleaved mode | Y | |||||
burnin | running fio against storage devices to verify it operates well and validate the performance. | ||||||
format | reset content of NVMe SSDs, format SCM with ext4, mount SCM and start the DAOS service (io_server) | Y | |||||
fwupdate | firmware update | ||||||
network | scan * | list discovered network interfaces * suggestion: report which interfaces and OFI providers would be used with the discovered interfaces | Y | ||||
query * | report status & stats about network interfaces * suggestion: perform a local test to indicate in advance if an OFI runtime error is going to occur with the interface, for example as seen with daos_server: "na_ofi_getinfo(): fi_getinfo failed, rc=-61(No data available)". Here, was from a VM build that didn't have PSM2 devel. | ||||||
system | --sys=SYSNAME | query | report service status on all or a subset of the servers | Y | |||
list-pools | list all pools created (do we want an alias for this as "pool list"?) | Y | |||||
Same as above | list-ranks | list all DAOS system server ranks in the specified system ("query" currently lists all system ranks, do we need list-ranks?) | |||||
Same as above | stat * | report various stats about the service * alternative command name: get-statistics | |||||
Same as above | log * | report service logs * alternative: get-log | |||||
Same as above | debug * | change debug mask * alternative: set-debug | |||||
Same as above | drain | --fdomains=FDRANKLIST --fd=FDRANKLIST * --ranks=SRVRANKLIST --targets=SRVRANK:TGTRANK LIST --tgt=SRVRANK:TGTRANK LIST | drain a list of racks, list of servers, or list of targets in preparation for maintenance * use --fd= as a convenience (shorter than --fdomains) This one really does require ability to specify at target or SSD level. Use case: one of the SSDs in a server is about to fail, hot swap it after a drain and before a reintegrate. | ||||
Same as above | reintegrate | --fdomains=FDRANKLIST --fd=FDRANKLIST --ranks=SRVRANKLIST --targets=SRVRANK:TGTRANK LIST --tgt=SRVRANK:TGTRANK LIST | reintegrate a drained component | ||||
Same as above | stop | full shutdown of the DAOS service | Y | ||||
Same as above | start | restart service after full shutdown | Y | ||||
Same as above | kill | --fdomains=FDRANKLIST --fd=FDRANKLIST --ranks=SRVRANKLIST | abrupt shutdown of a particular server (really: set of servers, or whole fault domains/racks of servers) | ||||
Same as above | exclude * | --fdomains=FDRANKLIST --fd=FDRANKLIST --ranks=SRVRANKLIST | Remove node from DAOS system (really: set of servers, or whole fault domains/racks of servers) * alternative command names: del-nodes, del-servers? | ||||
scrub | start | Start background checksum scrubbing process (or resume after prior stop) | |||||
stop | Stop scrubbing process | ||||||
query | Report status of background checksum scrubbing process (e.g., number of corruptions found, percentage of storage scanned so far) | ||||||
pool | --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST * | query | Report pool status * (applies to all pool commands) given a pool UUID and DAOS system name, eventually is expected the implementation will look up the existing pool service replica SRVRANKLIST (i.e., get rid of need for svc=) | daos_pool_query() | Y | ||
Same as above | stat * | Get pool statistics * alternative command name: get-statistics | ? | ||||
Same as above | get-prop * | Get pool properties * alternative: prop (but I like having the commands be "verbs") | ? | ||||
set-prop | TBD | Y | |||||
Same as above | get-acl * overwrite-acl update-acl delete-acl | Get/set/delete pool access control | ? | Y | |||
Same as above | get-attr set-attr del-attr list-attrs | --attr=ATTRNAME (get,del) --value=VALUESTR (set) no arguments for list-attrs | Get / set user attributes | daos_pool_attr_get() daos_pool_attr_set() daos_pool_attr_list() | |||
Same as above | list-containers list-cont | List all containers in the pool | |||||
N/A | create | --user=USERNAME@, --group=GROUPNAME@, --mode=MODE --nsvc=NREP --sys=SYSNAME --ranks=SRVRANKLIST * --scm-size=SIZE --nvme-size=SIZE --fdomains=FDRANKLIST --fd=FDRANKLIST --acl-file=FILE | * change existing dmg --target= to --ranks= | daos_pool_create() | Y | ||
Same as above | destroy | --force | daos_pool_destroy() | Y | |||
Same as above | add-storage * | --fdomains=FDRANKLIST --fd=FDRANKLIST --ranks=SRVRANKLIST | Add a storage fault domain (rack) or list of servers to an existing pool * formerly named "extend" | daos_pool_exclude() | |||
Same as above | del-storage * | --fdomains=FDRANKLIST --fd=FDRANKLIST --ranks=SRVRANKLIST | Remove a fault domain (rack) or list of servers from a pool. * formerly named "exclude" | daos_pool_extend() | |||
Same as above | add-svc | --ranks=MORESRVRANKLIST | Add a pool service replicate --svc= to specify current list of metadata service server ranks; --ranks= to specify new ranks to add to the set. | ? | |||
Same as above | del-svc | --ranks=OLDSRVRANKLIST | Remove a pool service replicate | ? | |||
Same as above | rebuild | Manage rebuild for a pool | ? | ||||
Same as above | rebalance | Trigger rebalance after add-storage(extend) by racks/servers | ? | ||||
Same as above | resize | --scm-size=SIZE --nvme-size=SIZE | Extend the size of a pool's existing targets | ||||
Same as above | evict | Evict all active pool connections | daos_pool_evict() | ||||
Same as above | lurk * | Dump activity on the pool * alternative: get-log | ? | ||||
Tool | Component | Operation | Arguments | Description and Notes / Issues | API | ||
daos | pool * | --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST | query, stat, get-prop | report pool status (rebuild/rebalancing status, ...) report various stats about the pool (size, usage, number of containers, ...), same as dmg pool stats show pool properties Note: daos pool is mostly "read-only" versus "dmg pool" used by the administrator. So the set-prop command is not available here. | Y (query, get-prop). Missing statistics support for "stat", but it may stay an admin/dmg thing ? | ||
Same as above | get-attr set-attr del-attr list-attrs | --attr=ATTRNAME (get,del) --value=VALUESTR (set) no arguments for list-attrs | Y | ||||
Same as above | list-containers list-cont | List all containers in the specified pool | Y | ||||
container cont | Pool related (same as daos pool): --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST Container (choose 1): --cont=UUID OR --path=FILESYSDIR | query * | show container status query by container UUID with --cont or query by unified namespace (directory or file) --path=FSENTITY (like current duns resolve_path). Note: do not specify --pool= when querying by path. * alternative: get-status | daos_cont_query() | Y | ||
Same as above | stat | show various container statistics * alternative command name: get-statistics | ? | Missing statistics/metrics support for "stat". | |||
Same as above | get-attr set-attr del-attr list-attrs | --attr=ATTRNAME (get,del) --value=VALUESTR (set) no arguments for list-attrs | set/retrieve user attributes | Y | |||
Same as above | get-prop | * is there such a thing as getting container properties (like pool properties)? | ? | Y | |||
set-prop | TBD | Y (currently, only "label"property is supported. | |||||
Same as above | list-objects list-obj | Enumerate all objects in the specified container | Y | ||||
Pool related: same as above Container related: (--cont=UUID) OR --path=FSENTITY --type=CONTTYPE --oclass=OBJCLASS --chunk_size=NBYTES | create | (implementation generates CUUID if not specified) Also optional are --path/type/oclass/chunk_size | create a container with specific properties (including type, object class, and chunk_size if provided) and link it with the path (if provided - similar to duns link_path, create a POSIX container with DFS-specific parameters). CONTTYPE: posix, hdf5 OBJCLASS: tiny, small, large, R2S, R2, repl_max | daos_cont_create() | Y | ||
Same as query above | destroy | --force * | destroy a container based on UUID or path (unlink path as well if provided) * current dcont destroy does not have the --force option | daos_cont_destroy() | Y | ||
Same as query above | create-snap, list-snaps, destroy-snap | --snap=NAME (create) --epc=NUM (destroy) --epcrange=RANGE (destroy) | Take, list, destroy container snapshots List all snapshots in the container Destroy a single snapshot by epoch number, or all that snapshots between two epoch numbers (inclusive of the begin/end epoch numbers?). | Y | |||
Same as query above | rollback | --snap=NAME --epc=NUM | Revert a container back to a previous snapshot specified by name or epoch number. | ||||
verify * | Validate content of a POSIX container * TBD - this was in a separate "daos fs" command section that has been merged into "daos cont" | ||||||
object obj | Pool, cont related (same as daos pool and daos cont): --pool=UUID --sys=SYSNAME --svc=SRVRANKLIST --cont=UUID Object: --oid=OID | query * | TBD: Epoch? | Show the layout of a particular DAOS object including all the targets where it is distributed | daos_obj_open()? daos_obj_fetch()? | Y | |
Same as above | list-keys | daos_obj_list_dkey()? daos_obj_list_akey() | |||||
Same as above | dump | Dump content of an object |
Proposal: TODO
Container create use cases, unified namespace related additionscontainer create by pool UUID + path (container UUID generated)query by path onlyDetermine if create is to be done in "daos cont create" with more options, or have a uns or fs resource (e.g., daos uns).
Container type specification (optional: unknown if not specified)--type=fs (or --type=posix)--type=block (future: e.g., spdk, virtio for cloud use cases)--type=hdf5
Object class and chunk sizes.
Proposal: Opens
- How to expose the mapping of SSDs to VOS targets on a given server node
- Use cases
- SSD fails:
- system will detect, DAOS will rebuild pool excluding affected target (using SSD to target mapping and using DAOS API to exclude by target).
- Admin gets predictive alert that SSD wear is high, could fail soon.
- Admin needs a way to query hw topology and associate with affected targets, then invoke dmg exclude (dmg system drain specifying targets I think is what we decided instead)
- SSD fails:
- How:
- Pool map, topology portion - does this contain topology details at this low level - or does it only go to the server/node level and stop there?
- System map?
- Use cases
- How to number (or instead name?) the fault domains in the system (with a numeric rank just like servers?)
- How: system map?
- Server ranks: should we support ranges of consecutive ranks?
- Example: ranks 0-1023
- --ranks=0-1023
- Example: ranks 0-511 and 1000-1023:
- --ranks=0-511,1000-1023
- Example: ranks 0-1023
- daosctl related
- OK to include "test" commands from daosctl into the official product dmg admin and daos user tools? Options
- include test commands in the official tools
- Print in help messages
- Do not print in help messages, but support in the code
- Do not include test commands in official tools (maintain daosctl as a developer-only utility)
- keep the "test-" commands in daosctl, but remove the ones that are supported by "dmg" and "daos" tools (e.g., pool create, container create, ...)
- include test commands in the official tools
- Is daosctl connect-pool needed?
- OK to include "test" commands from daosctl into the official product dmg admin and daos user tools? Options
- (more daosctl related) Should we have daos pool kill-leader command?
- daosctl kill-leader (kills one of the metadata service server ranks for a specific pool)