Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

NOTE THESE ARE NOT TO BE APPLIED TO 2.0 TESTING, USE THE QUICKSTARTS IN THE 2.0 ON-LINE DOCUMENTATION

Table of Contents

Table of Contents
excludeTable of Contents

...

This documentation provides a general tour to DAOS management commands (dmg) for daos_admin, and DAOS tools (daos) for daos_client users. Provides help with pool and container create, list, query and destroy on DAOS server for daos_admin and daos_client users. Some frequent common errors users might see and workaround are provided.and workarounds for new users when using the dmg and daos tools.  Example runs of data transfer between DAOS file systems, by setting up of DAOS dfuse mount point and run traffic with dfuse fio and mpirun mdtest. Example of basic dmg and daos tools runs on 2 hosts DAOS server and 1 host client, runs of DAOS rebuild over dfuse fio and mpirun mdtest on a 4 hosts DAOS server.

...

Set environment variables for list of servers, client and admin node.

Code Block
languagebash
# Example of 2 hosts server
# For 1 host server, export SERVER_NODES=node-1
export SERVER_NODES=node-1,node-2
# Example to use admin and client on the same node
export ADMIN_NODE=node-3
export CLIENT_NODE=node-3
export ALL_NODES=$SERVER_NODES,$CLIENT_NODE

...

dmg system query

Code Block
languagebash
# system query output for a 2 hosts DAOS server
$ dmg system query
Rank  State  
----  -----  
[0-1] Joined  

...

dmg storage query usage

Code Block
languagebash
$# dmgsystem storage query usage Hosts   output for a 2 hosts DAOS server
$ dmg storage query usage
Hosts   SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used 
-----   --------- -------- -------- ---------- --------- --------- 
boro-35 17 GB     17 GB    0 %      0 B        0 B       N/A       
boro-8  17 GB     17 GB    0 %      0 B        0 B       N/A        

dmg pool create help

Code Block
languagebash
$ dmg pool create --help
Usage:
  dmg [OPTIONS] pool create [create-OPTIONS]

Application Options:
      --allow-proxy    Allow proxy configuration via environment
  -l, --host-list=     comma separated list of addresses <ipv4addr/hostname>
  -i, --insecure       have dmg attempt to connect without certificates
  -d, --debug          enable debug output
  -j, --json           Enable JSON output
  -J, --json-logging   Enable JSON-formatted log output
  -o, --config-path=   Client config file path

Help Options:
  -h, --help           Show this help message

[create command options]
      -g, --group=     DAOS pool to be owned by given group, format name@domain
      -u, --user=      DAOS pool to be owned by given user, format name@domain
      -p, --name=      Unique name for pool (set as label)
      -a, --acl-file=  Access Control List file path for DAOS pool
      -z, --size=      Total size of DAOS pool (auto)
      -t, --scm-ratio= Percentage of SCM:NVMe for pool storage (auto) (default: 6)
      -k, --nranks=    Number of ranks to use (auto)
      -v, --nsvc=      Number of pool service replicas
      -s, --scm-size=  Per-server SCM allocation for DAOS pool (manual)
      -n, --nvme-size= Per-server NVMe allocation for DAOS pool (manual)
      -r, --ranks=     Storage server unique identifiers (ranks) for DAOS pool
      -S, --sys=       DAOS system that pool is to be a part of (default: daos_server)

dmg pool create

Code Block
languagebash
# Create a 10GB pool
$ dmg pool create --size=10G
Creating DAOS pool with automatic storage allocation: 10 GB NVMe + 6.00% SCM
Pool created with 100.00% SCM/NVMe ratio
-----------------------------------------
  UUID          : 0a6003c6-23a7-4cb5-8895-c004ca2b75f5
  Service Ranks : 0                                   
  Storage Ranks : [0-1]                               
  Total Size    : 10 GB                               
  SCM           : 10 GB (5.0 GB / rank)               
  NVMe          : 0 B (0 B / rank)                  

$ dmg storage query usage
Hosts   SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used 
-----   --------- -------- -------- ---------- --------- --------- 
boro-35 17 GB     12 GB    29 %     0 B        0 B       N/A       
boro-8  17 GB     11 GB    36 %     0 B        0 B       N/A

...

Code Block
languagebash
$ daos cont query  --pool=$DAOS_POOL --cont=$DAOS_CONT
Pool UUID:      528f4710-7eb8-4850-b6aa-09e4b3c8f532
Container UUID: bc4fe707-7470-4b7d-83bf-face75cc98fc
Number of snapshots: 0
Latest Persistent Snapshot: 0
Highest Aggregated Epoch: 172477977191481344
Container redundancy factor: 1

daos container snapshot help/create/list/destroy

Code Block
languagebash
$ daos help cont snapshotcreate-snap
daos command (v1.2), libdaos 1.2.0
container containeroptions (cont) commandssnapshot and rollback-related):
        --snap=NAME  create      container snapshot    create a container(create/destroy-snap, rollback)
        --epc=EPOCHNUM   clone  container epoch (destroy-snap, rollback)
      clone a container --epcrange=B-E     container epoch range (destroy-snap)
container options destroy(query, and all commands except create):
    destroy a container    <pool options>   with --cont use: list-objects     list all objects in container
          list-obj
          query            query a container
          get-prop         get all container's properties
          set-prop         set container's properties
          get-acl          get a container's ACL
          overwrite-acl    replace a container's ACL
          update-acl       add/modify entries in a container's ACL
          delete-acl       delete an entry from a container's ACL
          set-owner        change the user and/or group that own a container
          stat             get container statistics
          check            check objects consistency in container
          list-attrs       list container user-defined attributes
          del-attr         delete container user-defined attribute
          get-attr         get container user-defined attribute
          set-attr         set container user-defined attribute
          create-snap      create container snapshot (optional name)
                           at most recent committed epoch
          list-snaps       list container snapshots taken
          destroy-snap     destroy container snapshots
          (--pool, --sys-name)
          <pool options>   with --path use: (--sys-name)
        --cont=UUID        by name(mandatory, epoch or rangeuse --path)
         rollback         roll back container to specified snapshot

use 'daos help cont|container COMMAND' for command specific options

daos container snapshot create/list/destroy

Code Block
languagebash
--path=PATHSTR

$ daos cont create-snap --pool=$DAOS_POOL --cont=$DAOS_CONT
snapshot/epoch 172646116775952384 has been created

$ daos container list-snaps --pool=$DAOS_POOL --cont=$DAOS_CONT
Container's snapshots :
172478166024060928 
172646116775952384 

$ daos container destroy-snap --pool=$DAOS_POOL --cont=$DAOS_CONT --epc=172646116775952384

$ daos container list-snaps --pool=$DAOS_POOL --cont=$DAOS_CONT
Container's snapshots :
172478166024060928 

Frequent errors user might see and workaround

use dmg command without daos_admin

...

languagebash

privilege

Code Block
languagebash
# Error message or timeout after dmg system query
$ dmg system query 
ERROR: dmg: Unable to load Certificate Data: could not load cert: stat /etc/daos/certs/admin.crt: no such file or directory
# or Node-hang after dmg system query command issued 

# Workaround
# 1. Make sure the admin-host /etc/daos/daos_control.yml is correctly configured. 
#    including:
#      hostlist: <daos_server_lists>
#      port: <port_num>
#      transport_config:
#        allow_insecure: <true/false>
#        ca_cert: /etc/daos/certs/daosCA.crt
#        cert: /etc/daos/certs/admin.crt
#        key: /etc/daos/certs/admin.key
#
# 2. Make sure the admin-host allow_insecure mode match with the servers'.

...

Code Block
languagebash
$ dmg pool create --size=50G
Creating DAOS pool with automatic storage allocation: 50 GB NVMe + 6.00% SCM
ERROR: dmg: pool create failed: DER_NOSPACE(-1007): No space on storage target

# Workaround: dmg storage query scan to find current available storage
$ dmg storage query usage
Hosts  SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used 
-----  --------- -------- -------- ---------- --------- --------- 
boro-8 17 GB     6.0 GB   65 %     0 B        0 B       N/A       

$ dmg pool create --size=2G
Creating DAOS pool with automatic storage allocation: 2.0 GB NVMe + 6.00% SCM
Pool created with 100.00% SCM/NVMe ratio
-----------------------------------------
  UUID          : b5ce2954-3f3e-4519-be04-ea298d776132
  Service Ranks : 0                                   
  Storage Ranks : 0                                   
  Total Size    : 2.0 GB                              
  SCM           : 2.0 GB (2.0 GB / rank)              
  NVMe          : 0 B (0 B / rank)                    

$ dmg storage query usage
Hosts  SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used 
-----  --------- -------- -------- ---------- --------- --------- 
boro-8 17 GB     2.9 GB   83 %     0 B        0 B       N/A       

dmg pool destroy timeout

Code Block
languagebash
# dmg pool destroy Timeout or failed due to pool has active container(s)
# Workaround pool destroy --force option

$ dmg pool destroy --pool=$DAOS_POOL --force
Pool-destroy command succeeded

...


Run with dfuse fio

required rpm

Code Block
languagebash
$ sudo yum install -y fio
or
$ sudo yum install -y daos-tests

...

unmount

Code Block
languagebash
$ /usr/bin/fusermount -u /tmp/daos_test1/

$ /usr/bin/df -h -t fuse.daos
df: no file systems processed

...








Run with mpirun mdtest

required rpms

Code Block
languagebash
$ sudo yum install -y mpich
$ sudo yum install -y mdtest
$ sudo yum install -y Lmod
$ sudo module load mpi/mpich-x86_64
$ /usr/bin/touch /tmp/daos_test1/testfile

...

Code Block
languagebash
# Bring up 4 hosts server with appropriate daos_server.yml and
# access-point, reference to  DAOS Set-Up 
# After DAOS servers and, DAOS admin and client loaded, and started.

$ dmg storage format
Format Summary:
  Hosts             SCM Devices NVMe Devices 
  -----             ----------- ------------ 
  boro-[8,35,52-53] 1           0            

$ dmg pool list
Pool UUID Svc Replicas 
--------- ------------ 
733bee7b-c2af-499e-99dd-313b1ef092a9 
[1-3] 

$ daos cont create --pool=$DAOS_POOL --type=POSIX --oclass=RP_3G1 --properties=rf:2
Successfully created container 2649aa0f-3ad7-4943-abf5-4343205a637b 

$ daos pool list-cont --pool=$DAOS_POOL
2649aa0f-3ad7-4943-abf5-4343205a637b

$ dmg pool query --pool=$DAOS_POOL 
Pool 733bee7b-c2af-499e-99dd-313b1ef092a9, ntarget=32, disabled=0, leader=2, version=1 
Pool space info: 
- Target(VOS) count:32 
- SCM: 
  Total size: 5.0 GB 
  Free: 5.0 GB, min:156 MB, max:156 MB, mean:156 MB 
- NVMe: 
  Total size: 0 B 
  Free: 0 B, min:0 B, max:0 B, mean:0 B 
Rebuild idle, 0 objs, 0 recs

$ df -h -t fuse.daos
df: no file systems processed

$ mkdir /tmp/daos_test1

$ dfuse --mountpoint=/tmp/daos_test1 --pool=$DAOS_POOL --cont=$DAOS_CONT

$ df -h -t fuse.daos
Filesystem      Size  Used Avail Use% Mounted on
dfuse            19G  1.1M   19G   1% /tmp/daos_test1

$ fio --name=random-write --ioengine=pvsync --rw=randwrite --bs=4k --size=128M --nrfiles=4 --directory=/tmp/daos_test1 --numjobs=8 --iodepth=16 --runtime=60 --time_based --direct=1 --buffered=0 --randrepeat=0 --norandommap --refill_buffers --group_reporting
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=pvsync, iodepth=16
...
fio-3.7
Starting 8 processes
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
random-write: Laying out IO files (4 files / total 128MiB)
Jobs: 8 (f=32): [w(8)][100.0%][r=0KiB/s,w=96.1MiB/s][r=0,w=24.6k IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=8): err= 0: pid=27879: Sat Apr 17 01:12:57 2021
  write: IOPS=24.4k, BW=95.3MiB/s (99.9MB/s)(5716MiB/60001msec)
    clat (usec): min=220, max=6687, avg=326.19, stdev=55.29
     lat (usec): min=220, max=6687, avg=326.28, stdev=55.29
    clat percentiles (usec):
     |  1.00th=[  260],  5.00th=[  273], 10.00th=[  285], 20.00th=[  293],
     | 30.00th=[  306], 40.00th=[  314], 50.00th=[  322], 60.00th=[  330],
     | 70.00th=[  338], 80.00th=[  355], 90.00th=[  375], 95.00th=[  396],
     | 99.00th=[  445], 99.50th=[  465], 99.90th=[  523], 99.95th=[  562],
     | 99.99th=[ 1827]
   bw (  KiB/s): min=10976, max=12496, per=12.50%, avg=12191.82, stdev=157.87, samples=952
   iops        : min= 2744, max= 3124, avg=3047.92, stdev=39.47, samples=952
  lat (usec)   : 250=0.23%, 500=99.61%, 750=0.15%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%
  cpu          : usr=0.81%, sys=1.69%, ctx=1463535, majf=0, minf=308
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1463226,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=95.3MiB/s (99.9MB/s), 95.3MiB/s-95.3MiB/s (99.9MB/s-99.9MB/s), io=5716MiB (5993MB), run=60001-60001msec

...