10-2-18
- Stephen Willson (Unlicensed)
- Jelon Anderson (Deactivated)
Tip of master, commit 32aeb8b47ae0bdd69da80d3b51026865226295b1
All tests run with ofi+psm2, ib0.
daos_test: Run with 8 server (boro-[4-11]), 2 client (boro-12,16). Killed servers, cleaned /mnt/daos in between runs listed below.
Tests requiring pool to be created via dmg used 4GB pool. These used boro-12 as client.
mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.
Test Results
daos_test
Separate runs with cleanup in between:
- -mpcCAeioRdOr - PASS
daosperf
1K Records
CREDITS=1
[sdwillso@boro-4 daos_m]$ orterun --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' Test : DAOS (full stack) Parameters : pool size : 2048 MB credits : 1 (sync I/O for -ve) obj_per_cont : 1 x 1 (procs) dkey_per_obj : 1 akey_per_dkey : 200 recx_per_akey : 1000 value type : single value size : 1024 zero copy : yes overwrite : yes verify fetch : no VOS file : <NULL> 87426454: rank 1 became pool service leader 0 Started... update successfully completed: duration : 111.352682 sec bandwith : 1.754 MB/sec rate : 1796.10 IO/sec latency : 556.763 us (nonsense if credits > 1) Duration across processes: MAX duration : 111.352682 sec MIN duration : 111.352682 sec Average duration : 111.352682 sec 87426454: rank 1 no longer pool service leader 0
CREDITS=8
CART-496 - Getting issue details... STATUS
4K Records
CREDITS=1
CART-496 - Getting issue details... STATUS
IOR, 50GB pool, data verification enabled
[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 -N 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi --ompi-server file:~/scripts/uri.txt ior -v -W -i 1 -a DAOS -w -o `uuidgen` -b 5g -t 1m -- -p 9c8b96e6-964d-4c07-bfbc-aac53ec066e9 -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE -e 1 ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' ior WARNING: assuming POSIX-based backend for DAOS statfs call. ior WARNING: assuming POSIX-based backend for DAOS mkdir call. ior WARNING: assuming POSIX-based backend for DAOS rmdir call. ior WARNING: assuming POSIX-based backend for DAOS access call. ior WARNING: assuming POSIX-based backend for DAOS stat call. ior WARNING: assuming POSIX-based backend for DAOS statfs call. ior WARNING: assuming POSIX-based backend for DAOS mkdir call. ior WARNING: assuming POSIX-based backend for DAOS rmdir call. ior WARNING: assuming POSIX-based backend for DAOS access call. ior WARNING: assuming POSIX-based backend for DAOS stat call. IOR-3.1.0: MPI Coordinated Test of Parallel I/O Began : Tue Oct 2 22:31:12 2018 Command line : ior -v -W -i 1 -a DAOS -w -o 1c36ddfa-51c6-43db-96a0-934696fbc0f9 -b 5g -t 1m -- -p 9c8b96e6-964d-4c07-bfbc-aac53ec066e9 -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE -e 1 Machine : Linux boro-12.boro.hpdd.intel.com Start time skew across all tasks: 14690266.15 sec TestID : 0 StartTime : Tue Oct 2 22:31:12 2018 Path : /home/sdwillso/daos_m FS : 3.8 TiB Used FS: 14.0% Inodes: 250.0 Mi Used Inodes: 3.0% Participating tasks: 2 [0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA Options: api : DAOS apiVersion : DAOS test filename : 1c36ddfa-51c6-43db-96a0-934696fbc0f9 access : single-shared-file type : independent segments : 1 ordering in a file : sequential ordering inter file : no tasks offsets tasks : 2 clients per node : 1 repetitions : 1 xfersize : 1 MiB blocksize : 5 GiB aggregate filesize : 10 GiB Results: access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---------- --------- -------- -------- -------- -------- ---- Commencing write performance test: Tue Oct 2 22:31:14 2018 write 4479 5242880 1024.00 0.044826 2.21 0.027054 2.29 0 Verifying contents of the file(s) just written. Tue Oct 2 22:31:16 2018 remove - - - - - - 0.000068 0 Max Write: 4479.36 MiB/sec (4696.95 MB/sec) Summary of all tests: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs) Min(OPs) Mean(OPs) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum write 4479.36 4479.36 4479.36 0.00 4479.36 4479.36 4479.36 0.00 2.28604 0 2 1 1 0 0 1 0 0 1 5368709120 1048576 10240.0 DAOS 0 Finished : Tue Oct 2 22:31:25 2018
daos_bench
kv-idx-update
Time: 614.043987 seconds (1628.547826 ops per second)
DAOS-1243 - Getting issue details... STATUS
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-idx-update --testid=1 --svc=1 --dpool=32b9e029-a757-45c1-a1fc-cf8b346fc18f --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' ================================ DAOSBENCH (KV) Started at Tue Oct 2 22:44:24 2018 ================================= =============================== Test Setup --------------- Test: kv-idx-update DAOS pool :32b9e029-a757-45c1-a1fc-cf8b346fc18f DAOS container :6f4c6523-5fc0-42e4-a9b7-8ead4463d9ea Value buffer size: 64 Number of processes: 1 Number of indexes/process: 1000000 Number of asynchronous I/O: 32 =============================== kv-idx-update Time: 614.043987 seconds (1628.547826 ops per second) daosbench:0:src/tests/daosbench.c:765: Unknown error 2001: Container destroy failed
kv-dkey-update
Time: 0.137489 seconds (727.331513 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=fa67ca50-bbe9-40be-ae42-9a384fa9fd37 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' ================================ DAOSBENCH (KV) Started at Tue Oct 2 22:56:49 2018 ================================= =============================== Test Setup --------------- Test: kv-dkey-update DAOS pool :fa67ca50-bbe9-40be-ae42-9a384fa9fd37 DAOS container :333d6f06-b3f8-4589-b841-57a76d9fea47 Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-dkey-update Time: 0.137489 seconds (727.331513 ops per second) Ended at Tue Oct 2 22:56:51 2018
kv-akey-update
Time: 0.070309 seconds (1422.285973 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=9327cb08-5121-4586-b58d-22160e2d11d1 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' ================================ DAOSBENCH (KV) Started at Tue Oct 2 22:58:09 2018 ================================= =============================== Test Setup --------------- Test: kv-akey-update DAOS pool :9327cb08-5121-4586-b58d-22160e2d11d1 DAOS container :c0459b84-5fa0-46fa-a372-5c4cab702b80 Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-akey-update Time: 0.070309 seconds (1422.285973 ops per second) Ended at Tue Oct 2 22:58:11 2018
kv-dkey-fetch
Time: 0.071141 seconds (1405.663142 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=9c67989d-19e8-4b2c-a4c3-6c6ec6df8196 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' ================================ DAOSBENCH (KV) Started at Tue Oct 2 22:59:12 2018 ================================= =============================== Test Setup --------------- Test: kv-dkey-fetch DAOS pool :9c67989d-19e8-4b2c-a4c3-6c6ec6df8196 DAOS container :4fd98e33-ef0d-4d6c-a8a1-5b17fcd47aef Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-dkey-fetch Time: 0.071141 seconds (1405.663142 ops per second) Ended at Tue Oct 2 22:59:13 2018
kv-akey-fetch
Time: 0.042142 seconds (2372.908955 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=1d714278-cef5-4cf8-b9ad-273a47dc2b3f --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64' ================================ DAOSBENCH (KV) Started at Tue Oct 2 23:00:16 2018 ================================= =============================== Test Setup --------------- Test: kv-akey-fetch DAOS pool :1d714278-cef5-4cf8-b9ad-273a47dc2b3f DAOS container :91781881-2477-47d5-8bd7-2c1f22f127c4 Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-akey-fetch Time: 0.042142 seconds (2372.908955 ops per second) Ended at Tue Oct 2 23:00:17 2018
CaRT Self-Test
Small IO
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(0-EMPTY 0-EMPTY)] Buffer addresses end with: <Default> Repetitions per size: 100000 Max inflight RPCs: 16 host boro-4.boro.hpdd.intel.com finished self_test duration 19.502825 S. ################################################## Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 0.00 RPC Throughput (RPCs/sec): 5127 RPC Latencies (us): Min : 1126 25th %: 3051 Median : 3080 75th %: 3112 Max : 17075 Average: 3087 Std Dev: 100.32 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 3080
Large IO Bulk PUT
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16 --repetitions 1000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(0-EMPTY 1048576-BULK_PUT)] Buffer addresses end with: <Default> Repetitions per size: 1000 Max inflight RPCs: 16 host boro-4.boro.hpdd.intel.com finished self_test duration 0.338606 S. ################################################## Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 2953.28 RPC Throughput (RPCs/sec): 2953 RPC Latencies (us): Min : 2272 25th %: 5332 Median : 5361 75th %: 5401 Max : 6320 Average: 5352 Std Dev: 255.75 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 5361
Large IO Bulk GET
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes “b1048576 0” --max-inflight-rpcs 16 --repetitions 1000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(1048576-BULK_GET 1048576-BULK_PUT)] Buffer addresses end with: <Default> Repetitions per size: 1000 Max inflight RPCs: 16 host boro-4.boro.hpdd.intel.com finished self_test duration 0.548675 S. ################################################## Results for message size (1048576-BULK_GET 1048576-BULK_PUT) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 3645.14 RPC Throughput (RPCs/sec): 1823 RPC Latencies (us): Min : 3590 25th %: 8701 Median : 8770 75th %: 8829 Max : 12066 Average: 8705 Std Dev: 598.18 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 8770