8-20-18
- Stephen Willson (Unlicensed)
- Jelon Anderson (Deactivated)
Owned by Stephen Willson (Unlicensed)
Tip of master, commit cf89d9f3cc2ec5e5fbc8c33ea9f51e83ac23429a
All tests run with ofi+psm2, ib0.
daos_test: Run with 8 server (boro-[4-11]), 2 client (boro-[12-13]). Killed servers, cleaned /mnt/daos in between runs listed below.
Tests requiring pool to be created via dmg used 4GB pool. These used boro-12 as client.
mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.
Test Results
daos_test
Separate runs with cleanup in between:
- -mpcCAeoRd - PASS
- -i - FAIL, still rebuilding on IO27 after 10 minutes
- Appears to be DAOS-1207 - Getting issue details... STATUS
- -r - FAIL
- looks to be same as -i above, still rebuilding after 10 min
- -O - PASS
daosperf
1K Records
CREDITS=1
Expand source
[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z Test : DAOS (full stack) Parameters : pool size : 2048 MB credits : 1 (sync I/O for -ve) obj_per_cont : 1 x 1 (procs) dkey_per_obj : 1 akey_per_dkey : 200 recx_per_akey : 1000 value type : single value size : 1024 zero copy : yes overwrite : yes verify fetch : no VOS file : <NULL> e77e006e: rank 1 became pool service leader 0 Started... update successfully completed: duration : 5.539411 sec bandwith : 35.259 MB/sec rate : 36104.92 IO/sec latency : 27.697 us (nonsense if credits > 1) Duration across processes: MAX duration : 5.539411 sec MIN duration : 5.539411 sec Average duration : 5.539411 sec e77e006e: rank 1 no longer pool service leader 0
CREDITS=8
- hitting segfault CART-496 - segfault in psm2 while running daos_perf OPEN
4K Records
CREDITS=1
- hitting segfault CART-496 - segfault in psm2 while running daos_perf OPEN
IOR, 10GB pool, data verification enabled
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi --ompi-server file:~/scripts/uri.txt ior -v -W -i 5 -a DAOS -w -o `uuidgen` -b 5g -t 1m -O daospool=cc15805e-1178-41f5-8542-bccdcc7aadce,daosrecordsize=1m,daosstripesize=1m,daosstripecount=1024,daosaios=16,daosobjectclass=LARGE,daosPoolSvc=1,daosepoch=1 IOR-3.0.1: MPI Coordinated Test of Parallel I/O Began: Mon Aug 20 18:42:00 2018 Command line used: ior -v -W -i 5 -a DAOS -w -o 178063f8-d7aa-4d34-8dd9-4c9218e56101 -b 5g -t 1m -O daospool=cc15805e-1178-41f5-8542-bccdcc7aadce,daosrecordsize=1m,daosstripesize=1m,daosstripecount=1024,daosaios=16,daosobjectclass=LARGE,daosPoolSvc=1,daosepoch=1 Machine: Linux boro-12.boro.hpdd.intel.com Start time skew across all tasks: 0.00 sec Test 0 started: Mon Aug 20 18:42:00 2018 Path: /home/sdwillso/daos_m FS: 3.8 TiB Used FS: 12.6% Inodes: 250.0 Mi Used Inodes: 2.5% Participating tasks: 1 [0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA Summary: api = DAOS test filename = 178063f8-d7aa-4d34-8dd9-4c9218e56101 access = single-shared-file, independent pattern = segmented (1 segment) ordering in a file = sequential offsets ordering inter file= no tasks offsets clients = 1 (1 per node) repetitions = 5 xfersize = 1 MiB blocksize = 5 GiB aggregate filesize = 5 GiB access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---------- --------- -------- -------- -------- -------- ---- Commencing write performance test: Mon Aug 20 18:42:01 2018 write 5413 5242880 1024.00 0.006197 0.936376 0.003321 0.945910 0 Verifying contents of the file(s) just written. Mon Aug 20 18:42:02 2018 remove - - - - - - 0.001939 0 Commencing write performance test: Mon Aug 20 18:42:05 2018 write 5554 5242880 1024.00 0.001194 0.918304 0.002265 0.921780 1 Verifying contents of the file(s) just written. Mon Aug 20 18:42:06 2018 remove - - - - - - 0.000892 1 Commencing write performance test: Mon Aug 20 18:42:10 2018 write 5516 5242880 1024.00 0.001179 0.924887 0.002051 0.928131 2 Verifying contents of the file(s) just written. Mon Aug 20 18:42:11 2018 remove - - - - - - 0.000912 2 Commencing write performance test: Mon Aug 20 18:42:14 2018 write 5558 5242880 1024.00 0.001197 0.917913 0.002022 0.921148 3 Verifying contents of the file(s) just written. Mon Aug 20 18:42:15 2018 remove - - - - - - 0.000862 3 Commencing write performance test: Mon Aug 20 18:42:18 2018 write 5578 5242880 1024.00 0.001147 0.914605 0.002092 0.917854 4 Verifying contents of the file(s) just written. Mon Aug 20 18:42:19 2018 remove - - - - - - 0.000922 4 Max Write: 5578.23 MiB/sec (5849.19 MB/sec) Summary of all tests: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5578.23 5412.78 5524.04 59.11 0.92696 0 1 1 5 0 0 1 0 0 1 5368709120 1048576 5368709120 DAOS 0 Finished: Mon Aug 20 18:42:25 2018
daos_bench
kv-idx-update
- At end of this test with multiple servers, container destroy fails
- DAOS-1243 - Getting issue details... STATUS
Time: 104.697483 seconds (9551.328003 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-idx-update --testid=1 --svc=1 --dpool=96f11eff-7475-4158-90ae-b55fae60cfaf --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Mon Aug 20 18:50:25 2018 ================================= =============================== Test Setup --------------- Test: kv-idx-update DAOS pool :96f11eff-7475-4158-90ae-b55fae60cfaf DAOS container :fbe891b8-d983-4289-a111-12c0682ae9db Value buffer size: 64 Number of processes: 1 Number of indexes/process: 1000000 Number of asynchronous I/O: 32 =============================== kv-idx-update Time: 104.697483 seconds (9551.328003 ops per second)
kv-dkey-update
Time: 0.006400 seconds (15624.158021 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=3031f327-4f8e-4489-8de6-b2994b8aff5a --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Mon Aug 20 18:59:08 2018 ================================= =============================== Test Setup --------------- Test: kv-dkey-update DAOS pool :3031f327-4f8e-4489-8de6-b2994b8aff5a DAOS container :99ad75bd-f914-4449-becf-b5e64f278d79 Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-dkey-update Time: 0.006400 seconds (15624.158021 ops per second) Ended at Mon Aug 20 18:59:09 2018
kv-akey-update
Time: 0.004211 seconds (23746.177339 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=d09aa21b-7f78-4ede-b437-cc7b68f13037 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Mon Aug 20 19:00:59 2018 ================================= =============================== Test Setup --------------- Test: kv-akey-update DAOS pool :d09aa21b-7f78-4ede-b437-cc7b68f13037 DAOS container :5c00f77c-0eb7-4672-90fa-d37724a66c5a Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-akey-update Time: 0.004211 seconds (23746.177339 ops per second) Ended at Mon Aug 20 19:01:00 2018
kv-dkey-fetch
Time: 0.000601 seconds (166508.774672 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=ffeb7914-5b21-4865-80fd-b3dc0ad6bc1c --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Mon Aug 20 19:02:29 2018 ================================= =============================== Test Setup --------------- Test: kv-dkey-fetch DAOS pool :ffeb7914-5b21-4865-80fd-b3dc0ad6bc1c DAOS container :396c6185-c711-4892-9c32-530212979b9d Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-dkey-fetch Time: 0.000601 seconds (166508.774672 ops per second) Ended at Mon Aug 20 19:02:31 2018
kv-akey-fetch
Time: 0.001685 seconds (59362.699540 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=ba2b92f5-2ac9-42e4-8070-6b6b13531b80 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Mon Aug 20 19:04:00 2018 ================================= =============================== Test Setup --------------- Test: kv-akey-fetch DAOS pool :ba2b92f5-2ac9-42e4-8070-6b6b13531b80 DAOS container :cbeeb6b2-4ec2-4aa4-82ff-9b8859710ae8 Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-akey-fetch Time: 0.001685 seconds (59362.699540 ops per second) Ended at Mon Aug 20 19:04:01 2018
CaRT Self-Test
Small IO
Expand source
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 -ompi-server file:~/scripts/uri.txt --hostfile ~/hostlists/daos_single_server_2 self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(0-EMPTY 0-EMPTY)] Buffer addresses end with: <Default> Repetitions per size: 100000 Max inflight RPCs: 16 host boro-6.boro.hpdd.intel.com finished self_test duration 0.309352 S. ################################################## Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 0.00 RPC Throughput (RPCs/sec): 323257 RPC Latencies (us): Min : 19 25th %: 44 Median : 44 75th %: 50 Max : 1433 Average: 49 Std Dev: 13.66 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 44
Large IO Bulk PUT
Expand source
[sdwillso@boro-4 ~]$ orterun -np 1 --hostfile ~/hostlists/daos_single_server_2 -x FI_PSM2_DISCONNECT=1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16 --repetitions 1000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(0-EMPTY 1048576-BULK_PUT)] Buffer addresses end with: <Default> Repetitions per size: 1000 Max inflight RPCs: 16 host boro-6.boro.hpdd.intel.com finished self_test duration 0.085034 S. ################################################## Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 11759.96 RPC Throughput (RPCs/sec): 11760 RPC Latencies (us): Min : 643 25th %: 1341 Median : 1354 75th %: 1369 Max : 1582 Average: 1349 Std Dev: 66.47 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 1354
Large IO Bulk GET
Expand source
[sdwillso@boro-4 ~]$ orterun -np 1 --hostfile ~/hostlists/daos_single_server_2 -x FI_PSM2_DISCONNECT=1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "b1048576 0" --max-inflight-rpcs 16 --repetitions 1000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(1048576-BULK_GET 0-EMPTY)] Buffer addresses end with: <Default> Repetitions per size: 1000 Max inflight RPCs: 16 host boro-6.boro.hpdd.intel.com finished self_test duration 0.125142 S. ################################################## Results for message size (1048576-BULK_GET 0-EMPTY) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 7990.89 RPC Throughput (RPCs/sec): 7991 RPC Latencies (us): Min : 305 25th %: 990 Median : 2334 75th %: 2729 Max : 3287 Average: 1991 Std Dev: 895.36 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 2334
mpich tests
Results: