9-18-18
- Stephen Willson (Unlicensed)
Owned by Stephen Willson (Unlicensed)
Tip of master, commit 506021ea4ea6b25169d0706c69a845cc17e9ba8c
All tests run with ofi+psm2, ib0.
daos_test: Run with 8 server (boro-[4-11]), 2 client (boro-[12-13]). Killed servers, cleaned /mnt/daos in between runs listed below.
Tests requiring pool to be created via dmg used 4GB pool. These used boro-12 as client.
mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.
Test Results
daos_test
Separate runs with cleanup in between:
- -mpcCAeoRd - PASS
- -i - FAIL, still rebuilding on IO27 after 10 minutes
- DAOS-1289 - daos_test -i subtest 27 rebuild hangs OPEN
- -r - same as -i, still rebuilding after 10 minutes
- -O - PASS
daosperf
1K Records
CREDITS=1
Expand source
[sdwillso@boro-4 daos_m]$ orterun --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z Test : DAOS (full stack) Parameters : pool size : 2048 MB credits : 1 (sync I/O for -ve) obj_per_cont : 1 x 1 (procs) dkey_per_obj : 1 akey_per_dkey : 200 recx_per_akey : 1000 value type : single value size : 1024 zero copy : yes overwrite : yes verify fetch : no VOS file : <NULL> 349b5062: rank 1 became pool service leader 0 Started... update successfully completed: duration : 106.440557 sec bandwith : 1.835 MB/sec rate : 1878.98 IO/sec latency : 532.203 us (nonsense if credits > 1) Duration across processes: MAX duration : 106.440557 sec MIN duration : 106.440557 sec Average duration : 106.440557 sec 349b5062: rank 1 no longer pool service leader 0
CREDITS=8
- CART-496 - segfault in psm2 while running daos_perf CLOSED
- Bug is fixed in patch that's not yet merged to master
4K Records
CREDITS=1
- CART-496 - segfault in psm2 while running daos_perf CLOSED
- Bug is fixed in patch that's not yet merged to master
IOR, 10GB pool, data verification enabled
Expand source
[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 -N 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi --ompi-server file:~/scripts/uri.txt ior -v -W -i 1 -a DAOS -w -o `uuidgen` -b 5g -t 1m -- -p 7edf08f3-bea0-4a0d-84da-042e8fa4cb4f -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE ior WARNING: assuming POSIX-based backend for DAOS statfs call. ior WARNING: assuming POSIX-based backend for DAOS mkdir call. ior WARNING: assuming POSIX-based backend for DAOS rmdir call. ior WARNING: assuming POSIX-based backend for DAOS access call. ior WARNING: assuming POSIX-based backend for DAOS stat call. ior WARNING: assuming POSIX-based backend for DAOS statfs call. ior WARNING: assuming POSIX-based backend for DAOS mkdir call. ior WARNING: assuming POSIX-based backend for DAOS rmdir call. ior WARNING: assuming POSIX-based backend for DAOS access call. ior WARNING: assuming POSIX-based backend for DAOS stat call. IOR-3.1.0: MPI Coordinated Test of Parallel I/O Began : Tue Sep 18 21:29:58 2018 Command line : ior -v -W -i 1 -a DAOS -w -o 97d7d821-afb8-4867-8261-54101a8c4e54 -b 5g -t 1m -- -p 7edf08f3-bea0-4a0d-84da-042e8fa4cb4f -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE Machine : Linux boro-12.boro.hpdd.intel.com Start time skew across all tasks: 13470298.67 sec TestID : 0 StartTime : Tue Sep 18 21:29:58 2018 Path : /home/sdwillso/daos_m FS : 3.8 TiB Used FS: 14.1% Inodes: 250.0 Mi Used Inodes: 2.9% Participating tasks: 2 [0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA Options: api : DAOS apiVersion : DAOS test filename : 97d7d821-afb8-4867-8261-54101a8c4e54 access : single-shared-file type : independent segments : 1 ordering in a file : sequential ordering inter file : no tasks offsets tasks : 2 clients per node : 1 repetitions : 1 xfersize : 1 MiB blocksize : 5 GiB aggregate filesize : 10 GiB Results: access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---------- --------- -------- -------- -------- -------- ---- Commencing write performance test: Tue Sep 18 21:29:59 2018 write 4993 5242880 1024.00 0.061752 1.96 0.026341 2.05 0 Verifying contents of the file(s) just written. Tue Sep 18 21:30:01 2018 remove - - - - - - 0.000066 0 Max Write: 4992.56 MiB/sec (5235.08 MB/sec) Summary of all tests: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs) Min(OPs) Mean(OPs) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum write 4992.56 4992.56 4992.56 0.00 4992.56 4992.56 4992.56 0.00 2.05105 0 2 1 1 0 0 1 0 0 1 5368709120 1048576 10240.0 DAOS 0 Finished : Tue Sep 18 21:30:10 2018
daos_bench
kv-idx-update
kv-dkey-update
Time: 0.110023 seconds (908.901733 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=458b9874-d084-41f9-aaa0-78029731d5fc --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Tue Sep 18 22:02:31 2018 ================================= =============================== Test Setup --------------- Test: kv-dkey-update DAOS pool :458b9874-d084-41f9-aaa0-78029731d5fc DAOS container :312ab399-644a-406b-a192-a689a7209bd8 Value buffer size: 64 Number of processes: 1 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-dkey-update Time: 0.110023 seconds (908.901733 ops per second) Ended at Tue Sep 18 22:02:32 2018
kv-akey-update
Time: 0.126121 seconds (1585.773965 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -N 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=e59fc03a-c0a2-4286-b4f2-eca7b5c9a350 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Tue Sep 18 22:06:41 2018 ================================= =============================== Test Setup --------------- Test: kv-akey-update DAOS pool :e59fc03a-c0a2-4286-b4f2-eca7b5c9a350 DAOS container :6a6e7b03-5032-4b95-a2a9-3b2c0c03d60d Value buffer size: 64 Number of processes: 2 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-akey-update Time: 0.126121 seconds (1585.773965 ops per second) Ended at Tue Sep 18 22:06:42 2018
kv-dkey-fetch
Time: 0.188908 seconds (1058.717960 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -N 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=c4b0bbed-40e6-48d1-8f61-2d7d66114f50 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Tue Sep 18 22:08:21 2018 ================================= =============================== Test Setup --------------- Test: kv-dkey-fetch DAOS pool :c4b0bbed-40e6-48d1-8f61-2d7d66114f50 DAOS container :286125a7-97c6-459e-988f-7cef535a5941 Value buffer size: 64 Number of processes: 2 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-dkey-fetch Time: 0.188908 seconds (1058.717960 ops per second) Ended at Tue Sep 18 22:08:23 2018
kv-akey-fetch
Time: 0.084518 seconds (2366.364978 ops per second)
Expand source
[sdwillso@boro-4 daos_m]$ orterun -N 1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=15c5477e-bf75-4302-a609-5fbde1c25785 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000 ================================ DAOSBENCH (KV) Started at Tue Sep 18 22:09:52 2018 ================================= =============================== Test Setup --------------- Test: kv-akey-fetch DAOS pool :15c5477e-bf75-4302-a609-5fbde1c25785 DAOS container :96a91201-9337-43c6-aacd-46153974e6b8 Value buffer size: 64 Number of processes: 2 Number of keys/process: 100 Number of asynchronous I/O: 32 =============================== kv-akey-fetch Time: 0.084518 seconds (2366.364978 ops per second) Ended at Tue Sep 18 22:09:53 2018
CaRT Self-Test
Small IO
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(0-EMPTY 0-EMPTY)] Buffer addresses end with: <Default> Repetitions per size: 100000 Max inflight RPCs: 16 host boro-4.boro.hpdd.intel.com finished self_test duration 19.076417 S. ################################################## Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 0.00 RPC Throughput (RPCs/sec): 5242 RPC Latencies (us): Min : 1035 25th %: 2980 Median : 3008 75th %: 3034 Max : 23485 Average: 3022 Std Dev: 343.21 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 3008
Large IO Bulk PUT
Expand source
--repetitions 1000i-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(0-EMPTY 1048576-BULK_PUT)] Buffer addresses end with: <Default> Repetitions per size: 1000 Max inflight RPCs: 16 host boro-4.boro.hpdd.intel.com finished self_test duration 0.340959 S. ################################################## Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 2932.91 RPC Throughput (RPCs/sec): 2933 RPC Latencies (us): Min : 2305 25th %: 5373 Median : 5404 75th %: 5430 Max : 6371 Average: 5387 Std Dev: 253.15 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 5403
Large IO Bulk GET
Expand source
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "b1048576 0" --max-inflight-rpcs 16 --repetitions 1000 Adding endpoints: ranks: 0 (# ranks = 1) tags: 0 (# tags = 1) Warning: No --master-endpoint specified; using this command line application as the master endpoint Self Test Parameters: Group name to test against: daos_server # endpoints: 1 Message sizes: [(1048576-BULK_GET 0-EMPTY)] Buffer addresses end with: <Default> Repetitions per size: 1000 Max inflight RPCs: 16 host boro-4.boro.hpdd.intel.com finished self_test duration 0.332212 S. ################################################## Results for message size (1048576-BULK_GET 0-EMPTY) (max_inflight_rpcs = 16): Master Endpoint 0:0 ------------------- RPC Bandwidth (MB/sec): 3010.12 RPC Throughput (RPCs/sec): 3010 RPC Latencies (us): Min : 2206 25th %: 5234 Median : 5260 75th %: 5298 Max : 6019 Average: 5248 Std Dev: 247.42 RPC Failures: 0 Endpoint results (rank:tag - Median Latency (us)): 0:0 - 5259
mpich tests
Results: Fails out at first test, CART-496 - segfault in psm2 while running daos_perf CLOSED