Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Tip of master, commit 506021ea4ea6b25169d0706c69a845cc17e9ba8c

All tests run with ofi+psm2, ib0.

daos_test: Run with 8 server (boro-[4-11]), 2 client (boro-[12-13]). Killed servers, cleaned /mnt/daos in between runs listed below.

Tests requiring pool to be created via dmg used 4GB pool. These used boro-12 as client.

mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.

Test Results

daos_test

Separate runs with cleanup in between:

  • -mpcCAeoRd - PASS
  • -i - FAIL, still rebuilding on IO27 after 10 minutes
  • -r - same as -i, still rebuilding after 10 minutes
  • -O - PASS

daosperf

1K Records

CREDITS=1

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z
Test :
	DAOS (full stack)
Parameters :
	pool size     : 2048 MB
	credits       : 1 (sync I/O for -ve)
	obj_per_cont  : 1 x 1 (procs)
	dkey_per_obj  : 1
	akey_per_dkey : 200
	recx_per_akey : 1000
	value type    : single
	value size    : 1024
	zero copy     : yes
	overwrite     : yes
	verify fetch  : no
	VOS file      : <NULL>
349b5062: rank 1 became pool service leader 0
Started...
update successfully completed:
	duration : 106.440557 sec
	bandwith : 1.835      MB/sec
	rate     : 1878.98    IO/sec
	latency  : 532.203    us (nonsense if credits > 1)
Duration across processes:
	MAX duration : 106.440557 sec
	MIN duration : 106.440557 sec
	Average duration : 106.440557 sec
349b5062: rank 1 no longer pool service leader 0

CREDITS=8

  • Image AddedCART-496 - segfault in psm2 while running daos_perf CLOSED
  • Bug is fixed in patch that's not yet merged to master

4K Records

CREDITS=1

  • Image AddedCART-496 - segfault in psm2 while running daos_perf CLOSED
  • Bug is fixed in patch that's not yet merged to master

IOR, 10GB pool, data verification enabled

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 -N 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi  --ompi-server file:~/scripts/uri.txt ior -v -W -i 1 -a DAOS -w -o `uuidgen` -b 5g -t 1m -- -p 7edf08f3-bea0-4a0d-84da-042e8fa4cb4f -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
IOR-3.1.0: MPI Coordinated Test of Parallel I/O
Began               : Tue Sep 18 21:29:58 2018
Command line        : ior -v -W -i 1 -a DAOS -w -o 97d7d821-afb8-4867-8261-54101a8c4e54 -b 5g -t 1m -- -p 7edf08f3-bea0-4a0d-84da-042e8fa4cb4f -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE
Machine             : Linux boro-12.boro.hpdd.intel.com
Start time skew across all tasks: 13470298.67 sec
TestID              : 0
StartTime           : Tue Sep 18 21:29:58 2018
Path                : /home/sdwillso/daos_m
FS                  : 3.8 TiB   Used FS: 14.1%   Inodes: 250.0 Mi   Used Inodes: 2.9%
Participating tasks: 2
[0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA

Options: 
api                 : DAOS
apiVersion          : DAOS
test filename       : 97d7d821-afb8-4867-8261-54101a8c4e54
access              : single-shared-file
type                : independent
segments            : 1
ordering in a file  : sequential
ordering inter file : no tasks offsets
tasks               : 2
clients per node    : 1
repetitions         : 1
xfersize            : 1 MiB
blocksize           : 5 GiB
aggregate filesize  : 10 GiB

Results: 

access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
Commencing write performance test: Tue Sep 18 21:29:59 2018
write     4993       5242880    1024.00    0.061752   1.96       0.026341   2.05       0   
Verifying contents of the file(s) just written.
Tue Sep 18 21:30:01 2018

remove    -          -          -          -          -          -          0.000066   0   
Max Write: 4992.56 MiB/sec (5235.08 MB/sec)

Summary of all tests:
Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev   Max(OPs)   Min(OPs)  Mean(OPs)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt   blksiz    xsize aggs(MiB)   API RefNum
write        4992.56    4992.56    4992.56       0.00    4992.56    4992.56    4992.56       0.00    2.05105     0      2   1    1   0     0        1         0    0      1 5368709120  1048576   10240.0 DAOS      0
Finished            : Tue Sep 18 21:30:10 2018

daos_bench

kv-idx-update

kv-dkey-update

Time: 0.110023 seconds (908.901733 ops per second)
Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=458b9874-d084-41f9-aaa0-78029731d5fc --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Tue Sep 18 22:02:31 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-update
DAOS pool :458b9874-d084-41f9-aaa0-78029731d5fc
DAOS container :312ab399-644a-406b-a192-a689a7209bd8
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-update
Time: 0.110023 seconds (908.901733 ops per second)

Ended at Tue Sep 18 22:02:32 2018

kv-akey-update

Time: 0.126121 seconds (1585.773965 ops per second)
Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -N 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=e59fc03a-c0a2-4286-b4f2-eca7b5c9a350 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Tue Sep 18 22:06:41 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-update
DAOS pool :e59fc03a-c0a2-4286-b4f2-eca7b5c9a350
DAOS container :6a6e7b03-5032-4b95-a2a9-3b2c0c03d60d
Value buffer size: 64
Number of processes: 2
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-update
Time: 0.126121 seconds (1585.773965 ops per second)

Ended at Tue Sep 18 22:06:42 2018

kv-dkey-fetch

Time: 0.188908 seconds (1058.717960 ops per second)
Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -N 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=c4b0bbed-40e6-48d1-8f61-2d7d66114f50 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Tue Sep 18 22:08:21 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-fetch
DAOS pool :c4b0bbed-40e6-48d1-8f61-2d7d66114f50
DAOS container :286125a7-97c6-459e-988f-7cef535a5941
Value buffer size: 64
Number of processes: 2
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-fetch
Time: 0.188908 seconds (1058.717960 ops per second)

Ended at Tue Sep 18 22:08:23 2018

kv-akey-fetch

Time: 0.084518 seconds (2366.364978 ops per second)
Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -N 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=15c5477e-bf75-4302-a609-5fbde1c25785 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Tue Sep 18 22:09:52 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-fetch
DAOS pool :15c5477e-bf75-4302-a609-5fbde1c25785
DAOS container :96a91201-9337-43c6-aacd-46153974e6b8
Value buffer size: 64
Number of processes: 2
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-fetch
Time: 0.084518 seconds (2366.364978 ops per second)

Ended at Tue Sep 18 22:09:53 2018

CaRT Self-Test

Small IO

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       100000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 19.076417 S.
##################################################
Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 0.00
	RPC Throughput (RPCs/sec): 5242
	RPC Latencies (us):
		Min    : 1035
		25th  %: 2980
		Median : 3008
		75th  %: 3034
		Max    : 23485
		Average: 3022
		Std Dev: 343.21
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 3008

Large IO Bulk PUT

Code Block
linenumberstrue
collapsetrue
--repetitions 1000i-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16  
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 1048576-BULK_PUT)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       1000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 0.340959 S.
##################################################
Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 2932.91
	RPC Throughput (RPCs/sec): 2933
	RPC Latencies (us):
		Min    : 2305
		25th  %: 5373
		Median : 5404
		75th  %: 5430
		Max    : 6371
		Average: 5387
		Std Dev: 253.15
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 5403

Large IO Bulk GET

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 daos_m]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "b1048576 0" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(1048576-BULK_GET 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       1000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 0.332212 S.
##################################################
Results for message size (1048576-BULK_GET 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 3010.12
	RPC Throughput (RPCs/sec): 3010
	RPC Latencies (us):
		Min    : 2206
		25th  %: 5234
		Median : 5260
		75th  %: 5298
		Max    : 6019
		Average: 5248
		Std Dev: 247.42
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 5259

mpich tests

Results: Fails out at first test, Image AddedCART-496 - segfault in psm2 while running daos_perf CLOSED