Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Repair Jira Macros

...

Tip of master, commit 98cd53e0a273885324261d5f38c23a80be84f09e

After running tip of master, reran few tests with OFI updated to 99e333426b64d7d227fd604731235ffc14862662 to pull in some psm2 fixes.

All tests run with ofi+psm2, ib0.

...

  • -mpcCAeoRd - PASS
  • -i - FAIL, still rebuilding on IO27 after 10 minutes
    • Jira Legacy
      serverHPDD Community JiraSystem JIRA
      columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
      serverId8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69
      keyDAOS-1289
  • -r - FAIL
    • looks to be same as -i above, still rebuilding after 10 min
  • -O - PASS

...

  • hitting segfault  
    Jira Legacy
    serverHPDD Community JiraSystem JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69
    keyCART-496

4K Records

CREDITS=1

  • hitting segfault  
    Jira Legacy
    serverHPDD Community JiraSystem JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69
    keyCART-496

IOR, 10GB pool, data verification enabled

...

  • At end of this test with multiple servers, container destroy fails
    • Jira Legacy
      serverHPDD Community JiraSystem JIRA
      columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
      serverId8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69
      keyDAOS-1243
Time: 105.668696 seconds (9463.540644 ops per second)

...

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=7e97c5fb-196e-4851-9c88-069349d8ce6e --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 27 18:35:29 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-fetch
DAOS pool :7e97c5fb-196e-4851-9c88-069349d8ce6e
DAOS container :932946ad-9bf1-4309-b14b-96ca699ee72c
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-fetch
Time: 0.001576 seconds (63464.794813 ops per second)

Ended at Mon Aug 27 18:35:30 2018

CaRT Self-Test

Small IO

Large IO Bulk PUT

Large IO Bulk GET

mpich tests

...

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 ~]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       100000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 0.500411 S.
##################################################
Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 0.00
	RPC Throughput (RPCs/sec): 199836
	RPC Latencies (us):
		Min    : 34
		25th  %: 77
		Median : 77
		75th  %: 80
		Max    : 1563
		Average: 79
		Std Dev: 15.72
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 77

Large IO Bulk PUT

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 ~]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 1048576-BULK_PUT)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       1000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 0.145925 S.
##################################################
Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 6852.84
	RPC Throughput (RPCs/sec): 6853
	RPC Latencies (us):
		Min    : 1372
		25th  %: 2292
		Median : 2313
		75th  %: 2336
		Max    : 4448
		Average: 2324
		Std Dev: 178.51
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 2313

Large IO Bulk GET

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 ~]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "b1048576 0" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(1048576-BULK_GET 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       1000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 0.125163 S.
##################################################
Results for message size (1048576-BULK_GET 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 7989.59
	RPC Throughput (RPCs/sec): 7990
	RPC Latencies (us):
		Min    : 518
		25th  %: 1961
		Median : 1977
		75th  %: 2004
		Max    : 3477
		Average: 1991
		Std Dev: 232.06
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 1977

mpich tests

Results: Hanging on first test until segfault with current master. Updated to OFI commit mentioned at beginning of this page, then hit 

Jira Legacy
serverSystem JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverIdf325724b-f7c9-34db-bd1c-69d12ec98a69
keyDAOS-1290