Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Tip of master, commit 98cd53e0a273885324261d5f38c23a80be84f09e

After running tip of master, reran few tests with OFI updated to 99e333426b64d7d227fd604731235ffc14862662 to pull in some psm2 fixes.

All tests run with ofi+psm2, ib0.

daos_test: Run with 8 server (boro-[4-11]), 2 client (boro-[12-13]). Killed servers, cleaned /mnt/daos in between runs listed below.

Tests requiring pool to be created via dmg used 4GB pool. These used boro-12 as client.

mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.

Test Results

daos_test

Separate runs with cleanup in between:

  • -mpcCAeoRd - PASS
  • -i - FAIL, still rebuilding on IO27 after 10 minutes
    • Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
  • -r - FAIL
    • looks to be same as -i above, still rebuilding after 10 min
  • -O - PASS

daosperf

1K Records

CREDITS=1

[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z
Test :
	DAOS (full stack)
Parameters :
	pool size     : 2048 MB
	credits       : 1 (sync I/O for -ve)
	obj_per_cont  : 1 x 1 (procs)
	dkey_per_obj  : 1
	akey_per_dkey : 200
	recx_per_akey : 1000
	value type    : single
	value size    : 1024
	zero copy     : yes
	overwrite     : yes
	verify fetch  : no
	VOS file      : <NULL>
fe9c1051: rank 1 became pool service leader 0
Started...
update successfully completed:
	duration : 5.522145   sec
	bandwith : 35.369     MB/sec
	rate     : 36217.81   IO/sec
	latency  : 27.611     us (nonsense if credits > 1)
Duration across processes:
	MAX duration : 5.522145   sec
	MIN duration : 5.522145   sec
	Average duration : 5.522145   sec
fe9c1051: rank 1 no longer pool service leader 0

CREDITS=8

  • hitting segfault   Unable to locate Jira server for this macro. It may be due to Application Link configuration.

4K Records

CREDITS=1

  • hitting segfault   Unable to locate Jira server for this macro. It may be due to Application Link configuration.

IOR, 10GB pool, data verification enabled

[sdwillso@boro-4 ~]$ orterun -np 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi  --ompi-server file:~/scripts/uri.txt ior -v -W -i 5 -a DAOS -w -o `uuidgen` -b 5g -t 1m -O daospool=3c6381d2-b094-476e-b7ae-b1f4f7f908fe,daosrecordsize=1m,daosstripesize=1m,daosstripecount=1024,daosaios=16,daosobjectclass=LARGE,daosPoolSvc=1,daosepoch=1
IOR-3.0.1: MPI Coordinated Test of Parallel I/O

Began: Mon Aug 27 18:00:54 2018
Command line used: ior -v -W -i 5 -a DAOS -w -o 0d89ddc3-ae02-4df5-a071-2423bb3a32b9 -b 5g -t 1m -O daospool=3c6381d2-b094-476e-b7ae-b1f4f7f908fe,daosrecordsize=1m,daosstripesize=1m,daosstripecount=1024,daosaios=16,daosobjectclass=LARGE,daosPoolSvc=1,daosepoch=1
Machine: Linux boro-12.boro.hpdd.intel.com
Start time skew across all tasks: 0.00 sec

Test 0 started: Mon Aug 27 18:00:54 2018
Path: /home/sdwillso
FS: 3.8 TiB   Used FS: 13.4%   Inodes: 250.0 Mi   Used Inodes: 2.7%
Participating tasks: 1
[0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA
Summary:
	api                = DAOS
	test filename      = 0d89ddc3-ae02-4df5-a071-2423bb3a32b9
	access             = single-shared-file, independent
	pattern            = segmented (1 segment)
	ordering in a file = sequential offsets
	ordering inter file= no tasks offsets
	clients            = 1 (1 per node)
	repetitions        = 5
	xfersize           = 1 MiB
	blocksize          = 5 GiB
	aggregate filesize = 5 GiB

access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
Commencing write performance test: Mon Aug 27 18:00:55 2018
write     2553.80    5242880    1024.00    0.044146   1.93       0.026871   2.00       0   
Verifying contents of the file(s) just written.
Mon Aug 27 18:00:57 2018

remove    -          -          -          -          -          -          0.044487   0   
Commencing write performance test: Mon Aug 27 18:01:03 2018
write     2592.04    5242880    1024.00    0.035451   1.92       0.024715   1.98       1   
Verifying contents of the file(s) just written.
Mon Aug 27 18:01:05 2018

remove    -          -          -          -          -          -          0.045178   1   
Commencing write performance test: Mon Aug 27 18:01:12 2018
write     2593.84    5242880    1024.00    0.035656   1.91       0.025228   1.97       2   
Verifying contents of the file(s) just written.
Mon Aug 27 18:01:14 2018

remove    -          -          -          -          -          -          0.043000   2   
Commencing write performance test: Mon Aug 27 18:01:20 2018
write     2595.09    5242880    1024.00    0.036727   1.91       0.024770   1.97       3   
Verifying contents of the file(s) just written.
Mon Aug 27 18:01:22 2018

remove    -          -          -          -          -          -          0.044975   3   
Commencing write performance test: Mon Aug 27 18:01:28 2018
write     2590.57    5242880    1024.00    0.035836   1.92       0.025285   1.98       4   
Verifying contents of the file(s) just written.
Mon Aug 27 18:01:30 2018

remove    -          -          -          -          -          -          0.043984   4   

Max Write: 2595.09 MiB/sec (2721.15 MB/sec)

Summary of all tests:
Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum
write        2595.09    2553.80    2585.07      15.71    1.98068 0 1 1 5 0 0 1 0 0 1 5368709120 1048576 5368709120 DAOS 0

Finished: Mon Aug 27 18:01:39 2018

daos_bench

kv-idx-update

  • At end of this test with multiple servers, container destroy fails
    • Unable to locate Jira server for this macro. It may be due to Application Link configuration.
Time: 105.668696 seconds (9463.540644 ops per second)
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-idx-update --testid=1 --svc=1 --dpool=099fde5e-e164-4e0f-b1be-bc7130a652b9 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 27 18:25:30 2018
=================================
===============================
Test Setup
---------------
Test: kv-idx-update
DAOS pool :099fde5e-e164-4e0f-b1be-bc7130a652b9
DAOS container :d17f64ae-2c6c-48a5-a3b5-0733424c0384
Value buffer size: 64
Number of processes: 1
Number of indexes/process: 1000000
Number of asynchronous I/O: 32
===============================
kv-idx-update
Time: 105.668696 seconds (9463.540644 ops per second)
daosbench:0:src/tests/daosbench.c:765: Unknown error 2001: Container destroy failed

kv-dkey-update

Time: 0.008301 seconds (12047.253463 ops per second)
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=15d0097a-c03b-4272-8d60-b4e7cd11544e --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 27 18:29:54 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-update
DAOS pool :15d0097a-c03b-4272-8d60-b4e7cd11544e
DAOS container :44c3bcea-1056-418f-a900-8eb899af26af
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-update
Time: 0.008301 seconds (12047.253463 ops per second)

Ended at Mon Aug 27 18:29:55 2018

kv-akey-update

Time: 0.004175 seconds (23950.921016 ops per second)
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=7185fa9d-f5ad-4454-90e2-3ce5187f3bd3 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 27 18:31:54 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-update
DAOS pool :7185fa9d-f5ad-4454-90e2-3ce5187f3bd3
DAOS container :9da1b295-f055-4b0b-ac63-9702daa30715
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-update
Time: 0.004175 seconds (23950.921016 ops per second)

Ended at Mon Aug 27 18:31:55 2018

kv-dkey-fetch

Time: 0.000553 seconds (180706.206748 ops per second)
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=64a6e942-4bb2-4b50-a95e-58086a17bfab --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 27 18:33:46 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-fetch
DAOS pool :64a6e942-4bb2-4b50-a95e-58086a17bfab
DAOS container :b511c4d7-aab0-4700-970f-eadc840806d1
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-fetch
Time: 0.000553 seconds (180706.206748 ops per second)

Ended at Mon Aug 27 18:33:47 2018

kv-akey-fetch

Time: 0.001576 seconds (63464.794813 ops per second)
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=7e97c5fb-196e-4851-9c88-069349d8ce6e --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 27 18:35:29 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-fetch
DAOS pool :7e97c5fb-196e-4851-9c88-069349d8ce6e
DAOS container :932946ad-9bf1-4309-b14b-96ca699ee72c
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-fetch
Time: 0.001576 seconds (63464.794813 ops per second)

Ended at Mon Aug 27 18:35:30 2018

CaRT Self-Test

Small IO

[sdwillso@boro-4 ~]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       100000
  Max inflight RPCs:          16

host boro-4.boro.hpdd.intel.com finished self_test duration 0.500411 S.
##################################################
Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 0.00
	RPC Throughput (RPCs/sec): 199836
	RPC Latencies (us):
		Min    : 34
		25th  %: 77
		Median : 77
		75th  %: 80
		Max    : 1563
		Average: 79
		Std Dev: 15.72
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 77

Large IO Bulk PUT

Large IO Bulk GET

mpich tests

Results: Hanging on first test until segfault with current master. Updated to OFI commit mentioned at beginning of this page, then hit  Unable to locate Jira server for this macro. It may be due to Application Link configuration.

  • No labels