Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Current »

Tip of master, commit cf89d9f3cc2ec5e5fbc8c33ea9f51e83ac23429a

All tests run with ofi+psm2, ib0.

daos_test: Run with 8 server (boro-[4-11]), 2 client (boro-[12-13]). Killed servers, cleaned /mnt/daos in between runs listed below.

Tests requiring pool to be created via dmg used 4GB pool. These used boro-12 as client.

mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.

Test Results

daos_test

Separate runs with cleanup in between:

  • -mpcCAeoRd - PASS
  • -i - FAIL, still rebuilding on IO27 after 10 minutes
    • Appears to be 
      Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
  • -r - FAIL
    • looks to be same as -i above, still rebuilding after 10 min
  • -O - PASS

daosperf

1K Records

CREDITS=1

[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z
Test :
	DAOS (full stack)
Parameters :
	pool size     : 2048 MB
	credits       : 1 (sync I/O for -ve)
	obj_per_cont  : 1 x 1 (procs)
	dkey_per_obj  : 1
	akey_per_dkey : 200
	recx_per_akey : 1000
	value type    : single
	value size    : 1024
	zero copy     : yes
	overwrite     : yes
	verify fetch  : no
	VOS file      : <NULL>
e77e006e: rank 1 became pool service leader 0
Started...
update successfully completed:
	duration : 5.539411   sec
	bandwith : 35.259     MB/sec
	rate     : 36104.92   IO/sec
	latency  : 27.697     us (nonsense if credits > 1)
Duration across processes:
	MAX duration : 5.539411   sec
	MIN duration : 5.539411   sec
	Average duration : 5.539411   sec
e77e006e: rank 1 no longer pool service leader 0

CREDITS=8

  • hitting segfault  CART-496 - segfault in psm2 while running daos_perf OPEN

4K Records

CREDITS=1

  • hitting segfault  CART-496 - segfault in psm2 while running daos_perf OPEN

IOR, 10GB pool, data verification enabled

[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi  --ompi-server file:~/scripts/uri.txt ior -v -W -i 5 -a DAOS -w -o `uuidgen` -b 5g -t 1m -O daospool=cc15805e-1178-41f5-8542-bccdcc7aadce,daosrecordsize=1m,daosstripesize=1m,daosstripecount=1024,daosaios=16,daosobjectclass=LARGE,daosPoolSvc=1,daosepoch=1
IOR-3.0.1: MPI Coordinated Test of Parallel I/O

Began: Mon Aug 20 18:42:00 2018
Command line used: ior -v -W -i 5 -a DAOS -w -o 178063f8-d7aa-4d34-8dd9-4c9218e56101 -b 5g -t 1m -O daospool=cc15805e-1178-41f5-8542-bccdcc7aadce,daosrecordsize=1m,daosstripesize=1m,daosstripecount=1024,daosaios=16,daosobjectclass=LARGE,daosPoolSvc=1,daosepoch=1
Machine: Linux boro-12.boro.hpdd.intel.com
Start time skew across all tasks: 0.00 sec

Test 0 started: Mon Aug 20 18:42:00 2018
Path: /home/sdwillso/daos_m
FS: 3.8 TiB   Used FS: 12.6%   Inodes: 250.0 Mi   Used Inodes: 2.5%
Participating tasks: 1
[0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA
Summary:
	api                = DAOS
	test filename      = 178063f8-d7aa-4d34-8dd9-4c9218e56101
	access             = single-shared-file, independent
	pattern            = segmented (1 segment)
	ordering in a file = sequential offsets
	ordering inter file= no tasks offsets
	clients            = 1 (1 per node)
	repetitions        = 5
	xfersize           = 1 MiB
	blocksize          = 5 GiB
	aggregate filesize = 5 GiB

access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
Commencing write performance test: Mon Aug 20 18:42:01 2018
write     5413       5242880    1024.00    0.006197   0.936376   0.003321   0.945910   0   
Verifying contents of the file(s) just written.
Mon Aug 20 18:42:02 2018

remove    -          -          -          -          -          -          0.001939   0   
Commencing write performance test: Mon Aug 20 18:42:05 2018
write     5554       5242880    1024.00    0.001194   0.918304   0.002265   0.921780   1   
Verifying contents of the file(s) just written.
Mon Aug 20 18:42:06 2018

remove    -          -          -          -          -          -          0.000892   1   
Commencing write performance test: Mon Aug 20 18:42:10 2018
write     5516       5242880    1024.00    0.001179   0.924887   0.002051   0.928131   2   
Verifying contents of the file(s) just written.
Mon Aug 20 18:42:11 2018

remove    -          -          -          -          -          -          0.000912   2   
Commencing write performance test: Mon Aug 20 18:42:14 2018
write     5558       5242880    1024.00    0.001197   0.917913   0.002022   0.921148   3   
Verifying contents of the file(s) just written.
Mon Aug 20 18:42:15 2018

remove    -          -          -          -          -          -          0.000862   3   
Commencing write performance test: Mon Aug 20 18:42:18 2018
write     5578       5242880    1024.00    0.001147   0.914605   0.002092   0.917854   4   
Verifying contents of the file(s) just written.
Mon Aug 20 18:42:19 2018

remove    -          -          -          -          -          -          0.000922   4   

Max Write: 5578.23 MiB/sec (5849.19 MB/sec)

Summary of all tests:
Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum
write        5578.23    5412.78    5524.04      59.11    0.92696 0 1 1 5 0 0 1 0 0 1 5368709120 1048576 5368709120 DAOS 0

Finished: Mon Aug 20 18:42:25 2018

daos_bench

kv-idx-update

  • At end of this test with multiple servers, container destroy fails
    • Unable to locate Jira server for this macro. It may be due to Application Link configuration.
Time: 104.697483 seconds (9551.328003 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-idx-update --testid=1 --svc=1 --dpool=96f11eff-7475-4158-90ae-b55fae60cfaf --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 20 18:50:25 2018
=================================
===============================
Test Setup
---------------
Test: kv-idx-update
DAOS pool :96f11eff-7475-4158-90ae-b55fae60cfaf
DAOS container :fbe891b8-d983-4289-a111-12c0682ae9db
Value buffer size: 64
Number of processes: 1
Number of indexes/process: 1000000
Number of asynchronous I/O: 32
===============================
kv-idx-update
Time: 104.697483 seconds (9551.328003 ops per second)

kv-dkey-update

Time: 0.006400 seconds (15624.158021 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=3031f327-4f8e-4489-8de6-b2994b8aff5a --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 20 18:59:08 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-update
DAOS pool :3031f327-4f8e-4489-8de6-b2994b8aff5a
DAOS container :99ad75bd-f914-4449-becf-b5e64f278d79
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-update
Time: 0.006400 seconds (15624.158021 ops per second)

Ended at Mon Aug 20 18:59:09 2018

kv-akey-update

Time: 0.004211 seconds (23746.177339 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=d09aa21b-7f78-4ede-b437-cc7b68f13037 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 20 19:00:59 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-update
DAOS pool :d09aa21b-7f78-4ede-b437-cc7b68f13037
DAOS container :5c00f77c-0eb7-4672-90fa-d37724a66c5a
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-update
Time: 0.004211 seconds (23746.177339 ops per second)

Ended at Mon Aug 20 19:01:00 2018

kv-dkey-fetch

Time: 0.000601 seconds (166508.774672 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=ffeb7914-5b21-4865-80fd-b3dc0ad6bc1c --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 20 19:02:29 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-fetch
DAOS pool :ffeb7914-5b21-4865-80fd-b3dc0ad6bc1c
DAOS container :396c6185-c711-4892-9c32-530212979b9d
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-fetch
Time: 0.000601 seconds (166508.774672 ops per second)

Ended at Mon Aug 20 19:02:31 2018

kv-akey-fetch

Time: 0.001685 seconds (59362.699540 ops per second)
[sdwillso@boro-4 daos_m]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi  --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=ba2b92f5-2ac9-42e4-8070-6b6b13531b80 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Mon Aug 20 19:04:00 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-fetch
DAOS pool :ba2b92f5-2ac9-42e4-8070-6b6b13531b80
DAOS container :cbeeb6b2-4ec2-4aa4-82ff-9b8859710ae8
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-fetch
Time: 0.001685 seconds (59362.699540 ops per second)

Ended at Mon Aug 20 19:04:01 2018

CaRT Self-Test

Small IO

[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -np 1 -ompi-server file:~/scripts/uri.txt --hostfile ~/hostlists/daos_single_server_2 self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       100000
  Max inflight RPCs:          16

host boro-6.boro.hpdd.intel.com finished self_test duration 0.309352 S.
##################################################
Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 0.00
	RPC Throughput (RPCs/sec): 323257
	RPC Latencies (us):
		Min    : 19
		25th  %: 44
		Median : 44
		75th  %: 50
		Max    : 1433
		Average: 49
		Std Dev: 13.66
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 44

Large IO Bulk PUT

[sdwillso@boro-4 ~]$ orterun -np 1 --hostfile ~/hostlists/daos_single_server_2 -x FI_PSM2_DISCONNECT=1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(0-EMPTY 1048576-BULK_PUT)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       1000
  Max inflight RPCs:          16

host boro-6.boro.hpdd.intel.com finished self_test duration 0.085034 S.
##################################################
Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 11759.96
	RPC Throughput (RPCs/sec): 11760
	RPC Latencies (us):
		Min    : 643
		25th  %: 1341
		Median : 1354
		75th  %: 1369
		Max    : 1582
		Average: 1349
		Std Dev: 66.47
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 1354

Large IO Bulk GET

[sdwillso@boro-4 ~]$ orterun -np 1 --hostfile ~/hostlists/daos_single_server_2 -x FI_PSM2_DISCONNECT=1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "b1048576 0" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
  ranks: 0 (# ranks = 1)
  tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
  Group name to test against: daos_server
  # endpoints:                1
  Message sizes:              [(1048576-BULK_GET 0-EMPTY)]
  Buffer addresses end with:  <Default>
  Repetitions per size:       1000
  Max inflight RPCs:          16

host boro-6.boro.hpdd.intel.com finished self_test duration 0.125142 S.
##################################################
Results for message size (1048576-BULK_GET 0-EMPTY) (max_inflight_rpcs = 16):

Master Endpoint 0:0
-------------------
	RPC Bandwidth (MB/sec): 7990.89
	RPC Throughput (RPCs/sec): 7991
	RPC Latencies (us):
		Min    : 305
		25th  %: 990
		Median : 2334
		75th  %: 2729
		Max    : 3287
		Average: 1991
		Std Dev: 895.36
	RPC Failures: 0

	Endpoint results (rank:tag - Median Latency (us)):
		0:0 - 2334

mpich tests

Results: 

  • No labels