...
mpich tests used boro-4 as server, boro-12 as client, with a 1GB pool.
Tests used 8 xstream/server this time, as there is bug with 36xstreams I normally run with.
Test Results
daos_test
Separate runs with cleanup in between:
- -mpcCAeoRd - PASS
- -r - FAIL
- Appears to beĀ
Jira Legacy |
---|
server | HPDD Community JiraSystem JIRA |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
serverId | 8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69 |
---|
key | DAOS-1556 |
---|
|
- -i - FAIL
Jira Legacy |
---|
server | HPDD Community JiraSystem JIRA |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
serverId | 8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69 |
---|
key | DAOS-1685 |
---|
|
daosperf
1K Records
CREDITS=1
CREDITS=8
...
Records
CREDITS=1
...
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 daos_m~]$ orterun -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi -np 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt dmg create --size=40G
3675153f: rank 1 became pool service leader 0
3675153f-7d2c-48d9-9b19-0da8cebb2b18 1
[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 -N 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi --ompi-server file:~/scripts/uri.txt ior -v -W -i 5 -a DAOS -w -o `uuidgen` -b 5g -t 1m -- -p 3675153f-7d2c-48d9-9b19-0da8cebb2b18 -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
IOR-3.1.0: MPI Coordinated Test of Parallel I/O
Began : Wed Oct 31 22:37:44 2018
Command line : ior -v -W -i 5 -a DAOS -w -o f76eabf5-dddd-44d8-9e80-859e18b14f3e -b 5g -t 1m -- -p 3675153f-7d2c-48d9-9b19-0da8cebb2b18 -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE
Machine : Linux boro-12.boro.hpdd.intel.com
Start time skew across all tasks: 2208081.84 sec
TestID : 0
StartTime : Wed Oct 31 22:37:44 2018
Path : /home/sdwillso/daos_m
FS : 3.8 TiB Used FS: 14.2% Inodes: 250.0 Mi Used Inodes: 3.1%
Participating tasks: 2
[0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA
Options:
api : DAOS
apiVersion : DAOS
test filename : f76eabf5-dddd-44d8-9e80-859e18b14f3e
access : single-shared-file
type : independent
segments : 1
ordering in a file : sequential
ordering inter file : no tasks offsets
tasks : 2
clients per node : 1
repetitions : 5
xfersize : 1 MiB
blocksize : 5 GiB
aggregate filesize : 10 GiB
Results:
access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter
------ --------- ---------- --------- -------- -------- -------- -------- ----
Commencing write performance test: Wed Oct 31 22:37:45 2018
write 4542 5242880 1024.00 0.025885 2.21 0.020801 2.25 0
Verifying contents of the file(s) just written.
Wed Oct 31 22:37:47 2018
remove - - - - - - 0.029135 0
Commencing write performance test: Wed Oct 31 22:37:54 2018
write 4637 5242880 1024.00 0.023124 2.16 0.020863 2.21 1
Verifying contents of the file(s) just written.
Wed Oct 31 22:37:56 2018
remove - - - - - - 0.028687 1
Commencing write performance test: Wed Oct 31 22:38:02 2018
write 4614 5242880 1024.00 0.023486 2.18 0.020511 2.22 2
Verifying contents of the file(s) just written.
Wed Oct 31 22:38:04 2018
remove - - - - - - 0.029018 2
Commencing write performance test: Wed Oct 31 22:38:12 2018
write 4620 5242880 1024.00 0.023798 2.17 0.021097 2.22 3
Verifying contents of the file(s) just written.
Wed Oct 31 22:38:14 2018
remove - - - - - - 0.029168 3
Commencing write performance test: Wed Oct 31 22:38:21 2018
write 4608 5242880 1024.00 0.024092 2.18 0.020979 2.22 4
Verifying contents of the file(s) just written.
Wed Oct 31 22:38:23 2018
remove - - - - - - 0.028924 4
Max Write: 4636.61 MiB/sec (4861.84 MB/sec)
Summary of all tests:
Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs) Min(OPs) Mean(OPs) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum
write 4636.61 4541.56 4603.98 32.64 4636.61 4541.56 4603.98 32.64 2.22428 0 2 1 5 0 0 1 0 0 1 5368709120 1048576 10240.0 DAOS 0
Finished : Wed Oct 31 22:38:33 2018 |
daos_bench
kv-idx-update
kv-dkey-update
kv-akey-update
kv-dkey-fetch
kv-akey-fetch
CaRT Self-Test
Small IO
Large IO Bulk PUT
...
-x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 4G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z
ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'openmpi-x86_64'
Test :
DAOS (full stack)
Parameters :
pool size : 4096 MB
credits : 1 (sync I/O for -ve)
obj_per_cont : 1 x 1 (procs)
dkey_per_obj : 1
akey_per_dkey : 200
recx_per_akey : 1000
value type : single
value size : 1024
zero copy : yes
overwrite : yes
verify fetch : no
VOS file : <NULL>
3e9214af: rank 1 became pool service leader 0
Started...
update successfully completed:
duration : 4.446848 sec
bandwith : 43.922 MB/sec
rate : 44975.68 IO/sec
latency : 22.234 us (nonsense if credits > 1)
Duration across processes:
MAX duration : 4.446848 sec
MIN duration : 4.446848 sec
Average duration : 4.446848 sec
3e9214af: rank 1 no longer pool service leader 0 |
CREDITS=8
Jira Legacy |
---|
server | System JIRA |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
serverId | f325724b-f7c9-34db-bd1c-69d12ec98a69 |
---|
key | CART-496 |
---|
|
4K Records
CREDITS=1
Jira Legacy |
---|
server | System JIRA |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
serverId | f325724b-f7c9-34db-bd1c-69d12ec98a69 |
---|
key | CART-496 |
---|
|
IOR, 40GB pool, data verification enabled
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 daos_m]$ orterun --mca mtl ^psm2,ofi -np 1 --ompi-server file:~/scripts/uri.txt dmg create --size=40G
3675153f: rank 1 became pool service leader 0
3675153f-7d2c-48d9-9b19-0da8cebb2b18 1
[sdwillso@boro-4 daos_m]$ orterun -x FI_PSM2_DISCONNECT=1 -N 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi --ompi-server file:~/scripts/uri.txt ior -v -W -i 5 -a DAOS -w -o `uuidgen` -b 5g -t 1m -- -p 3675153f-7d2c-48d9-9b19-0da8cebb2b18 -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
IOR-3.1.0: MPI Coordinated Test of Parallel I/O
Began : Wed Oct 31 22:37:44 2018
Command line : ior -v -W -i 5 -a DAOS -w -o f76eabf5-dddd-44d8-9e80-859e18b14f3e -b 5g -t 1m -- -p 3675153f-7d2c-48d9-9b19-0da8cebb2b18 -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE
Machine : Linux boro-12.boro.hpdd.intel.com
Start time skew across all tasks: 2208081.84 sec
TestID : 0
StartTime : Wed Oct 31 22:37:44 2018
Path : /home/sdwillso/daos_m
FS : 3.8 TiB Used FS: 14.2% Inodes: 250.0 Mi Used Inodes: 3.1%
Participating tasks: 2
[0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA
Options:
api : DAOS
apiVersion : DAOS
test filename : f76eabf5-dddd-44d8-9e80-859e18b14f3e
access : single-shared-file
type : independent
segments : 1
ordering in a file : sequential
ordering inter file : no tasks offsets
tasks : 2
clients per node : 1
repetitions : 5
xfersize : 1 MiB
blocksize : 5 GiB
aggregate filesize : 10 GiB
Results:
access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter
------ --------- ---------- --------- -------- -------- -------- -------- ----
Commencing write performance test: Wed Oct 31 22:37:45 2018
write 4542 5242880 1024.00 0.025885 2.21 0.020801 2.25 0
Verifying contents of the file(s) just written.
Wed Oct 31 22:37:47 2018
remove - - - - - - 0.029135 0
Commencing write performance test: Wed Oct 31 22:37:54 2018
write 4637 5242880 1024.00 0.023124 2.16 0.020863 2.21 1
Verifying contents of the file(s) just written.
Wed Oct 31 22:37:56 2018
remove - - - - - - 0.028687 1
Commencing write performance test: Wed Oct 31 22:38:02 2018
write 4614 5242880 1024.00 0.023486 2.18 0.020511 2.22 2
Verifying contents of the file(s) just written.
Wed Oct 31 22:38:04 2018
remove - - - - - - 0.029018 2
Commencing write performance test: Wed Oct 31 22:38:12 2018
write 4620 5242880 1024.00 0.023798 2.17 0.021097 2.22 3
Verifying contents of the file(s) just written.
Wed Oct 31 22:38:14 2018
remove - - - - - - 0.029168 3
Commencing write performance test: Wed Oct 31 22:38:21 2018
write 4608 5242880 1024.00 0.024092 2.18 0.020979 2.22 4
Verifying contents of the file(s) just written.
Wed Oct 31 22:38:23 2018
remove - - - - - - 0.028924 4
Max Write: 4636.61 MiB/sec (4861.84 MB/sec)
Summary of all tests:
Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs) Min(OPs) Mean(OPs) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum
write 4636.61 4541.56 4603.98 32.64 4636.61 4541.56 4603.98 32.64 2.22428 0 2 1 5 0 0 1 0 0 1 5368709120 1048576 10240.0 DAOS 0
Finished : Wed Oct 31 22:38:33 2018 |
daos_bench
kv-idx-update
Time: 399.307237 seconds (2504.337278 ops per second)
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 ~]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-idx-update --testid=1 --svc=0 --dpool=30099e37-040e-42a9-8eb2-c8cbda0e6148 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Thu Nov 1 19:25:42 2018
=================================
===============================
Test Setup
---------------
Test: kv-idx-update
DAOS pool :30099e37-040e-42a9-8eb2-c8cbda0e6148
DAOS container :ce9ffdb8-054e-4f42-a9c4-8188d38b0426
Value buffer size: 64
Number of processes: 1
Number of indexes/process: 1000000
Number of asynchronous I/O: 32
===============================
kv-idx-update
Time: 399.307237 seconds (2504.337278 ops per second)
Ended at Thu Nov 1 19:32:26 2018 |
kv-dkey-update
Time: 0.088867 seconds (1125.278100 ops per second)
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 ~]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-update --testid=1 --svc=1 --dpool=fd8ab9b3-1f49-45ad-973c-616cb453e1a9 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Thu Nov 1 19:35:30 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-update
DAOS pool :fd8ab9b3-1f49-45ad-973c-616cb453e1a9
DAOS container :d3e1acdf-c922-45fe-8461-fd8e927d2604
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-update
Time: 0.088867 seconds (1125.278100 ops per second)
Ended at Thu Nov 1 19:35:31 2018 |
kv-akey-update
Time: 0.068169 seconds (1466.935135 ops per second)
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 ~]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-update --testid=1 --svc=1 --dpool=4dad8cbc-236a-4364-a7b1-dd59bb212642 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Thu Nov 1 19:37:12 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-update
DAOS pool :4dad8cbc-236a-4364-a7b1-dd59bb212642
DAOS container :cac9cd5b-b74c-4826-8317-25e17085e50d
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-update
Time: 0.068169 seconds (1466.935135 ops per second)
Ended at Thu Nov 1 19:37:12 2018 |
kv-dkey-fetch
Time: 0.049400 seconds (2024.283674 ops per second)
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 ~]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-dkey-fetch --testid=1 --svc=1 --dpool=5ca6999f-15a8-41f4-86ac-9494fd891876 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Thu Nov 1 19:38:33 2018
=================================
===============================
Test Setup
---------------
Test: kv-dkey-fetch
DAOS pool :5ca6999f-15a8-41f4-86ac-9494fd891876
DAOS container :b4bfc4e0-87d0-4ec7-bcd9-03e46235abf8
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-dkey-fetch
Time: 0.049400 seconds (2024.283674 ops per second)
Ended at Thu Nov 1 19:38:33 2018 |
kv-akey-fetch
Time: 0.038302 seconds (2610.806612 ops per second)
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 ~]$ orterun -np 1 -x FI_PSM2_DISCONNECT=1 --mca mtl ^psm2,ofi --hostfile ~/hostlists/daos_client_hostlist --ompi-server file:~/scripts/uri.txt daosbench --test=kv-akey-fetch --testid=1 --svc=1 --dpool=a87c1051-bef3-4929-ba42-d3a742e532d1 --container=`uuidgen` --object-class=tiny --aios=32 --indexes=1000000
================================
DAOSBENCH (KV)
Started at
Thu Nov 1 19:39:53 2018
=================================
===============================
Test Setup
---------------
Test: kv-akey-fetch
DAOS pool :a87c1051-bef3-4929-ba42-d3a742e532d1
DAOS container :41bd8e41-5444-4323-bd72-d820c9ea7429
Value buffer size: 64
Number of processes: 1
Number of keys/process: 100
Number of asynchronous I/O: 32
===============================
kv-akey-fetch
Time: 0.038302 seconds (2610.806612 ops per second)
Ended at Thu Nov 1 19:39:53 2018 |
CaRT Self-Test
Small IO
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 mpich]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes 0 --max-inflight-rpcs 16 --repetitions 100000
Adding endpoints:
ranks: 0 (# ranks = 1)
tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
Group name to test against: daos_server
# endpoints: 1
Message sizes: [(0-EMPTY 0-EMPTY)]
Buffer addresses end with: <Default>
Repetitions per size: 100000
Max inflight RPCs: 16
host boro-4.boro.hpdd.intel.com finished self_test duration 0.339317 S.
##################################################
Results for message size (0-EMPTY 0-EMPTY) (max_inflight_rpcs = 16):
Master Endpoint 0:0
-------------------
RPC Bandwidth (MB/sec): 0.00
RPC Throughput (RPCs/sec): 294710
RPC Latencies (us):
Min : 31
25th %: 51
Median : 52
75th %: 52
Max : 968
Average: 53
Std Dev: 8.99
RPC Failures: 0
Endpoint results (rank:tag - Median Latency (us)):
0:0 - 52 |
Large IO Bulk PUT
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 mpich]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "0 b1048576" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
ranks: 0 (# ranks = 1)
tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
Group name to test against: daos_server
# endpoints: 1
Message sizes: [(0-EMPTY 1048576-BULK_PUT)]
Buffer addresses end with: <Default>
Repetitions per size: 1000
Max inflight RPCs: 16
host boro-4.boro.hpdd.intel.com finished self_test duration 0.133766 S.
##################################################
Results for message size (0-EMPTY 1048576-BULK_PUT) (max_inflight_rpcs = 16):
Master Endpoint 0:0
-------------------
RPC Bandwidth (MB/sec): 7475.75
RPC Throughput (RPCs/sec): 7476
RPC Latencies (us):
Min : 1013
25th %: 2077
Median : 2096
75th %: 2124
Max : 4216
Average: 2130
Std Dev: 284.34
RPC Failures: 0
Endpoint results (rank:tag - Median Latency (us)):
0:0 - 2096 |
Large IO Bulk GET
Code Block |
---|
linenumbers | true |
---|
collapse | true |
---|
|
[sdwillso@boro-4 mpich]$ orterun -np 1 -ompi-server file:~/scripts/uri.txt self_test --group-name daos_server --endpoint 0:0 --message-sizes "b1048576 0" --max-inflight-rpcs 16 --repetitions 1000
Adding endpoints:
ranks: 0 (# ranks = 1)
tags: 0 (# tags = 1)
Warning: No --master-endpoint specified; using this command line application as the master endpoint
Self Test Parameters:
Group name to test against: daos_server
# endpoints: 1
Message sizes: [(1048576-BULK_GET 0-EMPTY)]
Buffer addresses end with: <Default>
Repetitions per size: 1000
Max inflight RPCs: 16
host boro-4.boro.hpdd.intel.com finished self_test duration 0.116480 S.
##################################################
Results for message size (1048576-BULK_GET 0-EMPTY) (max_inflight_rpcs = 16):
Master Endpoint 0:0
-------------------
RPC Bandwidth (MB/sec): 8585.14
RPC Throughput (RPCs/sec): 8585
RPC Latencies (us):
Min : 361
25th %: 1827
Median : 1833
75th %: 1901
Max : 3450
Average: 1853
Std Dev: 258.15
RPC Failures: 0
Endpoint results (rank:tag - Median Latency (us)):
0:0 - 1833 |
mpich tests