Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Repair Jira Macros

...

  • -mpcCAeoRd - PASS
  • -i - FAIL, still rebuilding on IO27 after 10 minutes
    • Jira Legacy
      serverHPDD Community JiraSystem JIRA
      columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
      serverId8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69
      keyDAOS-1289
  • -r - same as -i, still rebuilding after 10 minutes
  • -O - PASS

...

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 ior]$ orterun --mca mtl ^psm2,ofi -N 1 -quiet --hostfile ~/scripts/host.cli.1 --ompi-server file:~/scripts/uri.txt -x DD_SUBSYS= -x DD_MASK= -x D_LOG_FILE=/tmp/daos_perf.log daos_perf -T daos -P 2G -d 1 -a 200 -r 1000 -s 1K -C 1 -t -z
Test :
	DAOS (full stack)
Parameters :
	pool size     : 2048 MB
	credits       : 1 (sync I/O for -ve)
	obj_per_cont  : 1 x 2 (procs)
	dkey_per_obj  : 1
	akey_per_dkey : 200
	recx_per_akey : 1000
	value type    : single
	value size    : 1024
	zero copy     : yes
	overwrite     : yes
	verify fetch  : no
	VOS file      : <NULL>
d3bda290: rank 1 became pool service leader 0
Started...
update successfully completed:
	duration : 96.823832  sec
	bandwith : 4.034      MB/sec
	rate     : 4131.21    IO/sec
	latency  : 242.060    us (nonsense if credits > 1)
Duration across processes:
	MAX duration : 96.823226  sec
	MIN duration : 90.260579  sec
	Average duration : 93.541903  sec
d3bda290: rank 1 no longer pool service leader 0

CREDITS=8

4K Records

CREDITS=1

IOR, 10GB pool, data verification enabled

Hitting error after updating to latest IOR: "Can't modify committed epoch"

Still debugging. Stdout here:

...

linenumberstrue
collapsetrue

...

  • Jira Legacy
    serverSystem JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverIdf325724b-f7c9-34db-bd1c-69d12ec98a69
    keyCART-496
  • Bug is fixed in patch that's not yet merged to master

4K Records

CREDITS=1

  • Jira Legacy
    serverSystem JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverIdf325724b-f7c9-34db-bd1c-69d12ec98a69
    keyCART-496
  • Bug is fixed in patch that's not yet merged to master

IOR, 10GB pool, data verification enabled

Ran with 1 iteration.

Code Block
linenumberstrue
collapsetrue
[sdwillso@boro-4 ~]$ orterun -x FI_PSM2_DISCONNECT=1 -N 1 --hostfile ~/hostlists/daos_client_hostlist --mca mtl ^psm2,ofi  --ompi-server file:~/scripts/uri.txt ior -v -W -i 1 -a DAOS -w -o `uuidgen` -b 5g -t 1m -- -p c6a51f83-2334-464b-9c2c-704a75ba4bfe -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE -e 1
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
ior WARNING: assuming POSIX-based backend for DAOS statfs call.
ior WARNING: assuming POSIX-based backend for DAOS mkdir call.
ior WARNING: assuming POSIX-based backend for DAOS rmdir call.
ior WARNING: assuming POSIX-based backend for DAOS access call.
ior WARNING: assuming POSIX-based backend for DAOS stat call.
IOR-3.1.0: MPI Coordinated Test of Parallel I/O
Began               : MonTue Sep 1011 2217:3750:4144 2018
Command line        : ior -v -W -i 51 -a DAOS -w -o b2c28bd8acafbb4b-1877431b-43af4c33-9f758a79-ea23d6b28ca1e851bcd78fd8 -b 5g -t 1m -- -p 20c1b6dac6a51f83-6bfd2334-4082464b-87919c2c-718cf0028b3c704a75ba4bfe -v 1 -r 1m -s 1m -c 1024 -a 16 -o LARGE -e 1
Machine             : Linux boro-12.boro.hpdd.intel.com
Start time skew across all tasks: 10554382.37 sec
TestID              : 0
StartTime           : MonTue Sep 1011 2217:3750:4144 2018
Path                : /home/sdwillso/ior
FS                  : 3.8 TiB   Used FS: 1413.8%   Inodes: 250.0 Mi   Used Inodes: 2.8%
Participating tasks: 2
[0] WARNING: USING daosStripeMax CAUSES READS TO RETURN INVALID DATA

Options: 
api                 : DAOS
apiVersion          : DAOS
test filename       : b2c28bd8acafbb4b-1877431b-43af4c33-9f758a79-ea23d6b28ca1e851bcd78fd8
access              : single-shared-file
type                : independent
segments            : 1
ordering in a file  : sequential
ordering inter file : no tasks offsets
tasks               : 2
clients per node    : 1
repetitions         : 51
xfersize            : 1 MiB
blocksize           : 5 GiB
aggregate filesize  : 10 GiB

Results: 

access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
Commencing write performance test: MonTue Sep 1011 2217:3750:4245 2018
write     49474991       5242880    1024.00    0.043661043013   21.0098       0.026218025697   2.0705       0   
Verifying contents of the file(s) just written.
MonTue Sep 1011 2217:3750:4447 2018

remove    -          -          -          -          -          -          0.000043000064   0   
Max Write: 
Can't modify committed epoch

Can't modify committed epoch

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode -1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[boro-4.boro.hpdd.intel.com:16408] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[boro-4.boro.hpdd.intel.com:16408] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages4991.13 MiB/sec (5233.58 MB/sec)

Summary of all tests:
Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev   Max(OPs)   Min(OPs)  Mean(OPs)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt   blksiz    xsize aggs(MiB)   API RefNum
write        4991.13    4991.13    4991.13       0.00    4991.13    4991.13    4991.13       0.00    2.05164     0      2   1    1   0     0        1         0    0      1 5368709120  1048576   10240.0 DAOS      0
Finished            : Tue Sep 11 17:50:57 2018

daos_bench

kv-idx-update

kv-dkey-update

...

Results: Hangs at first test, this is due to known issue, 

Jira Legacy
serverHPDD Community JiraSystem JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId8bba2dd1f325724b-4333f7c9-300634db-bfcdbd1c-f35d4ebbd2ad69d12ec98a69
keyCART-496