This page is to Track the Initial Performance Number for Erasure Code comparison with Wolf/Frontera.
Wolf-Cluster:
- 4-18 severs
- DCPM only (no SSD)
- 16 targets per server
- OFI+psm2
- 2 daos_io_server pre physical node
- 10 clients
- OFI+psm2
- IOR (FPP)
- 200 ranks
-v -w -r -i 2 -k
Frontera Cluster:
- 4-18 severs
- tmpfs only (no SSD)
- 16 targets per server
- ofi+verbs;ofi_rxm
- 1 daos_io_server per physical node
- 10-36 clients
- ofi+verbs;ofi_rxm
- IOR (FPP)
- 200 – 720 ranks
-w -W -r -R -g -G 27 -C -Q 1 -F -i 2 -s 1
- Aggregation Disabled for all tests
1 | Servers | Clients | IOR Process | Aggregation Disabled | ChunkSize | BlockSize | XferSize | Strip | access | Object Class | Write (MB/Sec) | Read (MB/Sec) | Date | Request | Notes |
2 | 4 | 10 | 200 | 1M | 32M | 1M | ------ | FPP | RP_1G2 | 20628.51 | 41909.55 | ||||
3 | 4 | 10 | 200 | 1M | 32M | 1M | ------ | FPP | RP_3G2 | 5079.58 | 43342.75 | ||||
4 | 4 | 10 | 200 | 32M | 32M | 2M | FULL | FPP | EC_2P1G1 | 14021.35 | 40296.71 | Need to add export FI_PSM2_CONN_TIMEOUT=30 at client side to work for most FPP | |||
5 | 4 | 8 | 256 | YES | 32M | 32M | 2M | FULL | FPP | EC_2P1G1 | 27912.64 | 46351.81 | Frontera | IOR_Console.txt | |
6 | 4 | 10 | 200 | YES | 32M | 32M | 2M | FULL | FPP | EC_2P1G1 | 28046.75 | 43065.39 | Frontera | IOR_Console.txt | |
7 | 4 | 8 | 256 | No | 32M | 32M | 2M | FULL | FPP | EC_2P1G1 | 25060.42 | 37088.85 | 5/12/2021 | Frontera | |
8 | |||||||||||||||
9 | 6 | 10 | 200 | 1M | 32M | 1M | ------ | FPP | RP_3G4 | 7188.65 | 60362.96 | ||||
10 | 6 | 10 | 200 | 32M | 32M | 4M | FULL | FPP | EC_4P1G1 | 25080.65 | 61916.86 | ||||
11 | 6 | 12 | 240 | YES | 32M | 32M | 4M | FULL | FPP | EC_4P1G1 | 47124.62 | 64640.02 | Frontera | IOR_Console.txt | |
12 | |||||||||||||||
13 | 10 | 10 | 200 | 1M | 32M | 8M | ------ | FPP | RP_1G8 | 41363.64 | 60083.43 | ||||
14 | 10 | 10 | 200 | 32M | 32M | 8M | FULL | FPP | EC_8P2G1 | 35246.91 | 78044.88 | ||||
15 | 10 | 20 | 400 | YES | 32M | 32M | 8M | FULL | FPP | EC_8P2G1 | 70238.98 | 101927.74 | Frontera | IOR_Console.txt | |
16 | 10 | 10 | 200 | YES | 32M | 32M | 8M | FULL | FPP | DAOS_OC_EC_K8P2_L64K | 13249.46 | 61664.51 | |||
17 | 10 | 20 | 400 | YES | 32M | 32M | 8M | FULL | FPP | DAOS_OC_EC_K8P2_L64K | 68384.40 | 93717.37 | Frontera | IOR_Console.txt | |
18 | |||||||||||||||
19 | 18 | 10 | 200 | 1M | 32M | 1M | ------ | FPP | RP_1G16 | 78331.87 | 117012.32 | 11/3/2020 | Latest master d73374cb6cef61b830bd030a2b5d85791342d2d0 IOR_Console.txt | ||
20 | 18 | 36 | 720 | YES | 1M | 32M | 1M | ----- | FPP | RP_1G16 | 149058.91 | 171154.21 | Frontera | IOR_Console.txt | |
21 | 18 | 10 | 200 | 32M | 32M | 16M | FULL | FPP | EC_16P2G1 | 17129.21 | 123063.79 | 11/3/2020 | Latest master d73374cb6cef61b830bd030a2b5d85791342d2d0 | ||
22 | 18 | 36 | 720 | YES | 32M | 32M | 16M | FULL | FPP | EC_16P2G1 | 107520.52 | 179240.30 | Frontera | IOR_Console.txt | |
23 | 18 | 10 | 200 | YES | 32M | 32M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L64K | 14874.25 | 83421.25 | 10/19/2020 | ||
24 | 18 | 36 | 720 | YES | 32M | 32M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L64K | 97084.61 | 147155.30 | Frontera | IOR_Console.txt | |
25 | 18 | 10 | 200 | YES | 32M | 32M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L128K | 16220.05 | 123331.59 | 10/20/2020 | ||
26 | 18 | 36 | 720 | YES | 32M | 32M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L128K | 108867.57 | 163780.23 | Frontera | IOR_Console.txt | |
27 | 18 | 10 | 200 | YES | 32M | 128M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L256K | Open new issue just to be sure it's not some thing in DAOS or CART side Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration. ] - IOR EC object type DAOS_OC_EC_K16P2_L256K IOR with FPP is crashing the server Open | ||||
28 | 18 | 36 | 720 | YES | 32M | 128M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L256K | 146009.29 | 191612.73 | Frontera | IOR_Console.txt | |
29 | 18 | 10 | 200 | YES | 32M | 128M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L512K | 40705.93 | 127393.64 | 10/21/2020 | With higher 128M Blocksize IOR_log.txt | |
30 | 18 | 36 | 720 | YES | 32M | 128M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L512K | 160802.90 | 193756.27 | Frontera | IOR_Console.txt | |
31 | 18 | 36 | 720 | YES | 32M | 256M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L512K | 149744.54 | 195394.01 | Frontera | IOR_Console.txt With dedup:memcmp | |
32 | 18 | 36 | 720 | YES | 32M | 512M | 16M | FULL | FPP | DAOS_OC_EC_K16P2_L512K | 175131.70 | 183141.63 | Frontera | IOR_Console.txt With dedup:memcmp |
Defect Status:
- DAOS-5888 - EC Write performance EC_16P2G1 dropped to 50% on latest master compare to few weeks older 70b49b97ca40d596a0c98f28684378b159fdd66a
- Not on Frontera because of non PSM2 fabric
- DAOS-5895 - IOR EC object type DAOS_OC_EC_K16P2_L256K IOR with FPP is crashing the server
- Did not observed any crash on Frontera
- DAOS-5777 - EC IOR test is failing for file-per-process which has higher Chunk size
- Did not tried on Frontera but will likely go away ?
- New Defect on Frontera (18 servers/80 client [1600 tasks]) where read is getting stuck (But no server crash or client crash). Need to try few more things to get more debug info
- object ERR src/object/cli_shard.c:552 dc_rw_cb() RPC 1 failed: -1032