Erasure Code Performance

This page is to Track the Initial Performance Number for Erasure Code:

Erasure_Code_Performance (Wolf vs Frontera)


SeqServersClientsIOR ProcessAggregation DisabledChunkSizeBlockSizeXferSizeStripaccessObject ClassWrite (MB/Sec) Read (MB/Sec)DateRequestNotes
1410200
1M32M1M ------SSF RP_1G23430.76843.38


2410200
1M32M1M ------SSF RP_3G23084.8518206.26


3410200
32M32M2MFULLSSFEC_2P1G13448.77087.37


4410200
32M32M2MFULLSSFEC_2P2G13445.286842.25


5410200
32M32M1MPartialSSFEC_2P1G13410.766430.81


6410200
32M32M1MPartialSSFEC_2P2G11729.426586.25


7410200
1M32M1M ------FPP RP_1G220628.5141909.55


8410200
1M32M1M ------FPPRP_3G25079.5843342.75


9410200
32M32M2MFULLFPPEC_2P1G114021.3540296.71

Need to add export FI_PSM2_CONN_TIMEOUT=30 at client side to work for most FPP
10410200
32M32M2MFULLFPPEC_2P2G19933.6238209.18


11410200
32M32M1MPartialFPPEC_2P1G1669.4828914.79


12410200
32M32M1MPartialFPPEC_2P2G16423.2727437.02


13
14610200
1M32M1M ------SSFRP_3G46094.6934479.7


15610200
32M32M4MFULLSSFEC_4P1G16824.5414190.82


16610200
32M32M4MFULLSSFEC_4P2G16429.2713979.85


17610200
32M32M1MFULLSSFDAOS_OC_EC_K4P2_L32K 4172.459294.78


18610200
32M32M128KFULLSSFDAOS_OC_EC_K4P2_L32K 1403.542600.15


19610200
32M32M1MFULLSSFDAOS_OC_EC_K4P2_L64K 4013.99781.82


20610200
32M32M128KFULLSSFDAOS_OC_EC_K4P2_L64K 1244.962415.02


21610200
1M32M1M ------FPPRP_3G47188.6560362.96


22610200
32M32M4MFULLFPPEC_4P1G125080.6561916.86


23610200
32M32M4MFULLFPPEC_4P2G120152.5261087.66


24610200YES32M32M1MFULLFPPDAOS_OC_EC_K4P2_L32K 7326.1823294.16


25610200YES32M32M1MFULLFPPDAOS_OC_EC_K4P2_L64K 12545.2527339.02


26610200
8M8M1MFULLFPPDAOS_OC_EC_K4P2_L32K 11291.9937411.33


27
281010200
1M32M8M ------SSF RP_1G813845.628329.19


291010200
32M32M8MFULLSSFEC_8P2G112262.0726527.38


301010200
32M32M1MPartialSSFEC_8P2G11698.54526.24


311010200
32M32M1MFULLSSFDAOS_OC_EC_K8P2_L32K 7305.943880.9


321010200
32M32M256KFULLSSFDAOS_OC_EC_K8P2_L32K 2438.233189.18


331010200
32M32M8MFULLSSFDAOS_OC_EC_K8P2_L32K 5283.4918123.56


341010200
32M32M1MFULLSSFDAOS_OC_EC_K8P2_L64K 5869.313040.1


351010200
32M32M256KFULLSSFDAOS_OC_EC_K8P2_L64K 1012.43520.82


361010200
32M32M8MFULLSSFDAOS_OC_EC_K8P2_L64K 10032.9621768.8


371010200
1M32M8M ------FPP RP_1G841363.6460083.43


381010200
32M32M8MFULLFPPEC_8P2G135246.9178044.88


391010200
16M16M1MPartialFPPEC_8P2G112679.7475909.24


401010200YES32M32M1MFULLFPPDAOS_OC_EC_K8P2_L32K 7706.123577.2


411010200YES32M32M8MFULLFPPDAOS_OC_EC_K8P2_L32K 16883.8768451.2


421010200YES32M32M1MFULLFPPDAOS_OC_EC_K8P2_L64K 15362.6729499.67


431010200YES32M32M8MFULLFPPDAOS_OC_EC_K8P2_L64K 13249.4661664.51


441010200
8M8M1MFULLFPPDAOS_OC_EC_K8P2_L32K 23407.663203.37


45
461810200
1M32M1M ------SSF RP_1G1627322.2754265.8310/19/2020
Reran on latest master 89f02db3700005079d5e76411f336e82ac50b35a
471810200
1M32M1M ------SSF RP_1G1626523.5251431.1010/20/2020On 10/19/2020 From Di to run with Debug LogReran on master 89f02db3700005079d5e76411f336e82ac50b35a with debug log and logs and IOR console output copied on Wolf /scratch/samirrav/Defect_logs/ior-2020-10-21_00-16-24-RP_1G16/
481810200
64K32M64k ------SSF RP_1G1614044.8923006


491810200
64K32M1M ------SSF RP_1G1614180.7125624.46


501810200
32M32M16MFULLSSFEC_16P2G125379.1725194.6910/19/2020
Reran on latest master 89f02db3700005079d5e76411f336e82ac50b35a
511810200
32M32M16MFULLSSFEC_16P2G124766.2646143.710/19/2020
With xuezhao’s patch https://github.com/daos-stack/daos/pull/3648 and  https://github.com/daos-stack/daos/pull/3690
521810200
32M32M16MFULLSSFEC_16P2G125332.9951189.5310/20/2020On 10/19/2020 From Di to run with Debug Log

With xuezhao’s patch https://github.com/daos-stack/daos/pull/3648 and  https://github.com/daos-stack/daos/pull/3690

Logs and IOR console output copied on Wolf /scratch/samirrav/Defect_logs/ior-EC_16P2G1-2020-10-21_00-48-17/

531810200
32M32M16MFULLSSFEC_16P2G124982.0329195.24

For Debug EC_16P2 fetch performance is not scaling on Wolf /scratch/samirrav/Defect_logs/RP_1G16_For_Debug/
541810200
32M32M1MPartialSSFEC_16P2G14820.762444.39


551810200
32M32M1MFULLSSFDAOS_OC_EC_K16P2_L32K 6374.052390.3110/19/2020
Reran all on latest master 89f02db3700005079d5e76411f336e82ac50b35a
561810200
32M32M1MFULLSSFDAOS_OC_EC_K16P2_L64K 6242.842153.6310/19/2020
571810200
32M32M16MFULLSSFDAOS_OC_EC_K16P2_L64K 18174.6439528.0310/19/2020
581810200
32M32M16MFULLSSFDAOS_OC_EC_K16P2_L128K 22452.4140896.4410/19/2020
89f02db3700005079d5e76411f336e82ac50b35a
591810200
32M32M16MFULLSSFDAOS_OC_EC_K16P2_L128K 23092.0640052.0110/20/2020On 10/19/2020 From Di to run with Debug Log

89f02db3700005079d5e76411f336e82ac50b35a

Ran with debug log and logs and IOR console output  copied on wolf /scratch/samirrav/Defect_logs/ior-DAOS_OC_EC_K16P2_L128K-2020-10-21_00-33-53

601810200
32M32M16MFULLSSFDAOS_OC_EC_K16P2_L256K 21340.5625030.9910/19/2020
89f02db3700005079d5e76411f336e82ac50b35a
611810200
32M128M16MFULLSSFDAOS_OC_EC_K16P2_L256K 23737.3851349.8810/21/2020
With higher Block size IOR_log.txt
621810200
32M32M16MFULLSSFDAOS_OC_EC_K16P2_L512K 24840.6926299.410/19/2020
89f02db3700005079d5e76411f336e82ac50b35a
631810200
32M128M16MFULLSSFDAOS_OC_EC_K16P2_L512K 22949.5852070.0410/21/2020
With higher Block size IOR_log.txt
641810200
32M32M16MFULLSSFDAOS_OC_EC_K16P2_L512K 23946.7133672.4910/20/2020On 10/19/2020 From Di to run with Debug Log

89f02db3700005079d5e76411f336e82ac50b35a

Ran with debug log and logs and IOR console output on wolf /scratch/samirrav/Defect_logs/ior-DAOS_OC_EC_K16P2_L512K-2020-10-21_00-37-08/

651810200
32M32M1MFULLSSFDAOS_OC_EC_K16P2_L64K 9502.526734.47

With Patch from Di to disable //obj_auxi->flags |= ORF_DTX_SYNC;
661810200
1M32M1M ------FPP RP_1G1674121.3798367.3910/19/2020
Reran all on latest master 89f02db3700005079d5e76411f336e82ac50b35a
671810200
1M32M1M ------FPP RP_1G1678331.87117012.3211/3/2020
Latest master d73374cb6cef61b830bd030a2b5d85791342d2d0 IOR_Console.txt
681810200YES1M32M16M ------FPP RP_1G16223.6668781.28


691810200
64K32M64k ------FPP RP_1G1650569.4184687.21


701810200
64K32M1M ------FPP RP_1G1662811.6689913.44


711810200
32M32M16MFULLFPPEC_16P2G116746.8173468.0910/19/2020

Reran on latest master 89f02db3700005079d5e76411f336e82ac50b35a

DAOS-5888 - Getting issue details... STATUS

721810200
32M32M16MFULLFPPEC_16P2G116329.480395.8410/19/2020
With xuezhao’s patch https://github.com/daos-stack/daos/pull/3648 and  https://github.com/daos-stack/daos/pull/3690
731810200
32M32M16MFULLFPPEC_16P2G117129.21123063.7911/3/2020

Latest master d73374cb6cef61b830bd030a2b5d85791342d2d0

Write is same compare to RP_1G16 so not going to open defect

741810200
32M32M16MFULLFPPEC_16P2G1



With echo mode (Add this env in your server yml file  DAOS_IO_BYPASS=target)                                         dfs  ERR  src/client/dfs/dfs.c:968 open_sb() SB does not exist                            
751810200
8M8M1MPartialFPPEC_16P2G121759.3382566.56


761810200YES32M32M1MFULLFPPDAOS_OC_EC_K16P2_L32K 19547.0622717.0310/19/2020
Reran all on latest master 89f02db3700005079d5e76411f336e82ac50b35a
771810200YES32M32M1MFULLFPPDAOS_OC_EC_K16P2_L64K 18283.1422995.810/19/2020
781810200YES32M32M16MFULLFPPDAOS_OC_EC_K16P2_L64K 14874.2583421.2510/19/2020
791810200YES32M32M16MFULLFPPDAOS_OC_EC_K16P2_L128K 16220.05123331.5910/20/2020
IOR_Log.txt
801810200YES32M32M16MFULLFPPDAOS_OC_EC_K16P2_L256K 




811810200YES32M128M16MFULLFPPDAOS_OC_EC_K16P2_L256K 



Open new issue just to be sure it's not some thing in DAOS or CART side  DAOS-5895 - Getting issue details... STATUS

821810200YES32M32M16MFULLFPPDAOS_OC_EC_K16P2_L512K 16520.7173037.2110/19/2020
On latest master 89f02db3700005079d5e76411f336e82ac50b35a
831810200YES32M128M16MFULLFPPDAOS_OC_EC_K16P2_L512K 40705.93127393.6410/21/2020
With higher 128M Blocksize IOR_log.txt
84This are SX object ran to verify the Network and system BW
85410200
32M32M2M
SSFSX20194.8740302.62


86410200
32M32M2M
FPPSX21357.0742227.33


87610200
32M32M4M
SSFSX25087.0148330.26


88610200
32M32M4M
FPPSX28762.5956743.31


891010200
32M32M8M
SSFSX39641.4976674.21


901810400
32M32M2M
SSFSX71998.83116292.9


911810200
1M32M1M
FPPSX79356.44122079.510/2/2020
70b49b97ca40d596a0c98f28684378b159fdd66a as of 10/2/2020
921810200
1M32M1M
FPPSX74514.32127545.8910/19/2020
70b49b97ca40d596a0c98f28684378b159fdd66a as of 10/19/2020
931810200
1M32M1M
FPPSX77373.74130906.8710/19/2020
89f02db3700005079d5e76411f336e82ac50b35a as of 10/19/2020
941810200
1M32M1M
FPPSX80206.02110426.1211/3/2020
d73374cb6cef61b830bd030a2b5d85791342d2d0 on 11/3/2020 IOR_Console.txt
951610200
1M32M1M
FPPSX
43882.12
137707.65
1/5/2021
d7548abeaa0d1a94d6ed67373a894fed04e80a1c with verbs provider IOR_Console.txt

Target Comparison on d73374cb6cef61b830bd030a2b5d85791342d2d0: 

11010200
1M32M8M ------FPP RP_1G8
43421.06
96042.05

11/3/2020

With targets: 16, nr_xs_helpers: 0 IOR_Console.txt 

21010200
1M32M8M ------FPP RP_1G833252.0779346.13
11/3/2020With targets: 8, nr_xs_helpers: 8 IOR_Console.txt
31010200
1M32M8M ------FPP RP_1G840803.2462832.65
11/4/2020With targets: 16, nr_xs_helpers: 16 IOR_Console.txt
41010200YES1M32M8M ------FPP RP_1G838090.3254018.09
11/4/2020

With targets: 16, nr_xs_helpers: 16 IOR_Console.txt

+ Reverting commit of "86330a6fc7dff4f78d8a625975d8a8eae900f2cb"

51010200
32M32M8MFULLFPPEC_8P2G123595.6391343.78
11/3/2020

With targets: 16, nr_xs_helpers: 0 IOR_Console.txt

61010200
32M32M8MFULLFPPEC_8P2G126437.2684216.77
11/3/2020With targets: 8, nr_xs_helpers: 8 IOR_Console.txt
71010200
32M32M8MFULLFPPEC_8P2G118385.8455319.19
11/4/2020With targets: 16, nr_xs_helpers: 16 IOR_Console.txt
81010200YES32M32M8MFULLFPPEC_8P2G121954.2163754.15
11/4/2020

With targets: 16, nr_xs_helpers: 16 IOR_Console.txt

+ Reverting commit of "86330a6fc7dff4f78d8a625975d8a8eae900f2cb"

91810200
32M32M16MFULLFPPEC_16P2G117129.21123063.79
11/3/2020

With targets: 16, nr_xs_helpers: 0 IOR_Console.txt

101810200
32M32M16MFULLFPPEC_16P2G114765.76107413.18
11/3/2020With targets: 8, nr_xs_helpers: 8 IOR_Console.txt
111810200
32M32M16MFULLFPPEC_16P2G125576.86105462.85
11/4/2020With targets: 16, nr_xs_helpers: 16 IOR_Console.txt
121810200YES32M32M16MFULLFPPEC_16P2G1


11/4/2020

With targets: 16, nr_xs_helpers: 16

+ Reverting commit of "86330a6fc7dff4f78d8a625975d8a8eae900f2cb"

Hitting the same issue as  DAOS-5895 - Getting issue details... STATUS

Summary:

  • Severs side fan-out has big performance impact
  • Write is forwarded by EC group leader
    • 16+2 write performance is not scaling well
  • Read does not have the same issue
    • Client has the same RPC fan-out as server, why?
  • Fragmented RDMA
    • Performance of 64K cell size is not good enough