...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
# single server instance per config file for now servers: - targets: 16 # Confirm the number of targets first_core: 0 # offset of the first core for service xstreams nr_xs_helpers: 1 # count of offload/helper xstreams per target fabric_iface: ib0 # map to OFI_INTERFACE=ib0 fabric_iface_port: 31416 # map to OFI_PORT=31416 log_mask: ERR # map to D_LOG_MASK=ERR log_file: /tmp/daos_server.log # map to D_LOG_FILE=/tmp/server.log # Environment variable values should be supplied without encapsulating quotes. env_vars: # influence DAOS IO Server behaviour by setting env variables - CRT_TIMEOUT=120 - CRT_CREDIT_EP_CTX=0 - PSM2_MULTI_EP=1 - CRT_CTX_SHARE_ADDR=1 - PMEMOBJ_CONF=prefault.at_open=1;prefault.at_create=1; # Do we need this? - PMEM_IS_PMEM_FORCE=1 # Do we need this? # Storage definitions # When scm_class is set to ram, tmpfs will be used to emulate SCM. # The size of ram is specified by scm_size in GB units. scm_mount: /dev/shm # map to -s /mnt/daos scm_class: ram scm_size: 90 |
Server Environment variables (If set any)
Client Configuration:
Configuration:
Environment variables (If set any):
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
export # When scm_class is set to dcpm, scm_list is the list of device paths for # AppDirect pmem namespaces (currently only one per server supported). # scm_class: dcpm # scm_list: [/dev/pmem0] # If using NVMe SSD (will write /mnt/daos/daos_nvme.conf and start I/O # service with -n <path>) # bdev_class: nvme # bdev_list: ["0000:81:00.0"] # generate regular nvme.conf # If emulating NVMe SSD with malloc devices # bdev_class: malloc # map to VOS_BDEV_CLASS=MALLOC # bdev_size: 4 # malloc size of each device in GB. # bdev_number: 1 # generate nvme.conf as follows: # [Malloc] # NumberOfLuns 1 # LunSizeInMB 4000 # If emulating NVMe SSD over kernel block device # bdev_class: kdev # map to VOS_BDEV_CLASS=AIO # bdev_list: [/dev/sdc] # generate nvme.conf as follows: # [AIO] # AIO /dev/sdc AIO2 # If emulating NVMe SSD with backend file # bdev_class: file # map to VOS_BDEV_CLASS=AIO # bdev_size: 16 # file size in GB. Create file if does not exist. # bdev_list: [/tmp/daos-bdev] # generate nvme.conf as follows: # [AIO] # AIO /tmp/aiofile AIO1 4096 |
Server Environment variables (If set any)
Client Configuration:
Configuration:
Environment variables (If set any):
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
export CRT_PHY_ADDR_STR="ofi+psm2"
export OFI_INTERFACE=ib0
export FI_PSM2_NAME_SERVER=1
export PSM2_MULTI_EP=1
export FI_SOCKETS_MAX_CONN_RETRY=1
export CRT_CTX_SHARE_ADDR=1
export CRT_TIMEOUT=120 |
Other important information for running test:
...
Test Description:
...
Test Priority (1- HIGH, 2 - LOW)
...
target = [16]
nr_xs_helpers = [1]
CRT_CTX_SHARE_ADDR=[0, 1]
...
No sever crash,
Performance increase linearly
...
No Replica
Run IOR and collect BW
Run IOR small size and collect IOPS
...
1,
8,
32,
128
128
...
1,
16,
96,
256
740
...
protocol : daos
Transfer Size: 256B 4K 128K 512K 1M (Do we need non standard size also be covered?)
Block Size: 64M (Depend upon no. of process as file size will increase because of it)
FPP and SSF
...
single server got ~12GB Read/write so it should scale linearly.
With 128 server should be close to 1.5TB BW?
1406 Nodes
taking ~30 min
...
Replica 2 Way
Run IOR and collect BW
Run IOR small size and collect IOPS
...
8,
32,
128
...
16,
96,
740
...
1020 Nodes
for ~30 min
...
Replica 3 Way
Run IOR and collect BW
Run IOR small size and collect IOPS
...
8,
32,
128
...
16,
96,
740
...
1020 Nodes
for ~30 min
...
Replica 4 Way
Run IOR and collect BW
Run IOR small size and collect IOPS
...
8,
32,
128
...
16,
96,
740
...
1020 Nodes
for ~30 min
...
Any Erasure Encoding object class need to run? May be with medium size?
EC_2P1G1
EC_2P2G1
EC_8P2G1
...
1,
8,
32,
128
128
...
1,
16,
96,
256
740
...
How many tasks per client 1 ,4 or only 8?
What class type should be tested ?
-n = 1000 (every process will creat/stat/read/remove )
-z = 0 and 20 (depth of hierarchical directory structure)
...
Result with 1 server, 1 client is available from
https://jira.hpdd.intel.com/secure/attachment/31383/sbatch_run.txt
1406 Nodes
taking ~15 min
...
2
32
126
...
1
1
1
...
...
166 Nodes
for ~5 min
...
128
for ~60 min
...
Single server/Max clients
(IOR)
...
Client processes
1
64
128
512
1024
2032
...
Create pool, Query pool
Run IOR (Specific size?)
Transfer size: 256B, 1M
Block size: 16M for 256B TS, otherwise 64M
Flags: -w -W -r -R
iter: 3
...
Poole create should work fine. IOR will run with ~2000 tasks so it should success. Query pool info after IOR run and measure the pool size compare to file size.
Assuming 16 client processes per node. (Need to verify if it works fine. 8 client processes per node works.)
...
Client processes
1
128
1024
4096
8192
13872
...
Create pool, Query pool
Run IOR (Specific size?)
Transfer size: 256B, 1M
Block size: 16M for 256B TS, otherwise 64M
Flags: -w -W -r -R
iter: 3
...
Poole create should work fine. IOR will run with ~13000 tasks so it should success. Query pool info after IOR run and measure the pool size compare to file size.
Assuming 16 client processes per node. (Need to verify if it works fine. 8 client processes per node works.)
...
Max servers/single client
(IOR)
...
1
8
16
32
64
127
...
Client processes
16
...
Create pool, Query pool
Run IOR with DAOS and POSIX api
Transfer size: 256B, 1M
Block size: 16M for 256B TS, otherwise 64M
Flags: -w -W -r -R
iter: 3
...
1
16
64
256
512
867
...
Client processes
16
...
Create pool, Query pool
Run IOR with DAOS and POSIX api
Transfer size: 256B, 1M
Block size: 16M for 256B TS, otherwise 64M
Flags: -w -W -r -R
iter: 3
...
Single server/Max clients
(Mdtest)
...
CRT_PHY_ADDR_STR="ofi+psm2"
export OFI_INTERFACE=ib0
export FI_PSM2_NAME_SERVER=1
export PSM2_MULTI_EP=1
export FI_SOCKETS_MAX_CONN_RETRY=1
export CRT_CTX_SHARE_ADDR=1
export CRT_TIMEOUT=120 |
Other important information for running test:
Item or Notes | Description |
---|---|
For Defect CART-777 | There is an issue with verbs+open MPI so some time it orterun is getting stuck because of it, To workaround the issue use the "--mca btl tcp,self --mca oob tcp" with IOR or other commands |
Test Description:
Testing Area | Test | Test Priority (1- HIGH, 2 - LOW) | Number of Servers | Number of Clients | Input Parameter | Expected Result | Observed Result | Defect | Notes | Expected SU's (1 node * 1 hour = 1 SU) | |
---|---|---|---|---|---|---|---|---|---|---|---|
Server YAML config options | To verify the test cases from below section with specific server config options in YAML file | 1 | target = [16] nr_xs_helpers = [1] CRT_CTX_SHARE_ADDR=[0, 1] | No sever crash, Performance increase linearly | No need individual test but below test can be used this configuration | ||||||
Performance | No Replica Run IOR and collect BW Run IOR small size and collect IOPS | 1 | 1, 8, 32, 128 128 | 1, 16, 96, 256 740 | protocol : daos Transfer Size: 256B 4K 128K 512K 1M (Do we need non standard size also be covered?) Block Size: 64M (Depend upon no. of process as file size will increase because of it) FPP and SSF | single server got ~12GB Read/write so it should scale linearly. With 128 server should be close to 1.5TB BW? | 1406 Nodes taking ~30 min | 703 | |||
Replica 2 Way Run IOR and collect BW Run IOR small size and collect IOPS | 1 | 8, 32, 128 | 16, 96, 740 | Same As Above | 1020 Nodes for ~30 min | 510 | |||||
Replica 3 Way Run IOR and collect BW Run IOR small size and collect IOPS | 1 | 8, 32, 128 | 16, 96, 740 | Same As Above | 1020 Nodes for ~30 min | 510 | |||||
Replica 4 Way Run IOR and collect BW Run IOR small size and collect IOPS | 1 | 8, 32, 128 | 16, 96, 740 | Same As Above | 1020 Nodes for ~30 min | 510 | |||||
Any Erasure Encoding object class need to run? May be with medium size? EC_2P1G1 | 1? | 32 | 96 | Same As Above? | 128 nodes for ~60 min | 120 | |||||
Metadata Test (Using MDTest) | 1 | 1, 8, 32, 128 128 | 1, 16, 96, 256 740 | How many tasks per client 1 ,4 or only 8? What class type should be tested ? -n = 1000 (every process will creat/stat/read/remove ) -z = 0 and 20 (depth of hierarchical directory structure) | Result with 1 server, 1 client is available from https://jira.hpdd.intel.com/secure/attachment/31383/sbatch_run.txt | 1406 Nodes taking ~15 min | 350 | ||||
CART self_test | 1 | 2 32 126 | 1 1 1 | orterun --timeout 3600 --mca mtl ^psm2,ofi -x FI_PSM2_DISCONNECT=1 -np 1 -ompi-server <urifile> self_test --group-name daos_server --endpoint 0-<NO_OF_SERVER>:0 --master-endpoint 0-<NO_OF_SERVER>:0 --message-sizes 'b1048576',' b1048576 0','0 b1048576',' b1048576 i2048',' i2048 b1048576',' i2048',' i2048 0','0 i2048','0' --max-inflight-rpcs 1 --repetitions 100 | Did not get all the number for 126 servers | CART-791 | 166 Nodes for ~5 min | 14 | |||
POSIX (Fuse) | 2? | 32 | 96 | Run IOR with POSIX mode. Are we there to get the full performance ? | 128 for ~60 min | 128 | |||||
DFS | 2 | Not sure if we want to cover dfs as we are covering daos with IOR on above test cases | |||||||||
HDF5? | 2? | 32 | 96 | Any specific test we want to run? | |||||||
FIO? | Do we want to test this? | ||||||||||
Functionality and Scale testing | Run all daos_test | 2 | 128 | 740 | 868 node for ~60 min | ||||||
Single server/Max clients (IOR) | 1 | 1 | 126 (Client processes 1 64 128 512 10242032 2016) | Create pool Run mdtest with DFS and Posix api num of files/dir: -n 100|10K write and read (-w and -e): 4 for -n 100 otherwise keep files empty depth (-z): 0 and 20 iter: 3 | Pool create should work as expected. Run mdtest with , Query pool Run IOR (Specific size?) Transfer size: 256B, 1M Block size: 16M for 256B TS, otherwise 64M Flags: -w -W -r -R iter: 3 | Poole create should work fine. IOR will run with ~2000 tasks so it should success. Query pool info after IOR run and measure the pool size compare to file size. Assuming 16 client processes per node. ( needNeed to be verified if 16 clientsverify if it works fine. 8 client processes per node works fine.) . | (Total nodes available at present 128127. 16 Client processes per node) | 128 node for ~30 min | 64 | ||
1 | Client processes1 | 1288661024 (Client processes 4096 819213872 13856) | Create pool, Query pool Run mdtest with DFS and Posix api num of files/dir: -n 100|10K write and read (-w and -e): 4 for -n 100 otherwise keep files empty depth (-z): 0 and 20IOR (Specific size?) Transfer size: 256B, 1M Block size: 16M for 256B TS, otherwise 64M Flags: -w -W -r -R iter: 3 | Pool Poole create should work as expected. Run mdtest withfine. IOR will run with ~13000 tasks so it should success. Query pool info after IOR run and measure the pool size compare to file size. Assuming 16 client processes per node. ( needNeed to be verified if 16 clientsverify if it works fine. 8 client processes per node works fine.) . | (Total nodes available at present 868867. 16 Client processes per node) | 868 node for ~30 min | 434 | ||||
Max servers/single client (MdtestIOR) | 1 | 1 8 16 32 64 126 | 1271 (Client processes 16) | Create pool, Query pool Run mdtest IOR with DFS DAOS and Posix POSIX api num of files/dir: -n 100|10K write and read (-w and -e): 4 for -n 100 otherwise keep files empty depth (-z): 0 and 20Transfer size: 256B, 1M Block size: 16M for 256B TS, otherwise 64M Flags: -w -W -r -R iter: 3 | Pool Poole create should work as expected. Run mdtest fine. IOR will be run with 16 client processes per nodenode (need to be verified if 16 clients verify if it works fine. 8 client processes per node works fine)). Query pool info after IOR run and measure the pool size compare to file size. | (Total nodes available at present 128127. 16 Client processes per node) | 128 node for ~30 min | 64 | |||
1 | 16512 64866 256 512 867 | 1 (Client processes 16) | Create pool, Query pool Run mdtest IOR with DFS DAOS and Posix POSIX api num of files/dir: -n 100|10K write and read (-w and -e): 4 for -n 100 otherwise keep files empty depth (-z): 0 and 20Transfer size: 256B, 1M Block size: 16M for 256B TS, otherwise 64M Flags: -w -W -r -R iter: 3 | Pool Poole create should work as expected. Run mdtest fine. IOR will be run with 16 client processes per nodenode (need to be verified if 16 clients verify if it works fine. 8 client processes per node works fine)). Query pool info after IOR run and measure the pool size compare to file size. | (Total nodes available at present 868867. 16 Client processes per node) | 868 node for ~30 min | 434 | ||||
Large number of Pools (~1000) | 128 Server number seems ok? | 740 | Create large number of pools (~90MB each), Write small data with IOR. Restart all the servers. Query all the pools Read the IOR data from each pool with verification what other operation needed after pool creation? | Measure server restart time with this many pools Pool query should report correct sizes after IOR write IOR read should work fine with data validation after all server restart | 868 node for ~60 min | 868 | |||||
dmg utility testing for example: pool query | dmg pool create dmg pool query dmg pool destroy Anything more to cover? Some of this tools are going to cover in other test cases | ||||||||||
Negative Scenarios with Scalability | Server failure and rebuild data | 128 | 740 | 1 | Create the multiple pools. Store the IOR with 2,3,4 replica and with multiple groups. Kill server one by one 64 maximum (Half the requested size)? After each server kill read the IOR data and verify the content. Multiple server can be killed (2/4/8), Object data will be lost if all copy lost. May be we can verify the remaining system is functional | Rebuild should happen for all the object and data should not be corrupted after server failure | 868 for ~2 hours | 868 | |||
daos_run_io_conf | 128 | 740 | 2 | This will exclude the ranks and add it back in to the loop for given number. We can have maximum 16 targets and include all rank. Test will exclude the rank randomly and add it back. Pool query is also part of this test to verify the usage | We have not tried on TACC but locally it works but there are few issue need to be resolved which we caught during local testing (DAOS-3510) | 868 for ~30 min | 434 | ||||
Reliability and Data Integrity (Soak testing) | Current Soak testing | 868 for 2 hours | 1736 | ||||||||
...