Skip to end of metadata
Go to start of metadata
Introduction - Metadata on SSD overview can be found at Metadata on SSDs - DAOS Community - Confluence (atlassian.net)
This test plan covers additional tests needed to verify the extra work required upon engine restart. This includes loading the blob from the SSD and re-applying any changes from the WAL.
HW Requirements --identify the hardware that will be required to execute the tests covered by the plan.
SW Components - DAOS, Python, Avocado, IOR, Dfuse, POSIX environment
Opens/Limitations – list any unresolved issues, or limitations of the testing.
Test Description | JIRA | Priority | Test Steps | Resource | Release |
---|
Verify data access after engine restart w/ WAL replay + w/ check pointing (unsynchronized WAL & VOS) |
DAOS-13009
-
Getting issue details...
STATUS
| P1 | Start 2 DAOS servers with 1 engine per server Create a single pool and container Run ior w/ DFS to populate the container with data After ior has completed, shutdown every engine cleanly (dmg system stop ) Restart each engine (dmg system start ) Verify the previously written data matches with an ior read
|
| 2.4 |
Verify POSIX data access after engine restart (check modification timestamp?) |
DAOS-13010
-
Getting issue details...
STATUS
| P1 | Start 2 DAOS servers with 1 engine per server Create a single pool and a POSIX container Start dfuse Write and then read data to the dfuse mount point After the read has completed, unmount dfuse Shutdown every engine cleanly (dmg system stop) Restart each engine (dmg system start) Remount dfuse Verify the previously written data exists Verify more data can be written
| | 2.4 |
Verify device roles in dmg storage query output |
DAOS-13011
-
Getting issue details...
STATUS
| P1 | Start 1 DAOS server with 1 engine per server Get a list of device information (dmg storage query list-devices ) Verify each device’s role entry matches the expected value based upon the server storage configuration
| | 2.4 |
Verify data access after engine restart w/o WAL replay + w/ check pointing (synchronized WAL & VOS) |
DAOS-13012
-
Getting issue details...
STATUS
| P2 | Start 2 DAOS servers with 1 engine per server Create a single pool and container Run ior w/ DFS to populate the container with data After ior has completed, shutdown every engine cleanly (dmg system stop ) Restart each engine (dmg system start ) Verify the previously written data matches with an ior read
|
DAOS-13016
-
Getting issue details...
STATUS
| 2.4 |
Verify data access after engine restart w/ WAL replay + w/o check pointing (unsynchronized WAL & VOS) |
DAOS-13013
-
Getting issue details...
STATUS
| P2 | Start 2 DAOS servers with 1 engine per server Create a single pool and container Run ior w/ DFS to populate the container with small amount of data After ior has completed, shutdown every engine cleanly (dmg system stop ) Restart each engine (dmg system start ) Verify the previously written data matches with an ior read
|
DAOS-13017
-
Getting issue details...
STATUS
| 2.4 |
Verify snapshots after engine restart |
DAOS-13014
-
Getting issue details...
STATUS
| P2 | Start 2 DAOS servers with 1 engine per server Create a single pool and container in the pool Run ior w/ DFS to populate the container with persistent data followed by creating a snapshot (daos container create-snap ). Repeat this three times. Verify all three snapshots exist (daos container list-snaps ) Remove the second snapshot (daos container destroy-snap ) Verify that two snapshots exist (daos container list-snaps ) Shutdown every engine cleanly (dmg system stop --force ) Restart each engine (dmg system start ) Verify all engines have joined (dmg system query ) Verify that two snapshots exist (daos container list-snaps ) Remove the two snapshots (daos container destroy-snap ) Verify that no snapshots exist (daos container list-snaps )
| | 2.4 |
Verify pool & container attributes after engine restart |
DAOS-13015
-
Getting issue details...
STATUS
| P2 | Start 3 DAOS servers with 1 engine on each server Create a multiple pools and containers List the current pool and container attributes Modify at least one different attribute on each pool and container Shutdown every engine cleanly (dmg system stop ) Restart each engine (dmg system start ) Verify each modified pool and container attribute is still set
| | 2.4 |
Verify the specific metrics to track activity of md_on_ssd. |
DAOS-11626
-
Getting issue details...
STATUS
| P2 | test_wal_commit_metrics Start 2 DAOS servers with 1 engine on each server Verify the engine_dmabuff_wal_* metrics are 0 Create a pool Verify the engine_dmabuff_wal_sz metric is greater than 0 Verify the engine_dmabuff_wal_waiters metrics are 0
test_wal_reply_metrics Start 2 DAOS servers with 1 engine on each server Verify the engine_pool_vos_rehydration_replay_* metrics are 0 Create a pool Verify the engine_pool_vos_rehydration_replay_count metric is 1 Verify the engine_pool_vos_rehydration_replay_count metric is 1 Verify the engine_pool_vos_rehydration_replay_count metric is 1
| | 2.6 |
Add running a subset of pr tests with MD on SSD in master PRs |
DAOS-13530
-
Getting issue details...
STATUS
| P1 | <No test steps> | | 2.6 |
Reviewed and Approved By | |
---|
Test Engineer | Phil Henderson |
Feature Developer | |
Date | Initial review on Feb 9, 2023 |
Feature Design Document |
---|
*Link to design document for feature under test |
*Functional testing - Full functionality tests, verify all corner cases, can be automated in CI. To be run either in daily/weekly build, where it makes sense.
*Negative testing - Ensures application can gracefully handle invalid input or unexpected user behavior. To be run either in daily/weekly build, where it makes sense.
*Stress testing - Endurance test for checking resource limitations. To be run on weekly build or manually.