Various test runs (e.g. the weekly test run, or master test runs after a merge) are not directly associated with user activity and therefore and no clear responsible party when something goes wrong. With no
clear owner problems that occur on these test runs that do not occur elsewhere (e.g. on a PR run with a clear owner) can go undetected and uninvestigated for too long. To combat this problem a member
of the test team is assigned as owner on a rotating basis. Responsibilities are as follows:
- Review test run output on a daily basis, any failed runs require follow up.
- Investigate test failures. If a test failure has an existing ticket make sure it is properly identified as a "CI Issue". If there is not an existing ticket, create one. Use the template below when creating the ticket.
- Bring failures that do not receive prompt action to the attention of management and/or the daos triage team.
Use the template below for creating a new ticket for tests failing in CI.
Project: <CaRT | "CORAL - CI" | DAOS>
Issue type: Bug
Labels: flaky
Fill in Bug Exposure, Bug Quality, Bug Type accordingly.
Summary: Provide a summary of the issue
Description: Please include the following in the description.
Failed Stage: < Build | Unit Test | Test>
Failed Build/Test:
For build failure, please list the build stage that failed e.g. "Build RPM on CentOS 7", "Build RPM on Leap 15", etc
For test failure, please list the test folder, source (and variant when possible) e.g. daos_test/daos_core_test.py - DAOS degraded-mode tests
Branch: <master | name_of_branch>
Commit: <commit hash>
Include stack trace, error message
Attach debug logs.