We have an automated test infrastructure for building and running DAOS tests on Frontera.
Please see below for instructions on building and running tests.
Citizenship
Before doing any work on Frontera, you should read and understand Citizenship on Frontera.
You should also be aware of theĀ limited credits for running jobs. After logging in, you can run:
/usr/local/etc/taccinfo --------------------- Project balances for user dbohninx ---------------------- | Name Avail SUs Expires | | | STAR-Intel ##### YYYY-MM-DD | |
Initial Setup
All of these setup instructions should be ran on a login node (E.g. login3.frontera).
Add your local binary directory to your PATH
Add the following line to ~/.bashrc
. There should be an if block labeled "SECTION 2" where you should put this.
export PATH=$HOME/.local/bin:$PATH
Setup directories and clone the test scripts
mkdir -p ${WORK}/{BUILDS,TESTS,RESULTS,WEEKLY_RESULTS,TOOLS} cd ${WORK}/TESTS git clone https://github.com/daos-stack/daos_scaled_testing
Install python dependencies for post-job scripts
These packages are for some of the .py scripts for post-processing results
cd ${WORK}/TESTS/daos_scaled_testing python3 -m pip install --upgrade pip python3 -m pip install --user -r python3_requirements.txt
Build MPI packages - Optional
By default, the system installed MVAPICH2 is used and recommended. If you want to use MPICH or OpenMPI, they must be installed from scratch.
Since we only build with a single core on login nodes (remember Citizenship on Frontera), this may take a while to complete.
cd ${WORK}/TESTS/daos_scaled_testing/frontera ./build_and_install_tools.sh
This script is not well maintained and may need adjustment.
Build DAOS
Edit run_build.sh:
vim run_build.sh
Configure these lines:
BUILD_DIR="${WORK}/BUILDS/" DAOS_BRANCH="master"
Optionally, you can choose to build a specific branch, commit, or cherry pick.
When executed on a login node, run_build.sh
will only use a single process. It is recommended to build on a development node. This will build DAOS in <BUILD_DIR>/<date>/daos and build the latest hpc/ior.
idev # wait ./run_build.sh
Running Tests
The test script should be executed on a login node, which uses slurm to reserve nodes and run jobs.
Configure run_testlist.py:
cd ${WORK}/TESTS/daos_scaled_testing/frontera vim run_testlist.py ... # Configure these lines env['JOBNAME'] = "<sbatch_jobname>" env['DAOS_DIR'] = abspath(expandvars("${WORK}/BUILDS/latest/daos")) # Path to daos env['RES_DIR'] = abspath(expandvars("${WORK}/RESULTS")) # Path to test results
Configure DAOS Configs
The DAOS configs are daos_scaled_testing/frontera/daos_{server,agent,control}.yml
.
The client / test runner environment is defined in daos_scaled_testing/frontera/env_daos
.
Test Variants
Test configs are in the daos_scaled_testing/frontera/tests
directory. Each config is a python file that defines a list of test variants to run. For example, in tests/sanity/ior.py
:
''' IOR sanity tests. Defines a list 'tests' containing dictionary items of tests. ''' # Default environment variables used by each test env_vars = { 'pool_size': '85G', 'chunk_size': '1M', 'segments': '1', 'xfer_size': '1M', 'block_size': '150G', 'sw_time': '5', 'iterations': '1', 'ppc': 32 } # List of tests tests = [ { 'test_group': 'IOR', 'test_name': 'ior_sanity', 'oclass': 'SX', 'scale': [ # (num_servers, num_clients, timeout_minutes) (1, 1, 1), ], 'env_vars': dict(env_vars), 'enabled': True }, ]
These parameters can be configured as desired. Only tests with 'enabled': True
will run. To execute some tests:
$ ./run_testlist.py tests/sanity/ior.py Importing tests from tests/sanity/ior.py 001. Running ior_sanity SX, 1 servers, 1 clients, 1048576 ec_ell_size ... Submitted batch job 3728480
You can then monitor the status of your jobs in the queue:
showq -u
Get test results
When all the sbatch jobs complete, you should see the results at <RES_DIR>/<date>. You can extract the IOR and MdTest results into CSV format by running:
cd ${WORK}/TESTS/daos_scaled_testing/frontera # Get all ior and mdtest results ./get_results.py <RES_DIR> # Get all ior and mdtest results, and email the result CSVs ./get_results.py <RES_DIR> --email first.last@intel.com
Storing and retrieving tests in the database
This is currently a work in progress. See https://github.com/daos-stack/daos_scaled_testing/blob/master/database/README.md for general usage.
Running the Validation Suite
TODO