We have an automated test infrastructure for building and running DAOS tests on Frontera.
Please see below for instructions on building and running tests.
Before doing any work on Frontera, you should read and understand Citizenship on Frontera.
You should also be aware of theĀ limited credits for running jobs. After logging in, you can run:
/usr/local/etc/taccinfo --------------------- Project balances for user dbohninx ---------------------- | Name Avail SUs Expires | | | STAR-Intel ##### YYYY-MM-DD | | |
All of these setup instructions should be ran on a login node (E.g. login3.frontera).
Add the following line to ~/.bashrc
. There should be an if block labeled "SECTION 2" where you should put this.
export PATH=$HOME/.local/bin:$PATH |
mkdir -p ${WORK}/{BUILDS,TESTS,RESULTS,WEEKLY_RESULTS,TOOLS} cd ${WORK}/TESTS git clone https://github.com/daos-stack/daos_scaled_testing |
These packages are for some of the .py scripts for post-processing results
cd ${WORK}/TESTS/daos_scaled_testing python3 -m pip install --upgrade pip python3 -m pip install --user -r python3_requirements.txt |
By default, the system installed MVAPICH2 is used and recommended. If you want to use MPICH or OpenMPI, they must be installed from scratch.
Since we only build with a single core on login nodes (remember Citizenship on Frontera), this may take a while to complete.
cd ${WORK}/TESTS/daos_scaled_testing/frontera ./build_and_install_tools.sh |
This script is not well maintained and may need adjustment.
This script is not well maintained and may need adjustment.
Edit run_build.sh:
vim run_build.sh |
Configure these lines:
BUILD_DIR="${WORK}/BUILDS/" DAOS_BRANCH="master" |
Optionally, you can choose to build a specific branch, commit, or cherry pick.
When executed on a login node, run_build.sh
will only use a single process. It is recommended to build on a development node. This will build DAOS in <BUILD_DIR>/<date>/daos and build the latest hpc/ior.
idev # wait ./run_build.sh |
The test script should be executed on a login node, which uses slurm to reserve nodes and run jobs.
cd ${WORK}/TESTS/daos_scaled_testing/frontera vim run_testlist.py ... # Configure these lines env['JOBNAME'] = "<sbatch_jobname>" env['DAOS_DIR'] = abspath(expandvars("${WORK}/BUILDS/latest/daos")) # Path to daos env['RES_DIR'] = abspath(expandvars("${WORK}/RESULTS")) # Path to test results |
The DAOS configs are daos_scaled_testing/frontera/daos_{server,agent,control}.yml
.
The client / test runner environment is defined in daos_scaled_testing/frontera/env_daos
.
Test configs are in the daos_scaled_testing/frontera/tests
directory. Each config is a python file that defines a list of test variants to run. For example, in tests/sanity/ior.py
:
''' IOR sanity tests. Defines a list 'tests' containing dictionary items of tests. ''' # Default environment variables used by each test env_vars = { 'pool_size': '85G', 'chunk_size': '1M', 'segments': '1', 'xfer_size': '1M', 'block_size': '150G', 'sw_time': '5', 'iterations': '1', 'ppc': 32 } # List of tests tests = [ { 'test_group': 'IOR', 'test_name': 'ior_sanity', 'oclass': 'SX', 'scale': [ # (num_servers, num_clients, timeout_minutes) (1, 1, 1), ], 'env_vars': dict(env_vars), 'enabled': True }, ] |
These parameters can be configured as desired. Only tests with 'enabled': True
will run. To execute some tests:
$ ./run_testlist.py tests/sanity/ior.py Importing tests from tests/sanity/ior.py 001. Running ior_sanity SX, 1 servers, 1 clients, 1048576 ec_ell_size ... Submitted batch job 3728480 |
You can then monitor the status of your jobs in the queue:
showq -u |
When all the sbatch jobs complete, you should see the results at <RES_DIR>/<date>. You can extract the IOR and MdTest results into CSV format by running:
cd ${WORK}/TESTS/daos_scaled_testing/frontera # Get all ior and mdtest results ./get_results.py <RES_DIR> # Get all ior and mdtest results, and email the result CSVs ./get_results.py <RES_DIR> --email first.last@intel.com |
This is currently a work in progress. See https://github.com/daos-stack/daos_scaled_testing/blob/master/database/README.md for general usage.
TODO