Use the local workspace for running functional tests
- 1 Build DAOS from master
- 2 Install ftest Python Dependencies
- 3 Ensure sudo access
- 4 Perform some basic avocado/DAOS directory/mount cleanups
- 5 Install DAOS client RPMS
- 6 Setup environment variables
- 7 System Setup Not Handled by Source Builds
- 8 Run the daos_admin scripts
- 9 Update systemd files to use the local daos_server and daos_agent
- 10 Running launch.py
- 11 Known Issues
Build DAOS from master
cd $HOME
git clone https://github.com/daos-stack/daos.git --recurse-submodules
cd daos
MPI_PKG=any scons-3 BUILD_TYPE=dev PREFIX=$HOME/daos/install install --build-deps=yes --config=force -j 12
Install ftest Python Dependencies
It is recommended to use a python virtual environment. Create one with:
python3 -m venv venv_ftest
Use it:
source venv_ftest/bin/activate
And install the dependencies with:
python3 -m pip install --upgrade pip
python3 -m pip install -r "$DAOS_PREFIX"/lib/daos/TESTING/ftest/requirements-ftest.txt
Additionally, install pydaos
with:
python3 -m pip install "$DAOS_PREFIX"/lib/daos/python
Ensure sudo access
You will need sudo access on all nodes you plan to use. For example:
cat >> /etc/sudoers.d/<idsid> <<EOF
<idsid> ALL=(ALL) NOPASSWD: ALL
EOF
chmod 660 /etc/sudoers.d/<idsid>
Perform some basic avocado/DAOS directory/mount cleanups
clush -w $NODES 'sudo killall -9 avocado orterun mpirun orted daos_server daos_io_server daos_agent'
clush -w $NODES 'sudo rm -rf /var/run/daos_server/*
You can remove /var/run/daos_server, /var/run/daos_agent and create them
Optionally, you can mount tmpfs:
sudo mount -t tmpfs -o size=128G tmpfs /mnt/daos
Install DAOS client RPMS
You can either build mpich and ior by following these procedures:
MPI-IO DAOS Driver (Setup Guide) - DAOS Community - Confluence (atlassian.net)
https://daosio.atlassian.net/wiki/spaces/DAOS/pages/2183562861/HowTo+run+IOR)
Or you can install the DAOS client RPMS:
sudo yum install -y daos-client
sudo yum install -y ior-hpc # (Just ior-hpc installs alone doesn't seem to work for me, seeing conflicts)
sudo yum install -y mpich
Get the package versions installed with:
rpm -qa | grep daos
For example:
daos-client-1.3.104-4.6879.g75a46e58.el7.x86_64
daos-server-1.3.104-4.6879.g75a46e58.el7.x86_64
daos-1.3.104-4.6879.g75a46e58.el7.x86_64
Remove just the client packages from client nodes, but not ior and mpich
clush -w $CLIENT_NODES sudo rpm -e --nodeps daos-client-1.3.104-4.6879.g75a46e58.el7.x86_64
clush -w $CLIENT_NODES sudo rpm -e --nodeps daos-1.3.104-4.6879.g75a46e58.el7.x86_64
Remove just the server packages from server nodes:
clush -w $SERVER_NODES sudo rpm -e --nodeps daos-1.3.104-4.6879.g75a46e58.el7.x86_64
clush -w $SERVER_NODES sudo rpm -e --nodeps daos-server-1.3.104-4.6879.g75a46e58.el7.x86_64
Setup environment variables
Use these scripts to setup the environment, include SL_PREFIX
, etc.
<daos>/utils/setup_local.sh
<daos>/.build_vars.sh
Or
<daos>/install/lib/daos/.build-vars.sh
Make a copy and source the vars:
cp .build_vars.sh <daos>/utils/.build_vars-Linux.sh
source setup_local.sh .build_vars-Linux.sh
System Setup Not Handled by Source Builds
On each client:
sudo useradd daos_agent -M Running launch.py
sudo mkdir -p /etc/daos/certs
And on each server:
sudo useradd daos_server -M
sudo mkdir -p /etc/daos/certs/clients
Run the daos_admin scripts
cp ~/<daos>/install/lib/daos/.build_vars.sh ~/<daos>/utils/sl/.build_vars-Linux.sh
cd ~/<daos>/utils/sl
source setup_local.sh .build_vars-Linux.sh
cd ~/<daos>/utils/
sudo -E ./setup_daos_admin.sh
chmod -x ~/<daos>/install/bin/daos_admin
Update systemd files to use the local daos_server and daos_agent
Copy
daos_server.service
,daos_agent.service
from<daos>/utils/systemd/
to/usr/lib/systemd/system/
Update
ExecStart
with the absolute path to your binariessed -i "s|ExecStart=.*|ExecStart=$(which daos_server) start -o /etc/daos/daos_server.yml|g" /usr/lib/systemd/system/daos_server.service sed -i "s|ExecStart=.*|ExecStart=$(which daos_agent) start -o /etc/daos/daos_agent.yml|g" /usr/lib/systemd/system/daos_agent.service
Reload the systemd files
sudo systemctl daemon-reload
Running launch.py
You can now run launch.py from the DAOS install directory: <daos>/install/lib/daos/TESTING/ftest
.
For example, to run a basic test that confirms setup:
python3 ./launch.py -ts $SERVER_NODES -tc $CLIENT_NODES test_setup
See Using launch.py and Test Tags for details.
For sample scripts which you can tailor to your needs, see:
/home/rpadma2/local_testing/local_testing.tar file.(wolf node)
Known Issues
openmpi/mellanox drivers on IO500
Create the module file /etc/modulefiles/mpi/mlnx_openmpi-x86_64
and load it with module load mlnx_openmpi-x86_64
:
#%Module 1.0
conflict mpi
prepend-path PATH /usr/mpi/gcc/openmpi-4.1.0rc5/bin
prepend-path LD_LIBRARY_PATH /usr/mpi/gcc/openmpi-4.1.0rc5/lib64
prepend-path PKG_CONFIG_PATH /usr/mpi/gcc/openmpi-4.1.0rc5/lib64/pkgconfig
prepend-path PYTHONPATH /usr/lib64/python2.7/site-packages/openmpi
prepend-path MANPATH /usr/mpi/gcc/openmpi-4.1.0rc5/share/man
setenv MPI_BIN /usr/mpi/gcc/openmpi-4.1.0rc5/bin
setenv MPI_SYSCONFIG /usr/mpi/gcc/openmpi-4.1.0rc5/etc
setenv MPI_FORTRAN_MOD_DIR /usr/mpi/gcc/openmpi-4.1.0rc5/lib64
setenv MPI_INCLUDE /usr/mpi/gcc/openmpi-4.1.0rc5/include
setenv MPI_LIB /usr/mpi/gcc/openmpi-4.1.0rc5/lib64
setenv MPI_MAN /usr/mpi/gcc/openmpi-4.1.0rc5/share/man
setenv MPI_PYTHON_SITEARCH /usr/lib64/python2.7/site-packages/openmpi
setenv MPI_PYTHON2_SITEARCH /usr/lib64/python2.7/site-packages/openmpi
setenv MPI_COMPILER openmpi-x86_64
setenv MPI_SUFFIX _openmpi
setenv MPI_HOME /usr/mpi/gcc/openmpi-4.1.0rc5
Building mpich and ior (not using RPMs)
mpich and ior must be included in
$PATH
,$LD_LIBRARY_PATH
, and$CPATH