Use the local workspace for running functional tests

Use the local workspace for running functional tests

Build DAOS from master

cd $HOME git clone https://github.com/daos-stack/daos.git --recurse-submodules cd daos MPI_PKG=any scons-3 BUILD_TYPE=dev PREFIX=$HOME/daos/install install --build-deps=yes --config=force -j 12

Install ftest Python Dependencies

It is recommended to use a python virtual environment. Create one with:

python3 -m venv venv_ftest

Use it:

source venv_ftest/bin/activate

And install the dependencies with:

python3 -m pip install --upgrade pip python3 -m pip install -r "$DAOS_PREFIX"/lib/daos/TESTING/ftest/requirements-ftest.txt

Additionally, install pydaos with:

python3 -m pip install "$DAOS_PREFIX"/lib/daos/python

Ensure sudo access

You will need sudo access on all nodes you plan to use. For example:

cat >> /etc/sudoers.d/<idsid> <<EOF <idsid> ALL=(ALL) NOPASSWD: ALL EOF chmod 660 /etc/sudoers.d/<idsid>

Perform some basic avocado/DAOS directory/mount cleanups

clush -w $NODES 'sudo killall -9 avocado orterun mpirun orted daos_server daos_io_server daos_agent' clush -w $NODES 'sudo rm -rf /var/run/daos_server/*

You can remove /var/run/daos_server, /var/run/daos_agent and create them

Optionally, you can mount tmpfs:

sudo mount -t tmpfs -o size=128G tmpfs /mnt/daos

Install DAOS client RPMS

You can either build mpich and ior by following these procedures:

Or you can install the DAOS client RPMS:

sudo yum install -y daos-client sudo yum install -y ior-hpc # (Just ior-hpc installs alone doesn't seem to work for me, seeing conflicts) sudo yum install -y mpich

Get the package versions installed with:

rpm -qa | grep daos

For example:

  • daos-client-1.3.104-4.6879.g75a46e58.el7.x86_64

  • daos-server-1.3.104-4.6879.g75a46e58.el7.x86_64

  • daos-1.3.104-4.6879.g75a46e58.el7.x86_64

Remove just the client packages from client nodes, but not ior and mpich

clush -w $CLIENT_NODES sudo rpm -e --nodeps daos-client-1.3.104-4.6879.g75a46e58.el7.x86_64 clush -w $CLIENT_NODES sudo rpm -e --nodeps daos-1.3.104-4.6879.g75a46e58.el7.x86_64

Remove just the server packages from server nodes:

clush -w $SERVER_NODES sudo rpm -e --nodeps daos-1.3.104-4.6879.g75a46e58.el7.x86_64 clush -w $SERVER_NODES sudo rpm -e --nodeps daos-server-1.3.104-4.6879.g75a46e58.el7.x86_64

Setup environment variables

Use these scripts to setup the environment, include SL_PREFIX, etc.

  • <daos>/utils/setup_local.sh

  • <daos>/.build_vars.sh

    • Or <daos>/install/lib/daos/.build-vars.sh

Make a copy and source the vars:

cp .build_vars.sh <daos>/utils/.build_vars-Linux.sh source setup_local.sh .build_vars-Linux.sh

System Setup Not Handled by Source Builds

On each client:

sudo useradd daos_agent -M Running launch.py sudo mkdir -p /etc/daos/certs

And on each server:

sudo useradd daos_server -M sudo mkdir -p /etc/daos/certs/clients

Run the daos_admin scripts

cp ~/<daos>/install/lib/daos/.build_vars.sh ~/<daos>/utils/sl/.build_vars-Linux.sh cd ~/<daos>/utils/sl source setup_local.sh .build_vars-Linux.sh cd ~/<daos>/utils/ sudo -E ./setup_daos_admin.sh chmod -x ~/<daos>/install/bin/daos_admin

Update systemd files to use the local daos_server and daos_agent

  • Copy daos_server.service, daos_agent.service from <daos>/utils/systemd/ to /usr/lib/systemd/system/

  • Update ExecStart with the absolute path to your binaries

  • sed -i "s|ExecStart=.*|ExecStart=$(which daos_server) start -o /etc/daos/daos_server.yml|g" /usr/lib/systemd/system/daos_server.service sed -i "s|ExecStart=.*|ExecStart=$(which daos_agent) start -o /etc/daos/daos_agent.yml|g" /usr/lib/systemd/system/daos_agent.service
  • Reload the systemd files

  • sudo systemctl daemon-reload

Running launch.py

You can now run launch.py from the DAOS install directory: <daos>/install/lib/daos/TESTING/ftest.

For example, to run a basic test that confirms setup:

python3 ./launch.py -ts $SERVER_NODES -tc $CLIENT_NODES test_setup

See Using launch.py and Test Tags for details.

For sample scripts which you can tailor to your needs, see:

/home/rpadma2/local_testing/local_testing.tar file.(wolf node)

Known Issues

openmpi/mellanox drivers on IO500

Create the module file /etc/modulefiles/mpi/mlnx_openmpi-x86_64 and load it with module load mlnx_openmpi-x86_64:

#%Module 1.0 conflict mpi prepend-path PATH /usr/mpi/gcc/openmpi-4.1.0rc5/bin prepend-path LD_LIBRARY_PATH /usr/mpi/gcc/openmpi-4.1.0rc5/lib64 prepend-path PKG_CONFIG_PATH /usr/mpi/gcc/openmpi-4.1.0rc5/lib64/pkgconfig prepend-path PYTHONPATH /usr/lib64/python2.7/site-packages/openmpi prepend-path MANPATH /usr/mpi/gcc/openmpi-4.1.0rc5/share/man setenv MPI_BIN /usr/mpi/gcc/openmpi-4.1.0rc5/bin setenv MPI_SYSCONFIG /usr/mpi/gcc/openmpi-4.1.0rc5/etc setenv MPI_FORTRAN_MOD_DIR /usr/mpi/gcc/openmpi-4.1.0rc5/lib64 setenv MPI_INCLUDE /usr/mpi/gcc/openmpi-4.1.0rc5/include setenv MPI_LIB /usr/mpi/gcc/openmpi-4.1.0rc5/lib64 setenv MPI_MAN /usr/mpi/gcc/openmpi-4.1.0rc5/share/man setenv MPI_PYTHON_SITEARCH /usr/lib64/python2.7/site-packages/openmpi setenv MPI_PYTHON2_SITEARCH /usr/lib64/python2.7/site-packages/openmpi setenv MPI_COMPILER openmpi-x86_64 setenv MPI_SUFFIX _openmpi setenv MPI_HOME /usr/mpi/gcc/openmpi-4.1.0rc5

Building mpich and ior (not using RPMs)

  • mpich and ior must be included in $PATH, $LD_LIBRARY_PATH, and $CPATH