SSD Benchmarking for WAL
In order to evaluate the performance impact of phase 1 on DAOS, a set of low-level benchmark using spdk_nvme_perf can be used to (try to) emulate the WAL traffic. This wiki page summarizes all the command line option as well as the results.
SSD Identification
To know what type of SSD we are dealing with, the spdk_nvme_identify command can be used:
# spdk_nvme_identify -V -r 'trtype:PCIe traddr:e20005:04:00.0'
EAL: No free 2048 kB hugepages reported on node 1
EAL: No available 1048576 kB hugepages reported
TELEMETRY: No legacy callbacks, legacy socket not created
=====================================================
NVMe Controller at e20005:04:00.0 [144d:a824]
=====================================================
Controller Capabilities/Features
================================
Vendor ID: 144d
Subsystem Vendor ID: 144d
Serial Number: S4YPNE0N800120
Model Number: SAMSUNG MZWLJ3T8HBLS-00007
Firmware Version: EPK98B5Q
[...]
It dumps a lot of information, but what we are mostly interested here is the model number, firmware version and heath information. For the latter, we are expecting results to be reported with relatively new SSD. This includes no critical warnings, “Available Spare” to 100% and “Life Percentage Used” to 0%.
Single Thread qd = 1
The intent here is to measure that extra latency that the WAL commit might add to a single update operation. To do so, we want to first benchmark with a queue depth of 1 and from a single thread.
# spdk_nvme_perf -V -r 'trtype:PCIe traddr:e20005:04:00.0' -q 1 -o 4096 -w write -c 0x1 -t 60
EAL: No free 2048 kB hugepages reported on node 1
EAL: No available 1048576 kB hugepages reported
TELEMETRY: No legacy callbacks, legacy socket not created
Initializing NVMe Controllers
Attached to NVMe Controller at e20005:04:00.0 [144d:a824]
Associating PCIE (e20005:04:00.0) NSID 1 with lcore 0
Initialization complete. Launching workers.
========================================================
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 102351.15 399.81 9.76 8.55 49644.32
========================================================
Total : 102351.15 399.81 9.76 8.55 49644.32
The result of interest here is the average latency and IOPS. With this SSD (PM1733), the average latency is 9.7us and 102K IOPS.
Single Thread qd > 1
In practice, we are going to commit multiple operations in parallel to the WAL, so the latency with a queue depth > 1 is of interest. To see how the latency evolve, we would like to measure the latency for qd = 2^n with n in 1..8
# for i in `seq 8`; do echo $i; spdk_nvme_perf -V -r 'trtype:PCIe traddr:e20005:04:00.0' -q $((2**$i)) -o 4096 -w write -c 0x1 -t 60; done
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 195578.43 763.98 10.22 8.52 65762.35
========================================================
Total : 195578.43 763.98 10.22 8.52 65762.35
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 332707.70 1299.64 12.01 8.51 4215.47
========================================================
Total : 332707.70 1299.64 12.01 8.51 4215.47
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 494580.12 1931.95 16.16 8.56 1273.60
========================================================
Total : 494580.12 1931.95 16.16 8.56 1273.60
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 604705.96 2362.13 26.44 8.79 907.45
========================================================
Total : 604705.96 2362.13 26.44 8.79 907.45
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 620312.92 2423.10 51.56 9.27 786.01
========================================================
Total : 620312.92 2423.10 51.56 9.27 786.01
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 630753.78 2463.88 101.45 9.46 4472.04
========================================================
Total : 630753.78 2463.88 101.45 9.46 4472.04
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 632426.65 2470.42 202.37 9.31 3191.75
========================================================
Total : 632426.65 2470.42 202.37 9.31 3191.75
[...]
Latency(us)
Device Information : IOPS MiB/s Average min max
PCIE (e20005:04:00.0) NSID 1 from core 0: 633949.05 2476.36 403.80 9.82 5846.45
========================================================
Total : 633949.05 2476.36 403.80 9.82 5846.45
Multi Thread
In practice, there will be multi-target per SSD, so it is important to measure the IOPS and latency with multiple concurrent threads (i.e. 2/4/8).
Results
To provide results, the following spreadsheet should be filed with the output of this command that collects all the results:
Data should be filled in the format of the example spreadsheet below: