Benchmarking
There are many factors that go into HPC performance. Aside from the obvious cpu performance network and storage are equally important in a distributed context.
Storage
fio is a powerful tool for benchmarking filesystems. Measuring maximum performance especially on extremely high performance filesystems can be tricky to measure and will require research on how to effectively use the tool. Often times measuring maximum performance on high performance distributed filesystems will require multiple nodes and threads for reading/writing. However it should provide a good ballpark of performance.
Substitute <directory>
with the filesystem that you want to
test. df -h
can be a great way to see where each drive is
mounted. fio
will need the ability to read/write in the given
directory.
IOPs (input/output operations per second)
Maximum Write Throughput
fio --ioengine=sync --direct=0 \
--fsync_on_close=1 --randrepeat=0 --nrfiles=1 --name=seqwrite --rw=write \
--bs=1m --size=20G --end_fsync=1 --fallocate=none --overwrite=0 --numjobs=1 \
--directory=<directory> --loops=10
Maximum Write IOPs
fio --ioengine=sync --direct=0 \
--fsync_on_close=1 --randrepeat=0 --nrfiles=1 --name=randwrite --rw=randwrite \
--bs=4K --size=1G --end_fsync=1 --fallocate=none --overwrite=0 --numjobs=80 \
--sync=1 --directory=<directory> --loops=10
Maximum Read Throughput
fio --ioengine=sync --direct=0 \
--fsync_on_close=1 --randrepeat=0 --nrfiles=1 --name=seqread --rw=read \
--bs=1m --size=240G --end_fsync=1 --fallocate=none --overwrite=0 --numjobs=1 \
--directory=<directory> --invalidate=1 --loops=10
Maximum Read IOPs
fio --ioengine=sync --direct=0 \
--fsync_on_close=1 --randrepeat=0 --nrfiles=1 --name=randread --rw=randread \
--bs=4K --size=1G --end_fsync=1 --fallocate=none --overwrite=0 --numjobs=20 \
--sync=1 --invalidate=1 --directory=<directory> --loops=10
Network
To test network latency and bandwidth there needs to be a source and
destination that you wish to test. It will expose a given port by
default 5201
with iperf.
Bandwidth
Start a server on a given <dest>
iperf3 -s
No on the <src>
machine run
iperf3 -c <ip address>
This will measure the bandwidth of the link between the nodes from
<src>
to <dest>
. This means that if you are using a provider where
your Internet have very different upload vs. download speeds you will
see very different results in the direction. Add a -R
flag to the
client to test the other direction.
Latency
ping is a great way to watch the
latency between <src>
and <dest>
.
From the src machine run
ping -4 <dest> -c 10
Keep in mind that ping is the bi-directional (round trip) time. So dividing by 2 is roughly the latency.