How to judge scoring vs storage HW
by Mark Nelson
Hi Folks,
I'm one of the Ceph developers but used to work in the HPC world in a
previous life. Recently I saw that we were listed on the SC19 IO-500 10
node challenge list but had ranked pretty low. I figured that it might
be fun to play around for a couple of days and see if I could get our
score up a bit.
Let me first say that it's great having mdtest and ior packaged up like
this. Already the hard test cases have identified a couple of
performance issues we should take care of with unaligned reads/writes
and cephfs dynamic subtree partitioning (which are also dragging our
score down). Very useful! I was so happy with the effort that I ended
up writing a new libcephfs aiori backend for ior/mdtest. The PR just
merged but is here for anyone interested:
https://github.com/hpc/ior/pull/217
Our test cluster has 10 nodes with 8 NVMe drives each, and we are
co-locating the metadata servers and client processes on the same nodes
during testing. So far with 2x replication we've managed to hit scores
in the 55-60 range which looks like it would have put us in 10th place
on the SC19 list (note that for that result we are pre-creating the
mdtest easy directories for static round-robin MDS pinning, though we
have a feature coming soon for ephemeral pinning via a single
parent-directory xattr). Anyway, I have really no idea how that score
actually compares to the other systems listed. I was wondering if
there's any way to easily compare what kind of hardware and software
configuration is being used for the storage clusters for each entry?
IE in our case we're using 2x replication and 10 nodes total with pretty
beefy Xeon CPUs, 8xP4610 NVMe drives, and 4x25GbE. Total storage
capacity before replication is ~640TB.
Thanks,
Mark
3 years, 6 months