IO-500 March 2020

io-500@lists.vi4io.org

5 participants
1 discussions

by Mark Nelson

Hi Folks, I'm one of the Ceph developers but used to work in the HPC world in a previous life. Recently I saw that we were listed on the SC19 IO-500 10 node challenge list but had ranked pretty low. I figured that it might be fun to play around for a couple of days and see if I could get our score up a bit. Let me first say that it's great having mdtest and ior packaged up like this. Already the hard test cases have identified a couple of performance issues we should take care of with unaligned reads/writes and cephfs dynamic subtree partitioning (which are also dragging our score down). Very useful! I was so happy with the effort that I ended up writing a new libcephfs aiori backend for ior/mdtest. The PR just merged but is here for anyone interested: https://github.com/hpc/ior/pull/217 Our test cluster has 10 nodes with 8 NVMe drives each, and we are co-locating the metadata servers and client processes on the same nodes during testing. So far with 2x replication we've managed to hit scores in the 55-60 range which looks like it would have put us in 10th place on the SC19 list (note that for that result we are pre-creating the mdtest easy directories for static round-robin MDS pinning, though we have a feature coming soon for ephemeral pinning via a single parent-directory xattr). Anyway, I have really no idea how that score actually compares to the other systems listed. I was wondering if there's any way to easily compare what kind of hardware and software configuration is being used for the storage clusters for each entry? IE in our case we're using 2x replication and 10 nodes total with pretty beefy Xeon CPUs, 8xP4610 NVMe drives, and 4x25GbE. Total storage capacity before replication is ~640TB. Thanks, Mark

4 years, 1 month

5
6
0 / 0

← Newer
1
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

IO-500 March 2020