Re: [IO-500] Musings from Ilene Carpenter

Thursday, 6 April 2017

Hi,

...
1. On the other hand, Top 500 is not called HPC Top 500. As you
say, Green500, Graph500, Top500, IO500 does make sense.

2. > Benchmarks get what benchmarks get.  What about performance when
another job
...
 is running?
 The idea was data-easy, data-hard, metadata-easy, metadata-hard and then
 figure out how to combine these four into one number.  Perhaps we add a
 fifth test which is running all four simultaneously. 
We may also have useful equations to combine these values but also
supercomputer characteristics into one value.
Based on the needs, one retrieves the appropriate compound value to
sort for a "balanced system"

I don't see the benefit of running them simultaneously right now. This
is difficult as parts of the test could be served by other storage
systems, e.g., acceleration for small files. Also how to balance?
Well, maybe this thought is important for later.

...
 3. Storage performance degrades with age.  You run Linpack on day 1
and get a
 number.  You run Linpack on day N and get the same number.  If your purpose
 is to bound user expectation, then a number from day 1 may no longer be the
 correct bound on day N when the storage is fragmented. 
I believe the way we do benchmarking is adding to the problem. We
should indeed reserve a tiny fraction of space for debugging of
performance and resilience, a test partition on which the same
benchmark can be run over and over again with comparable performance.
At least during maintenance.
Unavoidable factors are degradation in storage hardware and impact of
software / firmware upgrades.

...
 4. Another challenge with IO 500 is that Linpack is so easy.  You run
it and
 you get two indisputable answers: the result which can be verified to be
 correct and the time that it took to get the result.  IO benchmarks are much
 harder.  Are people allowed to set up RAID0 RAM disks and do the benchmark
 into them? 
...
 Sure, if ... provide to users Yes, I agree too. They would show
up as another file system in the
IO500 list. Less capacity but good performance :-)

...
 5. IO 500 is great because it enables people to look at historical
trends.

 For example, when do various systems start showing up?  If I’m procuring a
 new storage system and I see that Ceph, BeeGFS, or OrangeFS are high up in
 the IO 500 then I’m much likely consider them instead of just Lustre and
 GPFS. Also one could compare data centers similar to the current system or
application profile and see how they improved.
FYI: Soon the list will be extended with several BeeGFS systems that
have low TOP500 ranks, but they will show up.

...
 6. NREL just doubled their flops on a system but didn’t touch
storage.  It’d be
 nice if the IO 500 could somehow capture this.  Both before and after will
 have the same storage performance but the after system is worse because it
 is imbalanced. Indeed, and it will. As the HPSL covers *many* other metrics and
(will) also cover costs, comparing those systems is possible.
Right now it is already interesting to compare peak performance vs.
storage capacity vs. memory capacity.

...
 7. Aren’t you all behind the schedule that you yourself John Bent
proposed
 after the SC BoF? We are. Somehow the activity here goes also up & down.

I don't know why only few people respond to me, if there is a reason,
tell me :-)
Julian

2024

2023

2022

2021

2020

2019

2018

2017

2016

Re: [IO-500] Musings from Ilene Carpenter