Re: [IO-500] Clock is ticking

Wednesday, 13 June 2018

Hello Kevin,

The deadline is one week before ISC, June 17. However, the committee is currently
discussing to allow any extensions.  

Please let us know if we can be of any assistance running or tuning the benchmark. 

Thanks 

John

...
 On Jun 12, 2018, at 6:41 PM, Glenn K. Lockwood
<glennklockwood(a)gmail.com&gt; wrote:

 Thanks for the tip re: pfind and the stonewalling option.  That definitely cuts down the
optimization space.

 The request for anonymization is really just to prevent a vendor from blindly taking the
IO-500 list, making a bar graph that directly identifies me, and claiming that my storage
system is garbage.  Even just anonymizing the institution name on the list would be
sufficient, because if a vendor wanted to throw up marketing material that showed how bad
my results are, they would have to go out of their way to identify me as being responsible
for that bad performance.  This is not unlike how some dark sites used to submit to Top500
under vague "US Government" banners.  We knew who they were, but that wasn't
what was important.

 I understand that IO-500 can be gamed just like Top500 and that vendors will do what they
do.  However given how new (and small) the IO-500 list is, there is a big downside risk
that any performance number posted by a large HPC center will appear, without context, on
a slide that oversimplifies the results.  Unlike HPL which relies on stateless resources,
storage performance is not a constant in time, and such subtleties get lost in the simple
top-10 bar graph that would likely appear on such a slide.  What I would like to avoid is
being named as the shortest bar on that graph at a venue where, for example, I cannot
explain that our performance was poor because our file system was 95% full that day.

 This problem will go away as IO-500 ages.  The list will get longer, and there will be
more opportunities to run it on squeaky-clean systems with vendor hand-holding like how
HPL is run today.  But given the current state where I can either volunteer results that
might be misrepresented (due to ignorance, not malice) or stay out of it, I think being
able to keep my name off the official record is a good compromise.

 Maybe I'm the only one who worries about these things, but it has been a major
concern of mine since at least two major storage vendors are now using IO-500 results for
marketing.  What do you all think?

 Glenn

> On Tue, Jun 12, 2018 at 5:06 PM John Bent <johnbent(a)gmail.com&gt; wrote:
> Hey Glenn,
> 
> Looks like Andreas answered some of your questions.  I'll answer a few more.
> 
> There is only one version of IO500.  You can run it with or without stonewall and
both are valid results.  There are a few caveats:
> It only works right now with IOR and pfind and you must set a stonewall limit of at
least 300 seconds
> It would be nice if it works with mdtest but that hasn't been added yet
> For IOR, you must use the --stoneWallingWearOut option so that stragglers are
accounted for
> The io500.sh doesn't support it yet so you'd have to modify it.  
> What it should do is automatically parse the output of the IOR write phases and
record the actual amount of data written by each process and then pass those values to the
corresponding read phase.
> One thing you can do is run it once with stonewall to see how much IO you can do in
five minutes and then modify io500.sh accordingly.
> Obviously it will be better when io500.sh does this for us.
> 
> When you say anonymous, what exactly did you have in mind?  Shroud institution,
vendor, file system type, or just some of these?  It's a tough balance.  Obviously we
want submissions but is there value in a totally anonymous result?  I guess we see the
degradation from easy to hard but is it useful information if we don't know what
filesystem it was?  Here's an example of something that would be awesome: someone
submits a Lustre result with very little degradation from easy to hard and other people in
the community say, "Wow!  How did they do that?" and IO500 submission contains
enough info for this result to be reproducible.  An anonymous submission wouldn't
allow this.  Is there some other value it provides beyond just inflating the list?  (which
might be sufficiently valuable in and of itself . . .)
> 
> John
> 
> On Tue, Jun 12, 2018 at 3:47 PM, Glenn Lockwood <glock(a)lbl.gov&gt; wrote:
> Hi John & Committee
> 
> I'd like to run IO-500 on our systems, but I have a few concerns that other sites
may also share.  In the interests of gaining clarity and resolution before the deadline, I
figure I would ask them here rather than in private.
> 
> Concern #1: The current state of the authoritative IO-500 benchmark distribution is a
little unclear to me .  As I understand it, there are two versions:
> 
> 1. the official version, where each benchmark must run for at least five minutes
> 2. the stonewall version, where each benchmark is allowed to stop after five minutes
> 
> In addition, I've been confused by the different options of parallel find.  It
looks like the most sensible one, pfind, is in the "utilities/find/old/"
directory whose name suggests it is old and I shouldn't be using it.  Is this true?
> 
> Concern #2: I can't help but notice that several HPC storage vendors have been
using the IO-500 results for marketing material ("IME is holistically faster than
DataWarp" and "DataWarp has the fastest peak flash performance").  It is
therefore conceivable that submitting anything but hero numbers could be used to make me,
my employer, or our vendor partners look bad.  I don't want my center's results
being used to show how bad our storage solution is, especially if the numbers are only low
because I didn't tune the benchmarks optimally.
> 
> As such, is it possible to submit results, hero or otherwise, anonymously?  Even
though there's only a few 1.6 TB/sec file systems in production in the world, even the
pretense of anonymity would make me feel more secure in submitting sub-optimal (or
embarrassing) numbers.
> 
> Thanks!
> 
> Glenn
> 
> 
> On Tue, Jun 12, 2018 at 2:25 PM John Bent <johnbent(a)gmail.com&gt; wrote:
> Hello IO500 friends,
> 
> We are at a critical junction for IO500.  Hopefully most of you joined this list
because you agree with the motivation behind IO500 and only a few of you joined to laugh
at its painful demise.
> 
> To the former, the committee wants to remind you all that we will unveil the second
list at ISC18.  As of now, we do have a few submissions but we fear they may be too few to
be sufficiently impressive to ensure our continued relevance.  We are clearly still too
new to have achieved critical mass.
> 
> We remain committed to the community's need for an IO500.  Reporting only hero
numbers as was the pre-IO500 status quo hurts us all.  Collecting a large historical data
set of IO performance benefits us all.
> 
> Please help ensure the success of our effort by submitting results yourselves and by
encouraging and soliciting others to do so.  The community stands ready to provide
assistance as is necessary although please remember that the benchmark is very easy to
run.
> 
> Thanks!
> 
> John (on behalf of the committee)
> 
> 
> _______________________________________________
> IO-500 mailing list
> IO-500(a)vi4io.org
> https://www.vi4io.org/mailman/listinfo/io-500
> 
> _______________________________________________
> IO-500 mailing list
> IO-500(a)vi4io.org
> https://www.vi4io.org/mailman/listinfo/io-500
> 
>  

2024

2023

2022

2021

2020

2019

2018

2017

2016

Re: [IO-500] Clock is ticking