On Jun 23, 2017, at 6:02 PM, Andreas Dilger
<adilger(a)dilger.ca> wrote:
On Jun 23, 2017, at 4:57 PM, John Bent <John.Bent(a)seagategov.com> wrote:
All,
We had a great session at ISC (about 30 people I think) and made great progress in the
weeks leading up to it as well. Thanks to Satoshi we even got the two attached slides
added into the official slides being released from the Top 500 session! We had 6 people
sign up at the BOF saying that they’ll run the benchmark when it is finalized.
I know I always say that we have ‘almost’ finalized the benchmark. But we really are
getting much closer; it helped so much that Nathan combined the benchmarks and George
worked on the script.
I think we only have two open questions right now:
1. do 47K random IO in the IOR-hard or do 47K simple strided? My original thinking was
strided but someone pointed out that the idea is to create the bounding box and random is
harder than strided. Also random might be increasingly prevalent these days with more
analytics and machine learning and graph analytics, etc. So I propose that we do random
unless there are objections here.
May as well go random at that point.
2. Should we do some sort of mixed IO workload in
addition to running the 4 tests serially? I like the idea but am not sure how exactly to
do it. Do we need to merely mix IOR-hard and IOR-easy or md-hard and md-easy or both or
mix all 4 at once? Do we just launch multiple command lines in the background and hope
that the mpirun launch times are fast enough that they overlap? Do we need to modify
IOR/mdtest to split the ranks in half and do different workloads with the two halves?
Thoughts?
This would be a bit of a dog's breakfast, and very hard to specify the test
parameters. Do all of the loads need to be running for the whole duration? How does this
work if some jobs finish early? What if, for example, there was a workload scheduler in
the storage that (automatically?) segregated the IO of each workload and they actually ran
serially and didn't contend at all? Would that be considered an improvement, since
this could help real-world jobs as well?
Agree about the scheduler. But what about a modified IOR that split the ranks in
two and half did easy and the other half did hard?
Thanks,
John
Maybe something for V2?
We also made some procedural decisions. The
initial steering committee will be Jay Lofstead, Julian Kunkle, and myself. That steering
committee membership will last until IO500 is up and running and stable at which point the
community can nominate new members. All decisions will be discussed first on the mailing
list and we will try for as much consensus as possible. The VI4IO organization will host
the IO500.
Thanks very much,
John
<io500_two_slides.pdf>_______________________________________________
IO-500 mailing list
IO-500(a)vi4io.org
https://www.vi4io.org/cgi-bin/mailman/listinfo/io-500
Cheers, Andreas