FWIW we were able to get DAOS/Ceph running with both the C app and the
script last night thanks to a significant effort by several folks
including Julian and Mohammad. Thanks guys! Having said that, this
morning I believe we identified that the C app and script are not
running the tests quite the same way. Specifically the C app is running
mdtest with -Y to issue syncs while the script is not (at least for
multi-node configs). I believe Julian is talking to the committee to
determine which behavior is correct going forward. Between this issue
and what appears to still be an open question regarding whether or not
directory creation is included in the mdtest results/score, I'm not
confident what exactly we're going to end up with in 10 days.
I sympathize with Johann, it's going to be a very tight time crunch
getting representative results in by the June 8th deadline (especially
now that official runs take twice as long to complete). Right now I'm
still just trying to get back to the baseline results I had 2 months ago
on our development cluster, let alone configuring/tuning a real cluster
for submission.
Mark
On 5/29/20 9:48 AM, John Bent via IO-500 wrote:
Hey Johann,
We apologize for the difficulties. We will take all of your feedback
into consideration. In the meantime, can you please try one more time
using the current master branch? Julian has been working really hard
with Mark and Mohamad and thinks that is working for Ceph and DAOS.
If you could please try one more time, we would really appreciate it.
If not, please note the preface paragraph in our rules page
<
https://www.vi4io.org/io500/rules/submission>:
For ISC20, submission of test runs should use new C-Application io500
which automatically runs both the new C version and the existing bash
version (following the rules for SC-19) to ensure the consistency of
results between the two implementations. /An exception to this rule is
possible for submitters who have a legitimate reason by requesting an
exception from the committee via comittee(a)io500.org
<mailto:comittee@io500.org>. /[emphasis mine]
Thanks,
John(*)
* These statements merely reflect my own personal view; the only
mechanism for announcing official IO500 policies and decisions is the
committee(a)io500.org <mailto:committee@io500.org> email address.
On Fri, May 29, 2020 at 7:50 AM Julian Kunkel via IO-500
<io-500(a)vi4io.org <mailto:io-500@vi4io.org>> wrote:
Hi Johann,
can you use the current master branch of the app? We have it working
for Ceph and DAOS.
Best,
Julian
On Fri, May 29, 2020 at 2:41 PM Lombardi, Johann via IO-500
<io-500(a)vi4io.org <mailto:io-500@vi4io.org>> wrote:
>
> Andreas (and the rest of IO-500 committee),
>
>
>
> Thanks for the clarifications. If we had been aware of this new
C-app, we would have been *delighted* to help the committee to
test it and debug it in our environment several months ago.
Unfortunately, this wasn’t announced to this mailing list and we
clearly missed it.
>
>
>
> The current situation is that the C-app is clearly broken for
submissions that don’t use the default POSIX backend and we have
been spending a huge amount of time debugging it under a very
tight deadline (on our side, we will only get one window to run
the benchmarks on a production cluster on Monday). At this point,
there is a high risk that we won’t be able to fix the C-app on
time and will thus not be able to submit any *new* results. I
understand that we are not the only one in this situation and it
would be a pity to have fewer submissions this time because of the
new requirement. I would advocate for more flexibility and maybe
relax the rule to something like “if the C-app does not work in
your environment, please report the bug and give us a link to the
ticket”. We are happy to continue the debugging activity and make
sure the C-app works well for SC’20. Everyone will be warned at
that time and won’t have any excuse. That being said, for ISC’20,
I think that more pro-active communications on this mailing list
several months in advance would have been welcomed.
>
>
>
> Cheers,
>
> Johann
>
>
>
> From: IO-500 <io-500-bounces(a)vi4io.org
<mailto:io-500-bounces@vi4io.org>> on behalf of Andreas Dilger via
IO-500 <io-500(a)vi4io.org <mailto:io-500@vi4io.org>>
> Reply-To: Andreas Dilger <adilger(a)dilger.ca
<mailto:adilger@dilger.ca>>
> Date: Thursday 28 May 2020 at 23:10
> To: "Chaarawi, Mohamad" <mohamad.chaarawi(a)intel.com
<mailto:mohamad.chaarawi@intel.com>>
> Cc: "io-500(a)vi4io.org <mailto:io-500@vi4io.org>"
<io-500(a)vi4io.org <mailto:io-500@vi4io.org>>
> Subject: Re: [IO-500] Some rules clarifications?
>
>
>
> On May 28, 2020, at 11:44 AM, Chaarawi, Mohamad via IO-500
<io-500(a)vi4io.org <mailto:io-500@vi4io.org>> wrote:
>
>
>
> Hi John (and rest of IO-500 committee),
>
>
>
> On a different note, could you please shed some light on the
committee’s decision to require 2 “apps” to run the benchmark and
not just one?
>
>
>
> It seems this change has been announced very close to the
deadline and the C-app appears to be broken currently for
non-POSIX backends (mainly ones that require extra options that
are not IOR generic). Maybe I missed an earlier notification, and
excuse me if I did. But if not, I just feel that a requirement
like that should be made right after the conference for the next
submission deadline, and not so close to the deadline where people
may have already have results to submit using the io-500-dev repo
and not the new io500-app one.
>
>
>
> Mohamad,
>
> the motivation for running two copies of the benchmark and not
only one is that
>
> we want to move forward with using the io500 C-app vs the bash
script because:
>
> - the existing io500.sh script had caused problems for some
sites, because
>
> it is "launching" the various ior/mdtest runs itself via
mpirun, rather
>
> than being the executable itself
>
> - parsing results from the mdtest/ior output to generate the
scores was itself
>
> fragile, and prone to error if the output was slightly different
>
>
>
> Running the C-app and bash script overlapping for ISC'20 allows
us to compare
>
> the results across multiple different systems, to ensure that
the two produce
>
> equivalent results, and to ensure that the C-app is working
correctly across
>
> different environments. Having the bash script results
available in case
>
> of issues or discrepancies between the two is important to
ensure continuity.
>
>
>
> The C-app has been in development for some time, but has not had
much feedback.
>
> Making it part of the ISC'20 submission for everyone ensures
there is enough
>
> testing before we do a complete changeover to the C-app for SC'20.
>
>
>
> Cheers, Andreas(*)
>
>
>
> * These statements merely reflect my own personal view; the only
mechanism for
>
> announcing official IO500 policies and decisions is the
committee(a)io500.org <mailto:committee@io500.org>
>
> email address
>
>
>
>
>
> From: IO-500 <io-500-bounces(a)vi4io.org
<mailto:io-500-bounces@vi4io.org>> on behalf of John Bent via
IO-500 <io-500(a)vi4io.org <mailto:io-500@vi4io.org>>
>
> Reply-To: John Bent <johnbent(a)gmail.com
<mailto:johnbent@gmail.com>>
>
> Date: Thursday, May 28, 2020 at 12:25 PM
>
> To: Mark Nelson <mnelson(a)redhat.com <mailto:mnelson@redhat.com>>
>
> Cc: "io-500(a)vi4io.org <mailto:io-500@vi4io.org>"
<io-500(a)vi4io.org <mailto:io-500@vi4io.org>>
>
> Subject: Re: [IO-500] Some rules clarifications?
>
>
>
> Mark and all,
>
>
>
> The committee just added a rule clarifying precreation of
directories to the rules page:
https://www.vi4io.org/io500/rules/submission. The newly added rule
states:
>
>
>
> "Each of the four main phases (IOR easy and hard, and mdtest
easy and hard) has a subdirectory which can be precreated and
tuned (e.g. using tools such as lfs_setstripe or beegfs_ctl);
however, additional subdirectories within these subdirectories
cannot be precreated."
>
>
>
> Below my signature, I am including my standard disclaimer that
my email is not necessarily an official IO500 position but note
that the rules page itself is. :)
>
>
>
> Hope this is clear; please do reply with any questions or need
for further clarification,
>
>
>
> Thanks,
>
>
>
> John(*)
>
> * These statements merely reflect my own personal view; the only
mechanism for announcing official IO500 policies and decisions is
the committee(a)io500.orgemail address.
>
>
>
>
>
> On Wed, May 27, 2020 at 5:14 PM John Bent <johnbent(a)gmail.com
<mailto:johnbent@gmail.com>> wrote:
>
> Hey Mark,
>
>
>
> Thanks for the interest. It will be great to get your
contributions!
>
>
>
> 1. Must be exactly 300 seconds.
>
> 2. Does not include the directories. Other historical
submissions have tuned the directories exactly as you describe.
>
> 3. Yes, 10+ metal nodes in AWS satisfies this requirement.
>
>
>
> Other committee members, and community members, please chime in
if I got anything wrong! Mark, you might note the disclaimer
below my signature which is just our committee's way of being
careful. I'll make sure to discuss this email with the rest of
the committee and will let you know if any of my answers need
official clarification.
>
>
>
> Thanks,
>
>
>
> John(*)
>
>
>
> * These statements merely reflect my own personal view; the only
mechanism for announcing official IO500 policies and decisions is
the committee(a)io500.orgemail address.
>
>
>
>
>
> On Wed, May 27, 2020 at 4:44 PM Mark Nelson via IO-500
<io-500(a)vi4io.org <mailto:io-500@vi4io.org>> wrote:
>
> Hi Folks,
>
>
>
>
>
> We are thinking about throwing together some cephfs io500
results for
>
> ISC20 and I just wanted to make sure that we are doing the right
thing
>
> in a couple of cases. Any help would be much appreciated since
we've
>
> never submitted results before. We might have a couple of
additional
>
> questions later on, but for now:
>
>
>
>
>
> 1) "All create/write phases must run for at least 300 seconds; the
>
> stonewall flag must be set to 300 which should ensure this."
>
>
>
> Is it acceptable to set the stonewall higher than 300, or is a
setting
>
> of exactly 300 required?
>
>
>
>
>
> 2) "The file names for the mdtest output files may not be
pre-created."
>
>
>
> Does this also include the directories? We have the ability to pin
>
> directories to specific MDSes that helps in the easy tests. We
also have
>
> an experimental feature that more or less does this psuedo-randomly
>
> behind the scenes so long as a top level xattr is set, but it
would be
>
> convenient if we could just pre-create the mdtest directories
and set
>
> the xattr to pin them individually in the "directory setup"
phase of the
>
> test if allowed. Likewise, we have code that allows users to
provide a
>
> hint if a specific directory is expected to have lots of files
which can
>
> improve performance in the hard tests. I would like to
pre-create the
>
> mdtest directory so that we can set the xattr informing ceph that we
>
> expect a lot of files to be written in that directory.
>
>
>
>
>
> 3) "Only submissions using at least 10 physical client nodes are
>
> eligible to win IO500 awards and at least one benchmark process
must run
>
> on each."
>
>
>
> We are planning on running on AWS. So long as we are using 10+
metal
>
> nodes does that meet the requirement to have "at least 10 physical
>
> client nodes"?
>
>
>
>
>
> Thanks,
>
>
>
> Mark
>
>
>
> _______________________________________________
>
> IO-500 mailing list
>
> IO-500(a)vi4io.org <mailto:IO-500@vi4io.org>
>
>
https://www.vi4io.org/mailman/listinfo/io-500
>
> _______________________________________________
>
> IO-500 mailing list
>
> IO-500(a)vi4io.org <mailto:IO-500@vi4io.org>
>
>
https://www.vi4io.org/mailman/listinfo/io-500
>
>
>
>
>
> Cheers, Andreas
>
>
>
>
>
>
>
>
>
>
>
>
>
>
---------------------------------------------------------------------
> Intel Corporation SAS (French simplified joint stock company)
> Registered headquarters: "Les Montalets"- 2, rue de Paris,
> 92196 Meudon Cedex, France
> Registration Number: 302 456 199 R.C.S. NANTERRE
> Capital: 4,572,000 Euros
>
> This e-mail and any attachments may contain confidential
material for
> the sole use of the intended recipient(s). Any review or
distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
> _______________________________________________
> IO-500 mailing list
> IO-500(a)vi4io.org <mailto:IO-500@vi4io.org>
>
https://www.vi4io.org/mailman/listinfo/io-500
--
Dr. Julian Kunkel
Lecturer, Department of Computer Science
+44 (0) 118 378 8218
http://www.cs.reading.ac.uk/
https://hps.vi4io.org/
PGP Fingerprint: 1468 1A86 A908 D77E B40F 45D6 2B15 73A5 9D39 A28E
_______________________________________________
IO-500 mailing list
IO-500(a)vi4io.org <mailto:IO-500@vi4io.org>
https://www.vi4io.org/mailman/listinfo/io-500
_______________________________________________
IO-500 mailing list
IO-500(a)vi4io.org
https://www.vi4io.org/mailman/listinfo/io-500