Hi John,
Thanks for the reply! In the io500_fixed.sh script, it looks like we
only read the File Creation and not Tree Creation results when computing
the mdtest_easy_write value in the summary (specifically the File
creation Max value). To verify I extracted the summary and
mdtest_easy_write results from the SC19 weka result archive and manually
ran the grep statement from get_mdt_iops to confirm. Is my understand
correct? Is the Tree creation time included somewhere else?
From mdtest_easy_write.txt:
Operation Max Min Mean Std Dev
--------- --- --- ---- -------
File creation : 2235658.143 2235648.983
2235654.026 0.986
Tree creation : 10.012 10.012
10.012 0.000
From the summary:
[RESULT] IOPS phase 1 mdtest_easy_write 2235.660 kiops : time
436.35 seconds
grep '^ *File 'creation mdtest_easy_write.txt | head -n 1 | awk '{print
$4/1000}'
2235.66
grep '^ *File 'creation mdtest_easy_write.txt | head -n 1 | awk '{print
$4}'
2235658.143
Mark
On 5/28/20 2:01 PM, John Bent wrote:
Hey Mark,
I believe the rational is that the process of creating directories
takes time. Some of the tests might create a lot of directories and
we want that included in the measured time. So doing the precreation
of the directories moves work from the measured benchmark phase to an
unmeasured precreate phase. Does that help?
Thanks,
John(*)
* These statements merely reflect my own personal view; the only
mechanism for announcing official IO500 policies and decisions is the
committee(a)io500.org <mailto:committee@io500.org> email address.
On Thu, May 28, 2020 at 12:17 PM Mark Nelson <mnelson(a)redhat.com
<mailto:mnelson@redhat.com>> wrote:
Thinking about this more, could I please ask what the rationale here
is? Ultimately we'll do the pinning one way or another (maybe not
for
ISC20, we'll see). Right now we pin the easy mdtest subdirs
individually in the script. We can accomplish the same thing with a
top-level xattr and round-robin as subdirectories are created inside
ceph itself, but that just trades user-control over the pinning
scheme
for the convenience of setting a top-level xattr. The whole thing is
pretty arbitrary except that this isn't how we do it right now.
I'd like to understand where you guys are coming from on this
one. Are
you worried about being able to game the benchmark if you can set
subdir
xattrs? Wouldn't a real performance-focused user potentially want to
set subdir tunings (and not just in the ceph case) for a real-world
use-case that the easy mdtest benchmark is supposed to represent?
Mark
On 5/28/20 12:31 PM, Mark Nelson wrote:
> Sigh. That means I'll need to have our ephemeral pinning code do it
> inside ceph rather than just pinning those directories in the
script
> as I've been doing previously. Not impossible, just more work
to do
> under the time crunch while also trying to debug the API issues
with
> the new C version of the benchmark. This is getting rather
frustrating.
>
>
> Mark
>
>
> On 5/28/20 12:24 PM, John Bent wrote:
>> Mark and all,
>>
>> The committee just added a rule clarifying precreation of
directories
>> to the rules page:
https://www.vi4io.org/io500/rules/submission. The
>> newly added rule states:
>>
>> "Each of the four main phases (IOR easy and hard, and mdtest
easy and
>> hard) has a subdirectory which can be precreated and tuned (e.g.
>> using tools such as lfs_setstripe or beegfs_ctl); however,
additional
>> subdirectories within these subdirectories cannot be precreated."
>>
>> Below my signature, I am including my standard disclaimer that my
>> email is not necessarily an official IO500 position but note
that the
>> rules page itself is. :)
>>
>> Hope this is clear; please do reply with any questions or need for
>> further clarification,
>>
>> Thanks,
>>
>> John(*)
>> * These statements merely reflect my own personal view; the only
>> mechanism for announcing official IO500 policies and decisions
is the
>> committee(a)io500.org <mailto:committee@io500.org>
<mailto:committee@io500.org <mailto:committee@io500.org>> email
address.
>>
>>
>> On Wed, May 27, 2020 at 5:14 PM John Bent <johnbent(a)gmail.com
<mailto:johnbent@gmail.com>
>> <mailto:johnbent@gmail.com <mailto:johnbent@gmail.com>>>
wrote:
>>
>> Hey Mark,
>>
>> Thanks for the interest. It will be great to get your
>> contributions!
>>
>> 1. Must be exactly 300 seconds.
>> 2. Does not include the directories. Other historical
submissions
>> have tuned the directories exactly as you describe.
>> 3. Yes, 10+ metal nodes in AWS satisfies this requirement.
>>
>> Other committee members, and community members, please
chime in if
>> I got anything wrong! Mark, you might note the disclaimer
below
>> my signature which is just our committee's way of being
careful.
>> I'll make sure to discuss this email with the rest of the
>> committee and will let you know if any of my answers need
official
>> clarification.
>>
>> Thanks,
>>
>> John(*)
>>
>> * These statements merely reflect my own personal view; the
only
>> mechanism for announcing official IO500 policies and
decisions is
>> the committee(a)io500.org <mailto:committee@io500.org>
<mailto:committee@io500.org <mailto:committee@io500.org>> email
address.
>>
>>
>> On Wed, May 27, 2020 at 4:44 PM Mark Nelson via IO-500
>> <io-500(a)vi4io.org <mailto:io-500@vi4io.org>
<mailto:io-500@vi4io.org <mailto:io-500@vi4io.org>>> wrote:
>>
>> Hi Folks,
>>
>>
>> We are thinking about throwing together some cephfs io500
>> results for
>> ISC20 and I just wanted to make sure that we are doing the
>> right thing
>> in a couple of cases. Any help would be much appreciated
>> since we've
>> never submitted results before. We might have a couple of
>> additional
>> questions later on, but for now:
>>
>>
>> 1) "All create/write phases must run for at least 300
seconds;
>> the
>> stonewall flag must be set to 300 which should ensure
this."
>>
>> Is it acceptable to set the stonewall higher than 300,
or is a
>> setting
>> of exactly 300 required?
>>
>>
>> 2) "The file names for the mdtest output files may not be
>> pre-created."
>>
>> Does this also include the directories? We have the
ability
>> to pin
>> directories to specific MDSes that helps in the easy
tests. We
>> also have
>> an experimental feature that more or less does this
>> psuedo-randomly
>> behind the scenes so long as a top level xattr is set,
but it
>> would be
>> convenient if we could just pre-create the mdtest
directories
>> and set
>> the xattr to pin them individually in the "directory
setup"
>> phase of the
>> test if allowed. Likewise, we have code that allows
users to
>> provide a
>> hint if a specific directory is expected to have lots
of files
>> which can
>> improve performance in the hard tests. I would like to
>> pre-create the
>> mdtest directory so that we can set the xattr informing
ceph
>> that we
>> expect a lot of files to be written in that directory.
>>
>>
>> 3) "Only submissions using at least 10 physical client
nodes are
>> eligible to win IO500 awards and at least one benchmark
>> process must run
>> on each."
>>
>> We are planning on running on AWS. So long as we are using
>> 10+ metal
>> nodes does that meet the requirement to have "at least 10
>> physical
>> client nodes"?
>>
>>
>> Thanks,
>>
>> Mark
>>
>> _______________________________________________
>> IO-500 mailing list
>> IO-500(a)vi4io.org <mailto:IO-500@vi4io.org>
<mailto:IO-500@vi4io.org <mailto:IO-500@vi4io.org>>
>>
https://www.vi4io.org/mailman/listinfo/io-500
>>