I would like to remind an issue regarding mdtest-hard benchmark. It
creates files in a single directory for at least five minutes. When
the metadata performance is more improved, the greater number of files
will be created. This will hit the limit of the number of files in a
Actually, the limit of Lustre file system is about 8M files although
it depends the length of the pathname. When the metadata performance
is better than 27K IOPS, it is not possible to execute the mdtest-hard
create benchmark for five minutes since the number of files is larger
This is a reason we cannot submit the Lustre result. I expect this
will be quite common when the metadata performance is more improved.
Osamu Tatebe, Ph.D.
Center for Computational Sciences, University of Tsukuba
1-1-1 Tennodai, Tsukuba, Ibaraki 3058577 Japan
the steering committee had reconvened and I'm glad I can present you
two exciting changes we made based on the feedback we heared from the
community: the 10 Node I/O challenge and privacy changes to the
The purpose of these changes is to encourage submissions.
=== 10 Node challenge ===
At SC, we will announce another IO-500 winner for this separate 10
node challenge. This challenge is conducted using the regular IO-500
benchmark, however, with the rule that exactly 10 client nodes must be
used to run the benchmark (one exception is find, which may use 1
You may use any shared storage with, e.g., any number of servers.
We will announce the result in a separate derived list and in the full
list but not on the ranked IO-500 list at io500.org.
When submitting for the IO-500 list, you can opt-in for "Participate
in the 10 node challenge only", then we won't include the results into
the ranked list. Other 10 node submission will be included in the full
list and in the ranked list.
This brings us to the following definitions.
=== Lists ===
For the lists, we have at the moment the following:
* The "ranked" IO-500 list, this is the one prominently shown at IO500.org
It shows for each site and system only the best result. It actually is
a derived list from:
* The "full list" which contains any result even though it may not be
the best for a site, you are encouraged to submit multiple!
With the 10 node challenge, we will have another derived list from the
"full list" that only shows results submitted for exactly 10 nodes.
These changes come with extended privacy options during the submission process.
=== Privacy ===
The SC release of the list will also include the name of the submitter
(or team) to give them the credit they deserve.
During the submission process one will be able to individually opt-out
the release of certain information:
1) The name of the submitter/team
2) The file system
3) The name of the site and supercomputer
Therefore, if 1-3 is ticked, the results will be anonymous!
At the moment, we envision that after an embargo period the
information (2 and 3) are revealed and updated automatically in the
The period is either 3 years, or after the machine is decommissioned
(will be asked during the submission process).
The name of the submitter can remain private (forever) if chosen.
I hope you agree that these changes and the new challenge are highly
encouraging to submit.
We will update the text on the webpage shortly.
Comments are welcome.
the testing branch of the IO-500 scripts now supports stonewalling for
mdtest and ior, reducing the burden of users to define the appropriate
data size and metadata objects.
The creation phases of the benchmarks will start a wear-out phase
after (by default) 300s.
That means after 300s, the current number of iterations is exchanged
-- all processes must reach this number of iterations => thus, it
simulates the block-synchronous behaviour that have stragglers.
IMPORTANT: it may happen that certain processes are significantly
faster than others, then, of course, one has to wait longer than 300s.
Still, it reduces the burden to parameterize the benchmark and
accidentally wait for hours instead of minutes.
The version has been tested it on 64 nodes by me and George tested it at Kaust.
If you want to give it a shot, use the branch:
The behaviour is controlled by a new variable:
For testing, you may want to set it to, e.g., 1 ;-)
Once this is effective, we can simplify the scripts furthermore...
Thanks & Best,
Dr. Julian Kunkel
Lecturer, Department of Computer Science
+44 (0) 118 378 8218
I have raised the following question during yesterday's workshop at ISC:
We do operate a facility-wide centralized storage infrastructure to
which a number of different client systems (clusters) are connected.
None of the client system alone is capable of saturating the bandwidth
of the system and hence any IO500 submission using a single system will
not be representative for the performance/capabilities of the overall
We are interested in a benchmark execution mode that would allow to
assess the center-wide performance level. One possible option would be
to allow summing up concurrently executed IOR runs that have a
sufficiently large overlap. At least for the IOR easy/hard cases that
would be a sensible number.
Thank you for your consideration.
Juelich Supercomputing Centre (JSC)
Institute for Advanced Simulation (IAS)
Phone: +49 2461 61-3631
Fax: +49 2461 61-6656
Forschungszentrum Juelich GmbH
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt