Hi Peng,
IO500 is intended to be a companion to top500 in that the machines in that the benchmarks
target similar workloads. Also note that Blue Waters at NCSA/UIUC is one of the most high
profile machines to not submit to top500. There are certainly many others as has been
pointed out.
If you have a particular goal in mind in looking over the list, we may be able to help
guide your search. Let us know if we can help. If there are particular benchmark goals you
have, we can probably help with that as well. The community guided the development of the
IO500 suite. There are a few very specific tests we don't have, such as a massively
parallel load of python or java libraries. We haven't found a generally acceptable
benchmark to represent these specific operations. Otherwise, general IO tasks should be
well represented for both easy and hard test cases.
Best,
Jay
From: IO-500 <io-500-bounces(a)vi4io.org> on behalf of George Markomanolis via IO-500
<io-500(a)vi4io.org>
Reply-To: Georgios Markomanolis <george(a)markomanolis.com>
Date: Tuesday, September 17, 2019 at 1:52 PM
To: Peng Yu <pengyu.ut(a)gmail.com>
Cc: "io-500(a)vi4io.org" <io-500(a)vi4io.org>, Julian Kunkel
<juliankunkel(a)googlemail.com>
Subject: [EXTERNAL] Re: [IO-500] A list file systems used in top I/O system
Hi,
Top500 is based on Linpack (
https://www.top500.org/project/linpack/ ) which many people
believe that doesn't correspond to the scientific applications that we use today. So,
I wouldn't say is more related to scientific computation. IO500 is about I/O, if you
do science or not, this is another topic, we hope that the people who use storage, they do
science. It is not embarking parallel, we have many benchmarks included in the IO500 suite
that you can write one file per MPI process or shared file across MPI processes. Top500
and IO500 don't have the same purpose, except of evaluating different systems. A
parallel application maybe writes dats. IO500 evaluates how efficient you will read/write
data, accessing metadata operations etc. You can find some talks about IO500 here:
https://www.vi4io.org/io500/news
regards,
George
On Tue, Sep 17, 2019 at 3:41 PM Peng Yu via IO-500
<io-500@vi4io.org<mailto:io-500@vi4io.org>> wrote:
I assume the top500.org<http://top500.org> systems are for applications different
from those of io500. After all, top500.org<http://top500.org>, in my opinion, are
more related to scientific computation, in which the scientific problems to solve is not
embarrassingly parallelizable. For io500, I'd expect more embarrassingly
parallelizable applications. In such cases, the storage size is more important, and the
write/read pattern should be more serial. This probably explains the big difference
between the two.
Maybe others can also share their opinions on the difference.
On Tue, Sep 17, 2019 at 1:57 PM John Bent
<johnbent@gmail.com<mailto:johnbent@gmail.com>> wrote:
Thanks Patrick. Peng, here's something else from Cray showing that Lustre was the
file system on 77% of the top 100 systems from the June 2018
top500.org<http://top500.org> list:
[cid:image001.png@01D571F9.6C02B5D0]
https://www.cray.com/blog/business-cards-change-passion-open-source-lustr...
Which is not nearly the same ratio that we have seen thus far in IO500:
[cid:image002.png@01D571F9.6C02B5D0]
Which I just created using
https://www.vi4io.org/assets/io500/2019-06/data.csv ("The
list shows the best result for a given combination of
system/institution/filesystem"). Note that I did combine a few to reduce the number
of slices (e.g. s/GPFS/Spectrum Scale/).
--
Regards,
Peng
_______________________________________________
IO-500 mailing list
IO-500@vi4io.org<mailto:IO-500@vi4io.org>
https://www.vi4io.org/mailman/listinfo/io-500