Hi Peng,

 

IO500 is intended to be a companion to top500 in that the machines in that the benchmarks target similar workloads. Also note that Blue Waters at NCSA/UIUC is one of the most high profile machines to not submit to top500. There are certainly many others as has been pointed out.

 

If you have a particular goal in mind in looking over the list, we may be able to help guide your search. Let us know if we can help. If there are particular benchmark goals you have, we can probably help with that as well. The community guided the development of the IO500 suite. There are a few very specific tests we don't have, such as a massively parallel load of python or java libraries. We haven't found a generally acceptable benchmark to represent these specific operations. Otherwise, general IO tasks should be well represented for both easy and hard test cases.

 

Best,

 

Jay

 

From: IO-500 <io-500-bounces@vi4io.org> on behalf of George Markomanolis via IO-500 <io-500@vi4io.org>
Reply-To: Georgios Markomanolis <george@markomanolis.com>
Date: Tuesday, September 17, 2019 at 1:52 PM
To: Peng Yu <pengyu.ut@gmail.com>
Cc: "io-500@vi4io.org" <io-500@vi4io.org>, Julian Kunkel <juliankunkel@googlemail.com>
Subject: [EXTERNAL] Re: [IO-500] A list file systems used in top I/O system

 

Hi,

 

Top500 is based on Linpack ( https://www.top500.org/project/linpack/ ) which many people believe that doesn't correspond to the scientific applications that we use today. So, I wouldn't say is more related to scientific computation. IO500 is about I/O, if you do science or not, this is another topic, we hope that the people who use storage, they do science. It is not embarking parallel, we have many benchmarks included in the IO500 suite that you can write one file per MPI process or shared file across MPI processes. Top500 and IO500 don't have the same purpose, except of evaluating different systems. A parallel application maybe writes dats. IO500 evaluates how efficient you will read/write data, accessing metadata operations etc. You can find some talks about IO500 here: https://www.vi4io.org/io500/news

 

regards,

George

 

On Tue, Sep 17, 2019 at 3:41 PM Peng Yu via IO-500 <io-500@vi4io.org> wrote:

I assume the top500.org systems are for applications different from those of io500. After all, top500.org, in my opinion, are more related to scientific computation, in which the scientific problems to solve is not embarrassingly parallelizable. For io500, I'd expect more embarrassingly parallelizable applications. In such cases, the storage size is more important, and the write/read pattern should be more serial. This probably explains the big difference between the two.

 

Maybe others can also share their opinions on the difference.

 

On Tue, Sep 17, 2019 at 1:57 PM John Bent <johnbent@gmail.com> wrote:

Thanks Patrick.  Peng, here's something else from Cray showing that Lustre was the file system on 77% of the top 100 systems from the June 2018 top500.org list:

 

 

 

Which is not nearly the same ratio that we have seen thus far in IO500:

 

 

Which I just created using https://www.vi4io.org/assets/io500/2019-06/data.csv ("The list shows the best result for a given combination of system/institution/filesystem").  Note that I did combine a few to reduce the number of slices (e.g. s/GPFS/Spectrum Scale/).

 

--

Regards,
Peng

_______________________________________________
IO-500 mailing list
IO-500@vi4io.org
https://www.vi4io.org/mailman/listinfo/io-500