June 2017 - IO-500 - Lists.Vi4io.Org

by John Bent

All, We had a great session at ISC (about 30 people I think) and made great progress in the weeks leading up to it as well. Thanks to Satoshi we even got the two attached slides added into the official slides being released from the Top 500 session! We had 6 people sign up at the BOF saying that they’ll run the benchmark when it is finalized. I know I always say that we have ‘almost’ finalized the benchmark. But we really are getting much closer; it helped so much that Nathan combined the benchmarks and George worked on the script. I think we only have two open questions right now: 1. do 47K random IO in the IOR-hard or do 47K simple strided? My original thinking was strided but someone pointed out that the idea is to create the bounding box and random is harder than strided. Also random might be increasingly prevalent these days with more analytics and machine learning and graph analytics, etc. So I propose that we do random unless there are objections here. 2. Should we do some sort of mixed IO workload in addition to running the 4 tests serially? I like the idea but am not sure how exactly to do it. Do we need to merely mix IOR-hard and IOR-easy or md-hard and md-easy or both or mix all 4 at once? Do we just launch multiple command lines in the background and hope that the mpirun launch times are fast enough that they overlap? Do we need to modify IOR/mdtest to split the ranks in half and do different workloads with the two halves? Thoughts? We also made some procedural decisions. The initial steering committee will be Jay Lofstead, Julian Kunkle, and myself. That steering committee membership will last until IO500 is up and running and stable at which point the community can nominate new members. All decisions will be discussed first on the mailing list and we will try for as much consensus as possible. The VI4IO organization will host the IO500. Thanks very much, John

6 years, 9 months

4
13
0 / 0

github

by John Bent

All, Thanks to Julian for setting up a github repository for us hosted by VI4IO: https://github.com/VI4IO/io-500-dev Thanks to George for populating it with the currently proposed benchmark that he’s been developing. I have added a few issues as well for various things that we have been discussing like the non-stonewalled stonewalling as well as mixed mode. These issues are not necessarily required yet; we may decide to close them without addressing them. But I just wanted to document them into a proper issue tracker. Thanks, John

6 years, 9 months

2
1
0 / 0

Detailed benchmark proposal

by Julian Kunkel

Somehow the mail from John did not get through, so here it is (if there is an email issue please mail me). Thanks also for all those, we had discussions with, besides the Dagstuhl meeting... ------------------------------ *Von:* John Bent <John.Bent(a)seagategov.com> *Gesendet:* 16. Juni 2017 22:30:23 MESZ *An:* "io-500(a)vi4io.org" <io-500(a)vi4io.org> *Betreff:* Detailed benchmark proposal All, Sorry for the long silence on the mailing list. However, we have made some substantial progress recently as we prepare for our ISC BOF next week. For those of you at ISC, please join us from 11 to 12 on Tuesday in Substanz 1&2. The progress that we have made recently happened because a bunch of us were attending a German workshop last month at Dagstuhl and had multiple discussions about the benchmark. Here’s the highlights from what was discussed and the progress that we made at Dagstuhl: 1. General agreement that the IOR-hard, IOR-easy, mdtest-hard, mdtest-easy approach is appropriate. 2. We should add a ‘find’ command as this is a popular and important workload. 3. The multiple bandwidth measurements should be combined via geometric mean into one bandwidth. 4. The multiple IOPs measurements should also be combined via geometric mean into one IOPs. 5. The bandwidth and the IOPs should be multiplied to create one final score. 6. The ranking uses that final score but the webpage can be sorted using other metrics. 7. The webpage should allow filtering as well so, for example, people can look at only the HDD results. 8. We should separate the write/create phases from the read/stat phases to help ensure that caching is avoided 9. Nathan Hjelm volunteered to combine the mdtest and IOR benchmarks into one git repo and has now done so. This removes the #ifdef mess from mdtest and now they both share the nice modular IOR backend So the top-level summary of the benchmark in pseudo-code has become: # write/create phase bw1 = ior_easy -write [user supplies their own parameters maximizing data writes that can be done in 5 minutes] md1 = md_test_easy -create [user supplies their own parameters maximizing file creates that can be done in 5 minutes] bw2 = ior_hard -write [we supply parameters: unaligned strided into single shared file] md2 = md_test_hard -create [we supply parameters: creates of 3900 byte files into single shared directory] # read/stat phase bw3 = ior_easy -read [cross-node read of everything that was written in bw1] md3 = md_test_easy -stat [cross-node stat of everything that was created in md1] bw4 = ior_hard -read md4 = md_test_hard -stat # find phase md5 = [we supply parameters to find a subset of the files that were created in the tests] # score phase bw = geo_mean( bw1 bw2 bw3 bw4) md = geo_mean( md1 md2 md3 md4 bd5) total = bw * md Now we are moving on to precisely define what the parameters should look like for the hard tests and to create a standard so that people can start running it on their systems. By doing so, we will define the formal process so we can actually make this an official benchmark. Please see the attached file in which we’ve started precisely defining these parameters. Let’s start iterating please on this file to get these parameters correct. Thanks, John -- Dr. Julian Kunkel Abteilung Forschung Deutsches Klimarechenzentrum GmbH (DKRZ) Bundesstraße 45a • D-20146 Hamburg • Germany Phone: +49 40 460094-161 <040%20460094161> Fax: +49 40 460094-270 <040%20460094270> E-mail: kunkel(a)dkrz.de URL: http://www.dkrz.de Geschäftsführer: Prof. Dr. Thomas Ludwig Sitz der Gesellschaft: Hamburg Amtsgericht Hamburg HRB 39784

6 years, 9 months

4
13
0 / 0

Email mistake

by John Bent

All, As you may have noticed, I sent an email to the full list that I thought I was only sending to the committee. Sorry for confusion; my fault for not double-checking the email headers before sending. In any event, this is an insight into our transparency and process. :) John

6 years, 9 months

1
0
0 / 0

detailed benchmark proposal

by John Bent

All, Sorry for the long silence on the mailing list. However, we have made some substantial progress recently as we prepare for our ISC BOF next week. For those of you at ISC, please join us from 11 to 12 on Tuesday in Substanz 1&2. The progress that we have made recently happened because a bunch of us were attending a German workshop last month at Dagstuhl and had multiple discussions about the benchmark. Here’s the highlights from what was discussed and the progress that we made at Dagstuhl: 1. General agreement that the IOR-hard, IOR-easy, mdtest-hard, mdtest-easy approach is appropriate. 2. We should add a ‘find’ command as this is a popular and important workload. 3. The multiple bandwidth measurements should be combined via geometric mean into one bandwidth. 4. The multiple IOPs measurements should also be combined via geometric mean into one IOPs. 5. The bandwidth and the IOPs should be multiplied to create one final score. 6. The ranking uses that final score but the webpage can be sorted using other metrics. 7. The webpage should allow filtering as well so, for example, people can look at only the HDD results. 8. We should separate the write/create phases from the read/stat phases to help ensure that caching is avoided 9. Nathan Hjelm volunteered to combine the mdtest and IOR benchmarks into one git repo and has now done so. This removes the #ifdef mess from mdtest and now they both share the nice modular IOR backend So the top-level summary of the benchmark in pseudo-code has become: # write/create phase bw1 = ior_easy -write [user supplies their own parameters maximizing data writes that can be done in 5 minutes] md1 = md_test_easy -create [user supplies their own parameters maximizing file creates that can be done in 5 minutes] bw2 = ior_hard -write [we supply parameters: unaligned strided into single shared file] md2 = md_test_hard -create [we supply parameters: creates of 3900 byte files into single shared directory] # read/stat phase bw3 = ior_easy -read [cross-node read of everything that was written in bw1] md3 = md_test_easy -stat [cross-node stat of everything that was created in md1] bw4 = ior_hard -read md4 = md_test_hard -stat # find phase md5 = [we supply parameters to find a subset of the files that were created in the tests] # score phase bw = geo_mean( bw1 bw2 bw3 bw4) md = geo_mean( md1 md2 md3 md4 bd5) total = bw * md Now we are moving on to precisely define what the parameters should look like for the hard tests and to create a standard so that people can start running it on their systems. By doing so, we will define the formal process so we can actually make this an official benchmark. Please see the attached file in which we’ve started precisely defining these parameters. Let’s start iterating please on this file to get these parameters correct. Thanks, John

6 years, 9 months

3
4
0 / 0

Detailed benchmark proposal

by John Bent

All, Sorry for the long silence on the mailing list. However, we have made some substantial progress recently as we prepare for our ISC BOF next week. For those of you at ISC, please join us from 11 to 12 on Tuesday<x-apple-data-detectors://0> in Substanz 1&2. The progress that we have made recently happened because a bunch of us were attending a German workshop last month at Dagstuhl and had multiple discussions about the benchmark. Here’s the highlights from what was discussed and the progress that we made at Dagstuhl: 1. General agreement that the IOR-hard, IOR-easy, mdtest-hard, mdtest-easy approach is appropriate. 2. We should add a ‘find’ command as this is a popular and important workload. 3. The multiple bandwidth measurements should be combined via geometric mean into one bandwidth. 4. The multiple IOPs measurements should also be combined via geometric mean into one IOPs. 5. The bandwidth and the IOPs should be multiplied to create one final score. 6. The ranking uses that final score but the webpage can be sorted using other metrics. 7. The webpage should allow filtering as well so, for example, people can look at only the HDD results. 8. We should separate the write/create phases from the read/stat phases to help ensure that caching is avoided 9. Nathan Hjelm volunteered to combine the mdtest and IOR benchmarks into one git repo and has now done so. This removes the #ifdef mess from mdtest and now they both share the nice modular IOR backend So the top-level summary of the benchmark in pseudo-code has become: # write/create phase bw1 = ior_easy -write [user supplies their own parameters maximizing data writes that can be done in 5 minutes] md1 = md_test_easy -create [user supplies their own parameters maximizing file creates that can be done in 5 minutes] bw2 = ior_hard -write [we supply parameters: unaligned strided into single shared file] md2 = md_test_hard -create [we supply parameters: creates of 3900 byte files into single shared directory] # read/stat phase bw3 = ior_easy -read [cross-node read of everything that was written in bw1] md3 = md_test_easy -stat [cross-node stat of everything that was created in md1] bw4 = ior_hard -read md4 = md_test_hard -stat # find phase md5 = [we supply parameters to find a subset of the files that were created in the tests] # score phase bw = geo_mean( bw1 bw2 bw3 bw4) md = geo_mean( md1 md2 md3 md4 bd5) total = bw * md Now we are moving on to precisely define what the parameters should look like for the hard tests and to create a standard so that people can start running it on their systems. By doing so, we will define the formal process so we can actually make this an official benchmark. Please see the attached file in which we’ve started precisely defining these parameters. Let’s start iterating please on this file to get these parameters correct. Thanks, John

6 years, 9 months

5
9
0 / 0

Benchmark name + Logo

by Julian Kunkel

Dear all, on the BoF (slide information to follow), we had been discussing the importance of having a good name for the IO-500 benchmark, e.g., compare this to Linpack. We will package all used benchmarks into a benchmark with a given name. Likewise the IO-500 needs a logo similar to Top500, Graph500, ..., We discussed that the community should be the place to collect ideas and vote for these names and logos. Here is a google spreadsheet: https://docs.google.com/spreadsheets/d/1HB9KbG2FQfA_ybz3BoZYlSQDvcfrTfWaM... Feel free to add names/logos. You may already vote (just add 1), that we have a first impression about good and bad ideas. Thanks, Julian -- http://wr.informatik.uni-hamburg.de/people/julian_kunkel

6 years, 10 months

2
1
0 / 0

Slack

by Julian Kunkel

Dear all, there is a slack for VI4IO with an initial channel for IO500: https://join.slack.com/vi4io/shared_invite/MTk5NjU0MTYwNTk0LTE0OTc4NTI2Mj... Feel free to join... Regards, Julian -- http://wr.informatik.uni-hamburg.de/people/julian_kunkel

6 years, 10 months

1
0
0 / 0

Re: [IO-500] Detailed benchmark proposal

by Georgios Markomanolis

Hello everybody and Michael, I have sent two emails in the list that probably will arrive in later(??), if you have not got them already, so I am sorry if you receive them later. Michael, this is exactly what I was discussing also with Julien today. It is ok to have some generic benchmarks but as one person who was judging the linpack that it was not related to an application, thus we should have some benchmarks related to applications. My proposal was let’s discuss which applications consume the largest percentage of core-hours on the supercomputers across various sites and let’s try to mimic their I/O through IOR or whatever else, but first through IOR as it is a ready solution and well known. If not possible, we could check other solutions. For example, we already did test today the md-real and I was mentioning that I would like also to create less files but larger ones, because this could mimic cases that users are doing data assimilation, they run multiple models with various parameters and they don’t save large data as they execute a model for example 100 times the same moment. About the duration of the benchmark, one personal concern is not to be complicated like linpack neither to take long time and make it difficult for the people to find resources etc. I assume that 3 hours is more than enough. But I agree, that it depends. However, if I create a lot of data, I know that I will disturb other users on Lustre, I do not want to do that for 3 hours. About safe data, how would you check that? About pmem and future, the benchmark (suite) could be updated, this moment not many people have access on this, right? Of course, this does not mean that we should not get ready for this, but I am saying let’s get ready first for the basic ones and we’ll get involve, just my opinion. Everybody wants to have a benchmark suite that will help to next procurement, right? I hope to have nice discussions at BOF with all the people who will be there. Best regards, George ________________________________________ George Markomanolis, PhD Computational Scientist KAUST Supercomputing Laboratory (KSL) King Abdullah University of Science & Technology Al Khawarizmi Bldg. (1) Room 0123 Thuwal Kingdom of Saudi Arabia Mob: +966 56 325 9012 Office: +966 12 808 0393<tel:%2B966%2012%20808%200683> From: IO-500 <io-500-bounces(a)vi4io.org> on behalf of Michael Kluge <michael.kluge(a)tu-dresden.de> Date: Sunday, 18 June 2017 at 9:01 PM To: "io-500(a)vi4io.org" <io-500(a)vi4io.org> Subject: Re: [IO-500] Detailed benchmark proposal Hi all, IOR has an option to allocate a certain amount of the hosts memory. I suggest that we set this to 90-95 percent and the total amount of data written as twice the size of the main memory? Otherwise, the 10+ PB main memory of SUMMIT would make the list useless ;) If I read everything correctly the current run rules define an execution time of 5 minutes and just count the numbers of bytes/iops/files touched during this time. I agree that most of the time our users do I/O in bursts. Is the benchmark basically only about “who can write the most data with one file per process in 5 mins”? Why 5 minutes and not “how long does it take to dump 80% of the main memory to some redundant permanent storage” (with fsync())? Do we want to define some rules about how safe the data has to be? Should it be OK if this data ends up in a single burst buffer and there is no copy somewhere? I would recommend that the results are only valid if data survives one failure of one of the storage devices used. For example: For the mdtest-workload I could imagine a file system that has directory locking turned off and is using an SSD/NVRAM backend and thus would just behave like the “IOR hard” workload. Another point is that I am more a fan of application driven benchmarks. The numbers above do not tell me anything about my applications, so why should I actually run the benchmark? Just to to be “on the list”? Application driven benchmarks (something like SPEC CPU, but SPEC I/O), that scale with the machine (and with the machines main memory) could actually become a standard that also the industry could use to advertise their systems. In addition, if we as a site have I/O patterns that are close to one of the benchmarks, we could put some weight on this benchmark and adjust our tenders and the industry partner would know how to design the storage system with respect to our special requirements. Just because they know the I/O pattern because it is a standard and they know how to deal with it. One more thing that the current approach does not deal with at all is the fact that in very near future applications will access permanent storage using interfaces that IOR does not cover and store data by using mov() instructions in the CPU. Thus, if the list is established using some combination of IOR+MDTEST+POSIX, I think it has no chance to reflect the really fast I/O subsystems that are coming like http://pmem.io/ Sorry for the lengthy statement … Regards, Michael -- Dr.-Ing. Michael Kluge Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Falkenbrunnen, Room 240 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge(a)tu-dresden.de<mailto:michael.kluge@tu-dresden.de> WWW: http://www.tu-dresden.de/zih Von: IO-500 [mailto:io-500-bounces@vi4io.org] Im Auftrag von John Bent Gesendet: Freitag, 16. Juni 2017 22:30 An: io-500(a)vi4io.org Betreff: [IO-500] Detailed benchmark proposal All, Sorry for the long silence on the mailing list. However, we have made some substantial progress recently as we prepare for our ISC BOF next week. For those of you at ISC, please join us from 11 to 12 on Tuesday<x-apple-data-detectors://0> in Substanz 1&2. The progress that we have made recently happened because a bunch of us were attending a German workshop last month at Dagstuhl and had multiple discussions about the benchmark. Here’s the highlights from what was discussed and the progress that we made at Dagstuhl: 1. General agreement that the IOR-hard, IOR-easy, mdtest-hard, mdtest-easy approach is appropriate. 2. We should add a ‘find’ command as this is a popular and important workload. 3. The multiple bandwidth measurements should be combined via geometric mean into one bandwidth. 4. The multiple IOPs measurements should also be combined via geometric mean into one IOPs. 5. The bandwidth and the IOPs should be multiplied to create one final score. 6. The ranking uses that final score but the webpage can be sorted using other metrics. 7. The webpage should allow filtering as well so, for example, people can look at only the HDD results. 8. We should separate the write/create phases from the read/stat phases to help ensure that caching is avoided 9. Nathan Hjelm volunteered to combine the mdtest and IOR benchmarks into one git repo and has now done so. This removes the #ifdef mess from mdtest and now they both share the nice modular IOR backend So the top-level summary of the benchmark in pseudo-code has become: # write/create phase bw1 = ior_easy -write [user supplies their own parameters maximizing data writes that can be done in 5 minutes] md1 = md_test_easy -create [user supplies their own parameters maximizing file creates that can be done in 5 minutes] bw2 = ior_hard -write [we supply parameters: unaligned strided into single shared file] md2 = md_test_hard -create [we supply parameters: creates of 3900 byte files into single shared directory] # read/stat phase bw3 = ior_easy -read [cross-node read of everything that was written in bw1] md3 = md_test_easy -stat [cross-node stat of everything that was created in md1] bw4 = ior_hard -read md4 = md_test_hard -stat # find phase md5 = [we supply parameters to find a subset of the files that were created in the tests] # score phase bw = geo_mean( bw1 bw2 bw3 bw4) md = geo_mean( md1 md2 md3 md4 bd5) total = bw * md Now we are moving on to precisely define what the parameters should look like for the hard tests and to create a standard so that people can start running it on their systems. By doing so, we will define the formal process so we can actually make this an official benchmark. Please see the attached file in which we’ve started precisely defining these parameters. Let’s start iterating please on this file to get these parameters correct. Thanks, John ________________________________ This message and its contents including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.

6 years, 10 months

1
0
0 / 0

BoF Session at ISC

by Julian Kunkel

Dear all, this is an invitation for our BoF during ISC HPC about the IO500 and the Virtual Institute on Tuesday, June 20, 2017 11:00 am - 12:00 pm Details: http://www.isc-hpc.com/isc17_ap/sessiondetails.htm?t=session&o=557&a=sele... We are looking forward for releasing a draft for the actual IO-500 benchmark on the list, soon. FYI: There is an article about the IO500 on Top500.org: https://www.top500.org/news/tracking-the-worlds-top-storage-systems/ Looking forward seeing you at ISC. Kind regards, Julian -- http://wr.informatik.uni-hamburg.de/people/julian_kunkel

6 years, 10 months

1
0
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

IO-500 June 2017