November 2017 - IO-500 - Lists.Vi4io.Org

by John Bent

All, A great first list. Now let's think about the next list for ISC. We are in agreement that the benchmarks themselves will not change although we might add some additional optional tests. The question I'd like to put before the community right now is how to improve the usefulness of the collected numbers. I think one way to do this is collect additional environmental information. For example, we currently have been recording the number of client nodes and the number of procs per node. This allows analysis of client scalability and allows a per-client score ranking. We should collect more information such as server side information like the number of servers and the number of devices; this will allow similar server scalability analysis. As such, I have two specific questions right now: 1. What is the best way to collect this environmental information? 2. What environmental information should we collect? Right now, I've added the following variables into io500.sh which we are asking submitters to fill-in. This will allow us to automate the data collection on the back-end but may require the submitter to do some research when they do their run. There is a trade-off between what is useful to collect and what puts an undue burden on the submitter. # top level info io500_info_system_name='xxx' # e.g. Oakforest-PACS io500_info_institute_name='xxx' # e.g. JCAHPC io500_info_storage_age_in_months='xxx' # not install date but age since last refresh io500_info_storage_install_date='xxx' # MM/YY io500_info_filesysem='xxx' # e.g. BeeGFS, DataWarp, GPFS, IME, Lustre io500_info_filesystem_version='xxx' # client side info io500_info_num_client_nodes='xxx' io500_info_procs_per_node='xxx' # server side info io500_info_num_metadata_server_nodes='xxx' io500_info_num_data_server_nodes='xxx' io500_info_num_data_storage_devices='xxx' # if you have 5 data servers, and each has 5 drives, then this number is 25 io500_info_num_metadata_storage_devices='xxx' # if you have 2 metadata servers, and each has 5 drives, then this number is 10 io500_info_data_storage_type='xxx' # HDD, SSD, persistent memory, etc, feel free to put specific models io500_info_metadata_storage_type='xxx' # HDD, SSD, persistent memory, etc, feel free to put specific models io500_info_storage_network='xxx' # infiniband, omnipath, ethernet, etc io500_info_storage_interface='xxx' # SAS, SATA, NVMe, etc # miscellaneous io500_info_whatever='WhateverElseYouThinkRelevant' One thing that might be useful are scripts to automatically collect this info. They might be specific to different filesystems. For example, perhaps we could include 'utilities/collect_XXX.sh' scripts which automatically collect useful information for various XXX filesystems like BeeGFS, DataWarp, GPFS, IME, Lustre. And perhaps there are ways to automatically collect in a filesystem agnostic way. Here is the enviromental information we are currently collecting: echo "System: " `uname -n` echo "filesystem_utilization=$(df ${io500_workdir}|tail -1)" Thanks and looking forward to hearing all of your ideas about this, John

6 years, 5 months

3
4
0 / 0

Reminder for Supercomputing BoF

by Julian Kunkel

Dear all, we are looking forward meeting you at our BoF during SC: https://www.vi4io.org/events/2017/bof-sc-bof Enclosed is a fancy flyer revealing some information about the Top 3 without giving too much information. Thanks, Julian -- http://wr.informatik.uni-hamburg.de/people/julian_kunkel

6 years, 5 months

3
2
0 / 0

System descriptions for IO500 submissions

by Julian Kunkel

Dear all, thanks again for all those that have submitted already to the IO500. An important piece of the submission is to provide information about the system where the test is run. There has happened a considerable amount of work to improve the database behind it, a process that is not yet completed. However, the system description is completed, such that one can add detailed information about nodes, devices of the supercomputer and data center. We will then link back from the IO-500 results to the test description. Have a look for example at DKRZ's page: https://www.vi4io.org/hpsl/2017/deu/dkrz/start I want to encourage people to update or submit new information to that list when they submit to IO500. The entry page is here: https://www.vi4io.org/hpsl/start If you have any questions, how to perform these changes send them to me or the list. Editing is very easy with the UI provided one clicks on the red button. If you send me the site information for the submissions, I can create stubs. Regards, Julian -- http://wr.informatik.uni-hamburg.de/people/julian_kunkel

6 years, 5 months

1
0
0 / 0

New improved parallel find tool

by John Bent

Hello all, As a reminder, the "find" portion of the IO500 is the least proscribed portion and we encourage you all to innovate heavily in this area by writing your own custom tools. However, we do realize that many of you are using the provided tools. Unfortunately, these provided tools have been suboptimal. As such, we are happy to announce that Julian wrote a new parallel find tool that is much improved over the previous. It uses C instead of python so should be easier to get up and running in your environments. Additionally, we have found that the "find" functionality can take an intractable amount of time. As such, Julian also added stonewall functionality. If you do use it, we will count the score but will mark it accordingly. Instructions for using it are in io500.sh:setup_find. You will want to pull a new clone of the github and rerun ./utilities/prepare.sh to get everything setup. Please let us know if you encounter any difficulties. Thanks, John Julian Jay

6 years, 5 months

1
0
0 / 0

SC BoF

by Julian Kunkel

Dear all, at SC we'll have a BoF regarding the IO500 and the VI4IO, you'll find the information + agenda here: https://www.vi4io.org/events/2017/bof-sc-bof Regards, Julian -- http://wr.informatik.uni-hamburg.de/people/julian_kunkel

6 years, 5 months

3
5
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

IO-500 November 2017