environmental information to collect
by John Bent
All,
A great first list. Now let's think about the next list for ISC. We are
in agreement that the benchmarks themselves will not change although we
might add some additional optional tests.
The question I'd like to put before the community right now is how to
improve the usefulness of the collected numbers. I think one way to do
this is collect additional environmental information. For example, we
currently have been recording the number of client nodes and the number of
procs per node. This allows analysis of client scalability and allows a
per-client score ranking. We should collect more information such as
server side information like the number of servers and the number of
devices; this will allow similar server scalability analysis.
As such, I have two specific questions right now:
1. What is the best way to collect this environmental information?
2. What environmental information should we collect?
Right now, I've added the following variables into io500.sh which we are
asking submitters to fill-in. This will allow us to automate the data
collection on the back-end but may require the submitter to do some
research when they do their run. There is a trade-off between what is
useful to collect and what puts an undue burden on the submitter.
# top level info
io500_info_system_name='xxx' # e.g. Oakforest-PACS
io500_info_institute_name='xxx' # e.g. JCAHPC
io500_info_storage_age_in_months='xxx' # not install date but age since
last refresh
io500_info_storage_install_date='xxx' # MM/YY
io500_info_filesysem='xxx' # e.g. BeeGFS, DataWarp, GPFS, IME, Lustre
io500_info_filesystem_version='xxx'
# client side info
io500_info_num_client_nodes='xxx'
io500_info_procs_per_node='xxx'
# server side info
io500_info_num_metadata_server_nodes='xxx'
io500_info_num_data_server_nodes='xxx'
io500_info_num_data_storage_devices='xxx' # if you have 5 data servers,
and each has 5 drives, then this number is 25
io500_info_num_metadata_storage_devices='xxx' # if you have 2 metadata
servers, and each has 5 drives, then this number is 10
io500_info_data_storage_type='xxx' # HDD, SSD, persistent memory, etc,
feel free to put specific models
io500_info_metadata_storage_type='xxx' # HDD, SSD, persistent memory,
etc, feel free to put specific models
io500_info_storage_network='xxx' # infiniband, omnipath, ethernet, etc
io500_info_storage_interface='xxx' # SAS, SATA, NVMe, etc
# miscellaneous
io500_info_whatever='WhateverElseYouThinkRelevant'
One thing that might be useful are scripts to automatically collect this
info. They might be specific to different filesystems. For example,
perhaps we could include 'utilities/collect_XXX.sh' scripts which
automatically collect useful information for various XXX filesystems like
BeeGFS, DataWarp, GPFS, IME, Lustre. And perhaps there are ways to
automatically collect in a filesystem agnostic way. Here is the
enviromental information we are currently collecting:
echo "System: " `uname -n`
echo "filesystem_utilization=$(df ${io500_workdir}|tail -1)"
Thanks and looking forward to hearing all of your ideas about this,
John
5 years, 4 months
System descriptions for IO500 submissions
by Julian Kunkel
Dear all,
thanks again for all those that have submitted already to the IO500.
An important piece of the submission is to provide information about
the system where the test is run. There has happened a considerable
amount of work to improve the database behind it, a process that is
not yet completed.
However, the system description is completed, such that one can add
detailed information about nodes, devices of the supercomputer and
data center.
We will then link back from the IO-500 results to the test description.
Have a look for example at DKRZ's page:
https://www.vi4io.org/hpsl/2017/deu/dkrz/start
I want to encourage people to update or submit new information to that
list when they submit to IO500.
The entry page is here:
https://www.vi4io.org/hpsl/start
If you have any questions, how to perform these changes send them to
me or the list.
Editing is very easy with the UI provided one clicks on the red button.
If you send me the site information for the submissions, I can create stubs.
Regards,
Julian
--
http://wr.informatik.uni-hamburg.de/people/julian_kunkel
5 years, 4 months
New improved parallel find tool
by John Bent
Hello all,
As a reminder, the "find" portion of the IO500 is the least proscribed
portion and we encourage you all to innovate heavily in this area by
writing your own custom tools. However, we do realize that many of you are
using the provided tools. Unfortunately, these provided tools have been
suboptimal.
As such, we are happy to announce that Julian wrote a new parallel find
tool that is much improved over the previous. It uses C instead of python
so should be easier to get up and running in your environments.
Additionally, we have found that the "find" functionality can take an
intractable amount of time. As such, Julian also added stonewall
functionality. If you do use it, we will count the score but will mark it
accordingly.
Instructions for using it are in io500.sh:setup_find. You will want to
pull a new clone of the github and rerun ./utilities/prepare.sh to get
everything setup.
Please let us know if you encounter any difficulties.
Thanks,
John Julian Jay
5 years, 4 months