All,
A great first list. Now let's think about the next list for ISC. We are
in agreement that the benchmarks themselves will not change although we
might add some additional optional tests.
The question I'd like to put before the community right now is how to
improve the usefulness of the collected numbers. I think one way to do
this is collect additional environmental information. For example, we
currently have been recording the number of client nodes and the number of
procs per node. This allows analysis of client scalability and allows a
per-client score ranking. We should collect more information such as
server side information like the number of servers and the number of
devices; this will allow similar server scalability analysis.
As such, I have two specific questions right now:
1. What is the best way to collect this environmental information?
2. What environmental information should we collect?
Right now, I've added the following variables into io500.sh which we are
asking submitters to fill-in. This will allow us to automate the data
collection on the back-end but may require the submitter to do some
research when they do their run. There is a trade-off between what is
useful to collect and what puts an undue burden on the submitter.
# top level info
io500_info_system_name='xxx' # e.g. Oakforest-PACS
io500_info_institute_name='xxx' # e.g. JCAHPC
io500_info_storage_age_in_months='xxx' # not install date but age since
last refresh
io500_info_storage_install_date='xxx' # MM/YY
io500_info_filesysem='xxx' # e.g. BeeGFS, DataWarp, GPFS, IME, Lustre
io500_info_filesystem_version='xxx'
# client side info
io500_info_num_client_nodes='xxx'
io500_info_procs_per_node='xxx'
# server side info
io500_info_num_metadata_server_nodes='xxx'
io500_info_num_data_server_nodes='xxx'
io500_info_num_data_storage_devices='xxx' # if you have 5 data servers,
and each has 5 drives, then this number is 25
io500_info_num_metadata_storage_devices='xxx' # if you have 2 metadata
servers, and each has 5 drives, then this number is 10
io500_info_data_storage_type='xxx' # HDD, SSD, persistent memory, etc,
feel free to put specific models
io500_info_metadata_storage_type='xxx' # HDD, SSD, persistent memory,
etc, feel free to put specific models
io500_info_storage_network='xxx' # infiniband, omnipath, ethernet, etc
io500_info_storage_interface='xxx' # SAS, SATA, NVMe, etc
# miscellaneous
io500_info_whatever='WhateverElseYouThinkRelevant'
One thing that might be useful are scripts to automatically collect this
info. They might be specific to different filesystems. For example,
perhaps we could include 'utilities/collect_XXX.sh' scripts which
automatically collect useful information for various XXX filesystems like
BeeGFS, DataWarp, GPFS, IME, Lustre. And perhaps there are ways to
automatically collect in a filesystem agnostic way. Here is the
enviromental information we are currently collecting:
echo "System: " `uname -n`
echo "filesystem_utilization=$(df ${io500_workdir}|tail -1)"
Thanks and looking forward to hearing all of your ideas about this,
John
Show replies by thread