I just wanted to add a note before the holidays help everyone disappear for a week or two.
I think we have settled on the following:
1. Anything that is capable of being treated as “storage” should have separate benchmarking. This will give us some idea how things like SCR will perform. It will also tell us worst case performance should data either be unavailable in the fast tier or the fast tier capacity is exceeded and data migration/retargeting is required. If data migration is required, some measure of the simultaneous drain/fill should also be benchmarked.
2. We need to settle on benchmarks for traditional HPC workloads, such as engineering codes and bulk synchronous simulations with distributed, but dependent data sets. We also need to determine what benchmarks we want to support for all phases of data movement/staging for other workloads, such as bio/genomics, chemistry, or data analytics workloads. Data distribution and reading performance are important. In the case of flash, erasing the data from a previous application needs to be included unless there is some guarantee that it won’t be an issue (doubtful).
Irene had some ideas about cloud workloads that may be different from those described above. Hopefully she can educate the rest of us on what we should include. Having support for Swift or S3 APIs, if supported, is a simple, but likely inadequate first step.
Sarp has tried this effort previously, but ran into serious issues. If he could share the specifics of what they ran into and/or what they developed before abandoning the project, that would be extremely valuable.
on our document
I worked on the description of a Metadata benchmark.
I also prototyped the behavior into the "md-real-io" benchmark and
validated that the pattern can be usefully implemented with NoSQL
solutions (mongodb) and the relational model besides being useful for
Plugins are implemented for these, but S3 and other interfaces will follow.
Surely there are alternatives.
I measured the pattern on local file system and Lustre and it will
reveal the hardware performance (SSD vs. HDD and file system behavior)
in my sense much better than mdtest.
mdtest also has quite a lot of knobs to play, probably I do not know
how to reveal better the hardware features and avoid bulk
optimizations / caching.
Some results are posted here:
Anyway I wanted to move on with the high-level discussion and have
here some thoughts about the access patterns / use case. I'm open to
other use cases that make sense in big data and HPC.
Here is an excerpt of MDsmall (more in the document):
* Goal: Determine the performance for accessing small data objects
* The term data object refers to data that is independently
created/assessed to all other data objects.
* The benchmark shall determine the sustained performance of creating,
accessing and deleting of data objects but preventing caching.
* It shall simulate the interactive usage, as usage of a system by
users often leads to such small accesses ascertain data artifacts such
as source code is small. For example, some users compile a program
consisting of many source code pieces, on a file system, they may run
“ls”, “stat”, “cat”, or editors.
* Alternatively, it simulates a producer-consumer / stream processing system.
* N processes independently work on data objects, they behave like a
producer-consumer system / stream processing engine. Each process is a
consumer reading/deleting a data object and a producer creating a new
data object based on the previous data. A process consumes data from a
few (fixed) others and produces it also for a few other processes.
* The objects are considered to be distributed across multiple logical
data sets (such as directories, buckets, databases, …), each data set
is considered to be a queue for the producer/consumers. The exact
mapping from logical to physical object does not matter, but all
processes shall be able to access all objects at any time.
* The producer/consumer queues can be considered to lead to the
following “communication” pattern (receive == read/delete, send ==
* Assume D=1: a process receives data from the left process and sends
data to the right. (Virtually, the process has actually processed the
data and produced some product, but we don’t care about the
as mentioned before, to foster the benchmarking effort we are
organizing a workshop about the "understanding of I/O performance
behavior" on March 23rd and 24th in 2017 that takes place at DKRZ,
Hamburg and invite each of you to participate in this workshop and (if
you are interested) give a talk related to this topic.
Of course you can participate as regular attendee without giving a talk, too.
Please fill the form to indicate your interest in the workshop and
optionally a preliminary title:
== Abstract ==
Understanding I/O performance behavior is crucial to optimize
I/O-intense applications but also the infrastructure of data centers.
However, with the dawn of new technologies such as NVRAM,
burst-buffers, active storage/function shipping, and network attached
memory, the complexity of storage infrastructure increases
significantly and the boundary between memory and storage blurs.
During the procurement of new systems, data centers have to ensure
that the application's needs are met. Therefore, they need to define
the proper requirements for storage and provide I/O benchmarks that
represent application workloads to quantify and verify I/O
The main goal of the workshop is discussion of tools to identify
(in-)efficient usage of I/O resources on modern storage subsystems
from the perspective of users and data centers.
The workshop covers:
a discussion of design alternatives of storage architectures and their
implications on user workflows;
telemetry and monitoring information necessary to understand actual
rather than reported I/O activity enabling efficient performance
optimization of system and applications;
the development of representative benchmarks resembling the applications' needs.
The discussion of alternative storage architectures lays the
foundation for the requirements of the monitoring and benchmarking
efforts. Speakers involved in storage and file system research will
present experience in alternative storage architectures, application
workflows, monitoring tools to identify bottlenecks in I/O, and
(benchmarking) tools to quantify I/O performance. Scientists involved
in various application domains can give an introduction to their
workflows and I/O requirements.
By bringing together application developers/users and I/O experts, we
support the development of tools to identify and quantify I/O
inefficiencies that support users and data centers.
If you have any questions, contact me or Jay.