IMHO, the "10 physical nodes" requirement makes sense from the point of view
previously stated, that running 10 virtual hosts on the same physical machine
can dramatically skew the results since they can share the same memory and
do virtually no actual network traffic, bypassing the "read your neighbour"
requirement.
One of the main motivators for IO-500 is to explore filesystem scaling in real
clusters, and if we allow 10 virtual nodes, why not devolve into 10 containers
in the same OS instance, or 10 mountpoints on a single node with a local server?
I think that is bypassing the intent of the benchmark completely.
As for why the 10-node challenge exists, IMHO there are two motivations for this:
- see storage performance with a limited number of clients so that users/admins
can get a realistic sense of how IO performance will scale based on the number
of clients (i.e. assuming benchmark numbers are limited by protocol and network
performance), while the hero numbers are based on the number of servers (i.e.
assume benchmark numbers are limited by aggregate storage bandwidth and IOPS).
- let's be realistic- since the list doesn't yet have 500 submissions, let alone
500 top systems, so this is a reasonable way to improve audience participation
that actually provides some useful metrics.
In particular I like the idea of comparing the 10-node result with the N-node
result to see just how much of the storage bandwidth can be driven by a small
number of clients. In theory, the 10-node case could saturate the network
bandwidth (assuming storage bandwidth > 10x network bandwidth), but in practise
this is not always the case, and the difference shows areas that could improve
CPU or protocol or server efficiency. I think the IO-500 has already driven
real-world improvements (c.f. Cambridge) that have improved the lives of users.
I think that 10-node results will continue to be useful to sumbit even after
there are more than 500 larger results available on the list.
As for the 1PB minimum, I think that would drive down participation, especially
in the (IMHO important) flash storage arena, since that can be cost-prohibitive
today. I think the list will naturally fill out over time, with new and large
systems coming online submitting results during their acceptance phase pushing
the 10-node results from the top spots, and eventually from the IO-500 list
entirely. In the meantime, I don't see a need to refuse valid results while
there is still a lot of room on the list that needs to be filled.
Cheers, Andreas
On Oct 4, 2019, at 10:46 AM, Carlile, Ken via IO-500 <io-500(a)vi4io.org> wrote:
I think the 1PB is a non-starter. Why exclude the small guys?
What confuses me is the statement that it's ok to run multiple VMs as long as the
iron count is 10. Or am I misreading that?
10 clients makes sense to me because certain places simply don't HAVE that many
clients to throw at the benchmark, and it normalizes the speeds across a standard number
of clients.
--Ken
> On Oct 4, 2019, at 12:43 PM, Dean Hildebrand via IO-500 <io-500(a)vi4io.org>
wrote:
>
> Julien, Thanks for the examples.
>
> I think what you may be getting at is that the 10 client challenge is really about,
"Given a large storage system that submits a result to the standard io500, how well
does it do with only 10 clients?".
>
> If this is the case, and we don't want to encourage the submission of small
non-scalable storage systems, then maybe there are other ways to achieve it such as:
> - A submission to the 10 client challenge is only valid if a submission is also made
to the standard io500 list. Users can then look at both rankings to get an understanding
of the system.
> - Each submission must have at least 1PB of storage capacity, which will increase by
10% each year.
>
> Just rough ideas, but maybe we need to clarify why an io500 list cares about 10
clients?
> Dean
>
>
> On 10/3/19 1:39 AM, Julian Kunkel wrote:
>> Hi,
>> IMHO: A simple way of seeing this matter for the 10 node challenge is
>> that it really should be about 10 nodes with interconnects to
>> normalize results to some extent. Such runs can be seen in a real
>> configuration.
>> However, deploying 10 VMs on a single host and seeing a performance
>> gain vs. running directly on the host seems to be artificial.
>>
>> Regarding cheating: theoretically one could run 10 VMs on one big
>> node, the host could slow down the creation rates to a limit such that
>> all data is available in a big cache (say NVDIMMs) from the
>> perspective of the host (and the VMs then). Every read would then be
>> cached.
>>
>> Here is a rather artificial example (if you have more appropriate
>> numbers, use them):
>>
>> For IOR BW assume
>> * writes 5 GiB/s to NVDIMMs (throttled) => 1.5 * 2 TB space needed / doable.
>> * read 500 GiB/s.
>> => (5*5*500*500)^0.25 = 50 score
>> Not an issue so far.
>>
>> For MD, 10 Million IOOPS for create and 100 Million for any
>> read/delete and find would give
>> (10000*10000*100000*100000*100000*100000*100000*100000)^(1/8)
>> => 56234.13
>>
>> Total score: sqrt(56234*50) = 1676.812
>>
>> Yes, it is a synthetic example but there could be technology out there
>> that generates such numbers o people may create an IOR backend to
>> exploit such a setup.
>> You could also use two nodes and only 1/5th of data needs to be
>> transferred over the network (as the IO500 does rank-shifting), that
>> would also lead to a superficial number.
>>
>> Personally I would be interested in such gaming results, you can
>> always submit such numbers to the full list as synthetic "upper
>> bounds".
>>
>> Best,
>> Julian
>>
>> On Wed, Oct 2, 2019 at 10:02 PM Dean Hildebrand via IO-500
>> <io-500(a)vi4io.org> wrote:
>>> As a cloud provider, this rule isn't too onerous as there is always a way
to get dedicated machines through sole tenant offerings and simply using large VMs
(although it is a waste of $$ to use clients that have 60+ cores just to run a single
benchmark process).
>>>
>>> I'm more curious about the thinking here, can someone from the committee
provide some background? This is one of those funny and rare cases where we are worried
about someone with fewer resources having an advantage over someone with more resources.
If a system with a 1 or 2 clients can beat 10...isn't that one measure of success from
an HPC point of view?
>>>
>>> Dean
>>>
>>> On 9/30/19 9:10 AM, John Bent via IO-500 wrote:
>>>
>>>> To IO500 Community,
>>>>
>>>>
>>>> The committee has received some queries about the rules concerning
virtual machines for the 10 Node Challenge. As such, the committee has added the following
rule:
>>>>
>>>>
>>>> 13. For the 10 Node Challenge, there must be exactly 10 physical nodes
for client processes and at least one benchmark process must run on each
>>>>
>>>> Virtual machines can be used but the above rule must be followed. More
than one virtual machine can be run on each physical node.
>>>>
>>>>
>>>> Although we recognize that this may disadvantage cloud architectures, we
do want to stress that this rule only applies to the 10 Node Challenge. The committee did
feel it was important to add this rule to ensure that the 10 Node Challenge sublist offers
the maximum potential for fair comparisons by ensuring equivalent client hardware
quantities. Submissions with any number/combination of virtual and physical machines can
of course always be submitted to the full list.
>>>>
>>>>
>>>>
>>>> Thank you,
>>>>
>>>>
>>>> The IO500 Committee
>>>>
>>>
>>>
>>> _______________________________________________
>>> IO-500 mailing list
>>> IO-500(a)vi4io.org
>>>
https://urldefense.com/v3/__https://www.vi4io.org/mailman/listinfo/io-500...
>>>
>>> _______________________________________________
>>> IO-500 mailing list
>>> IO-500(a)vi4io.org
>>>
https://urldefense.com/v3/__https://www.vi4io.org/mailman/listinfo/io-500...
>>
>>
>
> _______________________________________________
> IO-500 mailing list
> IO-500(a)vi4io.org
>
https://urldefense.com/v3/__https://www.vi4io.org/mailman/listinfo/io-500...
_______________________________________________
IO-500 mailing list
IO-500(a)vi4io.org
https://www.vi4io.org/mailman/listinfo/io-500
Cheers, Andreas