Hi,
A short update, I just tried to run larger mdtest (I still calibrate). For mdt easy, the
create takes 4 minutes and for the hard takes 4.5 minutes (I am getting close). I do
create 2 million files. This moment still has not finished the command find and it takes
20 minutes. Total execution of the benchmark without having calibrated IOR to 5 minutes,
is almost 46 minutes and counting till the command "find" finishes. All the IOR
except the hard write, took less than 2 minutes and the hard write took 7.7 minutes. We
miss the mix workload also. I use 1000 compute nodes with 2 MPI processes per node. I
have time limit 1 hour, so I am afraid that it could be killed due to time limit.
Best regards,
George
__________________________________________________
George Markomanolis, PhD
Computational Scientist
KAUST Supercomputing Laboratory (KSL)
King Abdullah University of Science & Technology
Al Khawarizmi Bldg. (1) Room 0123
Thuwal
Kingdom of Saudi Arabia
Mob: +966 56 325 9012
Office: +966 12 808 0393
________________________________________
From: John Bent <John.Bent(a)seagategov.com>
Sent: Friday, June 30, 2017 3:27 AM
To: Andreas Dilger
Cc: Georgios Markomanolis; Julian Kunkel; io-500(a)vi4io.org
Subject: Re: [IO-500] ISC BOF report
On Jun 29, 2017, at 2:11 PM, Andreas Dilger <adilger(a)dilger.ca>
wrote:
On Jun 29, 2017, at 5:58 PM, John Bent <John.Bent(a)seagategov.com> wrote:
>> On Jun 29, 2017, at 1:52 PM, Andreas Dilger <adilger(a)dilger.ca> wrote:
>>
>>>> On Jun 29, 2017, at 5:47 PM, John Bent <John.Bent(a)seagategov.com>
wrote:
>>>>
>>>> On Jun 29, 2017, at 1:35 PM, Andreas Dilger <adilger(a)dilger.ca>
wrote:
>>>>
>>>> That means the result needs to be in "files per second"
>>>
>>> Absolutely. It will be just a fifth iops number to combine with the other 4
IOPs numbers (mdtest create easy/hard and mdtest stat easy/hard) using geo mean.
>>>
>>> Figure out the number of files created by the four produce phases: n. Then
divide that by wall-clock for the find command: w. Then the 'find' IOPs is n/w.
>>>
>>> My concern is that it will be so slow that people will give up and not run
it.
>>
>> I don't see why that would be true? For Lustre at least, readdir() and
stat() are about 2x as fast as creating files.
>
> Even if you create across 10,000 nodes and then readdir from just 1?
Hmm, good point. I was thinking about parallel stat, but it isn't orders of
magnitude slower. George mentioned the find command took 3-4 minutes, which isn't
slower than the 5-minute create phase...
I don't think George has done a 5-min create phase yet. He's just getting
the scripts working still.
Thx
John
Cheers, Andreas
________________________________
This message and its contents including attachments are intended solely for the original
recipient. If you are not the intended recipient or have received this message in error,
please notify me immediately and delete this message from your computer system. Any
unauthorized use or distribution is prohibited. Please consider the environment before
printing this email.