Interesting Mark.  You do indeed show variability in your find phases:
[RESULT]                 find      512.379261 kIOPS : time 230.722 seconds
[RESULT] IOPS phase 3                      find 1470.910 kiops : time 
A 3X difference in your find rate.

Which seems to be directly due to variability in your mdtest_hard rates:
[RESULT] IOPS phase 2         mdtest_hard_write 8.455 kiops : time
[RESULT]    mdtest-hard-write       48.336249 kIOPS : time 328.619 seconds
A 6X difference in your mdtest-hard rates.

Yeah, it would be nice if find was not sensitive to the relative size of the directories (and it isn't in GPFS by the way) but, in this case, it seems like the larger problem is the variability in the mdtest hard write rates.  I'm curious.  Do you know what explains this?  Just depends on the mood of the system?  :)

Thanks,

John

On Mon, Jun 1, 2020 at 9:05 AM Mark Nelson via IO-500 <io-500@vi4io.org> wrote:
On 6/1/20 8:44 AM, Pinkesh Valdria via IO-500 wrote:

> I made some progress over the weekend to troubleshoot why find phase
> was not working, but I am not out of the woods.   Appreciate if
> someone can confirm if I am on the right path or acknowledge if the
> below are known issues and there are workarounds for them.
>
> Thanks for your help.  It’s a long email, but detailed to ensure,
> there is not a lot of back-n-forth.
>
> Here are some differences I found and workaround I had to use.
>
> *_Issue1: _*
>
> Io500 (C version) expects a field labelled: external-extra-args in the
> config.ini file,  but the non-C version (io500.sh) logic is looking
> for field labelled:  “external-args” (without “-extra-“)  , see below
> line:
>
> https://github.com/VI4IO/io500-app/blob/master/io500.sh#L229
>
>
> io500_find_cmd_args="$(get_ini_param find external-args)"
>
>
> less config-full.ini
>
> [find]
>
> # Set to an external script to perform the find phase
>
> external-script =
>
> # Extra arguments for external scripts
>
> external-extra-args =
>
> # Startup arguments for external scripts
>
> external-mpi-args =
>
> # Set the number of processes for pfind
>
> nproc =
>
> # Pfind queue length
>
> pfind-queue-length = 10000
>
> # Pfind Steal from next
>
> pfind-steal-next = FALSE
>
> # Parallelize the readdir by using hashing. Your system must support this!
>
> pfind-parallelize-single-dir-access-using-hashing = FALSE
>
> *Temporary workaround*:   I changed the io500.sh code to look for:
> /io500_find_cmd_args="$(get_ini_param find external-extra-args)"/
>
> *_Issue2_*:   I had to set*__*io500_find_mpi to True in the io500.sh
> script to avoid getting this error: “/io500_find_mpi: unbound
> variable/”  , but I don’t know if there is a different way to set the
> value for the C-version using config.ini file,  can someone share how
> to pass that value for the C-app version?
>
> function setup_find {
>
> io500_find_mpi="True"
>
> *_Issue3:_***How do I validate that the parameters I am setting in the
> config.ini file are been used at runtime.   I set the following, but
> don’t see them below:
>
> [find]
>
> external-script = /mnt/beeond/io500-app/bin/pfind
>
> #nproc = 30
>
> pfind-queue-length = 2000
>
> pfind-steal-next = TRUE
>
> pfind-parallelize-single-dir-access-using-hashing = FALSE*__*
>
> io500 – C version hangs at the below command and I don’t see queue
> length, steal-next, etc
>
> [find]
>
> t_start = 2020-06-01 08:28:30
>
> exe =  /mnt/beeond/io500-app/bin/pfind ./out//2020.06.01-06.57.23-app
> -newer ./out//2020.06.01-06.57.23-app/timestampfile -size 3901c -name
> "*01*"
>
> nproc = 1
>
> *_Issue4_*:   Manual workaround to make “find” phase work by setting
> some parameters in io500.sh, but they are not passed to the C-version
> and hence that fails.
>
> In io500.sh script:
>
> function setup_find {
>
> io500_find_mpi="True"
>
> io500_find_cmd_args="-N -q 2000 -s $io500_stonewall_timer -r
> $io500_result_dir/pfind_results"
>
> __
>
> _Non-C version  - success _
>
> [Starting] find
>
> [Exec] mpiexec --allow-run-as-root -mca btl self -x UCX_TLS=rc,self,sm
> -x HCOLL_ENABLE_MCAST_ALL=0 -mca coll_hcoll_enable 0 -x
> UCX_IB_TRAFFIC_CLASS=105 -x UCX_IB_GID_INDEX=3 -n 30 -npernode 10
> --hostfile /mnt/beeond/hostsfile.cn /mnt/beeond/io500-app/bin/pfind
> ./out//2020.06.01-06.57.23-scr -newer
> ./out//2020.06.01-06.57.23-scr/timestampfile -size 3901c -name "*01*"
> -N -q 2000 -s 300 -r ./results//2020.06.01-06.57.23-scr/pfind_results
>
> [Results] in ./results//2020.06.01-06.57.23-scr/find.txt.
>
> [FIND] MATCHED 28170/15966876 in 44.6593 seconds
>
> [RESULT] IOPS phase 3                      find              357.520
> kiops : time  44.66 seconds
>
> _io500 – C version hangs at the below command and I don’t see queue
> length, steal-next, etc _
>
> [find]
>
> t_start = 2020-06-01 08:28:30
>
> exe =  /mnt/beeond/io500-app/bin/pfind ./out//2020.06.01-06.57.23-app
> -newer ./out//2020.06.01-06.57.23-app/timestampfile -size 3901c -name
> "*01*"
>
> nproc = 1
>
> *_Issue5_*:   As you can see in issue#4,  I am passing some parameters
> with bash variables, if I do the same in the config.ini file,   they
> will be passed as is without getting interpreted by the bash script. 
> How do I pass such variable for the C-version of io500 ?
>
> Already tried the below and they are not interpreted when processed by
> io500 C-app version.
>
> In config.ini file:
>
> external-extra-args =  -s \$io500_stonewall_timer -r
> \$io500_result_dir/pfind_results
>
> or
>
> external-extra-args =  -s $io500_stonewall_timer -r
> $io500_result_dir/pfind_results
>
> [find]
>
> t_start = 2020-05-31 11:43:52
>
> exe =  /mnt/beeond/io500-app/bin/pfind -s $io500_stonewall_timer -r
> $io500_result_dir/pfind_results ./out//2020.05.31-10.52.56-app -newer
> ./out//2020.05.31-10.52.56-app/timestampfile -size 3901c -name "*01*"
>
> nproc = 1
>
> [find]
>
> t_start = 2020-05-31 15:55:52
>
> exe =  /mnt/beeond/io500-app/bin/pfind -s \$io500_stonewall_timer -r
> \$io500_result_dir/pfind_results ./out//2020.05.31-15.55.38-app -newer
> ./out//2020.05.31-15.55.38-app/timestampfile -size 3901c -name "*01*"
>
> nproc = 1
>
> This is a small test cluster
>
> less 2020.06.01-06.57.23-scr/result_summary.txt
>
> [RESULT] BW   phase 1            ior_easy_write                6.062
> GiB/s : time 362.17 seconds
>
> [RESULT] IOPS phase 1         mdtest_easy_write                7.300
> kiops : time 2054.93 seconds
>
> [RESULT] BW   phase 2            ior_hard_write                1.605
> GiB/s : time 321.58 seconds
>
> [RESULT] IOPS phase 2         mdtest_hard_write                3.173
> kiops : time 304.69 seconds
>
> [RESULT] IOPS phase 3                      find              357.520
> kiops : time  44.66 seconds
>
> [RESULT] BW   phase 3             ior_easy_read                8.269
> GiB/s : time 265.51 seconds
>
> [RESULT] IOPS phase 4          mdtest_easy_stat              144.149
> kiops : time 104.06 seconds
>
> [RESULT] BW   phase 4             ior_hard_read                3.847
> GiB/s : time 134.18 seconds
>
> [RESULT] IOPS phase 5          mdtest_hard_stat               82.220
> kiops : time  11.76 seconds
>
> [RESULT] IOPS phase 6        mdtest_easy_delete               54.334
> kiops : time 276.07 seconds
>
> [RESULT] IOPS phase 7          mdtest_hard_read               22.822
> kiops : time  42.37 seconds
>
> [RESULT] IOPS phase 8        mdtest_hard_delete                8.042
> kiops : time 123.97 seconds
>
> [SCORE] Bandwidth 4.19424 GiB/s : IOPS 31.5378 kiops : TOTAL 11.5012
>
> C-version app – partial result – which hangs at find:
>
> [root@inst-5n58i-good-crow results]# less
> 2020.06.01-06.57.23-app/result.txt  | egrep "\[|score"
>
> [ior-easy-write]
>
> score = 6.136592
>
> [mdtest-easy-write]
>
> score = 41.913878
>
> [timestamp]
>
> [ior-hard-write]
>
> score = 1.538194
>
> [mdtest-hard-write]
>
> score = 3.012750
>
> [find]
>
> [root@inst-5n58i-good-crow results]#
>

FWIW the default version of find is still working the same as it did
previously for us.  This is not really related to what you are seeing,
but what I'm still noticing is that we're getting an unreasonable find
advantage when our easy and hard create results are highly skewed (which
I'm still working on debugging since we are still sometimes seeing a lot
of variation).  This is from an 8 node development cluster:


First run (C-app in this case):

[RESULT]       ior-easy-write       59.443067 GiB/s  : time 467.383 seconds
[RESULT]    mdtest-easy-write      277.570531 kIOPS : time 369.282 seconds
[RESULT]       ior-hard-write       11.037549 GiB/s  : time 333.391 seconds
[RESULT]    mdtest-hard-write       48.336249 kIOPS : time 328.619 seconds
[RESULT]                 find      512.379261 kIOPS : time 230.722 seconds
[RESULT]        ior-easy-read       70.285275 GiB/s  : time 397.338 seconds
[RESULT]     mdtest-easy-stat      627.847802 kIOPS : time 165.087 seconds
[RESULT]        ior-hard-read       13.879120 GiB/s  : time 275.325 seconds
[RESULT]     mdtest-hard-stat      179.562335 kIOPS : time 1221.149 seconds
[RESULT]   mdtest-easy-delete      140.591170 kIOPS : time 917.852 seconds
[RESULT]     mdtest-hard-read       43.742740 kIOPS : time 362.040 seconds
[RESULT]   mdtest-hard-delete       11.804945 kIOPS : time 1341.281 seconds
[SCORE] Bandwidth 28.284600 GB/s : IOPS 124.102186 kiops : TOTAL 59.246778


Second run (Script):

[RESULT] BW   phase 1            ior_easy_write 55.296 GiB/s : time
477.21 seconds
[RESULT] BW   phase 2            ior_hard_write 10.749 GiB/s : time
331.08 seconds
[RESULT] BW   phase 3             ior_easy_read 67.308 GiB/s : time
391.89 seconds
[RESULT] BW   phase 4             ior_hard_read 14.842 GiB/s : time
239.77 seconds
[RESULT] IOPS phase 1         mdtest_easy_write 278.851 kiops : time
367.23 seconds
[RESULT] IOPS phase 2         mdtest_hard_write 8.455 kiops : time
323.59 seconds
[RESULT] IOPS phase 3                      find 1470.910 kiops : time 
71.48 seconds
[RESULT] IOPS phase 4          mdtest_easy_stat 967.127 kiops : time
105.89 seconds
[RESULT] IOPS phase 5          mdtest_hard_stat 47.136 kiops : time 
58.04 seconds
[RESULT] IOPS phase 6        mdtest_easy_delete 224.795 kiops : time
455.53 seconds
[RESULT] IOPS phase 7          mdtest_hard_read 46.749 kiops : time 
58.52 seconds
[RESULT] IOPS phase 8        mdtest_hard_delete 28.490 kiops : time 
99.08 seconds
[SCORE] Bandwidth 27.7589 GiB/s : IOPS 121.449 kiops : TOTAL 58.0628


We don't really "deserve" a score of 58 on that second run because the
only reason find was so fast was that we wrote out far less
mdtest_hard_write data.


Mark

> *From: *Pinkesh Valdria <pinkesh.valdria@oracle.com>
> *Date: *Saturday, May 30, 2020 at 3:01 AM
> *To: *Andreas Dilger <adilger@dilger.ca>
> *Cc: *<io-500@vi4io.org>
> *Subject: *Re: [IO-500] Io500 runs twice - is that expected starting
> 2020 ?
>
> Thanks Andreas,
>
> The C-app benchmark failed,  I mean it never completed and there was
> no score at the end,  I waited for 8 hours before I did ctrl-c .    It
> only has 4 results vs 12 results in the first section.    Are there
> special logs to see for the C version of the benchmark.
>
> The below is a very small development cluster I am using until I
> figure out how to run IO500 correctly.
>
> [Leaving] datafiles in ./out//2020.05.29-17.31.34-scr
>
> [Summary] Results files in ./results//2020.05.29-17.31.34-scr
>
> [Summary] Data files in ./out//2020.05.29-17.31.34-scr
>
> [RESULT] BW   phase 1            ior_easy_write                6.188
> GiB/s : time 357.44 seconds
>
> [RESULT] BW   phase 2            ior_hard_write                1.132
> GiB/s : time 367.41 seconds
>
> [RESULT] BW   phase 3             ior_easy_read                8.090
> GiB/s : time 273.37 seconds
>
> [RESULT] BW   phase 4             ior_hard_read                3.726
> GiB/s : time 111.69 seconds
>
> [RESULT] IOPS phase 1         mdtest_easy_write                4.263
> kiops : time 3518.34 seconds
>
> [RESULT] IOPS phase 2         mdtest_hard_write                2.953
> kiops : time 303.52 seconds
>
> [RESULT] IOPS phase 3                      find               91.550
> kiops : time 173.64 seconds
>
> [RESULT] IOPS phase 4          mdtest_easy_stat              137.243
> kiops : time 109.30 seconds
>
> [RESULT] IOPS phase 5          mdtest_hard_stat               84.140
> kiops : time  10.65 seconds
>
> [RESULT] IOPS phase 6        mdtest_easy_delete               55.311
> kiops : time 271.19 seconds
>
> [RESULT] IOPS phase 7          mdtest_hard_read               21.778
> kiops : time  41.16 seconds
>
> [RESULT] IOPS phase 8        mdtest_hard_delete                7.133
> kiops : time 129.24 seconds
>
> [SCORE] Bandwidth 3.81212 GiB/s : IOPS 24.1149 kiops : TOTAL 9.58796
>
> The io500.sh was run
>
> Running the C version of the benchmark now
>
> IO500 version io500-isc20
>
> [RESULT] ior-easy-write        6.233210 GiB/s  : time 359.216 seconds
>
> [RESULT] mdtest-easy-write        3.750549 kIOPS : time 3999.448 seconds
>
> [RESULT] ior-hard-write        1.415903 GiB/s  : time 349.737 seconds
>
> [RESULT] mdtest-hard-write        3.006432 kIOPS : time 305.006 seconds
>
> ^C
>
> [root@inst-q7cdd-good-crow io500-app]#
>
> *From: *Andreas Dilger <adilger@dilger.ca>
> *Date: *Saturday, May 30, 2020 at 2:33 AM
> *To: *Pinkesh Valdria <pinkesh.valdria@oracle.com>
> *Cc: *<io-500@vi4io.org>
> *Subject: *Re: [IO-500] Io500 runs twice - is that expected starting
> 2020 ?
>
> Hi Pinkesh,
>
> The dual runs of the IO500 benchmark for this list are intentional,
>
> and documented in the README-ISC20.txt file in the source tree.
>
> This is to allow comparison between the historical io500.sh script and the
>
> new C application that runs the same IOR, mdtest, and find commands.
>
> Please submit both results for ISC'20.
>
> We wanted to be sure that the transition to the new C-app didn't
>
> introduce any errors in the results. The need to run the benchmark
>
> twice will hopefully be gone for the SC'20 list.
>
> Cheers, Andreas(*)
>
> * speaking on my own behalf and not on behalf of the IO500 board
>
>
>
>
>     On May 29, 2020, at 16:02, Pinkesh Valdria via IO-500
>     <io-500@vi4io.org> wrote:
>
>     
>
>     Hello IO-500 experts,
>
>     I am trying to configure io500 .  When I run it,  it runs twice, 
>     first one is regular and 2^nd one is called “Running the C version
>     of the benchmark now”.   Is it because I misconfigured it or is it
>     required to run both, starting 2020 ?     My config*.ini file is
>     below.
>
>     [root@inst-q7cdd-good-crow io500-app]# ./io500.sh config-test1.ini
>
>     System: inst-q7cdd-good-crow
>
>     …..
>
>     Running the IO500 Benchmark now
>
>     [Creating] directories
>
>     …..
>
>     [Summary] Results files in ./results//2020.05.29-17.31.34-scr
>
>     [Summary] Data files in ./out//2020.05.29-17.31.34-scr
>
>     [RESULT] BW   phase 1 ior_easy_write                6.188 GiB/s :
>     time 357.44 seconds
>
>     [RESULT] BW   phase 2 ior_hard_write                1.132 GiB/s :
>     time 367.41 seconds
>
>     [RESULT] BW   phase 3             ior_easy_read       8.090 GiB/s
>     : time 273.37 seconds
>
>     [RESULT] BW   phase 4 ior_hard_read                3.726 GiB/s :
>     time 111.69 seconds
>
>     [RESULT] IOPS phase 1 mdtest_easy_write                4.263 kiops
>     : time 3518.34 seconds
>
>     [RESULT] IOPS phase 2 mdtest_hard_write                2.953 kiops
>     : time 303.52 seconds
>
>     [RESULT] IOPS phase 3                      find 91.550 kiops :
>     time 173.64 seconds
>
>     [RESULT] IOPS phase 4          mdtest_easy_stat 137.243 kiops :
>     time 109.30 seconds
>
>     [RESULT] IOPS phase 5          mdtest_hard_stat 84.140 kiops :
>     time  10.65 seconds
>
>     [RESULT] IOPS phase 6        mdtest_easy_delete 55.311 kiops :
>     time 271.19 seconds
>
>     [RESULT] IOPS phase 7          mdtest_hard_read
>                   21.778 kiops : time  41.16 seconds
>
>     [RESULT] IOPS phase 8 mdtest_hard_delete                7.133
>     kiops : time 129.24 seconds
>
>     [SCORE] Bandwidth 3.81212 GiB/s : IOPS 24.1149 kiops : TOTAL 9.58796
>
>     The io500.sh was run
>
>     Running the C version of the benchmark now
>
>     IO500 version io500-isc20
>
>     <currently running …when I posted this question ….
>
>     ***************************************************
>
>     config-test1.ini (END)
>
>     ***************************************************
>
>     [global]
>
>     datadir = ./out/
>
>     resultdir = ./results/
>
>     timestamp-resultdir = TRUE
>
>     # Chose parameters that are very small for all benchmarks
>
>     [debug]
>
>     stonewall-time = 300 # for testing
>
>     [ior-easy]
>
>     transferSize = 2m
>
>     blockSize = 102400m
>
>     [mdtest-easy]
>
>     API = POSIX
>
>     # Files per proc
>
>     n = 500000
>
>     [ior-hard]
>
>     API = POSIX
>
>     # Number of segments10000000
>
>     segmentCount = 400000
>
>     [mdtest-hard]
>
>     API = POSIX
>
>     # Files per proc 1000000
>
>     n = 40000
>
>     [find]
>
>     external-script = /mnt/beeond/io500-app/bin/pfind
>
>     pfind-parallelize-single-dir-access-using-hashing = FALSE
>
>     ***************************************************
>
>     _______________________________________________
>     IO-500 mailing list
>     IO-500@vi4io.org
>     https://www.vi4io.org/mailman/listinfo/io-500
>     <https://urldefense.com/v3/__https:/www.vi4io.org/mailman/listinfo/io-500__;!!GqivPVa7Brio!IIVK5wte_QzaDMDI727WWUVOeAHxxHJpVHKY-ULJ8aLVtR9jiMb9ino83TfwS3rj3slr$>
>
>
> _______________________________________________
> IO-500 mailing list
> IO-500@vi4io.org
> https://www.vi4io.org/mailman/listinfo/io-500

_______________________________________________
IO-500 mailing list
IO-500@vi4io.org
https://www.vi4io.org/mailman/listinfo/io-500