Differences

This shows you the differences between two versions of the page.

Link to this comparison view

john:benchmarks [2020/08/16 04:16]
solar [Collected john --test benchmarks for OpenMP-enabled builds] added EPYC 7R32 on AWS
john:benchmarks [2024/12/11 13:17] (current)
fantomas [GPU benchmarks]
Line 9: Line 9:
 ====== Collected "john --test"​ benchmarks for OpenMP-enabled builds ====== ====== Collected "john --test"​ benchmarks for OpenMP-enabled builds ======
  
-^ DES crypt() \\ many / one salt ^ MD5 crypt() ^ [[http://​www.openwall.com/​crypt/​|bcrypt]] <​sub>​x32</​sub>​ ^ Windows LanMan ^ CPUs & clock rate ^ threads ^ logical CPUs/ \\ physical cores ^ JtR ^ OS ^ compiler ^ make target ^ tweaks ^+^ DES crypt() \\ many / one salt ^ MD5 crypt() ^ [[http://​www.openwall.com/​crypt/​|bcrypt]] <​sub>​x32</​sub>​ ^ LM (Windows LanMan^ CPUs & clock rate ^ threads ^ logical CPUs/ \\ physical cores ^ JtR ^ OS ^ compiler ^ make target ^ tweaks ^
 | **407961K** / 62797K \\ 256/256 AVX2 | **4608K** \\ 256/256 AVX2 8x3 | **86832** \\ 32/64 X3 | 68274K ((OpenMP scaling for LM hashes is currently very poor - fewer threads would give better LM hash speeds)) \\ 256/256 AVX2 | EPYC 7R32 \\ 3.3 GHz | 96 | 96 / 48 | 1.9.0-jumbo-1'​ish \\ [[https://​www.openwall.com/​john/​cloud/​|in the cloud]] | Amazon Linux 2 | gcc | autoconf | AWS c5a.24xlarge instance | | **407961K** / 62797K \\ 256/256 AVX2 | **4608K** \\ 256/256 AVX2 8x3 | **86832** \\ 32/64 X3 | 68274K ((OpenMP scaling for LM hashes is currently very poor - fewer threads would give better LM hash speeds)) \\ 256/256 AVX2 | EPYC 7R32 \\ 3.3 GHz | 96 | 96 / 48 | 1.9.0-jumbo-1'​ish \\ [[https://​www.openwall.com/​john/​cloud/​|in the cloud]] | Amazon Linux 2 | gcc | autoconf | AWS c5a.24xlarge instance |
 | 322830K / **79421K** \\ 512/512 AVX512F | 3474K \\ 512/512 AVX512BW 16x3 | 35424 \\ 32/64 X3 | 110493K \\ 512/512 AVX512F | 2x Xeon Gold 6126 \\ 2.6+ GHz | 48 | 48 / 24 | 1.9.0-jumbo-1 | Linux \\ (Ubuntu 16.04.5 LTS) | gcc 5.4.0 20160609 \\ (5.4.0-6ubuntu1~16.04.12) | autoconf | none | | 322830K / **79421K** \\ 512/512 AVX512F | 3474K \\ 512/512 AVX512BW 16x3 | 35424 \\ 32/64 X3 | 110493K \\ 512/512 AVX512F | 2x Xeon Gold 6126 \\ 2.6+ GHz | 48 | 48 / 24 | 1.9.0-jumbo-1 | Linux \\ (Ubuntu 16.04.5 LTS) | gcc 5.4.0 20160609 \\ (5.4.0-6ubuntu1~16.04.12) | autoconf | none |
Line 91: Line 91:
 ====== Collected "john --test"​ benchmarks for MPI-enabled builds ====== ====== Collected "john --test"​ benchmarks for MPI-enabled builds ======
  
-^ DES crypt() \\ many / one salt ^ MD5 crypt() ^ [[http://​www.openwall.com/​crypt/​|bcrypt]] <​sub>​x32</​sub>​ ^ Windows LanMan ^ CPUs & clock rate ^ processes ^ logical CPUs/ \\ physical cores ^ JtR ^ OS ^ compiler ^ make target ^ tweaks ^+^ DES crypt() \\ many / one salt ^ MD5 crypt() ^ [[http://​www.openwall.com/​crypt/​|bcrypt]] <​sub>​x32</​sub>​ ^ LM (Windows LanMan^ CPUs & clock rate ^ processes ^ logical CPUs/ \\ physical cores ^ JtR ^ OS ^ compiler ^ make target ^ tweaks ^
 | **735037K / 701243K** \\ 128/128 BS SSE2-16 | **7507K** \\ 128/128 SSE2 intrinsics 12x ((Would likely be faster with the linux-x86-64i make target)) | **200679** \\ 32/64 X2 | ~**9200M** ((Reported as 4294M since limited by a 32-bit integer)) \\ 128/128 BS SSE2-16 | 48x X7550 \\ 2.0 GHz \\ HT disabled ((Would likely be faster with HT enabled)) | 384 | 384 / 384 | 1.7.9-jumbo-6'​ish bleeding-jumbo | Linux | gcc 4.7.0 | linux-x86-64-native | | | **735037K / 701243K** \\ 128/128 BS SSE2-16 | **7507K** \\ 128/128 SSE2 intrinsics 12x ((Would likely be faster with the linux-x86-64i make target)) | **200679** \\ 32/64 X2 | ~**9200M** ((Reported as 4294M since limited by a 32-bit integer)) \\ 128/128 BS SSE2-16 | 48x X7550 \\ 2.0 GHz \\ HT disabled ((Would likely be faster with HT enabled)) | 384 | 384 / 384 | 1.7.9-jumbo-6'​ish bleeding-jumbo | Linux | gcc 4.7.0 | linux-x86-64-native | |
 | 586638K / 505080K \\ DES 128/128 AVX-16 | 4398K \\ MD5 128/128 AVX 12x | 133374 \\ 32/64 X3 | 3234M \\ DES 128/128 AVX-16 | 20x E5-2670v2 \\ 2.5 GHz \\ HT disabled | 128 | 128 / 128 | 1.8.0-jumbo-1 | Linux \\ (SLES11Sp3) | gcc 4.9.2 | autoconf or linux-x86-64-native?​ | None: 128 cores across 20 active nodes (leaving some of the 200 cores unused?) | | 586638K / 505080K \\ DES 128/128 AVX-16 | 4398K \\ MD5 128/128 AVX 12x | 133374 \\ 32/64 X3 | 3234M \\ DES 128/128 AVX-16 | 20x E5-2670v2 \\ 2.5 GHz \\ HT disabled | 128 | 128 / 128 | 1.8.0-jumbo-1 | Linux \\ (SLES11Sp3) | gcc 4.9.2 | autoconf or linux-x86-64-native?​ | None: 128 cores across 20 active nodes (leaving some of the 200 cores unused?) |
Line 104: Line 104:
 For some CPUs (such as Core i7), the per-core clock rate varies with the number of cores in use, so directly multiplying the per-core c/s rate by the number of cores would not yield the CPU's combined c/s rate capability (the actual combined c/s rate would be less), but on the other hand if the CPU also supports SMT (Hyperthreading) then additional speedup may be obtained by running more JtR processes than the CPU's number of cores. For some CPUs (such as Core i7), the per-core clock rate varies with the number of cores in use, so directly multiplying the per-core c/s rate by the number of cores would not yield the CPU's combined c/s rate capability (the actual combined c/s rate would be less), but on the other hand if the CPU also supports SMT (Hyperthreading) then additional speedup may be obtained by running more JtR processes than the CPU's number of cores.
  
-^ DES crypt() \\ many / one salt ^ MD5 crypt() ^ [[http://​www.openwall.com/​crypt/​|bcrypt]] <​sub>​x32</​sub>​ ^ Windows LanMan ^ CPU & clock rate ^ JtR ^ OS ^ compiler ^ make target ^ tweaks ^ +^ DES crypt() \\ many / one salt ^ MD5 crypt() ^ [[http://​www.openwall.com/​crypt/​|bcrypt]] <​sub>​x32</​sub>​ ^ LM (Windows LanMan^ CPU & clock rate ^ JtR ^ OS ^ compiler ^ make target ^ tweaks ^ 
-| **6200K** / **5898K** \\ 128/128 AVX-16 | 17418 \\ 32/64 X2 | 1046 \\ 32/64 X2 | **80981K** \\ 128/128 BS AVX-16 | Core i7-3770 \\ 3.4 GHz | 1.8.0 | Linux | gcc 4.7.3 | linux-x86-64-avx | |+| **14843K** / **13548K** \\ 256/256 AVX2 | **107352** \\ 256/256 AVX2 8x3 | **1702** \\ 32/64 X3 | **114829K** \\ 256/256 AVX2 | Core i5-9500 \\ 3.00GHz | 1.9.0-jumbo-1+bleeding-250498b959 | Linux | gcc version 12.2.0 | x86_64-linux-gnu | | 
 +| 12462K / 11220K \\ 256/256 AVX2 | 101736 \\ 256/256 AVX2 8x3 | 1501 \\ 32/64 X3 | 100811K \\  256/256 AVX2 | Core i7-4790 \\ 4.0GHz turbo | 1.9.0-jumbo-1+bleeding-0835ce060 | Linux | gcc version 10.2.1 20210110 | x86_64-linux-gnu | | 
 +| 11162K / 10059K \\ 256/256 AVX2 | 89052 \\ 256/256 AVX2 8x3 | 1371 \\ 32/64 X3 | 92790K \\ 256/256 AVX2 | Core i7-6600U \\ 3.4GHz turbo | 1.9.0-jumbo-1+bleeding-0835ce060 | Linux | gcc version 10.2.1 20210110 | x86_64-linux-gnu | | 
 +| 6200K / 5898K \\ 128/128 AVX-16 | 17418 \\ 32/64 X2 | 1046 \\ 32/64 X2 | 80981K \\ 128/128 BS AVX-16 | Core i7-3770 \\ 3.4 GHz | 1.8.0 | Linux | gcc 4.7.3 | linux-x86-64-avx | |
 | 5802K / 5491K \\ 128/128 BS AVX-16 | 14766 \\ 32/64 X2 | 940 \\ 32/64 X2 | 71238K \\ 128/128 BS AVX-16 | Core i7-2600K \\ 3.4 GHz | 1.7.9 | Linux | gcc 4.6.1-9ubuntu3 | linux-x86-64-avx | | | 5802K / 5491K \\ 128/128 BS AVX-16 | 14766 \\ 32/64 X2 | 940 \\ 32/64 X2 | 71238K \\ 128/128 BS AVX-16 | Core i7-2600K \\ 3.4 GHz | 1.7.9 | Linux | gcc 4.6.1-9ubuntu3 | linux-x86-64-avx | |
 | 5731K / 4647K \\ 128/128 BS AVX-16 | 14648 \\ 32/64 X2 | 918 \\ 32/64 X2 | 26852K \\ 128/128 BS AVX-16 | Core i7-2600K \\ 3.4 GHz | 1.7.8 | Linux | gcc 4.5.2-8ubuntu4 | linux-x86-64-avx | | | 5731K / 4647K \\ 128/128 BS AVX-16 | 14648 \\ 32/64 X2 | 918 \\ 32/64 X2 | 26852K \\ 128/128 BS AVX-16 | Core i7-2600K \\ 3.4 GHz | 1.7.8 | Linux | gcc 4.5.2-8ubuntu4 | linux-x86-64-avx | |
Line 114: Line 117:
 | 4458K / 4275K \\ 128/128 BS SSE2-16 | 17335 \\ 32/64 X2 | 1098 \\ 32/64 X2 | 61769K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc-4.7.0_20111108 | linux-x86-64 | | | 4458K / 4275K \\ 128/128 BS SSE2-16 | 17335 \\ 32/64 X2 | 1098 \\ 32/64 X2 | 61769K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc-4.7.0_20111108 | linux-x86-64 | |
 | 4452K / 4275K \\ 128/128 BS SSE2-16 | 17521 \\ 32/64 X2 | 1106 \\ 32/64 X2 | 61240K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc-4.7.0_20111108 | linux-x86-64 | -march=nocona | | 4452K / 4275K \\ 128/128 BS SSE2-16 | 17521 \\ 32/64 X2 | 1106 \\ 32/64 X2 | 61240K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc-4.7.0_20111108 | linux-x86-64 | -march=nocona |
-| 4452K / 4275K \\ 128/128 BS SSE2-16 | **45328** \\ SSE2i 12x | 1122 \\ 32/64 X2 | 61470K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9-jumbo5 | Linux | gcc 4.6.2 SUSE | linux-x86-64i |  | +| 4452K / 4275K \\ 128/128 BS SSE2-16 | 45328 \\ SSE2i 12x | 1122 \\ 32/64 X2 | 61470K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9-jumbo5 | Linux | gcc 4.6.2 SUSE | linux-x86-64i |  | 
-| 4449K / 4289K \\ 128/128 BS SSE2-16 | 17734 \\ 32/64 X2 | **1131** \\ 32/64 X2 | 61684K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc 4.6.2 SUSE | linux-x86-64 | -march=nocona |+| 4449K / 4289K \\ 128/128 BS SSE2-16 | 17734 \\ 32/64 X2 | 1131 \\ 32/64 X2 | 61684K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc 4.6.2 SUSE | linux-x86-64 | -march=nocona |
 | 4449K / 4283K \\ 128/128 BS SSE2-16 | 17478 \\ 32/64 X2 | 1080 \\ 32/64 X2 | 60780K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc 4.5.1 SUSE | linux-x86-64 | | | 4449K / 4283K \\ 128/128 BS SSE2-16 | 17478 \\ 32/64 X2 | 1080 \\ 32/64 X2 | 60780K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | gcc 4.5.1 SUSE | linux-x86-64 | |
 | 4448K / 4286K \\ 128/128 BS SSE2-16 | 17083 \\ 32/64 X2 | 1058 \\ 32/64 X2 | 61171K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | icc 12.1.0 | linux-x86-64 | -march=core2 -fast |  | 4448K / 4286K \\ 128/128 BS SSE2-16 | 17083 \\ 32/64 X2 | 1058 \\ 32/64 X2 | 61171K \\ 128/128 BS SSE2-16 | Celeron E3200 \\ oc 4.00GHz | 1.7.9 | Linux | icc 12.1.0 | linux-x86-64 | -march=core2 -fast | 
Line 131: Line 134:
 | 3429K / 3014K \\ 128/128 BS SSE2-16 | 15696 \\ 32/64 X2 | 924 \\ 32/64 X2 | 21567K \\ 128/128 BS SSE2-16 | Core i7 920 \\ o/c to 3.6 GHz ((Core i7 920 is also benchmarked non-overclocked,​ see below in the table)) | 1.7.6-jumbo-12 | Linux | gcc 4.4.5 | linux-x86-64 | | | 3429K / 3014K \\ 128/128 BS SSE2-16 | 15696 \\ 32/64 X2 | 924 \\ 32/64 X2 | 21567K \\ 128/128 BS SSE2-16 | Core i7 920 \\ o/c to 3.6 GHz ((Core i7 920 is also benchmarked non-overclocked,​ see below in the table)) | 1.7.6-jumbo-12 | Linux | gcc 4.4.5 | linux-x86-64 | |
 | 3376K / 3218K \\ 128/128 BS SSE2-16 | 16089 \\ 32/64 X2 | 1070 \\ 32/64 X2 | 43999K \\ 128/128 BS SSE2-16 | Phenom II X6 1090T \\ 3.21 GHz | 1.7.9 | Linux | gcc 4.6.2 \\ Debian Testing 4.6.2-4 | linux-x86-64 | | | 3376K / 3218K \\ 128/128 BS SSE2-16 | 16089 \\ 32/64 X2 | 1070 \\ 32/64 X2 | 43999K \\ 128/128 BS SSE2-16 | Phenom II X6 1090T \\ 3.21 GHz | 1.7.9 | Linux | gcc 4.6.2 \\ Debian Testing 4.6.2-4 | linux-x86-64 | |
 +| 3311K / 3139K \\ 128/128 SSE2 | 32460 \\ 128/128 SSE4.1 4x3 | 990 \\ 32/64 X3 | 36065K \\ 128/128 SSE2 | Q9650  @ 3.00GHz | 1.9.0-jumbo-1+bleeding-edf64e869 | Linux | gcc 10.2.1 20210110 \\ Debian GNU/Linux 11 | x86_64 | |
 | 3296K / 3213K \\ 128/128 BS SSE2-16 | 13564 \\ 32/64 X2 | 790 \\ 32/64 X2 | 47566K \\ 128/128 BS SSE2-16 | X5650 \\ 2.67 GHz | 1.7.8-fast-des-key-setup-3 | Linux | gcc 4.5.0 | linux-x86-64 | --test=20 \\ (CPU frequency scaling) | | 3296K / 3213K \\ 128/128 BS SSE2-16 | 13564 \\ 32/64 X2 | 790 \\ 32/64 X2 | 47566K \\ 128/128 BS SSE2-16 | X5650 \\ 2.67 GHz | 1.7.8-fast-des-key-setup-3 | Linux | gcc 4.5.0 | linux-x86-64 | --test=20 \\ (CPU frequency scaling) |
 | 3252K / 3129K \\ 128/128 AltiVec | 7693 \\ 32/32 X2 | 571 \\ 32/32 | 39938K \\ 128/128 AltiVec | POWER7 \\ 3.7 GHz | 1.8.0 | AIX | xlc | aix-ppc32-altivec | custom Makefile | | 3252K / 3129K \\ 128/128 AltiVec | 7693 \\ 32/32 X2 | 571 \\ 32/32 | 39938K \\ 128/128 AltiVec | POWER7 \\ 3.7 GHz | 1.8.0 | AIX | xlc | aix-ppc32-altivec | custom Makefile |
Line 250: Line 254:
 | 709 / 669 \\ 24/32 4K | 23.3 \\ 32/32 | 1.3 \\ 32/32 | 17K \\ 32/32 | Am386DX \\ 40 MHz | 1.8.0 | FreeDOS | gcc 4.5.2 | dos-djgpp-x86-any | | | 709 / 669 \\ 24/32 4K | 23.3 \\ 32/32 | 1.3 \\ 32/32 | 17K \\ 32/32 | Am386DX \\ 40 MHz | 1.8.0 | FreeDOS | gcc 4.5.2 | dos-djgpp-x86-any | |
  
 +====== GPU benchmarks ======
 +
 +  * These can vary highly with different JtR, driver and platform versions.
 +  * per-hash LWS and GWS are important as much
 +  * real numbers are to be put there, virtual are useless on GPUs 
 +
 +^ descrypt-opencl \\ many/one salt ^ md5crypt-opencl \\ many/one salt ^ bcrypt-opencl ^ LM-opencl ^ JtR ^ OS ^ Device Name ^ Driver Version ^ Platform Version ^
 +| 440819K / 414040K \\  LWS=32 GWS=131072 | 5160K / 4222K \\ LWS=32 GWS=114688 | 6053 \\ LWS=8 GWS=4096 | 4500M \\ LWS=128 GWS=131072 | 1.9.0-jumbo-1+bleeding-250498b959 | linux-gnu | GeForce GTX 1650 | 535.183.01 | OpenCL 3.0 CUDA \\ OpenCL C 1.2 |
 +| 38358K / 36390K \\ LWS=64 GWS=16384 | 1288K / 1259K \\ LWS=32 GWS=98304 | 846 \\ LWS=8 GWS=1024 | 877399K \\  LWS=128 GWS=65536 | 1.9.0-jumbo-1+bleeding-ce068233d | linux-gnu | GeForce GT 1030 | 460.91.03 | OpenCL 1.2 CUDA 11.2.162 |
 +| 33242K / 28356K \\ LWS=128 GWS=32768 | 284928 / 279552 \\ LWS=64 GWS=1536 | 748 \\ LWS=4 GWS=4096 | |  | Debian GNU/Linux 11 | AMD R7 M360 DRM 3.40.0 ​ | 20.3.5 | OpenCL 1.1 Mesa 20.3.5 |
 +| 6103K / 6103K \\  LWS=16 GWS=8192 | 210651 / 210651 \\ LWS=128 gwS-24576 | 397 \\ LWS=8 GWS=4096 | 278605K LWS=256 GWS=65536 | 1.9.0-jumbo-1+bleeding-367d6438e6 | Debian GNU/Linux 12 | Intel(R) UHD Graphics 630 | 22.43.24595 | OpenCL 3.0 |
 +| 5454K / 5420K \\ LWS=16 GWS=8192 | 173070 / 173070 \\ LWS=256 GWS=24576 | 364  \\ LWS=8 GWS=1024 | 274252K \\  LWS=32 GWS=65536 | 1.9.0-jumbo-1+bleeding-ce068233d | Debian GNU/Linux 11 | Intel(R) HD Graphics 520 | 1.0.0 | OpenCL 3.0 |
 +| 960909 / 956377 \\ LWS=16 GWS=4096 |  | 193 \\ LWS=8 GWS=128 | 57006K \\ LWS=512 GWS=8192 | 1.9.0-jumbo-1+bleeding-ce068233d | Debian GNU/Linux 11 | Intel(R) HD Graphics 4600 (HSW GT2) | 1.3 | OpenCL 1.2 beignet 1.3 |
 ====== What (not) to submit ====== ====== What (not) to submit ======
  
john/benchmarks.1597544190.txt · Last modified: 2020/08/16 04:16 by solar
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate to DokuWiki Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki Powered by OpenVZ Powered by Openwall GNU/*/Linux