One of the most common questions in the computing domain is “Can I use multiple processes or cores to increase speed?” In particular, problems that are time-sensitive or would take a relatively unreasonable amount of time to solve (NP-complete) receive a lot of attention; password cracking is no different.
John the Ripper has one primary workload: generating hashes of candidate passwords. Its two secondary loads (candidate password generation and comparison of computed hashes against those being cracked) are not always insignificant, but are not nearly as computationally intensive or complex as the hash calculations themselves. Most efforts to speed processing by parallelization focus on evenly dispersing the primary computational load over multiple cores, but there are simpler approaches for smaller problem sets (discussed below).
The three primary cracking modes of operation (single
, wordlist
, and incremental
) differ mostly in the candidate passwords generated: the single
and wordlist
modes simply attempt presumably-higher-probability password candidates before they would normally be attempted in incremental
mode. Hence, a normal JtR session executes the single
and wordlist
modes before falling through to incremental
in an attempt to save time.
For example: as of this writing, an unmodified copy of john-1.7.2
compiled as linux-x86-mmx
and running against a single FreeBSD MD5
hash on a 1\ GHz Pentium\ III typically completes –single
mode in under one second and ”–wordlist=password.lst –rules
” mode in under one minute. This performance is dependent on a number of factors including:
Even so, it serves to illustrate that unless the crack attempt is to be interrupted shortly, the majority of time spent cracking will be in incremental
mode.
John the Ripper 1.7.6+ includes built-in parallelization for multi-CPU and/or multi-core systems by means of OpenMP directives. In 1.7.6, this was limited to bcrypt hashes (with JtR's own optimized code) and SHA-crypt and SunMD5 hashes on recent Linux and Solaris systems (with system-provided thread-safe crypto code). All of this requires GCC 4.2 or newer, or another OpenMP-capable C compiler (also tested with Sun Studio 12). To enable the OpenMP parallelization, the proper OMPFLAGS
line in the Makefile
needs to be uncommented.
John the Ripper 1.7.9+ adds built-in OpenMP parallelization for traditional DES-based crypt(3), BSDI-style DES-based crypt(3), FreeBSD-style MD5-based crypt(3), and for LM hashes. (For 1.7.6 through 1.7.8, similar parallelization of the DES-based hashes was available with patches.)
In John 1.7.9-jumbo-6, the following formats support OpenMP (after enabling it in Makefile as described above): BF, BSDI, CRC32, crypt, DES, Django, DragonFly3-32, DragonFly3-64, DragonFly4-32, DragonFly4-64, Drupal7, EPiServer, GOST, HDAA, IPB2, KeePass, keychain, LM, Lotus5, MD5, Mozilla, MSCash, MSCash2, MSCHAPv2, MSKrb5, MySQL (old), NETHALFLM, NETLM, NETLMv2, NETNTLM, NETNTLMv2, ODF, Office, PKZIP, pwsafe, RACF, RAR, raw-SHA224, raw-SHA256, raw-SHA384, raw-SHA512, SAPB, SAPG, sha256crypt, sha512crypt, SIP, SSH, SybaseASE, trip, VNC, WBB3, WPAPSK, XSHA, XSHA512, ZIP.
Starting with 1.7.7-jumbo-5, John also supports MPI. No additional patch is required, just uncomment the last two lines in the MPI section in Makefile so it looks like this:
## Uncomment the TWO lines below for MPI (can be used together with OMP as well)
## If you experience problems with MPI_Barrier, remove -DJOHN_MPI_BARRIER
## If you experience problems with MPI_Abort, remove -DJOHN_MPI_ABORT
CC = mpicc -DHAVE_MPI -DJOHN_MPI_BARRIER -DJOHN_MPI_ABORT
MPIOBJ = john-mpi.o
More information about MPI is in doc/README.mpi.
John the Ripper 1.7.9-jumbo-6 and newer includes OpenCL code for some formats. This is primarily intended for use on GPUs, but it may also be used on CPUs, and typically (on both Intel's and AMD's OpenCL SDK) it will make use of all CPUs in a system. With few exceptions, the OpenCL code was not optimized for use on CPUs, though, and it may be inconvenient to use because of too large numbers of candidate passwords being buffered and tested per OpenCL kernel invocation, hence making the program less interactive. Yet this is an option, especially for formats where we do not have OpenMP parallelization, but do have OpenCL code. (When OpenMP is also available, it is typically more efficient, easier to use, and does not affect program interactivity as much.)
See also: List.External:DumbForce, List.External:KnownForce
Instead of making code changes to JtR to attempt to evenly split the workload, some approaches simply use multiple instances of JtR configured to work on separate keyspaces. Although possibly not as efficient an approach, for certain use cases it is sufficient. In order to take advantage of multiple CPUs or CPU cores on a single machine it is necessary to run more than one instance of JtR per machine (to match the number of CPU cores that are to be used for the task). This generally does not require a separate directory or configuration file; multiple instances of JtR can be run from within the same directory, sharing the same john.conf
, john.pot
, and other files just fine - this is a feature.
Patch | no | Compile | no | Configuration | yes | Even split | no | Overhead | no |
One approach to splitting the keyspace by password length would be to create additional incremental
modes in john.conf
. This approach is limited to the maximum password length set at compile time, but is not difficult to set up or run. The following is an example based on Incremental:All
:
[Incremental:All5] File = $JOHN/all.chr MinLen = 0 MaxLen = 5 CharCount = 95 [Incremental:All6] File = $JOHN/all.chr MinLen = 6 MaxLen = 6 CharCount = 95 [Incremental:All7] File = $JOHN/all.chr MinLen = 7 MaxLen = 7 CharCount = 95 [Incremental:All8] File = $JOHN/all.chr MinLen = 8 MaxLen = 8 CharCount = 95
To use, one would launch four separate instances of JtR, each with its own –incremental=AllN
argument (replacing N with the maximum length). Although those instances with larger keyspaces will, in theory, take longer to complete, all of the instances will be accelerated due to not having to search other lengths. In practice, with slow enough hashes and a large enough number of different salts, none of the instances or only those for very short lengths will terminate in a reasonable amount of time, so this approach achieves a reasonable split of the workload.
It is advisable to split the lengths range by complexity to the extent possible (like it is done in the example above) rather than numerically (e.g., 0-2, 3-4, 5-6, 7-8 would be very inefficient).
Patch | no | Compile | no | Configuration | yes | Even split | yes | Overhead | yes |
The most obvious approach is to use the pre-defined Parallel
–external mode. This has been shown to be sufficiently efficient for not-too-fast hashes and smaller numbers of nodes. To use this, you must modify the List.External:Parallel
section to reflect how many nodes (total
) are running and which instance (node
) that particular configuration file represents. When running multiple instances on the same machine and in the same directory, just make multiple copies of the List.External:Parallel
section, changing its name - then refer to these different instances of the sections (with different node
numbers in them) on the command-line.
Please note that the Parallel
external mode is meant to work along with wordlist
and incremental
modes, but will not in the single crack
mode. If used in single crack
mode, it will skip candidate password and target hash combinations that would otherwise be tested.
Patch | yes | Compile | yes | Configuration | no | Even split | no | Overhead | no |
Although technically not a parallelization attempt, a patch was posted in Fall of 2007 that enables JtR to check passwords of a given Markov score. The patch has since been integrated into the jumbo patch. It is feasible to run multiple instances of this patched version against separate Markov ranges using non-overlapping start
and end
parameters for each instance. If MPI is enabled, this splitting is performed automatically.
Details are available at the Markov generator page.
Many programmatic attempts have been made to split John's various processing modes across multiple computing resources, but none have been accepted as an official implementation by the Openwall team. Varying reasons have been given, but the overall indication has been that said patches do not split the workload as elegantly or efficiently as the development team would like, they introduce unneeded dependencies on external libraries, and their code quality is often inadequate. These are all available on the Openwall FTP site. The end of this post describes a more ideal method of programmatically splitting the workload, which:
Most projects have focused on using existing parallel programming toolkits to communicate between the instances/nodes; the following is a table of those efforts:
Name | Toolkit | Release Date | Author | JtR Version | Notes |
---|---|---|---|---|---|
None | MPI* | 2001/05/20 | stealth | 1.6 | Incremental-only |
DJohn | None | 2001/12/25 | Luis Parravicini | 1.6.30 | Incremental-only, abandoned download |
None | MPI* | 2004/01/22 | Ryan Lim | 1.6.36 | Academic, incremental-only (Academic) |
CS240A | MPI* | 2004/06/07 | Pippin, Hall, Chen | 1.6 | Academic, paper notes focus on dictionary attack v. incremental |
None | MPI* | 2006/06/01 | Timothy Shelling | 1.7.2 | abandoned download |
None | None (script) | 2006/06/05 | Timothy Shelling | 1.7.2 | abandoned, download checksum addon |
cpushare | CPUShare | 2007/10/01 | Andrea Arcangeli (cpushare.com) | 1.7.2 | Requires a CPUShare account (no longer available) |
dnetj | None | 2008/03/03 | John Anderson (bindshell.net) | 1.7.2 | Allows a less homogeneous environment than MPI |
John-MPI | MPI* | 2008/04/19 | John Anderson (bindshell.net) | 1.7.2 | Continuation of Ryan Lim's patchset |
GI John | None | 2009/02/23 | Balázs Bucsay Rycon.hu | 1.7.3.1 | Grid Implemented John The Ripper (http://www.gijohn.info) |
John-MPI | MPI* | 2009/06/18 | RB | 1.7.3.1 | Stripped bindshell.net implementation |
FullMPI | MPI* | 2010/06/22 | magnum | 1.7.6 | Extended version of the above, supporting all cracking modes (included in 1.7.7 Jumbo-5) |
-omp patch | OpenMP | 2010/05/08 | Solar Designer | 1.7.5 | Integrated into JtR 1.7.6+ (so no longer needed as a patch). Currently only parallelizes bcrypt hashes. Requires recent gcc or another OpenMP-capable C compiler (also tested with Sun Studio 12). |
-omp-des patch | OpenMP | 2010/06/27 | Solar Designer | 1.7.6 | Parallelizes JtR's bitslice DES implementation, reasonably efficient for DES-based Unix crypt(3) hashes. Requires OpenMP-capable C compiler. |
Clortho | Clortho | 2012/06/24 | ccdes | 1.7.6 / any | Pre-divided distributed keyspace, designed for exhaustion of keyspace given enough compute |
The list is mostly sorted by release date.
Downloadable files for the above and some other JtR parallelization projects are found in the Openwall file archive (with some of them in a directory specific to MPI).
* PLEASE NOTE: MPI is a full parallelization suite; as such it has extensive configuration options and daemons to run before using. If using one of the MPI-based works, please familiarize yourself with the MPI version you install - seldom can one simply install them and immediately begin using 'mpirun'.
Back to John the Ripper user community resources.
/* TODO: - Clean up 'Extended Efforts' section (bring in-line w/'Simple approaches' format)
- Cover whole doc for CCC (cohesiveness, clarity, & comprehensibility) */