JtR Development

This section contains information useful for those who might wish to understand the internals of JtR, or develop new functionality.

JtR compilation process explained

If you're new to JtR as I was, you might find yourself with nice clean compiles that run 'out of the box' (thanks to the design), but no idea what's really going on in the compilation process. If you want to put the JtR code into an IDE like NetBeans to understand this software better by debugging it, you'll need to know more about the compilation process. Same goes if you want to extend jTR with your own formats/algorithms etc.

So..here's a really brief description of the compilation process

You choose your target architecture from the command line eg. macosx-x86-64 as part of your call to make.
Based on this selection, the Makefile then creates a link from the right architecture header (eg. x86_64.h) to arch.h. This file defines things like how long a WORD is on your machine etc. and is used throughout the software. Throughout the compilation process, CFLAGS, LDFLAGS and Assembler flags have been setup specific to your machine.
Compilation occurs:
- Utility files are compiled (unshadow, unafs, unique, undrop, genmkvpwd, mkvcalcproba, calc_stat), with a file suffix based on your architecture eg. .exe, .com or nothing for *nix
- john related objects are compiled (including the main JtR executable: john or john.exe), along with your architecture/option specific assembler file(s) eg. x86_64.S containing, amongst other things, DES and other optimised functions for your machine.

Ultimately, you end up with the following executables (add .exe for relevant o/s's). See README and ./doc files for details):

Executable	Description	Example Use
john	the main JtR executable	see README
unique	remove duplicates from a piped in list of word and output a unique list	cat filewithdupes.lst (pipe) ./unique filenodupes.lst
unshadow ^*	Take a standard *nix password file and convert to JtR password format	unshadow /etc/passwd /etc/shadow > mypasswd
unafs ^*	Convert a Kerberos AFS password file and convert to JtR password format
undrop ^*	Eggdrop userfile converter	./undrop eggdropfile JtRpasswordfile
genmkvpwd	Generate markov level (see MARKOV in /doc)
calc_stat	Create markov stats file from existing password dictionary	./calc_stat dictionary_file.txt stats
mkvcalcproba	Generate markvo statistics about cracked passwords from a stats file	./mkvcalcproba stats /tmp/passwordlist

^* Symbolic link to john executable.

It's worth examining the Makefile and checking out the actual compilation steps required for your machine.

As an example, for the macosx-x86-64 (Mac OS X x86 architecture chips, 64 bit) the following things happen:

link x86-64.h to arch.h
make the following executables: john, unshadow, unafs, unique, undrop, genmkvpwd, mkvcalcproba and calc_stat
compile JOHN_OBJS ie. JOHN_OBJS_MINIMAL and DES_bs_b.o and x84-64.o with correct compiler flags eg. -m64, -DUNDERSCORES etc.

The compilation of DES_bs_b.o is also worth understanding. DES_bs_b.c DES_bs_s.c DES_bs_n.c are compiled with an inline option, but not before the DES_bs_s and DES_bs_n c files are created by SED'ing sboxes.c and nonstd.c, updating 'unsigned long' to 'ARCH_WORD' which was defined in arch.h. arch.h remember is linked to x86-64.h.

Armed with this information and the right debug settings (eg. -g), you'll now find it's not too difficult to put together your own IDE-friendly Makefile for debugging and understanding this beast in more detail.

Integrating new formats

For a start, you can read format.h which has descriptions for what functions you will need to write and their use.

Here's some info on binary hashes and BINARY_SIZE that you may need while implementing a new format.

Let's say you've built a new decryptor format file for JtR and you want to include it in the compilation. Here's a very high level guide on the changes you'll need to make.

Note: Since 1.7.8 Jumbo-3, simple formats can be plugins. Just name it <format>_fmt_plug.c, put it in the ./src directory, make clean and rebuild. The changes in options.c and john.c described below are not needed. If a helper source file is needed too, it can be named eg. <helper>_plug.c

In john.c

Add your external structure in the list towards the top of the file eg. extern struct fmt_main fmt_XXXX;
In the function john_register_all, register your new format eg. john_register_one(&fmt_XXXX);

In options.c

Update the list of valid format to include yours. Change the text literal –format=NAME and add yours in at the end eg. the end of the line goes from …/HDAA\n”; to /HDAA/XXXX\n”;

Don't forget to update your Makefile, probably just adding your format object to the JOHN_OBJS target eg. insert fmt_XXXX.o somewhere in there (preferably the format list which comes first). Lastly, if you want to offer up your contribution to the JtR world, follow the instructions to make a patch against an original version.

Optimising performance

These are the functions that should be optimised in JtR (in order of importance):

crypt_all()
set_key()
set_salt()

cmp_all() and the get_hash() functions should be made fast too, but that is usually not a problem. Note that you can settle for just comparing parts of the binary (perhaps one word) in cmp_all() - this will lead to a few false positives that will be sorted out in cmp_one() (or even cmp_exact) but that might be a good saving overall.

If you have any means of moving stuff from set_salt() to salt() [a.k.a get_salt()], do it. The latter is called only once per run. We recently got a 25% boost in the hmac-md5 format, basically just by moving of a couple of lines of code from set_salt() to get_salt().

For salted formats, if you can do *any* key preparation that does not depend on salt, do it in set_key(). Note however that set_key is currently single threaded - when using OMP, it is sometimes better to put some key processing in crypt_all() but only once per key (as opposed to once per salt). See mscash1_fmt_plug.c for an example. It calls nt_hash() once (and threaded) for all keys in a batch.

Other performance notes

Do implement a salt_hash function, if a salt is used. This speeds up loading. It's very easy, look at the existing formats.
Do implement get_hash functions. These speed up cracking (and loading too) and can make a huge difference, especially when attacking many (like, millions) hashes at once. These functions are very trivial, look at the existing formats.

Setting up JtR in an IDE

The default JtR distributions include a Makefile that prepares binaries of john and the utilities based on an architecture you provide as a parameter to make. While this setup gets you going fast, it doesn't allow you to configure the compilation for debugging and development. There are many possible ways to do this setup, however one that's working well for me is described below in the hope others will find it useful. I use NetBeans 6.5 on an iMac, however the steps are adaptable to any IDE you care to use.

Create a new C/C++ project in NetBeans, call it john. We'll be using conditional compilation, and extra profiles to test different runtime environments.
Create a set of logical directories to hold the different categories of files in JtR. This is immensely helpful when you immerse yourself in development, and is instructive in and of itself for understanding the design of this software:
1. Assembler Files
2. Format File
3. Header Files
4. Key Files
5. Password Files
6. Resource Files (default project creates this)
7. Source Files (as above)
8. Utility Files

Now, allocate each of the base distribution files to one of the above categories…as follows:

Assembler Files - just choose your machine's architecture and pick the corresponding .S file. Don't add the other .S files - this is just done for simplicity. You can rename your architecture .h file to arch.h for simplicty if you want. In this situation, your makefile doesn't have to take your command line (from make clean XXXX) to work out what to link to - good for IDE's where modifying Makefiles (and making your changes stick!) is difficult.

   Format Files - add every *_fmt.c file here. 
   Header Files - the .h's
   Key Files - Any test key files you use. The ones included with the package are: xxx xxx
   Password Files - password.lst, and any other password files you create/get. 
   Source Files - any other .c files left over EXCEPT the utility standalones (next category)
   Utility Files - unshadow, unafs, and unique
   Character Files - defaults are all.chr, digits.chr, lanman.chr, alnum.chr and alpha.chr
   Important Files - just leave the default Makefile in here. We'll be customising for our architecture later, with some special make steps.

In your IDE's prebuild step, ensure you a) link your architecure file (eg. x86-64.S) to arch.h, and run sed (per the original makefile lines) to replace unsigned long with ARCH_WORD in a few files (this keeps the original files intact). To ensure your assembler -DBSD etc. flags are picked up, you can just set the properties of this .S file to compile with the normal C compiler. You may also need to run some other steps: again - check the original JtR Makefile for your architecture, and add them to the pre-build steps of your IDE's Makefile (this is called .pre-build in NetBeans).
Make sure you update your project settings with the correct compilation flags for your architecture - see the original Makefile. In particular, don't forget the .S assembler files too!
If your IDE does not allow the straightforward flagging of files you do not want compiled, you should remove the utilities .c files from your project. The reason for this is that they include a main function, and your final LD linker may complain about multiple definitions of main. These files are best.c, calc_stat.c, detect.c, genmkvpwd.c, mkvcalcproba.c, and symlink.c

You should now have a project that will product the john executable file, in your IDE, and hopefully, debuggable so you can learn more about the guts of this utility.

Checking out JtR from CVS

We can checkout JtR sources from the Owl repositories, removing the unecessary bits afterwards. Some CVS-guru may find a nicer way to do this:

cvs -d :pserver:anoncvs@anoncvs.owl.openwall.com/cvs login
CVS password: anoncvs
cvs -d :pserver:anoncvs@anoncvs.owl.openwall.com/cvs co Owl/packages/john
mv Owl/packages/john/john john-cvs && rm -rf Owl
cd john-cvs
cvs update

Importing JtR from CVS to a local git repo

This takes a while to complete (the full CVS history is imported)

mkdir john-cvs-git
cd john-cvs-git
cvs -d :pserver:anoncvs@anoncvs.owl.openwall.com/cvs login
CVS password: anoncvs
git cvsimport -v -d :pserver:anoncvs@anoncvs.owl.openwall.com/cvs Owl/packages/john/john

To update, just run the same command again.

Tracing JtR's flow

Build debug versions of JtR. Change src/Makefile

-CFLAGS = -c -Wall -O2 -fomit-frame-pointer -I/usr/local/include $(OMPFLAGS)
+CFLAGS = -g -c -Wall -O2 -fomit-frame-pointer -I/usr/local/include $(OMPFLAGS)
-LDFLAGS = -s -L/usr/local/lib -L/usr/local/ssl/lib -lcrypto -lm $(OMPFLAGS)
+LDFLAGS = -L/usr/local/lib -L/usr/local/ssl/lib -lcrypto -lm $(OMPFLAGS)

Run JtR under Valgrind’s callgrind tool

valgrind –tool=callgrind run/john –format=nt <hash file>

Use KCachegrind for profile data visualization

Debugging

To debug john with gdb you have to make the following changes in the Makefile:

change -O2 to -O0 in CFLAGS
add -g to CFLAGS
remove -s from LDFLAGS

After that you will be able to see function names instead of addresses in gdb's backtrace:

$ gdb ./john
(gdb) run -test -format=bcrypt
(gdb) backtrace

GPU

GPU development

Parallella

Parallella development

To come ...

Add more Callgrind graphs. (Feel free to generate and add your own graphs).

The format load and run process and sequence eg. providing a format test, running format self tests, updating fmt_main with the steps that 'override' (it's C) the ones in format.c

What goes in your default format functions like XXXX_get_key

Running JtR efficiently eg. what run profiles to do when so you maximise the hits in the most efficient manner eg. run a wordlist first, then with rules etc

Table of Contents