Some of these were initially suggested for Google Summer of Code 2011 (GSoC), but they're also relevant to our own Summer of Security program, and we continue to update and reuse this same page after 2011 as well. When applying to us for GSoC or otherwise, please use our application template (which also includes info on how to contact us).
Although we have a lot of ideas listed here, our mentoring capacity is limited and some of the ideas would be incompatible if worked on during the same summer. Thus, in 2012 we intend to work on a subset of these ideas only.
Openwall GNU/*/Linux (or simply Owl) is our security-hardened Linux distro for servers, currently at (and beyond) version 3.0. We have a nearly perfect userland in terms of privilege reduction and privilege separation of/in individual programs/services. Specifically, Owl 3.0 is the very first Linux distro to have no SUID programs in the default install (yet be usable). However, further work is needed.
We can reasonably accept and work with several GSoC students on the Owl tasks below. Although the separation between the task categories is not exact, here are three potential roles that students are invited to apply for:
Owl: new functionality
Owl: updates to existing functionality
-
Owl: port to ARM
Owl: documentation (not suitable for GSoC, but perfectly acceptable for our
Summer of Security)
Each of these focuses on the corresponding one of the task categories identified below.
Functionality available out of the box needs to be expanded in multiple ways, including:
Need to have full LAMP stack in the base system. We need to add Apache, MySQL,
PHP - and do so in accordance with our project concepts, which will include some security-relevant changes.
DHCP and PPP/PPPoE/PPTP client support (add userland packages, introduce privilege separation where needed)
Assorted extra packages that are in line with typical uses, concepts, and goals of Owl
Support and setup a package repository (for easier
updates), possibly with Zypper, yum, or apt
The system should be brought more up to date:
New GNU toolchain, and new upstream software versions in general (since Owl 3.0, we've already updated gcc, but not much else)
RHEL6 binary and package compatibility where this does not conflict with our other goals
OpenVZ kernels from their “rhel6” branch (Owl 3.0 uses “rhel5” branch)
(Better) support for: IPv6 (in network startup scripts and installer), UTF-8 (in many places), GPT (disk partitions beyond 2 TB), etc.
System security should be improved further:
The “rhel6” branch OpenVZ kernel that we'd update to will need to be security-hardened, in part by reviewing, extracting, cleaning up, porting, and documenting/commenting individual changes from
grsecurity and PaX (some of which have originated from
Openwall's patches for older kernels), and in part by implementing new security-related changes/features, some of those specific to container-based virtualization (purpose-specific restrictions to be applied on per-container basis). We expect help/consulting/mentoring from the author of PaX on portions that are PaX (some of these are difficult to understand from the code alone, especially the rationale behind things being done in a certain way), whereas the rest are not too complicated for a capable person to fully figure out on their own. References:
1,
2
-
We should work with upstreams - OpenVZ and Red Hat - to try and get some of these enhancements accepted
If time permits and this sub-task is not claimed by another person, the same person could also port the individual changes to mainline kernels and work with LKML - although Vasiliy already did much work on this (before getting to do it for Owl, actually)
The gcc options used to build the userland will need to be adjusted (globally, but with some per-package exceptions) to maximize the effect of
ASLR and to harden the programs in other ways. (This is what some other hardened distros did while we were focusing on re-working privilege management in our userland, which they did not do. Now we need to catch up, and we'll be ahead of them overall. This should be a lot easier than the work we did so far.) References:
1,
2,
3,
4
-
Documentation should be improved.
We should “complete” and publish a User Guide, covering not only specifics of Owl, but its use in general.
Per-package info on Owl specifics may be added (including explanations of how and in what ways certain packages on Owl are more secure than their “equivalents” found in other distros - e.g., how our syslogd runs as non-root and verifies/logs credentials of local message senders).
More web pages may be added: a packages directory (based on
RPM metadata and the per-package text files mentioned above), man pages (some of these are Owl-specific).
John the Ripper is a popular Open Source and cross-platform password cracker (password security auditing tool). Its homepage has exceeded 16 million hits.
There is a little bit of overlap between some of the JtR tasks below. A compatible subset of the tasks (with no overlap and no dependencies of any one of the tasks on another) will need to be picked and the tasks' scope adjusted for GSoC (if applicable) to match the student applications we receive.
We can work with multiple students on a subset of these tasks, leaving the rest of the tasks for further occasions (such as for next summer). Students are welcome to apply for the following roles, which directly correspond to the tasks below:
-
JtR: GPU for slow hashes
(was successfully worked on by Lukas Odzioba under GSoC 2011, but more work is needed: optimizations, more hash and non-hash types, multi-GPU, documentation, etc.)
JtR: GPU for fast hashes
(we currently have proof-of-concept implementations - need to deal with the bottlenecks to advance at this task)
JtR:
GUI
(work was started by Shinnok and Aleksey Cherepanov)
JtR: SIMD and bitslice implementations of SHA-512, SHA-256, SHA-crypt, other SHA-2 based JtR “formats”
(bonus task if time permits: also convert more of the DES-based JtR “formats” to use JtR's existing bitslice DES implementation)
Support for more things beyond password hashes: Mac
OS X FileVault and/or keychain passwords (these use PBKDF2), WEP & WPA-PSK passphrase (part of functionality of
aircrack-ng available right inside JtR), more
SSH private key types (we already have support for OpenSSH's due to Dhiru's GSoC 2011 work), GnuPG and PGP secret key passphrase, more archive passwords (we already have support for many ZIP and RAR archives due to Dhiru's GSoC 2011 work as well as later work by JimF and magnum), …
GPU support for “slow” hashes (does not require changes to JtR interfaces and program structure)
GPU support for “fast” hashes (requires some invasive changes to achieve good efficiency)
GUI, likely using Qt in C++: as wrapper around the command-line program (this is what's being worked on now) and/or integrated (will require/provide greater interaction)
Implementations of SHA-512 that make use of SSE2 to access 64-bit integer operations from an x86 CPU's 32-bit mode (when running a 32-bit
OS) are in use, however as of this writing (early 2012) full SIMD implementation of SHA-512 (or of SHA-256) that would compute multiple SHA-2 digests in parallel (e.g., two instances of SHA-512 in 128-bit SIMD vectors with 64-bit elements) appear to be only a subject of
academic papers. (There's also a
recent paper on applying SIMD to message scheduling while computing a single instance of SHA-256 or SHA-512.) Additionally, bitslice implementations
might turn out to deliver better performance on some CPUs or/and GPUs (and even if not yet, they may be valuable for relatively quickly re-testing this on future hardware). A possible source for speedup is the bit rotate operations, of which there are many in SHA-2, and which become no-ops with a bitslice implementation (on the other hand, many intermediary values will have to be in L1 cache rather than in registers, which may slow things down to a larger degree). We've previously demonstrated how it is possible to create a bitslice implementation of MD5 (
original,
revised), and SHA-2 are similar in design. The task here is to create both kinds of parallel implementations (regular SIMD and bitslice) and to compare their performance on actual hardware. Once we have SIMD and/or bitslice implementations of SHA-512 and/or SHA-256, we need to also build implementations of SHA-crypt (password hashes supported in glibc 2.7+ and in use by certain modern Linux distributions and by DragonFly BSD) on top of them - for use in JtR (hashing of multiple candidate passwords being tested in parallel). Ditto for other hash and cipher types supported by JtR that build upon SHA-2 (Mac
OS X 10.7 salted SHA-512 password hashes, etc.)
We have a number of project ideas, where the student's role would correspond to completion of an entire software, research, and/or “community” project independent from our existing larger projects (albeit closely related to our activities in general).
Here are the short “role names” for individual tasks briefly described below. Please use these short names when you apply to work on one of the tasks.
blists development
Own creative and relevant idea
blists is our web-based interface to mailing list archives. It works off pre-indexed mbox files. This approach enables it to be extremely fast and lightweight: messages are located instantly (in at most a few disk seeks) and there's no need to cache pre-generated
HTML page bodies. Even though we're making use of blists already (for publishing our own, hosted, and some third-party mailing lists on the web), it needs a lot more work (yet we failed to find time for work on it lately). Some of the things to add are support for character encodings for message bodies (converting them to UTF-8), thread view, and a search feature. At least the latter will require changes to the index file format (or a separate search index). Use of existing search libraries such as Xapian and/or implementing the search functionality entirely on our own are both within consideration. Please refer to
this thread (click thread-next) for current status (as of early 2012).
Your own creative and relevant idea - please propose it to us first, then describe it again when you apply
With few exceptions (such as for changes to existing Linux kernel code, which is under GPLv2 anyway), we require any contributed code to be made available under a cut-down BSD license. The wiki page linked from here is for JtR, but we'd like to use this approach for most other projects as well. By applying to work on one of the ideas with us, you signify your acceptance of these terms and your intent to license your code contributions accordingly.
This approach permits us to combine contributed code with differently-licensed third-party code, and it does not lock us to a specific Open Source license for our releases.
Additionally, it permits us to create and sell proprietary revisions of our programs, which we're currently doing with JtR Pro. As you can see from the feature sets of free JtR vs. JtR Pro, we're not abusing this ability to artificially cripple our free software. The free JtR remains the main one, where features get implemented first, with “Pro” being branched off some free versions for those users who prefer a pre-packaged “product”. Overall, the introduction of JtR Pro has helped development of the free JtR so far, by letting us spend more time on the project (vs. doing more client-facing work on other projects). We assume that students applying for work on JtR are comfortable with this.
These are kept recorded in here, but are not currently offered as initial tasks to new contributors for a variety of reasons (such as our mentoring capacity).
JtR: automatic rule set generation
JtR: distributed processing, including a possible sub-task:
Greater interaction with running cracking sessions (e.g., with an ircII-like ncurses interface or/and a
GUI) - such as to add/remove nodes on the fly
JtR: parallel processing (on one node) - not just further work on OpenMP support (initially integrated in 1.7.6), but also other approaches (not specific to individual hash types and achieving greater efficiency than OpenMP can provide for “non-slow” hashes)
JtR: integration of contributions
-
-
New password hashing method - claimed by: Yuri Gonzaga Gonçalves da Costa under GSoC 2011
Bitslice DES
Virtual distributed vector computer
-
Further research on and implementation of automatic rule set generation based on previously-cracked passwords. References:
1 2
JtR distributed processing, including a possible sub-task:
Greater interaction with running cracking sessions (e.g., with an ircII-like ncurses interface or/and a
GUI) - such as to add/remove nodes on the fly
JtR parallel processing (on one node) - not just further work on OpenMP support (initially integrated in 1.7.6), but also other approaches (not specific to individual hash types and achieving greater efficiency than OpenMP can provide for “non-slow” hashes)
JtR: integration of more hashes/ciphers/features/optimizations from the jumbo patch (and other user-contributed patches) into the official JtR - requires code cleanups, portability enhancements and testing, clearing up potential licensing issues (in some cases), etc. - or reimplementation
Linux kernel hardening - extract security hardening changes from various patches (which the mentor will point out), forward-port them to the latest mainstream kernels, make it easy to enable/disable the hardening measures (both compile- and runtime), add documentation, properly submit to and work with LKML (make proposals and own discussions to completion: either rejection or acceptance). This is a noble but thankless job to do, so be prepared! The authors of those changes did not submit them “properly” and did not “own discussions to completion” precisely because the job is so thankless.

Get better password security features into
PHP proper (the
PHP interpreter)
-
New crypt(3) flavor using concepts of
scrypt (not only iterations, but also parallelism and memory), optionally making use of
AES-NI
GPU or/and FPGA accelerated password hashing on servers (to better compete with similarly accelerated or distributed offline password hash cracking), optional local parameterization on specific hardware (parameter unreadable from host
OS)
-
We have some achievements in
generating more optimal DES S-box expressions for bitslice implementations. A possible task for a student would be further work on this: community distributed processing project (with “agents” working on portions of the task) to arrive at even more optimal S-box expressions, potential application to S-boxes of other ciphers, paper on the approach, code cleanups of programs used to generate the S-box expressions and code, and public release of these programs.
“Virtual distributed vector computer”: no native machine code distributed to nodes (good for security), yet near native performance should be possible for suitable tasks (such as bitslice implementations of ciphers applied to key search attacks) given efficient implementation of “agents” for their target machine architectures
This task involves research, design, implementation, testing, practical use example, and a publication
This might be partially hampered by
US Patent 5946496 (expiring in 2017);
authors' web page. We have not reviewed these yet (found them when searching for possible existing implementations of the idea, which we did not find).
Develop intense unit tests for all libc interfaces, and test
musl and other C libraries for correctness. Ideally the tests would be resilient against missing or buggy interfaces halting the test, so that partial results could be obtained even on certain incomplete or/and highly buggy C libraries.
This was worked on under GSoC 2011, and it was a half-success (6 out of 13 test categories were implemented).