Jawa is a rainy-day project to support inspecting, modifying, and creating JVM bytecode from Python. It’s a successor to an earlier project from 2010 which was used to magically parse new versions of Minecraft and find new network packets, entities, sounds, etc. It did this by looking for patterns in the bytecode and reconstructing higher-level objects based on what it found.

These days, there are popular new tools like Krakatau for producing human-readable output, but this kind of project isn’t always the best option when you actually react on the results.

Creation

Jawa can construct brand new ClassFiles from scratch - lets try the classic “Hello World!” example:

    #!/usr/bin/env python
    # -*- coding: utf8 -*-
    """
    An example showing how to create a "Hello World" class from scratch.
    """
    from jawa import ClassFile
    from jawa.assemble import assemble

    if __name__ == '__main__':
        cf = ClassFile.create('HelloWorld')

        main = cf.methods.create('main', '([Ljava/lang/String;)V', code=True)
        main.access_flags.acc_static = True
        main.code.max_locals = 1
        main.code.max_stack = 2

        main.code.assemble(assemble([
            ('getstatic', cf.constants.create_field_ref(
                'java/lang/System',
                'out',
                'Ljava/io/PrintStream;'
            )),
            ('ldc', cf.constants.create_string('Hello World!')),
            ('invokevirtual', cf.constants.create_method_ref(
                'java/io/PrintStream',
                'println',
                '(Ljava/lang/String;)V'
            )),
            ('return',)
        ]))

        with open('HelloWorld.class', 'wb') as fout:
            cf.save(fout)

Now lets give it a try:

» java HelloWorld
Hello World!

Success! Just like that, we’ve assembled a Class that the JVM will happily run. You can compare this to the Jasmin “Hello World!” example, the defacto standard for JVM assembly syntax. Both examples are equally compact and concise. We accomplish this by using the assemble() helper which provides support for psuedo-assembly (including named labels and branches), generating a stream of Instruction and Operand objects. This is what it would look like without that helper:

    from jawa.bytecode import Instruction, Operand, OperandTypes

    main.code.assemble([
        Instruction.from_mnemonic('getstatic', [
            Operand(
                OperandTypes.CONSTANT_INDEX,
                cf.constants.create_field_ref(
                    'java/lang/System',
                    'out',
                    'Ljava/io/PrintStream'
                ).index
            )
        ],
        ...
    ])

This is extremely precise and will always result in bytecode exactly as provided (even when it’s wrong), but you would quickly go insane doing this by hand so it’s recommended to always use the assemble() helper.

Modification

We can also easily modify existing classes. Lets take our Hello World! example from the last section and turn it into a Hello Mars! example.

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import sys

    from jawa import ClassFile
    from jawa.assemble import assemble


    def main():
        with open(sys.argv[1], 'rb') as fin:
            cf = ClassFile(fin)

            # We aren't doing HelloWorld any more, so lets fix the name of
            # our class!
            cf.this = cf.constants.create_class('HelloMars')

            # We could just modify the "hello world!" string in the constant
            # pool, but where is the fun in that? Instead, we're going to
            # disassemble the main method, find the 'ldc' that loads the string
            # constant to the stack, and change it to point to a new constant.
            main = cf.methods.find_one(name='main')

            new_main = []
            for instruction in main.code.disassemble():
                if instruction.mnemonic == 'ldc':
                    # We could build an Instruction and Operand object ourselves,
                    # or use the `assemble()` utility to do it for us.
                    new_main.extend(
                        assemble((
                            ('ldc', cf.constants.create_string('Hello Mars!')),
                        ))
                    )
                else:
                    # We only wanted to patch the 'ldc', everything else we want
                    # to keep.
                    new_main.append(instruction)

            main.code.assemble(new_main)

            with open('HelloMars.class', 'wb') as fout:
                cf.save(fout)


    if __name__ == '__main__':
        sys.exit(main())

Lets give our newly modified class a try:

» java HelloMars
Hello Mars!

Success!

Inspection

As a very basic example, lets make a grep tool. This version will only check the value of strings and will let us filter on a pattern.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
USAGE = """jvmgrep.py

Usage:
    jvmgrep.py <pattern> <file>...
"""
import re
import sys

from docopt import docopt

from jawa import ClassFile
from jawa.constants import ConstantString


def main():
    args = docopt(USAGE)

    pattern_r = re.compile(args['<pattern>'])
    matcher = lambda c: pattern_r.findall(c.string.value)

    # Regex to erase excessive whitespace from results.
    whitespace_r = re.compile(r'(\s{2,}|\t|\r|\n)')

    for file_ in args['<file>']:
        with open(file_, 'rb') as fin:
            cf = ClassFile(fin)

            # Filter the constants down to just types of ConstantString which
            # also have at least one result for `pattern`.
            query = cf.constants.find(ConstantString, f=matcher)

            # Show our results ...
            for i, constant in enumerate(query, 1):
                print('[{cf.this.name.value}][{i:003}] -> {s}'.format(
                    cf=cf,
                    i=i,
                    s=whitespace_r.sub(
                        ' ',
                        constant.string.value
                    ).strip()
                ))


if __name__ == '__main__':
    sys.exit(main())

Lets use our new tool to find possible links buried inside the Minecraft JARs:

» python jvmgrep.py "http(s?)://" minecraft/**/*.class
[cb][001] -> http://www.amd.com/
[cb][002] -> http://www.nvidia.com/
[cz][001] -> http://s3.amazonaws.com/MinecraftResources/
[dc][001] -> http://s3.amazonaws.com/MinecraftSkins/
[gs][001] -> http://s3.amazonaws.com/MinecraftCloaks/
[kp][001] -> https://login.minecraft.net/session?name=
[nb][001] -> http://www.minecraft.net/game/joinserver.jsp?user=
[xz][001] -> http://s3.amazonaws.com/MinecraftSkins/

Success! This was a pretty trivial example, but that’s just how easy it is to open a ClassFile and start poking around.

More!

Jawa has extensive documentation - give it a try.