How would one go about making their program do GBA LZ77 compression?

MarkyJoe1990 · November 28, 2015, 10:44pm

Relating to the whole procedurally generated map thing I’m working on, it requires the maps to be LZ77 compressed. I want the compression functionality to be IN my program instead of relying on, say, Nintenlord’s Compressor, since that eliminates unnecessary steps the user must do, improving work flow.

Some things I should mention; I’ve already looked at the source code for Nintelord’s Compressor, but it’s written in C#, and I’m doing this in Python, so… I’m not quite sure how I’d imitate the code.

Any help is appreciated.

Lisandra_brave · November 29, 2015, 3:01am

https://gbatemp.net/threads/nintendo-ds-gba-compressors.313278/
This includes a GPL’d one written in C, and a cursory search of the python documentation shows that you can relatively easily implement C function calls in python.
https://docs.python.org/2/extending/extending.html

Or you could reinvent the wheel.
http://feuniverse.us/t/a-question-on-lz77-in-progress/
has some discussion on the lz77 compression scheme,

Alternatively, you could say “fuck it” to compressing it at all. You can fake lz77 compression by inserting a 00 every 9th byte, then setting up the header. It’s terrible for size, but it works. I don’t recommend this one though.

If you’ve got a utility to read lz77 compressed data, you could just run it in reverse to re-compress it.

zahlman · November 29, 2015, 4:05am

I probably already have Python code for this somewhere. If not I can get it done tonight. It’s really not hard.

It’s not actually as simple as that makes it sound; it’s still straightforward, but the compression is harder to write than the decompression (this is true in general of compression algorithms, really).

EDIT: Here, have fun. I found some old code and modernized it (written for 3.x, should be fairly self-explanatory including the test at the end, let me know otherwise)

from itertools import takewhile, cycle

MAX_MATCH_LENGTH = 18
MAX_LOOKBEHIND = 1 << 12


def length_of_match(data, x, y, limit):
    """The number of matching consecutive characters starting at `x` and `y` in `data`."""
    return len(list(takewhile(
        lambda x: x[0] == x[1],
        zip(data[x:x+limit], data[y:y+limit])
    )))


def search(data, start, position, limit):
    """Look for the best previous match in `data` for the data at `position`.
    Matches are ranked by greatest length, then by shortest distance away
    (i.e., by greatest offset to the match position)."""
    candidates = range(
        max(start, position - MAX_LOOKBEHIND),
        position
    )
    
    return max(
        (length_of_match(data, previous, position, limit), previous)
        for previous in candidates
    ) if candidates else (0, 0)


def compress_raw(data, start, size):
    """Generator yielding blocks of either a single byte to output literally
    when decompressing, or a two-byte code for compression."""
    position = start 
    while position < start + size:
        limit = min(MAX_MATCH_LENGTH, start + size - position)
        match_length, match_location = search(data, start, position, limit)

        if (match_length > 2):
            encoded_distance = position - match_location - 1
            encoded_length = match_length - 3
            assert encoded_length & 0xF == encoded_length
            assert encoded_distance & 0xFFF == encoded_distance
            yield bytes([
                ((encoded_length << 4) + ((encoded_distance >> 8) & 0xF)),
                encoded_distance & 0xFF
            ])
            position += match_length 
        else:
            yield data[position:position+1]
            position += 1
    assert position == start + size 


def compress_gen(data, start, size):
    """Generator yielding the compressed data, in multiple `bytes` objects."""
    # Header
    yield bytes([0x10, size & 0xff, (size >> 8) & 0xff, (size >> 16) & 0xff])
    output_size = 0
    flags = 0
    chunk = b''
    # Group raw items into chunks of eight preceded by control bytes.
    # In the control byte, bits are set according to whether the corresponding
    # item is a compression code or a literal byte.
    for flag, item in zip(cycle((128, 64, 32, 16, 8, 4, 2, 1)), compress_raw(data, start, size)):
        chunk += item
        if len(item) == 2: flags |= flag

        if flag == 1: # finished a chunk; output control byte and chunk
            yield bytes([flags])
            yield chunk
            output_size += 1 + len(chunk)
            flags = 0
            chunk = b''
    # Emit any leftovers.
    if chunk:
        yield bytes([flags])
        yield chunk
        output_size += 1 + len(chunk)
    # Handle padding.
    yield bytes(-output_size % 4)


def compress(data, start, size):
    return b''.join(compress_gen(data, start, size))


def decompress(data, start):
    # Determine how much decompressed data to expect.
    header = data[start:start+4]
    assert header[0] == 0x10
    size = (header[3] << 16) | (header[2] << 8) | header[1]
    result = bytearray(b'')
    position = start + 4
    # Main loop.
    flags = []
    while len(result) < size:
        if not flags:
            flag_byte = data[position]
            position += 1
            flags = [bool((flag_byte << i) & 0x80) for i in range(8)]
        # Check the next flag and handle it accordingly.
        flag, *flags = flags
        if flag:
            # Interpret a compression code.
            first, second = data[position:position+2]
            position += 2
            match_length = (first >> 4) + 3
            encoded_distance = (first & 0xF) << 8 | second
            match_location = len(result) - encoded_distance - 1
            # Doing indexing math here in order to be able to handle
            # the 'wrap-around' behaviour elegantly.
            for i in range(match_length):
                result.append(result[match_location + i])
        else:
            # Interpret a literal byte.
            result.append(data[position])
            position += 1
    assert len(result) == size
    # Position may be less than len(data) due to padding.
    # In general, we won't know how much data to read anyway.
    return result


def test(source, compressed, verify):
    """Full demo of functionality."""
    junk = b'SPAMSPAMSPAM'
    junk_size = len(junk)
    chunk_size = 169

    with open(source, 'rb') as f:
        data = f.read()
    with open(compressed, 'wb') as f:
        # Compress the file in two chunks by two methods,
        # and put some filler before them.
        f.write(junk)
        first = compress(data, 0, chunk_size)
        f.write(first)
        f.write(junk)
        for item in compress_gen(data, chunk_size, len(data) - chunk_size):
            f.write(item)
    with open(compressed, 'rb') as f:
        data = f.read()
    with open(verify, 'wb') as f:
        # Restore the original data, by skipping the filler and
        # decompressing both chunks.
        f.write(decompress(data, junk_size))
        f.write(decompress(data, 2 * junk_size + len(first)))

(Ugh, Discourse removes blank lines from code blocks, apparently. That kinda sucks. Also I have to put a blank line between the opening spoiler tag and the opening code tag, at least for the preview to work. Curse you, jattwood.)

Edit: It’s been brought to my attention that Discourse fucks up the indentation inside the code block, which is death to Python code. Uh, try this I guess.

Incidentally, this is not particularly fast for compression (by which I mean I tested it on its own source and it took like 5 whole seconds on my not-particularly-terrible machine). I have some ideas for improvements, but it’s really not high priority right now. Also I couldn’t help myself trying to design a whole new format specifically for image data that should get better compression…

CT075 · November 29, 2015, 6:28pm

you could always actually learn the algorithm as well

MarkyJoe1990 · November 29, 2015, 10:04pm

What a great feeling to be told I don’t have to work tomorrow’s shift, AND people post helpful stuff in my thread. I’m pretty excited to try this out.

I was thinking of just blindly trial and erroring with Zahlman’s code (More accurately, assuming the compression function is what I’m looking for), but I decided to take CT075’s suggestion. I’m looking at the document and comparing it to the bytes of the compressed map in my hack. Also looking at the document CT linked. Things are starting to make sense, but… yeah, I think that PDF document would’ve been way harder for me to figure out on my own without the references I had. I’m not especially good at coding.

Anyway, once I manage to make a map, I’ll show it off. It’s not gonna be anything impressive, but I want to… um… I dunno. I guess it’d be nice to see if I showed you guys some progress.

EDIT:

… So 10 at the start (Or rather, 16 in hex) is the amount of bytes it’s gonna look at in a single “shot”. 16 bytes. EA 03 is the length of the raw data (EA 03 is actually 03EA, or 1002 in decimal). The last thing… uh… 00 0F… hm… I have no bloody clue.

The rest of the data, I’m pretty sure I already know what it is. 19 and 14 are the dimensions of the map in hex, and the rest is the compressed map data.

EDIT2: Is… is that 00 0F a “bitwise &?”

CT075 · November 29, 2015, 11:58pm

header is 4 bytes: 10 EA 03 00

10 - compression type: LZ77
EA 03 00 = 0x3EA = 1002 bytes = size of data

the 0F is the first byte of the compressed data

MisakaMikoto · November 30, 2015, 1:54am

I think you can compile the code written in other languages to a DLL and use it in your own code. It is easier than imitation. Even if you need to use others’ EXE directly, you can complete the step in your code so that your users needn’t do any extra steps.

CT075 · November 30, 2015, 5:04am

it’s very annoying and more than a little painful to use a .NET dll in a python script

MarkyJoe1990 · November 30, 2015, 9:11am

Wooooo! Progress!

Once I do a couple other things, the first “beta” should be ready, but uh… yeah. Right now, it only changes the first map because I’m bad at coding.

Thank you Zahlman and CT075 for the support over skype!

Lisandra_brave · November 30, 2015, 3:34pm

But what if you want to make the code portable to any platform? If you’re relying on a platform specific binary, then you’d need to do it differently for each platform.

MarkyJoe1990 · November 30, 2015, 10:29pm

More progress!

I can now create “pre-made” structures like thrones, and stuff like that X formation of forts.