DSA Hype Thread

zahlman · October 9, 2019, 3:02am

This thread is defunct. Please go to the new thread.

So first off, big shout-out for everyone who got a gameplay hack turned in on time for FEE3. Y’all are awesome and your content is about to be shown off by some really cool LPers.

I’m still working on the promised dsa v.2.x.y and I’m really liking the progress I’m making so far - my vision is coming together, the latest iteration of an idea that’s been stuck in my head for literally over 15 years, since even before I knew anything about Fire Emblem. By the time the show is underway I am expecting to have a public release out and hopefully some kind of promotional video to include in the actual FEE3 showcase.

With that said, I thought I’d take a moment to hype up the project and tackle some “”“Frequently Asked”"" Questions. Also in this thread I’ll be posting progress updates, teasers etc.; and when public releases start I’ll be updating this OP with links.

What Is DSA?

DSA stands for Data Structure Assembler.

It’s analogous to EA (Event Assembler), but more general-purpose in intent and design. I’m making it to be a true “everything editor” from the ground up, with a more powerful language raws system (internally these files are either type description files or structgroup description files) that gives types to data and names to values, and a plugin system for things that can’t be handled in EA style.

What can I do with it and how does it work?

In short, everything you can do with the ROM. In theory.

The general look of your file should be familiar: there are “labels” that indicate the start of chunks of data, and then the data mostly is presented as a sequence of “struct” lines (for example, in event code there is one line per event opcode; but a struct could also represent for example a row of a Nightmare table). But there are a bunch more neat features on top of this, and also you don’t need to mess around with a preprocessor to do useful things (and in fact I am not including one for now).

The plugin system allows DSA to extract compressed data and convert file formats (filter plugins), as well as to present the data in other ways or just toss data into separate files (interpreter plugins). You can chain filters together and dump the converted data in a file - the current version under development can already take apart FE7 portrait array data and produce PNGs.

Even more than that - DSA is intended as a fully general-purpose tool, beyond hacking or even ROMs into editing whatever binary files. If it has some concept of a pointer to somewhere else in the data, DSA is for you. You just need the right types/structgroups/filters/interpreters.

Why would I use this over other tools?

Depends on the tool.

As compared to things like Nightmare/Nightmare 2 or nmm2csv/c2ea in conjunction with EA, you would use DSA because it’s a one-stop shop that replaces all of that.

As compared to graphical tools, you would use DSA for mostly the same reasons that you would use EA, if EA could do them. Graphical tools that insert stuff directly into the ROM are nice and friendly enough, but when something goes wrong it can be hard to track down the problem yourself, and you probably don’t have a very organized system for backing up your ROM. Keeping a set of readable text files with your changes lets you rebuild from scratch whenever you need to, and it also lets you use a version control system more effectively.

As compared to EA, you would use DSA because it can do those things.

(This is the part where I bash on EA a bit. It feels kinda bad doing this because honestly NintenLord did an amazing job with the project - the “language raws” system resulted from him taking a suggestion of mine and going above and beyond, to the point where I was smacking my head and wondering why I hadn’t thought of it. But it’s been years now, and progress marches ever onward…)

In particular:

When you disassemble from the ROM, the output automatically fills in “enumeration” names that would normally only be available via macros (i.e., only during assembly), and puts labels on important ROM addresses. So you have a head-start on modifying existing data because you can clearly see what everything is.
The disassembly language is more powerful; it’s not based on assuming the presence of an event ID code. So you can easily use it to take apart table data that you would otherwise describe with .NMMs - and in fact, basic nmm2dsa and ea2dsa conversion tools are provided. But of course, it also lets you represent things that Nightmare doesn’t understand, like pointers and custom data types.
The plugin system means you can go beyond the sorts of editing you’re used to doing with EA and into the “everything editor” realm, except instead of relying on #incext calling other programs you can use Python code that runs in the same process and is custom-built for purpose. (I also have in mind a system for caching per-file assembly results so you don’t have to wait around every rebuild for all your images to re-convert.)
Oh, and have you ever read the EA license? It’s honestly kind of a mess. I’m putting a lot more thought into the DSA license. I want to avoid any potential confusion or fighting down the road.

How do I start using it on my existing projects?

Simple. I think/hope.

Proper installation instructions will be provided with the release, but basically you will need to pip install two packages - dsa and dsa-extras - into your Python virtualenv sandbox (or your actual Python installation, I guess, if you like to live dangerously), and then run a command that dsa-extras provides to notify dsa of the added available extras.

To migrate your EA files and other such content, the recommended approach is:

Build your ROM as before, with the old tools.
Use DSA to disassemble the built ROM.
Use the resulting disassembled files going forward.

If you need help with this, let me know and I’ll see what I can do for you.

What is the current version and where do I get it?

By my reckoning, the appropriate version number for the in-development version as I make this edit is (update:) 2.8.226+125, and this release will probably be numbered as somewhere in 2.10.x+y. (The +y part refers to commits/changes to the dsa-extras.)

What are these `dsa-extras`?

The stuff that makes DSA work specifically with GBAFE, basically.

Because DSA is designed as a fully general-purpose tool that I want to be able to show off to people in a professional context, and not have to dance around the topic of ROM hacking, things that are FE- or even GBA-specific have been shuffled off into a separate project. Sorry for adding the extra step. I promise I am trying to make this as smooth as possible.

Also, the dsa-extras project is separately licensed. It mostly consists of data that can be licensed under a permissive Creative Commons license (still thinking about this), and I’m willing to put the plugins under WTFPL just to not have to think about it. dsa itself is more restrictively licensed, for a variety of personal reasons.

How can I help make DSA even more awesome?

If you make or maintain a graphical tool for FE hacking, it would be amazing of you to design it to (or at least add the option) output data for DSA to assemble, rather than editing the ROM directly. Also, you can contribute content for dsa-extras (as long as you’re ok with the licensing).

Klokinator · October 9, 2019, 3:12am

No. Who would? You madman. Always click accept on terms and conditions and skip licenses. Welcome to the real world, jackass!

zahlman · October 9, 2019, 3:13am

Get Hype. Greyliwood demands it.

greyliwood

(The palette information is extracted separately and there isn’t really a way - at least, yet - to tell DSA to combine them. But on the assembly side, it will at least be able to pull multiple pieces of information out of the same .png file.)

Zeta · October 9, 2019, 3:24am

Will this be able to handle variable size structs? Game I’m working on has structs for enemies that include both their names and descriptions in the struct, which obviously makes EA useless for it. ~~been directly hex editing things so far and it’s terrible~~

Sme · October 9, 2019, 3:30am

I may be reading this incorrectly, but are you implying that you would use DSA to build a ROM, you would edit the output ROM, and then DSA would mark your changes? Or, is this just the standard EA buildfile approach of rebuilding from scratch each time you make edits? Assumedly it’s the latter, but the wording here is confusing me.

The latter two of these alone resolve two of the biggest small things (hue) that bother me and the first is the single hypest feature I could possibly imagine

What about EA assembly installers, would DSA be able to pick up on their existence in the built ROM? I doubt it would, would the aforementioned ea2dsa tool be able to convert the installers such that they work properly with DSA? What of EA installers that rely on stan’s lyn utility, would DSA (or presumably dsa-extras) have equivalent functionality inherently, or would that data need pre-processing before being formatted for DSA?
Looking to graphics, DSA has its graphics functionalities, but would graphics installed with EA’s utilities be able to be directly translated to DSA’s format? Even still, it seems that converting an EA buildfile to DSA would be more complex than you lay it out to be here.
~~I also assume DSA won’t have the #ifdef definition vs label problems that EA does~~

zahlman · October 9, 2019, 4:21am

I would need a more precise explanation of how the data is laid out, but I’m sure something could be figured out. I really haven’t designed for variable-length things that are in-line with other stuff, as opposed to like, just a string pool (sequence of null-terminated strings). I imagine the ASM that works on that data must be pretty gnarly too. x.x

zahlman · October 9, 2019, 4:31am

The latter, but hopefully with some kind of caching system so you don’t really have to start from scratch - you just have the option if things mess up.

Actually the version showcased last year already did this. Well, for the enumerations, not the labels.

~~I haven’t started on the within-group labels part yet.~~

This might be a bit complicated to explain x.x

So, ea2dsa is just for transforming the language raws, like if you know something about event opcodes that I don’t (mine are probably way out of date). It’s mostly just to showcase what DSA structgroups look like. (Although you could also just look at the pre-converted structgroup definitions, which also have the advantage of being tailored to use the features EA doesn’t.)

The plan for assembly stuff was to have an interpreter plugin that just shells out to an assembler (unless someone wants to make it work natively cough cough @CT075 ) so you just write the .s file and reference it in your project, and do the old-fashioned as/objcopy shenanigans.

DSA would be able to pick up whatever you tell it to pick up with structgroups etc., and it would be picking that up from the assembled in-ROM form which for “assembly installers” is presumably just machine code. So again that would be the “unpacking” side of the interpreter plugin.

DSA doesn’t care how the graphics got into the ROM; the plugins in question are operating directly at the level of “decompress LZ77, rearrange tiles, compress PNG” and backwards from there. All you need to know is where there’s a pointer to those graphics, whether they’re compressed and so on.

As for how hard it will be, I really don’t know because it hasn’t been tried before. Obviously it depends on the complexity of the hacking, and also the picture will be much clearer once I have this released and you can see how it operates on the base ROM.

~~And yes there are no such problems. Shh bby is ok, no #ifdefs now only dreams (and labels).~~

CT075 · October 9, 2019, 4:46am

implementing an ARM7-compatible assembler in python sounds like a mistake, i don’t know what idiot would ever try that

7743 · October 9, 2019, 6:05am

I look forward to publishing your tools.

Graphical tools that insert stuff directly into the ROM are nice and friendly enough, but when something goes wrong it can be hard to track down the problem yourself, and you probably don’t have a very organized system for backing up your ROM.

FEBuilderGBA has an automatic backup system and a diff debug tool that uses it.
With diff debug tool, most problems can be solved within 10 minutes.

If you can’t believe what I say, watch this video.
I solve the crashing bug in about 10 minutes.

Is it difficult to learn the diff debug tool?
No, it’s easy.
Just press the button.

In source managment system, the save button is equivalent to the TAG function.
And the diff debug tool is equivalent to merging between revisons.

I also have in mind a system for caching per-file assembly results so you don’t have to wait around every rebuild for all your images to re-convert.

If you want to redo the EA, I would like to suggest that you remove the ifdef.
The factor that slows down the C compiler is ifdef.
The same result cannot be guaranteed when it is included twice because of ifdef.
As a result, the build time has increased.

#pragma once

I hope that include guard will be adopted.

zahlman · October 9, 2019, 6:23am

There is no preprocessor at all right now, so no #ifdef. There won’t be a preprocessor until there is proof of it being really needed. Hopefully I can avoid it completely.

I have not thought very much yet about how to manage multiple source files, so I do need to get that done soon. I am hoping to make it work more like module imports in a language like python (or c# or java). But I don’t know any details yet.

This diff debug tool looks very nice. It’s good just to know that the diffs are tracked like this.

Zeta · October 9, 2019, 2:07pm

(right click open image in new tab)
Here’s a screenshot, highlighted stuff is a single struct. It goes kind of like this:
20 41 <two words that do things I don’t understand, first five bytes are often blank> 00 up until the ‘01’, which is enemy level. Then other stats are five bytes apart. 5B attack, 5B defense, 00 magic attack, 5A magic defense. Then I stopped analyzing it because I was just interested in changing enemy stats (insert :hector: here). But as you can see there are a bunch of strings in there - enemy name, enemy description, name of animation/model/AI files. If it’d help I could just send you a copy of the file in question in PM.

Edit: There’s also some kind of metadata describing the file/stucts at the start of it, would that require any extra handling? I think there’s a ‘number of entries’ in there somewhere, had to change it once for another file, but other than that I haven’t touched that part.

It’s a modern x86 game ~~so yes~~.

Edit #2: What about multiple target files? I have an entire folder of tables as separate files.

zahlman · October 9, 2019, 3:14pm

It does look like something that would benefit from special support in the engine, of a sort that I’ve thought about doing but probably won’t get to for a while yet. The main issue is deciding how flexible I want it to be and how to describe it in the structgroup/type definitions. (Also the disassembly engine, as it stands, needs to be able to compute a byte regex for the struct - which might be a problem for cases that aren’t as trivial as null-terminated strings.)

That’s not a real issue as I see it, you just tell dsa/dsd what the source and target are for each invocation. For a larger project you’d want some kind of makefile, of course.

Zeta · October 9, 2019, 3:21pm

I’ll be looking forward to it if/when you decide to support it, then.

Fair enough.

zahlman · October 9, 2019, 4:14pm

Progress tracker / Todo list

Since last year’s demo:

[+] New formats for type, structgroup and data description (i.e., disassembly) files
[+] Use path config files to specify which library files to use
[+] More detailed output logging system
[+] New core code for disassembly algorithm
[+] Support for “filter” plugins
[+] Support for quoted-string tokens (so filenames, extracted strings etc. don’t get mangled)
[+] Support for byte-array struct members, with ability to specify string encoding
[+] Support for “interpreter” plugins
[+] Ability to configure additional “system” library paths (i.e., support for dsa-extras without a tricky manual install step)
[+] Chunk size can now be determined implicitly by the first filter applied to the chunk
[+] Added to standard library: hexdump type and hex structgroup for hex-dump outputs; file interpreter to dump chunk data to separate binary files; size filter to track the size of chunks (this used to be hard-coded into the disassembly and not actually do anything, but now there’s actual filter support so it was the first thing that needed to be written)
[+] nmm2dsa tool
[+] ea2dsa tool (includes a bunch of workarounds so as not to fail completely on the outdated raws I have)
[+] Filters for handling image data: lz77, png, tileimg
[+] portrait filter and additional config data to rearrange ROM images into standard spritesheet
[-] Clean up structgroups generated by the tools; organize types and provide more useful ones
[+] Improvements to error reporting
[+] Finalize API for filter plugins
[+] Finalize spec for structgroup definitions
[-] Auto-generated labels inside chunks; system for referencing them as either addresses or indices
[ ] Ability to omit trailing type fields if they match “default” values
[ ] Ability to treat empty tokens ([]) as “default” values for struct members (only if every field has a default)
[ ] Installer spit and polish for release
[?] Support for chunks where the length is implied by the content of a header struct
[?] Audio definitions
[?] Text (Huffman) support

For future releases:
[ ] ASM support
[ ] Repeatable single-member structs (like how WORD etc. work in EA)
[ ] Inline variable-length content (the thing @Zeta is talking about above)
And probably a whole bunch more that I’m too unfocused to think of right now.

Update 1: Cleaning up structgroups has been slow going. I ran into more need for bug-fixing than I expected, and I came up with a bunch of improvements for nmm2dsa and ea2dsa that I hadn’t counted on either. Plus there will be an absolute ton of doc to write at the end.

Still feeling optimistic overall.

Update 2 (Oct 14): Within-chunk labels are correctly generated, formatted and parsed. Now to make them useful (i.e., make references to them work, and make the disassembler detect and use them). This is one of the biggest things I wanted to accomplish ~~and it makes @Sme hype~~.

I keep finding myself making small tweaks to stuff I thought was done, so even nmm2dsa is probably still not release-ready yet - which is kinda annoying, but that’s how it goes. There’s also some other minor functionality I got working that’s not on the checklist, but is too hard to explain to someone who isn’t already using/testing the code.

Also, structgroup cleanup is going pretty well.

Update 3 (Oct 16): Structgroup cleanup is basically done for FE7 NMMs. Moving on to events now. There’s a good chance I release before starting on FE8/6 defs, but rest assured they’ll be coming right down the pipeline. The current test produces about 4MB of binary data across almost 2000 tiny .png and .dat files (in addition to the main assembly which is over 1MB in itself). Between this and what I know about events, text and audio, we actually have quite a large fraction of the FE7 ROM mapped.

zahlman · October 24, 2019, 1:47am

Okay so

I had to take a few days off for mostly-unexpected family/household/life stuff, and then I fell out of my routine a bit and also the family/household/life stuff is not entirely dealt with yet.

So I’m now considerably behind the schedule I thought I was on. Still going to get as much done as I can by Hallowe’en, and hopefully keep working into November as well. But there might not be a video at all for the showcase, or only a very basic slideshow (I also don’t really have experience with video editing). I had wanted to make a showcase hack using DSA for everything, but that takes time even for silly basic stuff and also it’s looking like I won’t get as far as text support this time.

Pikmin1211 · October 24, 2019, 12:40pm

No rush, and good luck with the stuff

zahlman · December 24, 2019, 7:30am

Work has resumed. (I also did a little bit in early-mid November.)

Structgroups are cleaned up to my liking - for now, and for FE7. It’s good enough to be usable, but I’m going to need feedback and support from others to get it where I really want it.

Planning on the following before release:

Hopefully these parts won’t take too long, but holidays being what they are, this might not be released until sometime in January. I do feel like I need to include the text support though, because IMO it doesn’t feel like a functional, all-purpose tool without that.

The release will be numbered as 2.10.x+y assuming this goes according to plan (one minor version bump for finishing off the label system, and one for the two other convenience “abilities”). Currently at 2.8.226+125.

Zeta · December 24, 2019, 3:14pm

Good to hear stuff’s moving. I’ll definitely be putting it through its paces with non FE tables pretty quickly (as I said on Discord I do have files without variable length inline content).

If you’re cool with it, it’d be nice to see an example of what some definition / output files for DSA look like. Completely understandable if things are still WIP enough that you don’t want to, though.