A New Text Processor

MintX · December 9, 2021, 4:59pm

TextProcessX

Advanced text processor for buildfiles.

What’s new

New Features

TextProcessX provides auto newline, auto add [A], auto narrow and overflow warning.

The lack of dependency on calling ParseFile multiple times provides a significant performance boost. Text building and parsing should cost less than 1 second on most PCs.

[X] is no longer required as text no longer fall through different entries.

Backwards Compatability

TextProcessX is mostly backwards compatable to classic TextProcess and even Bly’s extension, however there are a few minor changes.

[var] = something can be called anywhere, but using [A] [B] as [A] = [B] is no longer supported.

Using # as line comment is no longer supported since text no longer fall through. Use // instead. (Probably won’t cause an error unless the comment is in the middle of a text block)

Leading and trailling spaces will be trimmed, use [0x20] instead if necessary.

ParseDefinitions.txt is no longer given special treatment so please add

#include "ParseDefinitions.txt"

to the start of your text_buildfile to use your definitions normally.

Additional Requirements

A version of EA Core that supports raw BASE64

Currently part of my fork but a compiled version is provided in the link below.

Command Line Arguments

TextProcess text_buildfile.txt InstallTextData.event --narrow-mapping narrow_mapping.csv --rom reference.gba --portraits PortraitInstaller.event

Breakdown:

--narrow-mapping, a csv file, required for narrow font features.

Each line represents the non-narrow version and narrow version of a character.

Valid formats:

a, 0x81
0x20, 0x79
32, 127

Sample csv files are provided in the project.

--rom, if specified, we will use the glyph data stored in a rom to calculate text size so we can perform auto-narrow. If not specified, we will use default FE8 glyph sizes. To use the standard narrowfont glyph sizes, at the start of text_buildfile.txt specify

#define UseDefaultNarrowFontProfile

--portraits, an Event Assembler event, if specified, we will automatically import definitions like #define {NAME}Portrait or #define {NAME}Mug.
For example if

#define TimmyMug 0x14

[LoadTimmy] will become

[LoadPortrait][0x14][0x1]

New Formatting Options

Traditionally TextProcess has 2 syntax for text:

## Name
# 0x1234 (Name)

Now the syntax goes

#(T/X/I/N/W/D/S/#) (0x1234) (Name)

T/X/I/N/W/D/S determine which type of text this is

#T: Text, or dialogue in general, 160px serif font
#X: eXtension, dialog with 3 line dialogue patch installed
#I: Item Name, 56px menu font
#N: (Character/Class) Name, 46px menu font
#W: Weapon Description, 160px serif font
#D: Description: 160px serif font, x2 lines
#S: Skill Description: 160px serif font, x3 lines
#/## (Default): The text will not be altered in any way.

T and X will try to add [N]s and [A]s if necessary. The program will not break existing formats and only add [N]s and [A]s if overflows occur.

I/N/W/D/S will try to narrow the text if an overflow occurs.
D/S will add [N]s if necessary

Examples

#I IronSwordName
Iron Sword

#W IronSwordDesc
A sword made of Iron.

Windows users: Grab the .exe files
Mac users: Grab the non-exe files

Vesly · December 10, 2021, 1:28am

Would you like to set this up in the EasyBuildfile?

I would like to try this out sometime both for my project and just to confirm it works as expected so we can encourage users to use it via Mystic’s easy buildfile

HyperGammaSpaces · December 10, 2021, 2:24am

Hypeeeeeee

I’m curious, since I’ve been working on a repo for localization tools, if this parser supports full cp1252 charmapping natively, and if it might support other Windows codepages as well.
I’ve implemented font characters and charmapping for cp1254 (Turkish) in the repo already if you need something to test, and I plan on doing cp1251 (Cyrillic) and cp1258 (Vietnamese) eventually as well.

MintX · December 11, 2021, 4:14pm

Well you need to provide information on how those glyphs are stored.

HyperGammaSpaces · December 14, 2021, 2:10am

Since we all use the anti-huffman patch, the text data is just stored as strings of 8-bit encoded text characters. The mapping corresponds to Windows-1252, but with unused characters stripped out such that only English, French, Italian, German, and Spanish are supported. The link to my repo in my previous post includes an installer, graphics, and mapping for the missing characters.

(One of the issues with ParseFile was that it only supported 7-bit encoding, meaning that it would fail on characters like á and you would need to specify the encoding manually as [0xE1]. Recently, Colorz made a fix to make it use 8-bit cp1252 encoding, which has saved me a massive amount of time inserting my Spanish script.)

Sources:

github.com

StanHash/DOC/blob/master/Drawing Text Notes.txt

addresses are FE8U as always

(I'll refer as String the null-terminated ascii representation of text)
(I'll refer as Text the visual representation/the tile graphics of a String)

Font Struct (size 0x18):
    +00 | word  | root output VRAM pointer
    +04 | word  | pointer to glyphs
    +08 | word  | pointer to glyph drawing routine (ex: 08004218+1, 08004268+1)
    +0C | word  | pointer to current VRAM tile offset getter (ex: 080041E8+1)
    +10 | short | base value for text tiles (containing base tile index & palette mask)
    +12 | short | current tile index (local/relative to root)
    +14 | short | palette index
    +16 | byte  | idk
        - set from byte at 02028E74, which is initialized at 1 at the start of the game and never changed again. My guess is that it denotes whenever to use Japanses (Shift-JS, 0) or English (ascii, 1) string format, since it is checked in various routines related to strings and glyphs
        - in FE8E, this is initialized to the current language id (byte at FE8E:02028E74)
    +17 | byte  | idk

Default font struct at 02028E58

This file has been truncated. show original