Happy birthday, Pokémon Emerald. At the time of writing, this very Monday, Pokémon Emerald has become 20 years old according to Wikipedia. I have never played Emerald, but I have fond memories of playing my bootleg cartridge of Ruby that we bought over in Turkey for a fraction of the price, which was probably the Japanese version but with all text replaced with the worst English imaginable. It did not distract from the experience, which was amazing, all up until the game crashed and the last save was corrupted.

When I streamed myself, I would often think of things I could do on stream. One of the things I wanted to do was to see if I could make changes to emulated games at run-time, and have some kind of integration with Twitch in it. It never came to fruition, since I always felt like I was not good at streaming content, which caused me to give up on it. But when it was revealed that my friend and Twitch streamer mscupcakes was going to stream Pokémon Emerald, I knew I could support the stream with this project. She’s currently two months in the running for the Partner Plus program, which means that if she holds onto her growth for a little while longer, she gets that shiny Partner Plus status.

So I threw an idea in the ring.. Image of a conversation with mscupcakes where I suggested the idea for having subscribers become upcoming trainers

I didn’t exactly want to commit to anything since I am very close to becoming a father of twins, but I am not one to sit back and play videogames waiting for it to happen, I’d much rather be coding something new. So I started investigating.

The investigation starts at Github of course, checking of anyone did something similar to what I did, but I did not find anything other than Twitch bots that allowed chat to play the game. But searching for just “Pokémon Emerald” yielded a great result: A full decompilation project that can build the ROM from scratch. This resource is invaluable: plain-text readable code together with the ability to have statically analysable debug code! This meant that I can search for “trainer” and find all related code, like the “gTrainers” global variable, which holds every piece of trainer data in the game. I changed some text in the game, built it, ran it in an emulator and yup, there it was, the change in all its glory. Best of all, the randomizer program works with it as well.

Honestly, kudos to the developers of this repository. If you have minimal experience with WSL or Linux, setting this up is copy-pasting levels of easy. Fantastic documentation, and very helpful Discord :)

So the search started out trying to find a trainer to override. I played through the game a bit, to find the first non-rival trainer that I could battle, which was Calvin!

The first normal trainer in Emerald, Calvin.

But strangely enough, there were multiple Calvins in the game. I couldn’t remember names being re-used, so what was this?

#define TRAINER_CALVIN_1                    318
...
#define TRAINER_CALVIN_2                    328
#define TRAINER_CALVIN_3                    329
#define TRAINER_CALVIN_4                    330
#define TRAINER_CALVIN_5                    331

If you search for the first one, you’ll find it being referenced in “scripts.inc”, where this line appears:

trainerbattle_single TRAINER_CALVIN_1, Route102_Text_CalvinIntro, Route102_Text_CalvinDefeated, Route102_EventScript_CalvinRegisterMatchCallAfterBattle

This is honestly very clearly what we’re looking at right now, the route is correct, and if you follow Route102_Text_CalvinIntro, you’ll find that it refers to the text you see in the screenshot above.

Turns out, the other texts are the same Calvin, except they’re used for rematches!

const struct RematchTrainer gRematchTable[REMATCH_TABLE_ENTRIES] =
{
    ...
    [REMATCH_CALVIN] = REMATCH(TRAINER_CALVIN_1, TRAINER_CALVIN_2, TRAINER_CALVIN_3, TRAINER_CALVIN_4, TRAINER_CALVIN_5, ROUTE102),
    ...
};

Every single “trainer” reference is actually one single encounter of any trainer. This is funny for rivals, since they have 2 variations for gender, and 3 variations for each started Pokémon they have, so there’s a bit of duplication required. Anyway, back on-topic..

If you try and search for the macro in the “scripts.inc” file, trainerbattle_single, you’ll find it calls various variations of the trainerbattle macro, which is where the core of the data is.

@ Configures the arguments for a trainer battle, then jumps to the appropriate script in scripts/trainer_battle.inc
.macro trainerbattle type:req, trainer:req, local_id:req, pointer1:req, pointer2, pointer3, pointer4
.byte 0x5c
.byte \type
.2byte \trainer
.2byte \local_id
.if \type == TRAINER_BATTLE_SINGLE
    .4byte \pointer1 @ text
    .4byte \pointer2 @ text
...

This code sets up a bunch of bytes in a row, but unfortunately does not contain any code that looks like it jumps to anything. But that first byte is curious, so I searched for that in the codebase. I found a reference that seemed related: .4byte ScrCmd_trainerbattle @ 0x5c, so I then searched for ScrCmd_trainerbattle and found the following piece of code:

bool8 ScrCmd_trainerbattle(struct ScriptContext *ctx)
{
    ctx->scriptPtr = BattleSetup_ConfigureTrainerBattle(ctx->scriptPtr);
    return FALSE;
}

Bingo! BattleSetup_ConfigureTrainerBattle here sets up the upcoming trainer battle based on the data that you pass into it. It reads the first byte (.byte \type) and uses it to load the correct parameters, which transform the sequence of bytes generated by the trainerbattle macro into usable data!

const u8 *BattleSetup_ConfigureTrainerBattle(const u8 *data)
{
    InitTrainerBattleVariables();
    sTrainerBattleMode = TrainerBattleLoadArg8(data);

    switch (sTrainerBattleMode)
    {
    ...
    case TRAINER_BATTLE_CONTINUE_SCRIPT:
        if (gApproachingTrainerId == 0)
        {
            TrainerBattleLoadArgs(sContinueScriptBattleParams, data);
            SetMapVarsToTrainer();
        }
        else
        {
            TrainerBattleLoadArgs(sTrainerBContinueScriptBattleParams, data);
        }
        return EventScript_TryDoNormalTrainerBattle;
    ...
    }
}

TrainerBattleLoadArgs copies data over to global variables based on the params given as a first argument, which is a set of parameters that look like this:

static const struct TrainerBattleParameter sTrainerBContinueScriptBattleParams[] =
{
    {&sTrainerBattleMode,           TRAINER_PARAM_LOAD_VAL_8BIT},
    {&gTrainerBattleOpponent_B,     TRAINER_PARAM_LOAD_VAL_16BIT},
    {&sTrainerObjectEventLocalId,   TRAINER_PARAM_LOAD_VAL_16BIT},
    {&sTrainerBIntroSpeech,         TRAINER_PARAM_LOAD_VAL_32BIT},
    {&sTrainerBDefeatSpeech,        TRAINER_PARAM_LOAD_VAL_32BIT},
    {&sTrainerVictorySpeech,        TRAINER_PARAM_CLEAR_VAL_32BIT},
    {&sTrainerCannotBattleSpeech,   TRAINER_PARAM_CLEAR_VAL_32BIT},
    {&sTrainerBBattleScriptRetAddr, TRAINER_PARAM_LOAD_VAL_32BIT},
    {&sTrainerBattleEndScript,      TRAINER_PARAM_LOAD_SCRIPT_RET_ADDR},
};

This means that the first parameter is filled into sTrainerBattleMode, the second into gTrainerBattleOpponent_B, etc.. And if you go back to the original code, you’ll see that after the the first two parameters that were already consumed, .2byte \trainer was in the second position. So whatever was defined as the “trainer”, which is in our case TRAINER_CALVIN_1 or 318, goes into gTrainerBattleOpponent_B! Throughout the code, gTrainerBattleOpponent_A and gTrainerBattleOpponent_B are used to reference into gTrainers, a variable which is a large array of trainer data, such as what sprite to use, which Pokémon and items they have, and what kind of AI they should use.

To test this theory, I changed TRAINER_CALVIN_1 in the trainerbattle_single macro to TRAINER_GRUNT_AQUA_HIDEOUT_1, compiled the game and..


That did it!

Of course, the introduction text did not change. That was stored in Route102_Text_CalvinIntro. However, knowing that all of these get passed into TrainerBattleLoadArgs, I knew I could intercept the instructions in this function to instead load data that I created. This function is a switch statement into all the TRAINER_PARAMs, but the main ones that we’re interested in are the 16 bit load (for the opponent ID) and the 32 bit load (for the pointers to the intro and defeat speeches).

The two jump table cases we’re interested in.

These two jump table cases show up with debug information in IDA thanks to the pokeemerald.elf file we compiled, and match up exactly to all future compilations and to the Pokémon emerald .GBA rom file. If we can execute our own code the moment this happens, and check if the gTrainerBattleOpponent_B, gTrainerBattleOpponent_A, sTrainerBIntroSpeech or sTrainerBDefeatSpeech are being modified, we can replace it with our own data. Since we want to keep most of the original trainer intact (so that gameplay remains the same), we just want to change the name and sprite of the current trainer.

So next up would be to somehow execute our own code when we hit these addresses. I’ve investigated the plugin/scripting systems mGBA and VBA use, but they’re not as feature-complete as I need them to be. One of them can hook into specific instructions and have a callback, but I need to change ROM memory afterwards. The other one can edit memory, but no way to do it when I hit the exact instruction. Luckily, the emulators are C/C++, and the build system is thoroughly documented, so building from source was as easy as running a couple of commands again. :)

I decided to go with VBA, which is C++. The code is fairly clear and I quickly discovered I had to be in thumbExecute(), a function that executes a single ARM thumb instruction. The armNextPC holds onto the next arm program counter, the address of the instruction. So now it’s as simple as just making an if-statement right before the instruction emulation code.

static constexpr auto gTrainers = 0x8310030;
static constexpr auto gAdjustU16 = 0x80B13C0; // ADDS data, #2
static constexpr auto gAdjustU32 = 0x80B13CE; // ADDS data, #4
if (armNextPC == gAdjustU16)
{
    // If we're in the U16 part of "TrainerBattleLoadArgs"...
}

Let’s bring back the assembly real quick:

MOVS            R0, data                ; Prepare data as an argument, R0 = data
BL              TrainerBattleLoadArg16  ; TrainerBattleLoadArg16(data)
LDR             R1, [specs]             ; R1 = specs->varPtr (the variable pointer)
STRH            R0, [R1]                ; [R1] = R0
ADDS            data, #2                ; data += 2

Sidebar real quick: Why is MOVS from the right into the left operand, and STRH from the left to the right? It took me quite a bit of time to understand this piece of code. What is the cause of this inconsistency?

Anyway, after STRH above, R0 should be TRAINER_CALVIN_1 if R1 is gTrainerBattleOpponent_A or gTrainerBattleOpponent_B. so we can add another check for gTrainerBattleOpponent_A or gTrainerBattleOpponent_B and adjust the entry in gTrainers to our liking.. right?

struct Trainer
{
    ...
    /*0x03*/ u8 trainerPic;
    /*0x04*/ u8 trainerName[TRAINER_NAME_LENGTH + 1];
    ...
    /*0x24*/ union TrainerMonPtr party;
};

The trainer structure is very straightforward: u8 for a “trainerPic and the trainerName is conveniently inline. Let’s try overriding it!

if (reg[1].I == gTrainerBattleOpponent_A || reg[1].I == gTrainerBattleOpponent_B)
{
    uint16_t& trainerIndex = reg[0].W.W0;

    auto trainerStructSize = 40;
    auto trainerEntryPointer = gTrainers + trainerIndex * trainerStructSize;

    // Inject new pic at 0x03
    uint8_t newPic = 1; // TRAINER_PIC_AQUA_GRUNT_M
    CPUWriteByte(trainerEntry + 0x03, newPic);
    
    const char* trainerName = "Querijn";
    size_t textLen = strlen(trainerName);
    for (int i = 0; i < textLen; i++)
        CPUWriteByte(trainerEntryPointer + 0x04 + i, trainerName[i]);
}

Unfortunately, nothing happens yet. It seems that somehow the bytes are not written, or not written to the correct offset. That seemed to simply be CPUWriteByte not writing to ROM if prompted, which makes a lot of sense! So let’s just modify these functions to do so. After which we got an interesting result:

Image of a conversation with mscupcakes where I suggested the idea for having subscribers become upcoming trainers

This looks sorta good: The name is not encoded correctly, but the image is correct! This means we’re in the right place. Though the encoding is incorrect, which is odd, but thankfully the RH Hub Discord helped me on the way quickly, apparently Game Freak has their own character encoding standard, so we just have to convert it from ASCII to theirs using the charmap.txt file.


void writeText(uint32_t address, const char* text, int maxLen)
{
    auto textLen = strlen(text) > maxLen ? maxLen : strlen(text);
    for (int i = 0; i < textLen; i++)
        CPUWriteByte(address + i, convertCharacter(text[i]));
    CPUWriteByte(address + textLen, 0xFF); // End of string denoter
}

After which all looks correct!

Correct name

I added the ‘-’ on purpose, to test other characters necessary for the Twitch usernames. This is great! Now we can put 10 characters of our making into the game. The process was similar for the 32 bit addresses load for intro and defeat quotes. However, we need a bit of space to put this text into, without just writing to any random address, so let’s just add our own little data to the ROM:

EWRAM_DATA static u8 gCustomIntro[0x100] = {0};
EWRAM_DATA static u8 gCustomDefeat[0x100] = {0};

After which we can just simply point to this data. Now we’re ready to somehow get our own data in. This part is not very interesting, mostly making an HTTPS request and reading some JSON data. Which we need to craft and send ourselves, from the Twitch part that we still need to make.

Honestly, this part was way worse. Twitch has a lot of APIs, of which 2 promised subscriber events, which each use two authentication types, and it’s poorly explained why I need each and how to exactly set this up. Luckily, I found Twurple, a library that has example code that explains it and has example code for a Twitch bot that has callbacks for the calls I need! So now I have a bot that simply listens for new subscribers!

How it works is that the bot also hosts a small express server, which just has a small REST API setup. It gives you a subscriber out of a first-in, first-out list, and if that happens to be empty, it goes over the existing subscribers from old to new. It just holds onto the data, and we still need those sprites and quotes. I host this part elsewhere, since that allows me to host the Twitch bot anywhere, and I can leave the stuff that needs to be really persistent up to my host, Netcup. :)

So finally, we need a website that allows subscribers to set up their sprites and quotes. Luckily, the repository has definitions for all the trainer pics, which refers to a CompressedSpriteSheet, which contains a reference to the compressed image data in question, which the repository contains a PNG containing the same data! This means that we can make a little script that puts all the pic references and their paths onto a website, and let a form choose it after logging in with Twitch OAuth.

alt text

Here’s the website! Not very exciting, but the end result is pretty awesome. The website holds onto these entries and has a REST API endpoint to get them by Twitch username. Which brings me back to the Twitch bot, which in turn requests these, and forwards the full information over to the emulator!

And in the end, we get the following:


That’s what I entered on the website!

It was quite the journey, let me end it off with a bug that happened to me during development. After finishing up all the above, I was happy to send the final ROM changes to mscupcakes, so that she’s got what she needed for the final setup. She sent me back the randomised version for me to test. I test the game all the way up to Dewford, and it looks good. She announces it to the Discord and on Twitter, and I feel good. Then, after a couple of days, I run into the following issue:


Oh no.

Every single textbox outside of combat stopped working. I restarted the emulator, the bot, tried running it without the bot and with another emulator, and it kept happening. So it had to be an issue with the ROM, but that’s so weird! I hadn’t changed much about it. Why does the issue persist even after loading the save from scratch? I had to start my workday, so I couldn’t look at it, but it bugged me all day.

And then it hit me: I had changed the settings to show the text speed to FAST, and had set the battle setting to SET to speed things up. And lo and behold. Once removed, that fixed it again! After which mscupcakes lets me know that…

alt text

So that was a funny bug. Here’s another for you (that we’re not going to fix)


This one happens because we only fetch 1 subscriber at a time, and we don’t get a new one until this one has been defeated (so if you win against mscupcakes, we get to see you again!). But that means we write the subscriber to each trainer that happens to be in combat. So if you’re lucky..

In the end I looked into if I could do anything with the classes.. They’re tied to gameplay, so it means that you can’t change them if they’re a special kind of trainer, like the Elite Four, or Gym Leaders, or the Champion. I looked into overriding them if they’re not one of these three, and see if there were any other gameplay changes, and I found that there was: The amount of money you get from winning is determined by the class name. This means that unfortunately I’ll have to leave it as-is.

I hope you enjoyed my little adventure through these codebases. Hope you’re able to catch mscupcakes’ streams, and be sure to subscribe!