Return to Level1Techs.com

[Devember2021] Implementing Falcom compression to reinsert asserts

Two years ago (GitHub tells me) I wanted to try to replace the 3D models in the JRPG game series Trails in the Sky with higher poly models. It is an old 2.5D game with 3D maps and 2D character sprites. I have managed to reverse engineer (black box) the format and the next step is to try to modify the files and try them in the game.

However the problem is that the games pack everything into archive files using custom compression and the extraction program is not open source. The only good news is that they are not encrypted. I have very little experience with compression and certainly not the skills to reverse engineer compression algorithms.
But over the last 2-3 months I have managed to create a working decompressor by digging up obscure implementations found by digging in old forum threads, by working a little on it sometimes while taking the train to work. (This includes moon programming languages and a hilarious dead end with a custom scripting language that in the end called a custom function defined in C++, but the C++ implementation just executed a hex dump of what I assume is a part of the original game executable.)

So my goal for this Devember project is to learn how these algorithms work (which I have a good grasp on already), document or at least describe them in a more understandable format than undocumented code, and to implement a compression algorithm for the two methods the game uses. (Yes, two methods, I was quite disappointed when I learned my initial implementation only worked on half the files.)
My ambition level is quite low as my main project this Christmas is to 3D print the character models from the newer 3D games and getting better at Blender (because those models where really not usable for 3D printing, but that is an entire different story). Oh, and I picked up an resin printer at black Friday, so it will be interesting to see if that is going to end up as a disaster or not.

Btw, I know that a remake of those games has been considered, so the original goal of the project might no longer be as relevant (but let’s see how the remake turns out), but I think this is a good opportunity to learn some more about compression and I’m close enough that I think I should at least try to get some 3D models into the game before calling quits.
(Actually, double checking this, I’m not actually sure if there are concreate plans for making a remake.)

All code will be pushed to my existing repository linked above.

1 Like

I don’t feel like I have much to say, but I guess it is due time to an update.

First of all, the decompression code can be found here: skytrails-replacer/main-extract-archive.cpp at master · spillerrec/skytrails-replacer · GitHub
The compression is split up into chunks, such that after decompression each chunk of date is at most 64 KiB, must likely so that the decompressor doesn’t need to worry.
Each chunk can use one of two compression methods, in practice all chunks uses the same compression method. It is not clear why some files uses one over the other, maybe it is decompression performance, maybe it fits those kind of data better, but it will be interesting to look into at some point.

I have tried to document the simpler compression method, which you can read here:

But I’m not quite happy with it, I tried to mess with tables and colors to make it clearer, but I don’t think it is good enough and I fear that if I need to start to write a lot to describe it properly, some pseudo-code with comments is probably a better approach. The issue is that the bits are somewhat aligned to byte boundaries but not always, and then some of the values specified by them are offset by 4 for some reason.
So I have decided not to put more effort into it for now.

So I started on implementing a compression algorithm for the simpler compression method, and I’m close to something that is working. Something goes wrong after the first chunk (i.e. 64 KiB) leaving it unable to decompress to parse the compressed stream correctly, and the results isn’t correct after a few KiB, but it is probably just a matter of fixing the last few bugs. (I just hope finding them will not take forever.)
I have implemented it using a fairly simple brute force approach for finding previous matching sub-strings, but since it is limited by its 64 KiB chunks it luckily isn’t unusable slow even in debug mode. So until I have everything working I’m not going to worry about performance, but I’m definitely interested in learning how this can be done more efficiently later.

Some interesting surprises came up when I tried to implement compression. First of all, I had misunderstood one of the if statements, I had implemented it correctly for the decompression but my incorrect understanding first became obvious when it didn’t compress properly. Another interesting aspect is that all the original compressed streams do not use all of the flexibility of the format. For example, the lookback/dictionary action allows you to specify a length of the copied bytes down to 4 bytes, however in practice this is always at least 7 bytes. This is in spite of it becoming effective already at 3-4 bytes, so that raises the question if using less than 7 bytes is actually legal or just a quirk of their compression algorithm.

Small update here the next day: Compression now works correctly. I wasn’t quite sure how to debug a small difference at an unknown point long into the execution of the program, so I tried to make the compressor and decompressor both print the commands they were encoding and decoding respectively to stdout with exactly the same formatting. And voila, putting those two results into a text diff tool instantly showed that the compressor tried to encode a lookback/dictionary lookup with an offset that was larger than the available bits to store that number.

I did some quick tests of the 7 vs. 4 bytes restriction, and it compresses way better if you allow it to go down to 4 bytes. For example, a file goes from 257 KiB down to 157 KiB with the 4 bytes version, while it is 202 KiB with the 7 bytes version. Very strange…

It is probably a runtime speed thing dirctly related to the era of hardware that the game/app runs on.

This is a cool niche project and makes me wish I could pull out the source code to the Baldurs Gate (in-game) GUI editor I made in GameMaker before it went out of beta (when it was still free). Lotsa fun to be had in these endeavours of reverse engineering compression & decompression

Nice to see you succeded

Cheers

Paul

To sum up the Devember project, it went well in the beginning and then it completely stalled due to Covid. My plan was to work on this while commuting to work (I have a 1 hour train ride each direction), but due to government recommendations to work from home when possible I suddenly didn’t have to commute to work anymore. So my focus ended up on the 3D printing project instead which was the project I worked on at home.
I do not consider it as a failure as I have a working compressor for one of the algorithms, but it could have been so much more. It would have been really cool to evaluate the two algorithms up against each other, learn their individual strengths, and what is causing this. There was definitely time for it… So I will continue to work on this once I start commuting again, I also need to try to change the game asserts as well and see if the game engine actually can read the files I produce.

But instead my 3D printing project went great. I got more familiar with Blender and while I didn’t get to start up my resin printer I did instead learn figure painting. I decided to give it a go and in the process I found out that one of the shops selling Warhammer stuff held weekly events, helping people get into figure painting (and spend money on painting equipment).
So I learned the basic and for a first attempt I’m satisfied:


It definitely looks rough up close, but from a distance I think it looks fine. (It is 21.5 cm in height, sorry, I forgot the banana.) The great thing about 3D printing is that I didn’t have to worry about messing it up.

The character is from one of the later games in the series, when they started using 3D character models. I extracted the character models using a program made by someone else and then manually fixed it up in Blender, which was A LOT of work. The game models where not printable as for example the coat and a lot of the hair were just zero width textures and a lot of the different part of the models simply didn’t connect with anything. Even after making it printable, it looked quite low-poly, so I had to smooth it out and add details. This is the third iteration btw, printed in 3 parts partly to make it easier to paint and partly to make the hair print nicely on a FDM printer.