🠔 Back

Translating Uma Musume using code injection


Project Status
Paused
Project Type
Personal
Project Duration
~1 month
Software Used
Visual Studio 2022
Languages Used
C++20

Introduction

I started playing Uma Musume: Pretty Derby™ as soon as it was released. At that time, there was no guide or wiki for the game. I quickly realized that it was necessary to have a good understanding of the game's systems in order to progress properly. The game being only available in Japanese, it was difficult for me to understand the nuances of the different systems. So I looked for a solution to translate the game without impacting the player's experience. It is possible to develop a tool using OCR and display the translation in a console or any other interface. But this was not a solution that suited me. So I turned to code injection to translate the texts in real time while the game is running. It also allows me to replace the texts with a translation and display them in game.

Reverse engineering the game

Uma Musume is developed with Unity and uses the IL2CPP backend for compilation in order to offer the game on different platforms. The IL2CPP (Intermediate Language To C++) backend is an alternative to Mono for compiling a Unity project. IL2CPP offers a better support for creating a project compatible with multiple platforms. The IL2CPP backend converts MSIL (Microsoft Intermediate Language) code into C++ code and then creates an executable or binary file to run the program on the target platform.

IL2CPP Image

In this instance, I used a tool developed by Perfare, which is called Il2CppDumper. This utility allows to restore the DLLs of the game as well as the signatures of the functions, classes and structures that compose them. The tool generates a DLL containing all this information that can now be opened with dnSpy to analyze its contents.

However, sometimes developers protect the generated files. In this situation, it is not possible to restore the original DLL from the game files.

> .\Il2CppDumper.exe .\UmaMusu\GameAssembly.dll .\UmaMusu\global-metadata.dat output/
Initializing metadata...
Metadata Version: 24.2
Initializing il2cpp file...
Il2Cpp Version: 24.2
Searching...
CodeRegistration : 0
MetadataRegistration : 0
Use custom PE loader
CodeRegistration : 0
MetadataRegistration : 0
ERROR: Can't use auto mode to process file, try manual mode.
Input CodeRegistration:

Fortunately, Perfare has developed another tool to restore DLLs while the game is running: Zygisk-Il2CppDumper (formerly Riru-Il2CppDumper). Unlike the classic version of Il2CppDumper which can be used on Windows, Zygisk requires a rooted Android device with the Magisk application installed. Zygisk generates a dump.cs file unlike the classic version of Il2CppDumper which generates a DLL. Once the game files are restored using the mobile version, it is possible to open the generated file with a code editor!

Here is an example of the result obtained once the file is opened in an editor, the code below represents the Runtime class of the Mono framework.

// Namespace: Mono
[DumpInfo(TypeDefIndex = 3)]
public static class Runtime
{
	// Methods
	[DumpInfo(Version = 1, RVA = 0x2535de8, VA = 0x7ce45bdde8)]
	private static Void mono_runtime_install_handlers() { }
    
	[DumpInfo(Version = 1, RVA = 0x2535dec, VA = 0x7ce45bddec)]
	public static Void InstallSignalHandlers() { }
    
	[DumpInfo(Version = 1, RVA = 0x2535df0, VA = 0x7ce45bddf0)]
	private static Void mono_runtime_cleanup_handlers() { }
    
	[DumpInfo(Version = 1, RVA = 0x2535e24, VA = 0x7ce45bde24)]
	public static Void RemoveSignalHandlers() { }
    
	[DumpInfo(Version = 1, RVA = 0x2535e58, VA = 0x7ce45bde58)]
	public static String GetDisplayName() { }
    
	[DumpInfo(Version = 1, RVA = 0x2535e5c, VA = 0x7ce45bde5c)]
	private static String GetNativeStackTrace(Exception exception) { }
    
	[DumpInfo(Version = 1, RVA = 0x2535e60, VA = 0x7ce45bde60)]
	public static Boolean SetGCAllowSynchronousMajor(Boolean flag) { }
}

Injection with DLL Proxying

The injection with DLL Proxying is a relatively simple method. When an application is launched, it looks for DLLs to load according to several directories defined in a precise order by the operating system (Windows). It is then possible to put my fake DLL in a directory with a higher priority than the original DLL, which allows me to inject the latter.

However, the DLL must respect some constraints:

  • If the functions that the program tries to import are not exported by the DLL, the program crashes.
  • If the implementation of the exported functions does not match the implementation of the original functions, the program loads the DLL and executes part of the code before crashing.

The solutions to these two problems are to export the required functions and to redirect the calls of these functions to the original DLL during the execution of the program.

To proceed with the injection, I will use the DLL version.dll which is loaded when it is in the same folder as the executable. So I just have to name my DLL version.dll and reproduce the behavior of the original DLL by adding the code I need.

extern "C"
{
    void* GetFileVersionInfoA_Original = NULL;
    void* GetFileVersionInfoByHandle_Original = NULL;
    void* GetFileVersionInfoExA_Original = NULL;
    void* GetFileVersionInfoExW_Original = NULL;

    // ...
}

namespace
{
    class VersionProxy
    {
    public:
        VersionProxy()
        {
            std::string dll_path;
            dll_path.resize(MAX_PATH);
            dll_path.resize(GetSystemDirectoryA(dll_path.data(), MAX_PATH));

            dll_path += "\\version.dll";

            const auto original_dll = LoadLibraryA(dll_path.data());

            if (original_dll == nullptr)
                return;

            GetFileVersionInfoA_Original = GetProcAddress(original_dll, "GetFileVersionInfoA");
            GetFileVersionInfoByHandle_Original = GetProcAddress(original_dll, "GetFileVersionInfoByHandle");
            GetFileVersionInfoExA_Original = GetProcAddress(original_dll, "GetFileVersionInfoExA");
            GetFileVersionInfoExW_Original = GetProcAddress(original_dll, "GetFileVersionInfoExW");
            
            // ...
        }
    };

    VersionProxy Proxy {};
}

It is also necessary to create a version.def and version.asm file, which you can find on the GitHub repository of the project. Now I can execute my code without the game crashing.

UmaMusumeInjection Image

I can now intercept all the functions I need. I use MinHook for this, but Detours (developed by Microsoft) is also a very good alternative.

bool Injector::Initialized = false;

bool Injector::Initialize()
{
    if (Initialized)
        return false;

    CreateConsole();

    if (MH_Initialize() != MH_OK)
    {
        spdlog::critical("Failed to initialize MinHook.");
        return false;
    }

    spdlog::info("Initialized MinHook.");
    Initialized = true;

    MH_CreateHook(LoadLibraryW, LoadLibraryHook, &LoadLibraryOriginal);
    MH_EnableHook(LoadLibraryW);

    spdlog::info("LoadLibraryW hooked.");

    return true;
}

I first intercept the LoadLibraryW function which is used to load all the DLLs of the game. As a reminder, I want to access GameAssembly.dll which contains the code I am interested in. My DLL is loaded first, so I just have to wait for the game to load GameAssembly.dll with LoadLibraryW.

void* LoadLibraryOriginal = nullptr;

void LoadGameAssembly()
{
    if (!Uma::Injector::Initialized)
        return;

    const auto handle = GetModuleHandleA("GameAssembly.dll");

    spdlog::info("Loading GameAssembly.dll");
    Uma::Resolver().Initialize(handle);
}

HMODULE __stdcall LoadLibraryHook(const wchar_t* path)
{
    if (wcscmp(L"cri_ware_unity.dll", path) == 0)
    {
        spdlog::info("cri_ware_unity.dll found.");
        LoadGameAssembly();
        MH_DisableHook(LoadLibraryW);
        MH_RemoveHook(LoadLibraryW);
        return LoadLibraryW(path);
    }

    return reinterpret_cast<decltype(LoadLibraryW)*>(LoadLibraryOriginal)(path);
}

I found out that it was mandatory to wait for cri_ware_unity.dll to be loaded before accessing GameAssembly.dll, probably due to dependencies.

Reflection with IL2CPP

Intercepting a function like LoadLibraryW is easy because it is a Windows API function, so my DLL has access to its information easily. However, I don't have any information about GameAssembly.dll apart from what I got from Il2CppDumper. I could retrieve the functions from GameAssembly's base memory address and the RVA address, but this method has a major drawback. It forces me to update my library with each new version of the game, because the addresses change with each build.

Another method is to browse an assembly using the functions available in the IL2Cpp library. Since I managed to get a reference to GameAssembly.dll, I can get the pointers to the IL2Cpp functions I need using GetProcAddress. All this work is done by the Resolver class of my library.

template<typename T>
static T Resolve(const HMODULE handle, const std::string& name)
{
    return reinterpret_cast<T>(GetProcAddress(handle, name.c_str()));  // NOLINT(clang-diagnostic-cast-function-type)
}

void* Resolver::IL2CppDomain;
IL2Cpp::NewIL2CppString Resolver::NewString;
IL2Cpp::GetDomain Resolver::GetDomain;
IL2Cpp::DomainAssemblyOpen Resolver::DomainAssemblyOpen;
IL2Cpp::AssemblyGetImage Resolver::AssemblyGetImage;
IL2Cpp::IL2CppClassFromName Resolver::IL2CppClassFromName;
IL2Cpp::IL2CppClassGetMethods Resolver::IL2CppClassGetMethods;
IL2Cpp::IL2CppClassGetMethodFromName Resolver::IL2CppClassGetMethodFromName;

void Resolver::Initialize(const HMODULE& handle) const
{
    NewString = Resolve<IL2Cpp::NewIL2CppString>(handle, "il2cpp_string_new");
    GetDomain = Resolve<IL2Cpp::GetDomain>(handle, "il2cpp_domain_get");
    DomainAssemblyOpen = Resolve<IL2Cpp::DomainAssemblyOpen>(handle, "il2cpp_domain_assembly_open");
    AssemblyGetImage = Resolve<IL2Cpp::AssemblyGetImage>(handle, "il2cpp_assembly_get_image");
    IL2CppClassFromName = Resolve<IL2Cpp::IL2CppClassFromName>(handle, "il2cpp_class_from_name");
    IL2CppClassGetMethods = Resolve<IL2Cpp::IL2CppClassGetMethods>(handle, "il2cpp_class_get_methods");
    IL2CppClassGetMethodFromName = Resolve<IL2Cpp::IL2CppClassGetMethodFromName>(handle, "il2cpp_class_get_method_from_name");
    
    // ...

    IL2CppDomain = GetDomain();
}

Now I can create some methods that will be useful to get the address of the functions I want to intercept.

IL2Cpp::MethodPointer Resolver::GetMethodPointer(const char* assemblyName, const char* namespaze, const char* className, const char* name, const int& argsCount)
{
    const auto assembly = DomainAssemblyOpen(IL2CppDomain, assemblyName);
    const auto image = AssemblyGetImage(assembly);
    const auto klass = IL2CppClassFromName(image, namespaze, className);

    return IL2CppClassGetMethodFromName(klass, name, argsCount)->MethodPointer;
}

IL2Cpp::MethodPointer Resolver::FindMethodPointer(const char* assemblyName,
    const char* namespaze, const char* className, const std::function<bool(const IL2Cpp::MethodInfo*)>& predicate)
{
    const auto assembly = DomainAssemblyOpen(IL2CppDomain, assemblyName);
    const auto image = AssemblyGetImage(assembly);
    const auto klass = IL2CppClassFromName(image, namespaze, className);
    void* iter = nullptr;

    while (const auto* method = IL2CppClassGetMethods(klass, &iter))
    {
        if (predicate(method))
        {
            spdlog::info("Matching candidate found: {0}.{1}", method->Class->Name, method->Name);
            return method->MethodPointer;
        }
    }

    return nullptr;
}

With these two methods I can get all the addresses I need. You may have noticed the presence of an IL2Cpp namespace and the structures that compose it. In order to access the information returned by the IL2Cpp functions, I need a structure equivalent to the original. It does not have to be identical but its representation in memory must be the same. For example, the MethodInfo structure contains all the information about a method.

struct MethodInfo;
struct ParameterInfo;
struct Class;

typedef void(*MethodPointer)();
typedef void (*InvokerMethod)(MethodPointer, const IL2Cpp::MethodInfo*, void*, void**, void*);

struct MethodInfo
{
    MethodPointer MethodPointer;
    InvokerMethod InvokerMethod;
    const char* Name;
    Class* Class;
    const Type* ReturnType;
    const ParameterInfo* Parameters;
    uint32_t Token;
    uint16_t Flags;
    uint16_t IFlags;
    uint16_t Slot;
    uint8_t ParametersCount;
    uint8_t IsGeneric : 1;
    uint8_t IsInflated : 1;
    uint8_t WrapperType : 1;
    uint8_t IsMarshaledFromNative : 1;
};

It is possible to find this information based on projects like Il2CppDumper and Il2CppInspector.

Translating internal strings

What I call internal strings are the texts defined in the game code and not loaded from a file or a web request. For this, I will use the Localize class that I discovered during my analysis of the file generated with Il2CppDumper.

// Namespace: Gallop
[DumpInfo(TypeDefIndex = 13768)]
public class Localize
{
    // ...
    
	[DumpInfo(Version = 1, RVA = 0x1c2473c, VA = 0x7ce3cac73c)]
	public static String Get(TextId id) { }
    
	[DumpInfo(Version = 1, RVA = 0x1c248a8, VA = 0x7ce3cac8a8)]
	public static String Get(String id) { }
    
	[DumpInfo(Version = 1, RVA = 0x1c249dc, VA = 0x7ce3cac9dc)]
	public static Void Set(Region region, String id, String value) { }
    
	// ...
}

It would be interesting to intercept the Localize.Get(TextId id) method. The Localize.Get(String id) method is never called, so I can ignore it.

Pointer<void> Localize::OriginalLocalizeByTextId;

Pointer<IL2Cpp::String> Localize::Get(const int id)
{
    const auto& result = reinterpret_cast<decltype(Get)*>(OriginalLocalizeByTextId)(id);

    // TODO: Get translation using id.

    return result;
}

const auto LocalizeGetTextId = Uma::Resolver::FindMethodPointer(
    Uma::Assembly::UmaMusume,
    Uma::Namespace::Gallop,
    Uma::Class::Localize,
    [](const IL2Cpp::MethodInfo* method)
    {
        if (method == nullptr)
            return false;

        return std::strcmp(method->Name, "Get") == 0
            && method->Parameters != nullptr
            && method->Parameters[0].ParameterType != nullptr
            && method->Parameters[0].ParameterType->TypeEnum == IL2Cpp::TypeEnum::IL2CPP_TYPE_VALUETYPE;
    }
);

MH_CreateHook(reinterpret_cast<LPVOID>(LocalizeGetTextId), reinterpret_cast<LPVOID>(Localize::Get), &Localize::OriginalLocalizeByTextId);
MH_EnableHook(reinterpret_cast<LPVOID>(LocalizeGetTextId));

This is what I get when I launch the game.

[2022-03-03 10:56:06.574] [info] UmaMusume.Core Loaded!
# ...
[2022-03-03 10:56:09.123] [info] Successfully hooked Localize.Get(TextId)
[828] Text Intercepted: 新米
[3591] Text Intercepted: ■ご注意■
[3592] Text Intercepted: ちゅうい
[3593] Text Intercepted: このアプリは基本無料で遊べます
[3594] Text Intercepted: きほんむりょう
[3595] Text Intercepted: あそ
[3596] Text Intercepted: 一部有料でアイテムを買うこともできます
[3597] Text Intercepted: いちぶゆうりょう
# ...

I can now replace this text by the translation I want. To know which text I have to translate, I can use the id passed as parameter instead of comparing the texts directly. This is an example with the french translation for the game.

Translating database's strings

I also found the Query class, which is part of the sqlite3 library, which indicates that the game uses sqlite3 to store some information.

// Namespace: LibNative.Sqlite3
[DumpInfo(TypeDefIndex = 3142)]
public class Query : IDisposable
{
	// Fields
	[DumpInfo(FieldOffset = 0x10)]
	protected Connection _conn;

    // ...

	// Methods
	[DumpInfo(Version = 1, RVA = 0x33f6dd8, VA = 0x7ce547edd8)]
	public Void .ctor(Connection conn, String sql) { }

	[DumpInfo(Version = 1, RVA = 0x33f83d4, VA = 0x7ce54803d4)]
	public String GetText(Int32 idx) { }
    
    // ...
}

While browsing the game files looking for a sqlite file, I came across the master.mdb file. Using DB Browser for Sqlite, I was able to open this file containing over 300 SQL tables. It contains information about characters, races and other events. But the table I'm interested in is the text_data table which contains a lot of text used in the game, especially the description of the characters' skills.

All that is left to do is to intercept the Query.GetText(Int32) method and retrieve the translation corresponding to our text.

void Query::InterceptMethods()
{
    const auto GetTextAddress = Uma::Resolver::GetMethodPointer(
        Uma::Assembly::LibNative,
        Uma::Namespace::Sqlite3,
        Uma::Class::Query,
        "GetText",
        1
    );

    MH_CreateHook(reinterpret_cast<LPVOID>(GetTextAddress), reinterpret_cast<LPVOID>(GetText), &Query::OriginalGetText);
    if (MH_EnableHook(reinterpret_cast<LPVOID>(GetTextAddress)) == MH_OK)
        spdlog::info("Successfully hooked Query.GetText(Int32)");
    else
        spdlog::error("Failed to hook Query.GetText(Int32)");
}

Pointer<void> Query::OriginalGetText;

Pointer<IL2Cpp::String> Query::GetText(Pointer<void> _this, int idx)
{
    const auto str = reinterpret_cast<decltype(GetText)*>(OriginalGetText)(_this, idx);

    // ... Get Translation

    return str;
}

Here is a translation of a skill as an example (in French, again).

Things that didn't work

Obviously, I didn't get everything right the first time, developing this project required a lot of time and research. This is also the first code injection project I've done, so I don't know all the existing injection techniques. I will quickly go over to my other attempts because they were very enriching even if they ended in failure.

Standard Injection

Injecting a DLL into a program can be relatively simple. Even if it doesn't work all the time, trying this method doesn't require much effort or time.

It only takes a few lines of code in C++ to inject the DLL you want into a running program.

#include <iostream>
#include <Windows.h>

// Based on: https://github.com/AYIDouble/Simple-DLL-Injection

int main()
{
	LPCSTR DllPath = "UmaMusume.dll";
	
	HWND hwnd = FindWindowA(NULL, "umamusume.exe"); // HWND (Windows window) by Window Name
	DWORD procID; // A 32-bit unsigned integer, DWORDS are mostly used to store Hexadecimal Addresses
	GetWindowThreadProcessId(hwnd, &procID);
	HANDLE handle = OpenProcess(PROCESS_ALL_ACCESS, FALSE, procID); // Opening the Process with All Access

	// Allocate memory for the dllpath in the target process, length of the path string + null terminator
	LPVOID pDllPath = VirtualAllocEx(handle, 0, std::strlen(DllPath) + 1, MEM_COMMIT, PAGE_READWRITE);

	// Write the path to the address of the memory we just allocated in the target process
	WriteProcessMemory(handle, pDllPath, (LPVOID)DllPath, std::strlen(DllPath) + 1, 0);

	// Create a Remote Thread in the target process which calls LoadLibraryA with our dllpath as an argument
	HANDLE hLoadThread = CreateRemoteThread(handle, 0, 0, 
        (LPTHREAD_START_ROUTINE)GetProcAddress(GetModuleHandleA("Kernel32.dll"), "LoadLibraryA"), pDllPath, 0, 0);

	WaitForSingleObject(hLoadThread, INFINITE); // Wait for the execution of our loader thread to finish

	std::cout << "DLL injected!" << std::endl;
	std::cin.get();

	VirtualFreeEx(handle, pDllPath, std::strlen(DllPath) + 1, MEM_RELEASE);

	return 0;
}

Unfortunately, this did not work. DMM Games runs the game as a child process, which means that it has full control over the program and its memory. Thus, it is impossible for me to inject code or allocate memory for my DLL. Running my injector as an administrator does not solve the problem either.

Injection using a Kernel Driver

I will not dwell on this type of injection because it has generally brought me more problems than solutions.

Driver injection is a method that takes advantage of the way Windows drivers work. These drivers are loaded first and have access to almost all the memory allocated on the machine. Thus, when a program is launched, we can check which program it is so that we can inject our DLL into the executable we want. Once the program is detected we wait for the ntdll.dll to be loaded before we can inject our DLL.

Here is a part of the injection logic for the driver I developed. Unfortunately the code is too long (+2000 lines), so I can't show everything.

NTSTATUS NTAPI InjectX64NoThunk
(
	_In_ PInjectionInfo pInjectionInfo,
	_In_ InjectionArchitecture Architecture,
	_In_ HANDLE SectionHandle,
	_In_ SIZE_T SectionSize
)
{
	NT_ASSERT(pInjectionInfo->LdrLoadDllAddress);
	NT_ASSERT(Architecture == InjArchitectureX64);

	UNREFERENCED_PARAMETER(Architecture);

	NTSTATUS Status;
	PVOID SectionMemoryAddress = NULL;
	
	Status = ZwMapViewOfSection(SectionHandle, ZwCurrentProcess(), &SectionMemoryAddress, 0, PAGE_SIZE, NULL, &SectionSize, ViewUnmap, 0, PAGE_READWRITE);
	if (!NT_SUCCESS(Status))
	{
		return Status;
	}

	PUNICODE_STRING DllPath = (PUNICODE_STRING)(SectionMemoryAddress);
	PWCHAR DllPathBuffer = (PWCHAR)((PUCHAR)DllPath + sizeof(UNICODE_STRING));

	RtlCopyMemory(DllPathBuffer, InjDllPath[Architecture].Buffer, InjDllPath[Architecture].Length);
	RtlInitUnicodeString(DllPath, DllPathBuffer);
	Status = QueueApc(UserMode, (PKNORMAL_ROUTINE)(ULONG_PTR)pInjectionInfo->LdrLoadDllAddress, NULL, NULL, DllPath);

	if (!NT_SUCCESS(Status))
	{
		Log("QueueApc in InjectX64NoThunk failed..");
	}

	return Status;
}

This technique allowed me to inject my DLL into the game but I couldn't interact with the memory, as it was still protected or inaccessible. I think this is mostly a lack of knowledge on the subject on my part.

Even if I had decided to go down this rabbit hole, it has many drawbacks. It is necessary to sign a Windows driver to allow all users to install it and a license is expensive. Asking a user to install a driver for code injection can cause problems with various anti-cheat software and block the launch of some games. In addition, it is necessary to offer code of high quality because an error during the execution of a driver can cause a blue screen.

What's next?

Continue to maintain the project, improvements, documentations, other utilities?

Some other ideas:

  • Use the DeepL API for texts that are not yet translated.
  • Change the size of the text boxes rather than the font size for texts that are too long
  • Translate stories that are downloaded from the game servers.
  • Replace textures with translated versions
  • Improvements to the FR/EN translation and support for other languages
  • Create other plugins