Memory management

Memory management

1. Introduction

Memory management is one of the hardest problems when developing a big program. The Nintendo DS is a machine with a lot of restrictions, so it’s important to understand them and to know how to use memory the most efficient way.

This chapter won’t talk about ITCM and DTCM because we have already seen how to allocate code and data in them in a previous chapter, and because they are normally allocated statically, not dynamically.

2. Static memory usage

The first thing we need to understand is static memory usage.

ARM9 code, data (data in .bin format added with #include) and variables allocated statically are placed starting from the beginning of main RAM. Any main RAM left unused is the memory available for the heap (we will talk about it later).

You may see the following names used in some places:

  • text: Section with code.
  • data: Section with variables that have non-zero initial values.
  • bss: Section with variables that have zero as initial values.
// Placed in bss
int value = 0;
int value2;

// Placed in data
int value3 = 2;

Sections text and data make your NDS ROM bigger and they are always placed in RAM when the application boots. Section bss doesn’t make your ROM bigger, but it takes up space in RAM.

Examples of statically-allocated data and code:

#include "my_graphics.h" // Header generated by grit

const char *enemy_name = "Big boss";

typedef struct {
    int x, y;
    int health;
} Enemy;

Enemy enemies[50];

int myfunction(int a, int b)
{
    return a + b;
}

The more code, data and static variables you add, the more your fixed RAM usage will grow, and the less flexibility you will have later for things that you only need temporarily in RAM. If you add too much code and data it won’t fit in the final ROM and the linking stage of the build process will fail.

There are no “DSi-only” ROMs in BlocksDS. All ROMs are compatible with DS and DSi, so you can’t have binaries that require more than 4 MiB of static memory. In fact, the DSi is more restrictive than the DS. DS ARM9 binaries can have a size of around 3.7 MiB. In DSi that’s reduced to 2.75 MiB.

There are special tags that you can add to your code and data to make it available only when the application runs on a DSi: TWL_CODE, TWL_DATA and TWL_BSS. This is added to twl sections, and they have their own limits. For example, you can do this if the code and data will only be used in DSi consoles:

TWL_DATA const char *enemy_name = "Big boss";

typedef struct {
    int x, y;
    int health;
} Enemy;

TWL_BSS Enemy enemies[50];

TWL_CODE int myfunction(int a, int b)
{
    return a + b;
}

Remember to always use isDSiMode() to check if you’re running in DSi mode! If you try to use them in DS mode you will cause memory corruption and crashes. Also, remember that the twl sections also have a limit (2.5 MiB).

The source of the limits is GBATEK:

DS limtations:

ARM9 size (max 3BFE00h) (3839.5KB)
ARM7 size (max 3BFE00h, or FE00h) (3839.5KB, 63.5KB)

DSi limitations:

ARM9  2004000h..227FFFFh (siz=27C000h) (for NDS mode: 2000000h and up)
ARM7  2380000h..23BFFFFh (siz=40000h)
ARM9i 2400000h..267FFFFh (siz=280000h)
ARM7i 2E80000h..2F87FFFh (siz=108000h)

3. Heap

The heap is the space used for dynamic memory allocations by C functions like malloc() or things like new in C++.

This space is used for things that are temporary in nature. For example, if you want to load a JPG file from the SD card and display it on the screen, the first step is to use malloc() to reserve some space in main RAM to load the file. Then, the file is read, decoded, and the result is copied to VRAM. Finally, you call free() so that the space is marked as unused again and it can be reused later. If you don’t have enough space in the heap you can’t do this.

The heap is as big as the free space in main RAM (more or less). That means that on DS you have 4 MiB minus the size of your ARM9 static memory usage. On a DSi you have 16 MiB minus the ARM9 static memory usage. malloc() knows if the application is running on DS or DSi and it will use as much memory as available.

In practice, you should always keep your static memory usage to at most 2-3 MiB so that your game works on DS and there’s enough heap for other purposes. If your application is very simple, it may be okay at first, but it will become a very big problem if the application grows. It will force you to rewrite big parts of your application to use dynamic memory.

You can get information about the current heap usage with the following functions:

  • getHeapStart(): It returns a pointer to the start of the heap.
  • getHeapLimit(): It returns a pointer to the end of the heap. The heap will grow up to this address.
  • getHeapEnd(): It returns a pointer to the end of the heap that is currently being used. This pointer moves up and down depending on your memory usage.

If you want details about what happens between getHeapStart() and getHeapLimit() you can use mallinfo2(), from the <malloc.h> header. The information returned by mallinfo2() is complicated to understand, you should check its documentation if you’re interested in it.

4. Stack

The stack is temporary memory that you don’t normally need to worry about. Local variables used in functions are placed in the stack. The compiler generates code so that when you call a function the function reserves some space in the stack, and that space is released when the function returns.

Note: Local variables defined as static inside a function are placed on bss, but that’s a special case.

// Global static variables are placed in bss or data, but they are only visible
// in this C file.
static int my_array[50];

int myfunction(int value)
{
    // Regular temporary variables are placed in the stack, allocated whenever
    // the function is called. If the function is called recursively it will be
    // allocated in the stack many times.
    int temp_array[50];

    // Placed in bss, there is only one instance of it even if the function is
    // called recursively.
    static int static_temp_array[50];

    // ...
}

However, the size of the stack is much smaller than the heap. The size of DTCM is 16 KB, and not all of it is available for the stack of the main thread. A small part of it is used by the thread scheduler. In general, you should avoid allocating big variables in the stack.

The stack of the main thread of the application is placed in DTCM. The stack starts growing from the end of DTCM to the start. Also, DTCM is placed in a way so that if the stack overflows it will start using the end of main RAM.

This means that if you allocate too much data on the stack, it overflows into RAM, and that RAM was already used by something else, it will corrupt the data and the application may crash.

There are some functions you need to be careful about, like the printf() family of functions. If the program crashes when you’re trying to print a debug error message (or right after printing it) this can be the issue (memory corruption issues don’t necessarily show up right when the corruption happens).

In general, you should try to keep your stack usage low so that it’s always in DTCM. DTCM is much faster than main RAM, so it will make your program faster. However, there may be cases in which you just need more stack. In that kind of situations you can use reduceHeapSize().

This is a function that you should call at the beginning of your program, before you start doing a lot of dynamic allocations and deallocations. If there is enough space at the end of the heap, reduceHeapSize() will block this amount of memory from the end of the heap so that malloc() and other functions can’t use it. In other words, it will become safe to be used as stack. Make sure to check the returned value by reduceHeapSize(), as it can fail if you have already allocated data at the end of the heap!