42
\$\begingroup\$

There are different memory segments to which various types of data are put into from C code after compilation. I.e: .text, .data, .bss, stack and heap. I just want to know where each of these segments would reside in a microcontroller memory. That is, which data goes into what type of memory, given the memory types are RAM, NVRAM, ROM, EEPROM, FLASH etc.

I have found answers to similar questions here, but they failed to explain what would be the contents of each of the different memory types.

Any sort of help is highly appreciated. Thanks in advance!

\$\endgroup\$
5
  • 2
    \$\begingroup\$ NVRAM, ROM, EEPROM and Flash are pretty much just different names for the same thing: non-volatile memory. \$\endgroup\$
    – Lundin
    Commented Jun 2, 2016 at 7:38
  • 1
    \$\begingroup\$ Slightly tangental to the question, but code can (exceptionally) exist in prety much any of these, particularly if you consider patch or calibration uses. Sometimes it will be moved before execution, sometimes executed in place. \$\endgroup\$ Commented Jun 2, 2016 at 8:12
  • 1
    \$\begingroup\$ @SeanHoulihane The OP is asking about microcontrollers, which almost always execute out of Flash (you did qualify your comment with exceptional). Microprocessors with MB's of external RAM running Linux for example, would copy their programs into RAM to execute them, perhaps off an SD card acting as a mountable volume. \$\endgroup\$
    – tcrosley
    Commented Jun 2, 2016 at 8:18
  • 1
    \$\begingroup\$ @tcrosley There are now microcontrollers with TCM, and sometimes microcontrollers are part of a larger SoC. I also suspect there are cases like eMMC devices where the mcu bootstraps itself to run from RAM out of its own storage (based on memory of some phone hard-bricking from a couple of years ago). I agree, its not a direct answer - but I think its very relevant that the typical mappings are not in any way hard rules. \$\endgroup\$ Commented Jun 2, 2016 at 9:39
  • 2
    \$\begingroup\$ there are no rules to connect one thing to another, sure the read only stuff like text and rodata would ideally want to go in flash, but the .data and the offset and size of .bss go there too (and then get copied by the bootstrap code). these terms (.text, etc) have nothing to do with microcontrollers it is a compiler/toolchain thing that applies to all targets of compilers/toolchains. at the end of the day the programmer decides where things go and informs the toolchain via a linker script usually. \$\endgroup\$
    – old_timer
    Commented Jun 2, 2016 at 13:23

3 Answers 3

55
\$\begingroup\$

.text

The .text segment contains the actual code, and is programmed into Flash memory for microcontrollers. There may be more than one text segment when there are multiple, non-contiguous blocks of Flash memory; e.g. a start vector and interrupt vectors located at the top of memory, and code starting at 0; or separate sections for a bootstrap and main program.

.bss and .data

There are three types of data that can be allocated external to a function or procedure; the first is uninitialized data (historically called .bss, which also includes the 0 initialized data), and the second is initialized (non-bss), or .data. The name "bss" historically comes from "Block Started by Symbol", used in an assembler some 60 years ago. Both of these areas areas are located in RAM.

As a program is compiled, variables will be allocated to one of these two general areas. During the linking stage, all of the data items will be collected together. All variables which need to be initialized will have a portion of the program memory set aside to hold the initial values, and just before main() is called, the variables will be initialized, typically by a module called crt0. The bss section is initialized to all zeros by the same startup code.

With a few microcontrollers, there are shorter instructions that allow access to the first page (first 256 locations, sometime called page 0) of RAM. The compiler for these processors may reserve a keyword like near to designate variables to be placed there. Similarly, there are also microcontrollers that can only reference certain areas via a pointer register (requiring extra instructions), and such variables are designated far. Finally, some processors can address a section of memory bit by bit and the compiler will have a way to specify that (such as the keyword bit).

So there might be additional segments like .nearbss and .neardata, etc., where these variables are collected.

.rodata

The third type of data external to a function or procedure is like the initialized variables, except it is read-only and cannot be modified by the program. In the C language, these variables are denoted using the const keyword. They are usually stored as part of the program flash memory. Sometimes they are identified as part of a .rodata (read-only data) segment. On microcontrollers using the Harvard architecture, the compiler must use special instructions to access these variables.

stack and heap

The stack and heap are both placed in RAM. Depending on the architecture of the processor, the stack may grow up, or grow down. If it grows up, it will be placed at the bottom of RAM. If it grows down, it will be placed at the end of RAM. The heap will use the remaining RAM not allocated to variables, and grow the opposite direction of the stack. The maximum size of the stack and heap can usually be specified as linker parameters.

Variables placed on the stack are any variables defined within a function or procedure without the keyword static. They were once called automatic variables (auto keyword), but that keyword is not needed. Historically, auto exists because it was part of the B language which preceded C, and there it was needed. Function parameters are also placed on the stack.

Here is a typical layout for RAM (assuming no special page 0 section):

enter image description here

EEPROM, ROM, and NVRAM

Before Flash memory came along, EEPROM (electrically erasable programmable read-only memory) was used to store the program and const data (.text and .rodata segments). Now there is just a small amount (e.g. 2KB to 8KB bytes) of EEPROM available, if any at all, and it is typically used for storing configuration data or other small amounts of data that need to be retained over a power-down power up cycle. These are not declared as variables in the program, but instead are written to using special registers in the microcontroller. EEPROM may also be implemented in a separate chip and accessed via an SPI or I²C bus.

ROM is essentially the same as Flash, except it is programmed at the factory (not programmable by the user). It is used only for very high volume devices.

NVRAM (non-volatile RAM) is an alternative to EEPROM, and is usually implemented as an external IC. Regular RAM may be considered non-volatile if it is battery-backed up; in that case no special access methods are needed.

Although data can be saved to Flash, Flash memory has a limited number of erase/program cycles (1000 to 10,000) so it's not really designed for that. It also requires blocks of memory to be erased at once, so it's inconvenient to update just a few bytes. It's intended for code and read-only variables.

EEPROM has much higher limits on erase/program cycles (100,000 to 1,000,000) so it is much better for this purpose. If there is EEPROM available on the microcontroller and it's large enough, it's where you want to save non-volatile data. However you will also have to erase in blocks first (typically 4KB) before writing.

If there is no EEPROM or it's too small, then an external chip is needed. An 32KB EEPROM is only 66¢ and can be erased/written to 1,000,000 times. An NVRAM with the same number of erase/program operations is much more expensive (x10) NVRAMs are typically faster for reading than EEPROMs, but slower for writing. They may be written to one byte at a time, or in blocks.

A better alternative to both of these is FRAM (ferroelectric RAM), which has essentially infinite write cycles (100 trillion) and no write delays. It's about the same price as NVRAM, around $5 for 32KB.

\$\endgroup\$
9
  • 1
    \$\begingroup\$ That was some real useful piece of information. Could you please provide a reference to your explanation? Like text books or journals, in case I want to read more about this..? \$\endgroup\$
    – stenvar
    Commented Jun 2, 2016 at 8:30
  • 1
    \$\begingroup\$ @SojuTVarghese I updated my answer and included some information about FRAM also. \$\endgroup\$
    – tcrosley
    Commented Jun 2, 2016 at 9:26
  • 2
    \$\begingroup\$ @Lundin we used the same segment names (e.g. .rodata) so the answers complement each other nicely. \$\endgroup\$
    – tcrosley
    Commented Jun 2, 2016 at 9:27
  • 1
    \$\begingroup\$ All variables with static storage duration will be initialized, which includes all variables at file scope and inside functions with the static keyword. So .bss must always be initialized to zero. \$\endgroup\$
    – starblue
    Commented Jun 2, 2016 at 19:42
  • 1
    \$\begingroup\$ @starblue I remember using Microsoft's Visual C 6.0 compiler about 20 years ago, and having bugs in my program due to uninitialized variables. The debug build would initialize all non-automatic variables to 0, including global variables defined across multiple files, but the release build would only initialize variables with an explicit initialization. Very frustrating. \$\endgroup\$
    – tcrosley
    Commented Jun 2, 2016 at 19:59
31
\$\begingroup\$

Normal embedded system:

Segment     Memory   Contents

.data       RAM      Explicitly initialized variables with static storage duration
.bss        RAM      Zero-initialized variables with static storage duration
.stack      RAM      Local variables and function call parameters
.heap       RAM      Dynamically allocated variables (usually not used in embedded systems)
.rodata     ROM      const variables with static storage duration. String literals.
.text       ROM      The program. Integer constants. Initializer lists.

In addition, there is usually separate flash segments for start-up code and interrupt vectors.


Explanation:

A variable has static storage duration if it is declared as static or if it resides at file scope (sometimes sloppily called "global"). C has a rule stating that all static storage duration variables that the programmer did not initialize explicitly must be initialized to zero.

Every static storage duration variable that is initialized to zero, implicitly or explicitly, ends up in .bss. While those that are explicitly initialized to a non-zero value end up in .data.

Examples:

static int a;                // .bss
static int b = 0;            // .bss      
int c;                       // .bss
static int d = 1;            // .data
int e = 1;                   // .data

void func (void)
{
  static int x;              // .bss
  static int y = 0;          // .bss
  static int z = 1;          // .data
  static int* ptr = NULL;    // .bss
}

Please keep in mind that a very common non-standard setup for embedded systems is to have a "minimal start-up", which means that the program will skip all initialization of objects with static storage duration. Therefore it might be wise to never write programs that relies on the initialization values of such variables, but instead sets them in "run-time" before they are used for the first time.

Examples of the other segments:

const int a = 0;           // .rodata
const int b;               // .rodata (nonsense code but C allows it, unlike C++)
static const int c = 0;    // .rodata
static const int d = 1;    // .rodata

void func (int param)      // .stack
{
  int e;                   // .stack
  int f=0;                 // .stack
  int g=1;                 // .stack
  const int h=param;       // .stack
  static const int i=1;    // .rodata, static storage duration

  char* ptr;               // ptr goes to .stack
  ptr = malloc(1);         // pointed-at memory goes to .heap
}

Variables that can go on the stack may often end up in CPU registers during optimization. As a rule of thumb, any variable which doesn't have its address taken can be placed in a CPU register.

Note that pointers are a bit more intricate than other variables, since they allow two different kinds of const, depending on if the pointed-at data should be read-only, or if the pointer itself should be. It is very important to know the difference so your pointers don't end up in RAM by accident, when you wanted them to be in flash.

int* j=0;                  // .bss
const int* k=0;            // .bss, non-const pointer to const data
int* const l=0;            // .rodata, const pointer to non-const data
const int* const m=0;      // .rodata, const pointer to const data

void (*fptr1)(void);       // .bss
void (*const fptr2)(void); // .rodata
void (const* fptr3)(void); // invalid, doesn't make sense since functions can't be modified

In the case of integer constants, initializer lists, string literals etc, they may end up either in .text or .rodata depending on compiler. Likely, they end up as:

#define n 0                // .text
int o = 5;                 // 5 goes to .text (part of the instruction)
int p[] = {1,2,3};         // {1,2,3} goes to .text
char q[] = "hello";        // "hello" goes to .rodata
\$\endgroup\$
15
  • \$\begingroup\$ I do not understand, in your first example code, why 'static int b = 0;' goes into .bss and why 'static int d = 1;' goes into .data ..? In my understanding, both are static variables which have been initialized by the programmer.. then what makes the difference? @Lundin \$\endgroup\$
    – stenvar
    Commented Jun 2, 2016 at 9:01
  • 2
    \$\begingroup\$ @SojuTVarghese Because .bss data is initialized to 0 as a block; specific values like d = 1 have to be stored in flash. \$\endgroup\$
    – tcrosley
    Commented Jun 2, 2016 at 9:30
  • \$\begingroup\$ @SojuTVarghese Added some clarification. \$\endgroup\$
    – Lundin
    Commented Jun 2, 2016 at 10:47
  • \$\begingroup\$ @Lundin Also, from your last example code, does it mean that all initialized values go into .text or .rodata and their respective variables alone go into .bss or .data? If so, how are the variables and their corresponding values mapped to each other (i.e, between the .bss/.data and .text/.rodata segments) ? \$\endgroup\$
    – stenvar
    Commented Jun 2, 2016 at 11:43
  • 2
    \$\begingroup\$ @DannyS I wrote it myself? \$\endgroup\$
    – Lundin
    Commented Jun 11, 2018 at 6:26
3
\$\begingroup\$

While any data can go into any memory the programmer chooses, generally the system works best (and is intended to be used) where the use profile of the data is matched to the read/write profiles of the memory.

For instance program code is WFRM (write few read many), and there's a lot of it. This fits FLASH nicely. ROM OTOH is W once RM.

Stack and heap are small, with lots of reads and writes. That would fit RAM best.

EEPROM would not suit either of those uses well, but it does suit the profile of small amounts of data perisistent across power-ups, so user specific initialisation data, and perhaps logging results.

\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.