Stack Memory in ARM Cortex-M4 : Stack memory is an essential part of how microcontrollers like the ARM Cortex-M4 manage temporary data, especially during function calls and interrupt handling. If you’re just starting out in embedded systems or working with ARM processors, understanding stack memory is a must.

What is Stack Memory?

Stack memory is a special section of the main memory (RAM) that is used to temporarily store data. This data is often short-lived and needed only during certain parts of a program, such as:

Function execution
Interrupt or exception handling
Saving and restoring CPU register values

Where is stack memory stored?

In internal RAM or external RAM of the microcontroller.
Allocated at the time the program starts and managed automatically.

How Does the Stack Work?

Stack follows a Last-In, First-Out (LIFO) rule, which means:

The last data pushed (added) onto the stack will be the first to be popped (removed).

Think of it like a stack of plates: you add one plate on top, and when you need one, you take the top one off first.

Stack Operations: PUSH and POP

The processor uses two main instructions to work with the stack:

Instruction	Action
PUSH	Saves (stores) data on the stack
POP	Restores (retrieves) data from the stack

These instructions automatically modify the Stack Pointer (SP).

What is the Stack Pointer (SP)?

The Stack Pointer (also called SP or R13) is a special CPU register that always points to the top of the stack.
When you PUSH, the stack pointer decreases (stack grows downwards).
When you POP, the stack pointer increases.

Stack Grows Down

In Cortex-M4, the stack grows from high memory to low memory.

When Is Stack Memory Used?

Let’s go through some common use cases:

1. 🧠 Temporary Storage of Register Values

When a function is called, or an interrupt occurs, the processor temporarily stores important register values (like R0–R3, LR, PC, xPSR) on the stack to preserve the program state.

2. 📦 Temporary Storage of Local Variables

Local variables declared inside a function (e.g., int x = 5;) are stored in stack memory. These variables are automatically cleared when the function ends.

3. 🚨 Exception or Interrupt Context Saving

When an interrupt or system exception occurs:

The processor automatically stores a context on the stack.
This includes general-purpose registers, the processor status register (xPSR), return address (PC), and link register (LR).
After the ISR (Interrupt Service Routine) ends, the stack is popped, and execution resumes as if nothing happened.

Stack in ARM Cortex-M4: Behind the Scenes

Main and Process Stack

Cortex-M4 has two types of stack pointers:

Stack Pointer	Usage
MSP	Main Stack Pointer – used by system/interrupts
PSP	Process Stack Pointer – used by user-level code/tasks (in RTOS)

By default, only MSP is used in bare-metal applications. In RTOS-based systems, PSP is used for thread execution.

Stack Overflow: A Word of Caution

Stack has a fixed size (set in linker script or startup code).
If too much data is pushed (e.g., recursive calls, large local variables), the stack can overflow into other memory areas and cause crashes.
This is why it’s important to allocate enough stack space and use tools to monitor usage.

Example: What Happens During a Function Call

Here’s what typically happens when a function is called:

Return address is saved on the stack.
Local variables are allocated on the stack.
Registers may be saved to preserve state.

When the function returns:

Registers are restored.
Local variables are removed (stack pointer moves back).
Execution continues from the saved return address.

Summary

Feature	Description
Location	Internal or external RAM
Access Style	Last In, First Out (LIFO)
Accessed With	PUSH/POP or LD/STR instructions
Traced Using	Stack Pointer (SP/R13)
Stack Growth	From higher to lower memory addresses
Used For	Function calls, interrupts, local variable storage
Stack Pointers in Cortex-M4	MSP (default), PSP (used in RTOS)

What is RAM used for in Embedded Systems?

Imagine RAM like a shelf in your workspace. Different parts of it are used for different tasks:

Global Data Area:
- This section stores global variables and static local variables.
- It is available throughout the entire program.
- Think of it like a drawer where you keep important items that you’ll use again and again.
Heap:
- This area is used for dynamic memory allocation (e.g., using malloc in C).
- The size of data here is not fixed — it grows during the program’s execution when needed.
- Think of it like a box where you add things as you go.
Stack:
- This is used during function calls.
- It stores temporary data like:
  - Function parameters
  - Local variables
  - Return addresses
  - Interrupt frames
- Imagine it like a stack of books. You add a book on top when you go into a function, and remove it when the function is done.

How Does the Stack Work in ARM Cortex Mx Processors?

ARM Cortex Mx processors use a Full Descending (FD) stack model.

Let’s break that down:

Full: The memory address it points to is already in use.
Descending: The stack grows downwards in memory (from higher addresses to lower addresses).

🧠 Imagine your stack is a set of boxes on a shelf, and every time you add a new box, you place it below the last one, and it already has stuff inside.

📌 Different Types of Stack Operation Models

There are 4 common stack models based on how they grow and whether the top pointer points to an empty or full location:

Stack Type	Meaning
Full Ascending (FA)	Stack grows up, and the top points to a full slot.
Full Descending (FD)	Stack grows down, and the top points to a full slot. (Used by ARM Cortex Mx)
Empty Ascending (EA)	Stack grows up, and the top points to an empty slot.
Empty Descending (ED)	Stack grows down, and the top points to an empty slot.

📌 Summary for ARM Cortex Mx:

Uses Full Descending (FD) stack.
Stack grows from high memory to low memory.
Top of the stack points to a valid (already used) memory location.

What is Stack Placement?

When a program runs on a microcontroller (like ARM Cortex-M), memory (RAM) is divided into different parts:

Data section (for global/static variables),
Heap (for dynamically allocated memory),
Stack (for temporary data in functions),
And the unused area.

The stack grows in a specific direction depending on how the memory is managed in the system.

📌 Two Types of Stack Placement (as shown in the top image)

✅ Type 1 – Stack at the Lower Memory Side

Data section is placed at the bottom.
Heap grows upwards (towards higher memory addresses).
Stack grows downwards (from high to low memory addresses).
Stack and heap are placed opposite to each other to avoid collision.

🧠 Think of it like:

Heap starts from the bottom and grows up.
Stack starts from the top and grows down.

✅ Type 2 – Stack at the Top (High Address Side)

Stack starts at the top of RAM and grows down.
Heap grows up from below the stack.
Still prevents overlap between heap and stack.

🧠 This is common in embedded systems like ARM Cortex-Mx.

Changing Stack Pointer (SP) to PSP in Thread Mode

Let’s understand the bottom part of the image.

🔹 What is SP (Stack Pointer)?

It’s a register that always points to the top of the current stack.

🔹 What are PSP and MSP?

MSP (Main Stack Pointer): Used by default after reset and in Handler mode (interrupts, exceptions).
PSP (Process Stack Pointer): Used in Thread mode (normal application code), for better memory separation.

🔄 Why Change SP to PSP?

To separate interrupt stack (using MSP) from application stack (using PSP).
Helps avoid accidental stack overflow between ISR and user code.

🔁 Summary of SP Switching:

Mode	Stack Pointer Used
Handler Mode (interrupts)	MSP
Thread Mode (application)	PSP (you can switch to it manually)

You can switch from MSP to PSP in thread mode using a special register (CONTROL register).

📦 Stack Region Mapping (from image):

   STACK_MSP_START
       ↓ (MSP stack grows down)
   ---------------------
       (shared stack space)
   ---------------------
       ↑ (PSP stack grows up)
   STACK_PSP_END

MSP and PSP can even share the same 1KB stack memory, but they must not overlap at runtime.

In Simple Terms:

Stack placement can vary, but it always grows down in ARM Cortex-M.
Stack and heap are placed in opposite directions to avoid clash.
ARM Cortex-M uses MSP by default but can switch to PSP in thread mode.
Separating PSP and MSP improves reliability, especially in RTOS or multitasking systems.

Code Example: Switching from MSP to PSP

This code assumes you are writing for a bare-metal ARM Cortex-M environment.

#include <stdint.h>

// Allocate memory for PSP (Process Stack Pointer)
#define PSP_STACK_SIZE  0x100   // 256 bytes
__attribute__((aligned(8))) uint8_t psp_stack[PSP_STACK_SIZE];

void switch_to_psp(void) {
    // Step 1: Calculate the top address of the PSP stack
    uint32_t psp_top = (uint32_t)(psp_stack + PSP_STACK_SIZE);

    // Step 2: Load the PSP (R13) with this address
    __asm volatile("msr psp, %0" :: "r" (psp_top));

    // Step 3: Change CONTROL register to use PSP in thread mode
    // Set bit 1 of CONTROL register to 1 (use PSP), and bit 0 to 0 (Privileged mode)
    __asm volatile(
        "mov r0, #2       \n"  // CONTROL.SPSEL = 1
        "msr control, r0  \n"
        "isb              \n"  // Instruction Synchronization Barrier
    );
}

int main(void) {
    // Initially MSP is used after reset
    // Switch to PSP for thread mode
    switch_to_psp();

    while (1) {
        // Application code using PSP
    }
}

Explanation:

Step	What it Does
`psp_stack`	Allocates memory (256 bytes) for the process stack.
`psp_top`	Points to the top of the stack (stack grows downward).
`msr psp, %0`	Moves the value to the PSP register.
`mov r0, #2`	Sets bit 1 of the CONTROL register to 1 to use PSP.
`msr control, r0`	Applies the new CONTROL settings.
`isb`	Ensures the CPU uses the new stack pointer immediately.

Output (Expected Behavior):

After running switch_to_psp(), your application (thread mode) will use the PSP instead of the default MSP.
Interrupts and exceptions will still use MSP, which is safer and prevents corruption.

1. Physically Two Stack Pointers

Cortex-M processors have two separate stack pointer registers:

MSP (Main Stack Pointer)
PSP (Process Stack Pointer)

These are hardware registers, not just variables in RAM.

2. Main Stack Pointer (MSP)

Default stack pointer after reset
Used by:
- Interrupt handlers (exceptions, faults)
- System-level code
- Thread mode if PSP is not enabled
MSP is automatically loaded from the first word of the vector table on reset.

3. Process Stack Pointer (PSP)

Used only in thread mode
Useful for user-level tasks or application code
Typically used in RTOS-based applications to give each task its own stack

4. Switching Between MSP and PSP

By default, the CPU uses MSP.
You can switch to PSP in Thread mode using special instructions.

5. Changing or Accessing Stack Pointers

Use assembly instructions:
- MRS — Read MSP/PSP
- MSR — Write MSP/PSP

Example (in ARM assembly):

MRS R0, MSP   ; Read MSP into R0
MRS R1, PSP   ; Read PSP into R1
MSR MSP, R2   ; Write R2 to MSP

6. Changing Stack Pointer in C Code

Use naked functions (no prologue/epilogue), e.g. in GCC:

__attribute__((naked)) void switch_to_psp(uint32_t pspValue) {
    __asm volatile (
        "MSR PSP, r0       \n"  // Set PSP
        "MOV r0, #0x02     \n"
        "MSR CONTROL, r0   \n"  // Switch to PSP
        "ISB               \n"
        "BX LR             \n"
    );
}

Summary Table

Feature	MSP	PSP
Default After Reset	✅ Yes	❌ No
Used in Handler Mode	✅ Yes	❌ No
Used in Thread Mode	✅ Yes (by default)	✅ Yes (if configured)
Common Usage	System/Interrupt code	Application tasks/RTOS tasks
Switch Using	`CONTROL` register (bit 1)	`CONTROL` register (bit 1)

AAPCS

Imagine you’re building with LEGOs, and your friend is building another part of the same model separately. For your LEGO pieces to fit together perfectly, you both need to follow the same set of instructions, right? The Procedure Call Standard for the Arm Architecture (AAPCS) is like that set of instructions, but for computer programs running on ARM-based devices (like many smartphones and embedded systems).

What is AAPCS and Why Do We Need It?

In simple terms, AAPCS is a standard or a set of rules that defines how different pieces of code, called subroutines or functions, “talk” to each other.

Think about a program as a team working on a project. One team member (a function) might need another team member (another function) to do a specific task. When the first function “calls” the second function, they need a clear agreement on how to pass information, who is responsible for what, and how to hand back the results.

The AAPCS makes sure that functions written by different people, or even compiled by different tools, can work together seamlessly. Without it, it would be like one LEGO builder using round pegs and another using square holes – things just wouldn’t connect!

What Does the AAPCS Define? The Contract Between Functions

The AAPCS sets up a “contract” between the function that makes the call (the caller) and the function that gets called (the callee). This contract includes:

Obligations on the Caller:
- The caller must set things up in a specific way before the called function can start. This includes putting any necessary data (called arguments or parameters) in agreed-upon places (like specific processor “slots” called registers).
Obligations on the Callee (the Called Routine):
- If the called function needs to use certain resources that the caller was also using (like some of those processor “slots” or registers), it must save the caller’s original values before using them.
- Before it finishes, it must restore those saved values so the caller isn’t surprised by unexpected changes.
Rights of the Callee:
- The called function has permission to use certain resources and change certain parts of the program’s state to do its job.

A Peek into the Rules: Registers in ARM

One of the key areas AAPCS defines is how registers are used. Registers are small, super-fast storage locations within the ARM processor.

According to the AAPCS (as mentioned in your provided information):

Registers R0, R1, R2, R3: These are often used to pass arguments to a function and to return a result from a function. A function can generally modify these registers without needing to save their previous values.
Register R14 (LR – Link Register): This special register holds the “return address” – it tells the function where to go back to in the caller’s code once it’s finished. A function can modify this (for example, if it calls another function itself).
PSR (Program Status Register): This holds information about the current state of the program (like if the last calculation was zero). A function can modify this.
Registers R4 to R11: These are considered “callee-saved” or “preserved” registers. If a function wants to use any of these registers, it must first save their current contents (e.g., push them onto a temporary storage area called the stack). Before the function finishes and returns to the caller, it must restore these registers to their original values. This ensures that the caller’s context is not disturbed.

Who Follows These Rules?

When a ‘C’ compiler (a tool that translates human-readable C code into machine code for the ARM processor) generates code, it must follow the AAPCS specification. This ensures that the compiled C functions can correctly call other functions, including those that might be part of the operating system or other libraries, which also adhere to AAPCS.

In Summary

The AAPCS is a fundamental agreement that allows different parts of an ARM program to communicate and cooperate effectively. It defines:

How functions call each other.
How data (arguments and return values) is passed.
Which registers can be used freely and which must be preserved.

By having this standard, developers can write modular code, use libraries from different sources, and be confident that everything will work together smoothly on ARM-powered devices. It’s a crucial part of the Application Binary Interface (ABI) that makes the ARM ecosystem robust and interoperable.

Here is a cleaned-up and structured explanation of stack activities during interrupts/exceptions and stack initialization tips, especially in the context of ARM Cortex-M processors:

Stack Activities During Interrupts and Exceptions

When an interrupt or exception occurs in an ARM Cortex-M processor:

What Gets Automatically Pushed to Stack

The processor automatically saves the following registers on the current stack (usually MSP unless PSP is configured for the thread):

R0 – R3: Argument and temporary registers
R12: Intra-procedure-call scratch register
LR: Link register (holds the return address)
PC: Program counter (implicitly saved for return)
xPSR: Program status register

This is done to preserve the CPU state before jumping to the interrupt handler.

Why This is Done

This mechanism ensures that when the handler completes and performs an exception return, all the saved registers can be restored by hardware, and the processor can resume execution exactly from where it was interrupted.

Implication for C Handlers

Because the registers are preserved, C functions can safely be used as interrupt handlers without needing manual assembly code for saving/restoring state.

Stack Initialization Tips

Proper stack configuration is essential in embedded systems to avoid hard faults, memory corruption, or unpredictable behavior.

1. Estimate Stack Size

Analyze the worst-case stack usage of your application: recursive functions, deep function calls, RTOS tasks, etc.
Use stack usage analysis tools or test under high load.

2. Understand the Stack Growth Model

Full Descending (FD) – Common in ARM (stack grows down, points to valid data)
Others: Full Ascending (FA), Empty Descending (ED), Empty Ascending (EA) — Know which one applies to your CPU.

3. Choose Stack Placement

Decide where to place the stack:
- At end of internal RAM (typical)
- In external memory (SDRAM) if needed
- Avoid placing near heap or globals unless separated by enough guard space.

4. Two-Stage Stack Initialization

In systems with external SDRAM:
- Start with stack in internal RAM
- Initialize SDRAM in startup code or main()
- Then switch stack pointer (MSP) to use SDRAM

5. Vector Table and MSP Initialization

The first word in the vector table must contain the initial value of the Main Stack Pointer (MSP).
Startup code sets this up before calling main().

6. Configure Linker Script

Use your linker script to:
- Define stack start and size
- Place symbols like _estack, _stack_size, etc.
The startup code reads these symbols to initialize the stack pointer.

7. RTOS Considerations

RTOS kernel uses:
- MSP for system-level tasks (e.g., ISRs, scheduler)
- PSP (Process Stack Pointer) for user threads
Make sure to switch to PSP in thread mode to separate kernel and user stack.

Summary Table

Topic	Key Point
Registers auto-saved	R0–R3, R12, LR, xPSR
Stack for exceptions	Handled by hardware (MSP/PSP)
Stack placement	Internal RAM or SDRAM
Vector table initialization	First entry = Initial MSP
Linker role	Defines stack boundaries
RTOS stack model	MSP = kernel, PSP = user tasks

Stack Growth

In most embedded systems (especially ARM Cortex-M), the stack grows downward — that is, toward lower memory addresses.

Push (Grow)

When a function is called or data is pushed:

SP (Stack Pointer) is decremented.
New data is stored at the new lower address.

Example:

Suppose the stack starts at 0x20001000:

Address	Value
0x20001000	← Initial SP
0x20000FFC	R0
0x20000FF8	R1
0x20000FF4	R2

Pushing 3 registers caused the SP to move from 0x20001000 to 0x20000FF4.

Stack Shrink

When a function returns or values are popped:

The SP is incremented.
This frees up the memory that was used.

Pop (Shrink)

Continuing from the example:

Address	Value
0x20000FF4	← SP after popping

Visual Summary

High Address ↑
   (Empty area)

   |       ← stack shrinks (pops)
   ↓
+----------+ ← Stack Top (SP)
|   Data   |
+----------+
|   Data   |
+----------+
|   Data   |
+----------+
   ↑
   |       ← stack grows (pushes)
Low Address ↓

Stack grows down (toward low addresses) when pushing.
Stack shrinks up (toward high addresses) when popping.

On ARM Cortex-M:

MSP or PSP is used depending on context (handler mode or thread mode).
Stack pointer is always aligned to word boundaries (4 bytes).
During interrupts, the CPU pushes context automatically (stack grows).
On return from interrupt, the CPU pops context (stack shrinks).

Common Stack Operations in C

Example Function Call:

void foo() {
    int a = 5;     // local variable -> stored on stack
    bar();         // function call -> return address pushed to stack
}

What Happens in Stack:

Space allocated for a
Return address pushed
Stack grows
On function return, stack shrinks (SP adjusted back)

Stack Overflow / Underflow

Overflow: Stack grows beyond its allocated space → corrupts other memory
Underflow: Trying to pop from an empty stack → unpredictable behavior

Key Points

Action	Effect on SP	Direction
Push	Decrement	Down
Pop	Increment	Up
Function Call	Push return addr + locals	Stack grows
Return	Pop return addr	Stack shrinks

Raj Kumar

Mr. Raj Kumar is a highly experienced Technical Content Engineer with 7 years of dedicated expertise in the intricate field of embedded systems. At Embedded Prep, Raj is at the forefront of creating and curating high-quality technical content designed to educate and empower aspiring and seasoned professionals in the embedded domain.

Throughout his career, Raj has honed a unique skill set that bridges the gap between deep technical understanding and effective communication. His work encompasses a wide range of educational materials, including in-depth tutorials, practical guides, course modules, and insightful articles focused on embedded hardware and software solutions. He possesses a strong grasp of embedded architectures, microcontrollers, real-time operating systems (RTOS), firmware development, and various communication protocols relevant to the embedded industry.

Raj is adept at collaborating closely with subject matter experts, engineers, and instructional designers to ensure the accuracy, completeness, and pedagogical effectiveness of the content. His meticulous attention to detail and commitment to clarity are instrumental in transforming complex embedded concepts into easily digestible and engaging learning experiences. At Embedded Prep, he plays a crucial role in building a robust knowledge base that helps learners master the complexities of embedded technologies.

embeddedprep.com/

Understanding Stack Memory in ARM Cortex-M4

What is Stack Memory?

Where is stack memory stored?

How Does the Stack Work?

Stack Operations: PUSH and POP

What is the Stack Pointer (SP)?

Stack Grows Down

When Is Stack Memory Used?

1. 🧠 Temporary Storage of Register Values

2. 📦 Temporary Storage of Local Variables

3. 🚨 Exception or Interrupt Context Saving

Stack in ARM Cortex-M4: Behind the Scenes

Main and Process Stack

Stack Overflow: A Word of Caution

Example: What Happens During a Function Call

Summary

What is RAM used for in Embedded Systems?

How Does the Stack Work in ARM Cortex Mx Processors?

📌 Different Types of Stack Operation Models

📌 Summary for ARM Cortex Mx:

What is Stack Placement?

📌 Two Types of Stack Placement (as shown in the top image)

✅ Type 1 – Stack at the Lower Memory Side

✅ Type 2 – Stack at the Top (High Address Side)

Changing Stack Pointer (SP) to PSP in Thread Mode

🔹 What is SP (Stack Pointer)?

🔹 What are PSP and MSP?

🔄 Why Change SP to PSP?

🔁 Summary of SP Switching:

📦 Stack Region Mapping (from image):

In Simple Terms:

Code Example: Switching from MSP to PSP

Explanation:

Output (Expected Behavior):

1. Physically Two Stack Pointers

2. Main Stack Pointer (MSP)

3. Process Stack Pointer (PSP)

4. Switching Between MSP and PSP

5. Changing or Accessing Stack Pointers

6. Changing Stack Pointer in C Code

Summary Table

AAPCS

What is AAPCS and Why Do We Need It?

What Does the AAPCS Define? The Contract Between Functions

A Peek into the Rules: Registers in ARM

Who Follows These Rules?

In Summary

Stack Activities During Interrupts and Exceptions

What Gets Automatically Pushed to Stack

Why This is Done

Implication for C Handlers

Stack Initialization Tips

1. Estimate Stack Size

2. Understand the Stack Growth Model

3. Choose Stack Placement

4. Two-Stage Stack Initialization

5. Vector Table and MSP Initialization

6. Configure Linker Script

7. RTOS Considerations

Summary Table

Stack Growth

Push (Grow)

Example:

Stack Shrink

Pop (Shrink)

Visual Summary

On ARM Cortex-M:

Common Stack Operations in C

Example Function Call:

What Happens in Stack:

Stack Overflow / Underflow

Key Points

Related posts:

Leave a Reply Cancel reply