The biggest reason C has remained the dominant language in embedded development for decades is this: pointers give you direct access to any memory address, and the C standard guarantees that struct members are laid out in memory in the order they appear in the source code. This article decodes how C syntax maps to physical memory, building the foundation for controlling hardware directly.

In Episode 2, we learned where variables are placed depending on how they’re declared — Flash, SRAM, or Stack. Now we focus on how data is arranged within that memory.


📖 Previous Article

#2: Where Variables Live — Flash, RAM, and the Stack

📘 Next Article

#4: The World of Bits — Register Operations and the BSRR Design

📍 Series Index

Full 13-Part Series: The Embedded World Beyond Pointers


✅ What You'll Be Able to Do After This Article

  • Confirm in the debugger Memory view that array elements are laid out contiguously in memory
  • Explain why structs contain padding and what alignment constraints mean
  • Identify cases where sizeof() differs from the sum of member sizes and find the cause
  • Apply the member reordering technique to reduce RAM consumption
  • Explain why volatile is necessary in relation to compiler optimization

Table of Contents

  1. Why C Is Chosen for Embedded Development — One-to-One Variable–Memory Correspondence
  2. Practice: Look Behind C with the Memory View First
  3. The Essence of Arrays: Contiguity in Memory
  4. Struct Padding and Alignment — Why Gaps Appear
  5. What volatile Means: A Warning to the Compiler (Foreshadowing)
  6. Summary — Why C Has Been Embedded’s Common Language for 50 Years

1. Why C Is Chosen for Embedded Development — One-to-One Variable–Memory Correspondence

In PC programming, a variable is an abstract “container for data.” In embedded systems, it’s a specific coordinate in physical memory.

1-1. Why C Is Ideal for Embedded Systems

C dominates embedded development because of these properties:

  • Standard-guaranteed layout: The C standard mandates that struct members are placed in memory in declaration order, making layout predictable.
  • Direct access via pointers: Any memory address can be accessed directly, enabling raw hardware register manipulation.
  • Minimal overhead: Despite being a high-level language, C maps closely to individual CPU instructions.

Understanding this physical concreteness is the first step toward becoming an engineer who controls hardware directly.

1-2. Comparison with Other Languages

Language Direct Pointer Ops Hardware Access Execution Speed Embedded Suitability
C ◎ Yes ◎ Direct ◎ Baseline ◎ Ideal
C++ ◎ Yes ◎ Yes (no unsafe) ◎ Equal ○ Equal to C if you avoid virtual functions
Rust ◎ Yes (unsafe) ○ Yes (unsafe needed) ◎ Equal ○ Growing adoption, safety-focused
Python ○ Limited △ Difficult × Slow × Not suitable
Assembly ◎ Full control ◎ Direct ◎ Fastest △ Verbose

C achieves the ideal balance between “direct address manipulation” and “concise notation.” Its strength today is largely historical: decades of proven use and comprehensive microcontroller vendor support.

Why Rust Is Gaining Traction in Embedded Rust works in no_std environments, and its ownership/borrowing system provides memory safety benefits. However, raw register access and interrupt handling often require unsafe, so low-level understanding remains essential. The learning curve is steep, but adoption is growing in safety-conscious environments.

When C++ Is Used in Embedded C++ is used in embedded too, but typically in a “C with Classes” style — using class features but avoiding virtual functions and exceptions. Within the same scope (including templates and inline), C++ delivers C-equivalent performance.


2. Practice: Look Behind C with the Memory View First

This chapter runs the experiment code first so you can see memory, then subsequent chapters explain why things are laid out the way they are.

One term upfront: What is padding? Padding is “alignment filler bytes” that the compiler automatically inserts inside or after a struct. The purpose is to align data at CPU-friendly positions. In our Memory view, the 00 bytes you’ll see are examples of padding (the values themselves are meaningless).

How to proceed (3 steps)

  1. Get the same display showing as in section 2-2
  2. Observe the AA BB CC DD / 00 pattern in section 2-3
  3. Understand why in chapters 3 and beyond You don’t need to understand everything at first. Just aim to reproduce the same display.

2-1. Preparing the Experiment Code

This experiment uses an STM32F401 series board (STM32F401RE in this article).

Make sure you’ve completed these steps before proceeding:

  1. Build the project in STM32CubeIDE
  2. Flash the microcontroller (STM32F401)
  3. Start in debug mode (F11 or Debug)

The following instructions assume you’re in debug mode with the board paused.

Add the following code to main.c to observe the memory layout:

/* USER CODE BEGIN PTD */
// Struct for observing memory layout
typedef struct {
    uint8_t  id;        // 1 byte
    // 3 bytes of "gap (padding)" should appear here
    uint32_t value;     // 4 bytes
    uint8_t  flag;      // 1 byte
    // 3 more bytes of "gap" should appear here
} __attribute__((aligned(4))) MemoryMapTest_t;
/* USER CODE END PTD */

/* USER CODE BEGIN PV */
// 1. Array: confirm that data is packed with no gaps
volatile uint8_t test_array[4] = {0xAA, 0xBB, 0xCC, 0xDD};

// 2. Struct: observe padding (gaps)
volatile MemoryMapTest_t test_struct = {0x01, 0x12345678, 0xFF};
/* USER CODE END PV */

int main(void) {
    HAL_Init();

    /* USER CODE BEGIN 2 */
    // Pointer variables to make address confirmation easier in Expressions view
    volatile uint8_t* p_array = (uint8_t*)test_array;
    volatile MemoryMapTest_t* p_struct = (MemoryMapTest_t*)&test_struct;
    (void)p_array;
    (void)p_struct;
    /* USER CODE END 2 */

    while (1) {
        HAL_Delay(1000);
    }
}

Code point: the difference between test_array and p_array

Two kinds of variables appear here, each with a different role:

Variable Type Role Contents
test_array volatile uint8_t[4] The actual data Holds the values {0xAA, 0xBB, 0xCC, 0xDD}
p_array volatile uint8_t* Records the address Holds the start address of test_array (e.g., 0x20000000)
test_struct volatile MemoryMapTest_t The actual data Holds the struct values
p_struct volatile MemoryMapTest_t* Records the address Holds the address of test_struct (e.g., 0x20000004)

Why keep a pointer variable for the address?

You can check addresses without them, but having a pointer variable makes it easier to find the address in the debugger:

// ❌ This also shows the address, but...
&test_array   // Add to Expressions view → shows the address

// ✅ The pointer variable is clearer
p_array       // "value of this variable = address of test_array" — explicit

Concrete example: memory layout

Memory Address   Contents
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
0x20000000:    AA BB CC DD      ← test_array[4] stored here
0x20000004:    01 00 00 00 ...  ← test_struct stored here
...
0x2000xxxx:    00 00 00 00      ← p_array itself (its value is the address 0x20000000)
0x2000yyyy:    04 00 00 00      ← p_struct itself (its value is the address 0x20000004)

p_array itself lives somewhere in memory, and its value is the address of test_array.

Note on * notation (detailed in Episode 5) The * in volatile uint8_t* declares “this variable stores an address.” The details are covered in the Pointers episode, but for now, think of it as “a box that holds an address.”

What (void)p_array; means This is a dummy statement that tells the compiler “yes, this variable is being used.” It prevents an “unused variable” warning. A common technique when you create a variable purely for debugger observation.

2-2. Debugger Confirmation Steps

This process is the same as in Episode 2. If needed, refer back to #2: Where Variables Live as you go.

Just this is enough for beginners

  • Add p_array and p_struct and see 0x20000000 and 0x20000004 → success
  • &test_array and &test_struct can wait until you’re comfortable
  1. Start debug: F11 or the “Debug” button
  2. Open Expressions view: Window → Show View → Expressions
  3. Confirm addresses: Add the following expressions:
    • p_arraymost important (shows the address of test_array as its value)
    • p_structmost important (shows the address of test_struct as its value)
    • &test_array — reference (same address as p_array)
    • &test_struct — reference (same address as p_struct)
    • test_array — to expand and see array contents
    • test_struct — to expand and see struct contents
  4. Open Memory view: Window → Show View → Memory
  5. Enter address: Type the address shown in p_array or p_struct’s Value column into the Memory view

Common confusion here

  • What you put into the Memory view is the Value column of p_array / p_struct
  • Expanding test_array / test_struct is for “checking contents” — not for Memory view input
  • If values don’t appear, try Resume → Pause to refresh the display

In the Expressions view, you’ll see something like:

Expressions view showing variable addresses

How to read the display (this order works):

  1. Check p_array’s Value (0x20000000 <test_array>)
  2. Check p_struct’s Value (0x20000004 <test_struct>)
  3. Expand test_array to confirm AA BB CC DD
  4. Expand test_struct to confirm id = 1 / value = 0x12345678 / flag = 0xFF

Quick reference table:

Item to look at What it tells you
p_array Start address of test_array
p_struct Start address of test_struct
test_array (expanded) Array contents (AA BB CC DD)
test_struct (expanded) Struct contents (id / value / flag)

Key points:

  • test_array / test_struct are “the contents themselves”
  • p_array / p_struct are “the address where those contents live”
  • In the Memory view, you use the address from p_array / p_struct to go look at the contents

2-3. Observing the Actual Byte Layout in the Memory View

Enter the value of p_array (0x20000000) into the Memory view, and you’ll see the actual data of test_array at that address:

Memory view showing byte layout

The Memory view shows memory contents as hexadecimal bytes. The left edge is the Address, and the data appears in 4-byte groups to the right.

What you can read from this display:

  • Address 20000000: AA BB CC DD — the actual data of test_array
  • 4 bytes later: 01 00 00 00 78 56 34 12 FF 00 00 00 — the actual data of test_struct
  • Data is packed with no gaps: both the array and struct are laid out in declaration order

Key understanding

  • p_array is a variable recording where test_array lives (address 0x20000000)
  • Entering that address in the Memory view shows test_array’s actual contents (AA BB CC DD)
  • The Memory view is a tool for “seeing what’s at a given address”

Observation order (follow this if lost)

  1. Find AA BB CC DD first (the array)
  2. Confirm 01 00 00 00 ... FF 00 00 00 (the struct with padding)
  3. Last, confirm 78 56 34 12 (little-endian)

Three important observations:

① Array Contiguity

What to check: Are test_array’s 4 bytes laid out with no gaps?

Memory view display (Address 20000000):

Address      +0  +1  +2  +3
20000000:    AA  BB  CC  DD

AA, BB, CC, DD are packed together with no gaps whatsoever.

② Struct Padding

What to check: In test_struct, where are the 00-filled gaps after 01 and FF?

Memory view display (same row at Address 20000000):

The row shows 16 bytes (0x10 bytes):

Address      +0  +1  +2  +3  +4  +5  +6  +7  +8  +9  +A  +B  +C  +D  +E  +F
20000000:    AA  BB  CC  DD  01  00  00  00  78  56  34  12  FF  00  00  00
             ~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~
              test_array      id+padding       value(LE)     flag+padding

In detail:

  • +0 to +3: AA BB CC DDtest_array[4]
  • +4: 01test_struct.id
  • +5 to +7: 00 00 00padding (auto-inserted gap)
  • +8 to +B: 78 56 34 12test_struct.value (little-endian for 0x12345678)
  • +C: FFtest_struct.flag
  • +D to +F: 00 00 00padding (to make the struct a multiple of 4 bytes)

③ Little-Endian Byte Order

What to check: Is 0x12345678 stored as 78 56 34 12 in reversed byte order?

Look at +8 to +B: 78 56 34 12.

This shows that the 0x12345678 we wrote is stored in memory with reversed byte order. STM32 (ARM Cortex-M) uses little-endian, which means:

  • Low byte at low address
  • High byte at high address
How 0x12345678 is stored (little-endian):
Address   Value    Meaning
+8(+0):   78   ←  Least Significant Byte (LSB)
+9(+1):   56
+A(+2):   34
+B(+3):   12   ←  Most Significant Byte (MSB)

Difference from big-endian Some CPUs (old Motorola-based architectures, etc.) use big-endian (high byte at low address). When exchanging binary data between different CPU architectures, byte order is critical.


Section 2-3 summary:

From the Memory view, we confirmed three things:

  1. Arrays are contiguous with no gaps (AA BB CC DD)
  2. Structs have padding inserted (00 00 00 after 01, and after FF)
  3. Data is stored in little-endian (0x12345678 stored as 78 56 34 12)

✅ Chapter 2 Checklist

  • Confirmed p_array and p_struct values in the Expressions view
  • Found AA BB CC DD in the Memory view
  • Identified the 00 bytes after 01 and after FF as padding
  • Can explain why 0x12345678 appears as 78 56 34 12

3. The Essence of Arrays: Contiguity in Memory

3-1. Arrays = Contiguous Placement with No Gaps

Arrays guarantee that same-type data is laid out in memory with absolutely no gaps between elements. This is strictly defined by the C standard, and it’s the foundation that makes pointer arithmetic possible.

Just remember this An array is “a region where the same type is packed with no gaps.” That’s why position is determined by start address + index × type size.

The test_array we observed in section 2-3 shows this in the Memory view:

Address      +0  +1  +2  +3
20000000:    AA  BB  CC  DD

3-2. Why Contiguity Matters

Q: Why does “being contiguous” matter?

A: Because you can calculate the address of any element with simple arithmetic.

If test_array[0] is at 0x20000000:

  • test_array[1] = 0x20000000 + (1 × 1 byte) = 0x20000001
  • test_array[2] = 0x20000000 + (2 × 1 byte) = 0x20000002
  • test_array[3] = 0x20000000 + (3 × 1 byte) = 0x20000003

With the formula “start address + index × type size”, any element can be accessed instantly.

Common stumbling point The address of test_array[0] is the same as the start address of the entire array. When confused, just remember “index 0 = the start.”

3-3. Confirming Different Type Sizes

Our experiment code uses uint8_t for the array, but looking at test_struct.value from section 2-3, we can see uint32_t occupies 4-byte units.

Revisiting the Memory view:

test_struct starts at +4 of Address 20000000:
+4:    id       (1 byte:  01)
+5~+7: padding  (3 bytes: 00 00 00)
+8~+B: value    (4 bytes: 78 56 34 12)
+C:    flag     (1 byte:  FF)
+D~+F: padding  (3 bytes: 00 00 00)

uint32_t tells the microcontroller “read 4 bytes from here as a single unit, interpreting as little-endian.”

The type size determines how many bytes the address advances per step — that’s C’s fundamental principle.

3-4. Why Contiguity Enables High-Speed Processing

This physical contiguity is what lets microcontrollers process large amounts of data quickly and correctly.

DMA (Direct Memory Access) works by issuing the command “transfer this many bytes starting from this address” — no CPU involvement needed. This mechanism fundamentally depends on arrays being contiguous.

DMA is covered in detail in Episode 10 DMA transfers data between memory regions or between memory and peripherals without involving the CPU. Array contiguity makes this high-speed transfer possible.

✅ Chapter 3 Checklist

  • Can explain why arrays use contiguous placement
  • Can use the start + index × type size calculation
  • Understands why uint8_t and uint32_t advance addresses by different amounts

4. Struct Padding and Alignment — Why Gaps Appear

4-1. Structs = Grouping Different Types

Like MemoryMapTest_t in chapter 2, structs let you group different types of data as a single unit. But in memory, a hidden element called padding appears.

The 00 00 00 bytes we saw in chapter 2 are exactly this padding.

Just remember this Structs “lay out members in declaration order,” but the CPU may insert gaps (padding) along the way.

4-2. The “Wasted Gap” Reality in the Memory View

The memory layout of test_struct from section 2-3:

Address      +0  +1  +2  +3  +4  +5  +6  +7  +8  +9  +A  +B  +C  +D  +E  +F
20000000:    AA  BB  CC  DD  01  00  00  00  78  56  34  12  FF  00  00  00
                             ^^  ~~~~~~~~~~  ~~~~~~~~~~~~~~  ^^  ~~~~~~~~~~
                             id  padding     value (LE)    flag  padding

(test_struct starts at +4 bytes from Address 20000000)

  • id (1 byte): 01
  • Padding (3 bytes): 00 00 00 (auto-inserted gap)
  • value (4 bytes): 78 56 34 12 (little-endian for 0x12345678)
  • flag (1 byte): FF
  • Padding (3 bytes): 00 00 00 (gap to make struct a multiple of 4 bytes)

The actual struct size:

sizeof(MemoryMapTest_t) = 12 bytes
// Breakdown: 1(id) + 3(padding) + 4(value) + 1(flag) + 3(padding) = 12 bytes

4-3. Why Gaps Form — Alignment Constraints

Q: Why does the compiler deliberately create wasted gaps?

A: To place data at positions the CPU can read/write efficiently.

Common stumbling point Padding looks “wasteful,” but it’s actually a necessary cost for the CPU to read/write quickly and safely.

32-bit CPUs (like Cortex-M4) have these physical constraints:

Data Type Size Recommended Alignment Reason
uint8_t 1 byte Any address 1-byte read/write possible anywhere
uint16_t 2 bytes Multiple of 2 Fastest at 2-byte boundary
uint32_t 4 bytes Multiple of 4 Fastest at 4-byte boundary

Example:

  • Placing uint32_t at an odd address like 0x20000001 may require 2 read operations (worst case).
  • At a 4-aligned address like 0x20000004, one instruction suffices.

The 24-byte stack frame from Episode 2 is also a result of alignment — management data is packed in 4-byte units following the same rules.

The compiler deliberately inserts padding to place each member at an efficient position.

Our code uses __attribute__((aligned(4))) to ensure the struct itself is 4-byte aligned, giving more stable observation results.

What happens if alignment is violated? On STM32’s Cortex-M4, alignment violations (e.g., reading uint32_t from a non-4-byte-boundary) don’t crash the program, but require extra clock cycles and reduce performance. Some older ARM cores raise a hardware exception (fault) on alignment violations and halt the program.

4-4. RAM-Saving Technique — Reordering Members

Simply arranging struct members from largest to smallest can minimize padding and save RAM.

Before (12 bytes):

typedef struct {
  uint8_t  id;        // 1 byte
  // ★ 3 bytes of padding
  uint32_t value;     // 4 bytes
  uint8_t  flag;      // 1 byte
  // ★ 3 bytes of padding
} MemoryMapTest_t;  // Total: 12 bytes

After (8 bytes):

typedef struct {
  uint32_t value;     // 4 bytes (largest first)
  uint8_t  id;        // 1 byte
  uint8_t  flag;      // 1 byte
  // ★ 2 bytes of padding (to make struct a multiple of 4)
} MemoryMapTest_Optimized_t;  // Total: 8 bytes

Memory layout comparison:

// Before (12 bytes)
Address      +0  +1  +2  +3  +4  +5  +6  +7  +8  +9  +A  +B
2000xxxx:    01  00  00  00  78  56  34  12  FF  00  00  00
             id  padding      value          flag padding

// After (8 bytes) — members reordered
Address      +0  +1  +2  +3  +4  +5  +6  +7
2000xxxx:    78  56  34  12  01  FF  00  00
             value          id  flag padding

A 4-byte savings achieved. In resource-constrained embedded systems, this kind of optimization is essential knowledge.

💡 Practical tip With a large struct used many times (e.g., an array of 1000 structs), 4 bytes per struct means 4 × 1000 = 4,000 bytes ≈ 4 KB difference. On a microcontroller with only 64 KB of RAM, that’s significant.

✅ Chapter 4 Checklist

  • Can explain why 3 bytes of gap appear after id
  • Can give the breakdown of sizeof(MemoryMapTest_t) = 12
  • Understands why changing member order reduces size

5. What volatile Means: A Warning to the Compiler (Foreshadowing)

The experiment code in chapter 2 used volatile in the variable declarations as a prerequisite for observation. Let’s clarify what it means.

Just remember this volatile means “please read from memory every time.” Use it for variables that hardware or interrupts might change.

5-1. What Is Compiler Optimization?

Normally, the compiler decides: “this variable’s value hasn’t changed in the program, so I’ll reuse the register value instead of reading memory again.”

Example: code that might be optimized

int counter = 0;
counter++;
counter++;
int result = counter;  // ← compiler might decide to just write "2" directly

This optimization is a good feature in normal programs — it speeds up execution.

5-2. The Embedded Problem: Changes from “Outside”

In embedded systems, however, values frequently change from “outside” the program (hardware or interrupts).

Example: GPIO input register (covered in detail in Episode 4)

uint32_t* gpio_input = (uint32_t*)0x40020010;  // GPIO input register
int value1 = *gpio_input;  // First read: button not pressed → 0
// ← button gets pressed here
int value2 = *gpio_input;  // Second read: button pressed → 1

If the compiler optimizes this, it might decide: “reading the same address twice, I’ll just reuse the first read” — and value2 would be 0 even though the button was pressed.

5-3. The Role of volatile: Suppressing Optimization

  • The role of volatile: Tells the compiler “this variable might change at any moment without warning — read from memory every single time.”
volatile uint32_t* gpio_input = (uint32_t*)0x40020010;
int value1 = *gpio_input;  // always reads from memory
int value2 = *gpio_input;  // reads again (not optimized away)

Common stumbling point volatile does not guarantee thread safety or mutual exclusion. Its role is purely “prevent the read/write from being optimized away.”

5-4. When to Use volatile

Case Reason Example
Hardware registers Hardware changes the value GPIO, UART, timer registers
Variables shared with interrupts Interrupt handler changes the value Flags, counters
Multi-threaded environments Another thread changes the value Shared variables with RTOS
Debugger observation Prevent compiler from eliminating the variable Our experiment code

Important note: Excessive volatile inhibits optimization and reduces execution speed. Using it only where necessary is the professional approach.

Why this matters will become clear in Episode 4’s “Register Operations.”

✅ Chapter 5 Checklist

  • Can list 3 or more situations where volatile is needed
  • Can explain that the purpose is “suppression of optimization”
  • Can explain why not to make everything volatile

6. Summary — Why C Has Been Embedded’s Common Language for 50 Years

This article explored how C variables, arrays, and structs are physically arranged in memory, confirmed with real addresses.

6-1. Review of Key Points

  • Variable = Memory Every C variable has a one-to-one correspondence with a physical memory address. This is why C is called “close to the hardware.”

  • Array = Contiguous Array elements are laid out with no gaps. This contiguity enables index-based address calculation and high-speed DMA transfers.

  • Struct = Layout Invisible “padding” is inserted by the CPU’s alignment requirements. Smart member ordering can save RAM.

  • The Role of volatile An essential keyword for hardware control, ensuring the actual memory state is always reflected. Use only where necessary.

6-2. The Fundamental Reason C Keeps Being Chosen

What we learned today relates to why C has been used in embedded development for over 50 years:

  1. Standard-guaranteed layout: The C standard mandates member ordering, making layout predictable
  2. Language simplicity: Pointer access to any address with no complex abstraction layer
  3. Experience and track record: Decades of use, comprehensive microcontroller vendor support, enormous reference base
  4. Controllability: Fine-grained control of compiler behavior through keywords like volatile

These properties perfectly match the embedded system requirement of “reliable operation with limited resources.”

6-3. Looking Ahead

Episode 4 finally dives into hardware register manipulation. The knowledge of “addresses,” “volatile,” and “struct layout” from this article all connect there.

What you’ll learn:

  • What are peripheral registers (GPIO, UART, timers)?
  • Why does writing a value to a specific address make hardware respond?
  • Where volatile really shines
  • Real examples of register definitions using structs

Armed with memory map knowledge, let’s step into the world of directly controlling hardware.


References

Official Documentation

Technical specifications in this article are based on STMicroelectronics official documentation:

C Language Standard

  • ISO/IEC 9899 (C Language Standard) Array contiguity, alignment, and volatile behavior are strictly defined by the C standard.

Next Up

#4: The World of Bits — Register Operations and the BSRR Design


📖 Previous Episode

#2: Where Variables Live — Flash, RAM, and the Stack

📚 Next Episode

Episode 4: "The World of Bits — Register Operations and the BSRR Design"
Registers are collections of bits. Mask operations, shift operations, RMW hazards, and the elegant BSRR design.

📖 Read Episode 4

📍 Series Index

Full 13-Part Series: The Embedded World Beyond Pointers