Master Structure and union Advanced Level interview Questions in c (2026)

0b63979cd9494aa401d1fce2d73bb002
On: December 8, 2025
Structure and union Advanced Level interview Questions

Master Structure and union Advanced Level interview Questions memory layout, padding, bitfields, nested structs, and optimization techniques in c

Imagine you are sitting in a high-stakes technical interview for a senior embedded systems position. The interviewer slides a piece of paper across the table with a seemingly simple question:

“Tell me how much memory this structure or union would occupy, and why?”

At first glance, it looks easy you’ve been working with C for years. But as you dig deeper, you realize this isn’t just about knowing sizeof(). The question is testing your understanding of memory alignment, padding, bitfields, flexible arrays, nested structures, and efficient memory usage.

In real-world systems, especially in embedded devices with limited memory, every byte counts. Misunderstanding how structures and unions are laid out can lead to memory wastage, subtle bugs, and performance issues.

This advanced-level interview set will take you step by step through challenging scenarios from nested unions, bitfields, structure packing, and flexible arrays to memory optimization techniques so you can confidently tackle any tricky structure or union question an interviewer throws your way.

By the end of this guide, you won’t just know the theory; you’ll be able to analyze, calculate, and optimize memory layouts in C like a true expert.

If you’re learning C programming and want to strengthen your fundamentals before practicing Structures and Unions Interview Questions, you should explore this detailed beginner-friendly guide on structures: Structures in C – Complete Guide
It explains structure syntax, memory layout, padding, nesting, and real-world examples in a very simple way. Reading this will give you a strong foundation before jumping into advanced interview questions.

Calculating the size of a structure manually in C/C++ involves understanding data type sizes, alignment, and padding. Let me explain step by step with clarity.

1. Check sizes of individual members

Each data type has a size (on most 32/64-bit systems):

Data TypeTypical Size
char1 byte
short2 bytes
int4 bytes
float4 bytes
double8 bytes
pointer4 or 8 bytes (32-bit or 64-bit system)

2. Understand alignment rules

  • Each data member is aligned to its natural boundary, i.e., its size or the struct alignment, whichever is smaller.
  • Alignment ensures efficient memory access.
  • For example:
    • int (4 bytes) should be at an address divisible by 4.
    • double (8 bytes) should be at an address divisible by 8.

3. Add padding between members

  • If a member’s natural alignment requires it to start at a certain boundary, but the previous member ends at a different boundary, padding bytes are added.
  • Padding is inserted between members and sometimes at the end of the structure so that the structure’s total size is a multiple of the largest member’s alignment.

4. Step-by-step example

struct Example {
    char c;     // 1 byte
    int i;      // 4 bytes
    short s;    // 2 bytes
};
  • Step 1: char c → starts at offset 0 → occupies 1 byte.
    Next offset: 1
  • Step 2: int i → needs 4-byte alignment → offset must be multiple of 4 → add 3 bytes padding → offset becomes 4.
    i occupies offsets 4,5,6,7.
  • Step 3: short s → needs 2-byte alignment → next offset is 8 → already aligned → occupies offsets 8,9.
  • Step 4: Struct size must be multiple of largest alignment (here, 4) → add 2 bytes padding at end → total size = 12 bytes.

5. General formula

struct_size = sum of (member sizes + padding between members) + padding at end
  • End padding ensures that an array of this struct has all elements correctly aligned.

Tip

  • To check in code: sizeof(struct Example)
  • To reduce size: reorder members from largest to smallest to minimize padding.

Calculating the size of a union is much simpler than a structure because of how unions work. Let me explain step by step.

1. Recall how a union works

  • In a union, all members share the same memory location.
  • Only one member can hold a value at a time.
  • Therefore, the size of a union is determined by:
    1. The largest member size, and
    2. The alignment requirements (padding at the end to satisfy the alignment).

2. Union size rule

union_size = maximum(size of all members) rounded up to nearest multiple of the largest alignment requirement

3. Example 1

union MyUnion {
    char c;     // 1 byte
    int i;      // 4 bytes
    double d;   // 8 bytes
};
  • Step 1: Find the size of each member:
    • char c = 1
    • int i = 4
    • double d = 8
  • Step 2: Take the largest size → 8 bytes (from double).
  • Step 3: Check alignment:
    • Largest alignment is usually the largest member’s alignment (double → 8 bytes).
    • 8 bytes is already a multiple of 8 → no extra padding needed.

Union size = 8 bytes

4. Example 2 (with padding)

union MyUnion2 {
    char c;     // 1 byte
    short s;    // 2 bytes
    int i;      // 4 bytes
};
  • Member sizes: char=1, short=2, int=4
  • Maximum size = 4 bytes
  • Maximum alignment = int → 4 bytes
  • Total size must be a multiple of alignment → 4 bytes

5. Key differences from struct

FeatureStructUnion
Memory allocationSum of members + paddingSize of largest member + padding
Members can coexistYesNo (share same memory)
Alignment/paddingBetween members + endOnly end alignment matters

So, manual calculation of union size is just find the largest member, check alignment, and add padding if needed.

The maximum alignment requirement in a structure is determined by the member with the strictest (largest) alignment requirement inside that structure.

Definition

The maximum alignment of a structure = the largest alignment requirement among all its members.

In simple terms:

  • Every data type has a natural alignment (e.g., int → 4 bytes, double → 8 bytes).
  • The structure’s alignment must match the member that needs the highest alignment, so all members stay properly aligned in memory.

Why does this matter?

  • The structure’s total size must be a multiple of this maximum alignment.
  • Compilers may add end padding to satisfy this requirement.
  • It ensures elements of an array of that structure remain correctly aligned.

Example

struct Example {
    char c;       // alignment 1
    int i;        // alignment 4
    double d;     // alignment 8
};

Alignment requirements:

  • char → 1 byte
  • int → 4 bytes
  • double → 8 bytes

Maximum alignment = 8 bytes
So the struct must be aligned to 8 bytes, and its total size will be rounded up to a multiple of 8.

Key Takeaway

The largest member decides the alignment this is why rearranging structure members can dramatically change size and reduce padding.

If you want, I can also solve a few tricky struct alignment problems to help you master this concept.

Using unions in embedded systems offers several powerful advantages, especially when memory, performance, and hardware interaction matter. Here are the key benefits explained clearly and professionally:

Advantages of Using Unions in Embedded Systems

1. Efficient Memory Usage (Saves RAM & Flash)

  • A union allocates memory equal to only its largest member.
  • All other members reuse the same memory space.
  • This is extremely useful in microcontrollers where RAM is limited (e.g., 2KB, 4KB).

Example:
Instead of storing different sensor data types separately, a union lets you reuse the same memory.

2. Easy Interpretation of the Same Data in Multiple Formats

  • A union allows you to treat the same data bytes as:
    • an integer
    • a float
    • a byte array
    • a bitfield
    • or protocol frames

This is very important in:

  • communication stacks (CAN, LIN, UART, SPI)
  • protocol parsing
  • sensor data conversion

Example: Reading 4 bytes from UART and interpreting them as either float or uint32_t.

3. Useful for Register Mapping in Microcontrollers

Hardware registers often need:

  • bit-level access
  • byte-level access
  • full-word access

Unions allow creating register definitions like:

union {
    uint32_t value;
    struct {
        uint8_t low;
        uint8_t mid;
        uint8_t high;
        uint8_t control;
    } bytes;
};

This is common in:

  • STM32
  • AVR
  • PIC
  • TI and NXP SoCs

4. Simplifies Communication Protocol Handling

Unions help pack and unpack data frames easily.

Example use cases:

  • CAN frames
  • Modbus frames
  • BLE packets
  • I2C/SPI buffers

You can overlay structs on top of raw byte arrays without copying data.

5. Speeds Up Data Conversion (No Need for memcpy())

In embedded systems, performance is critical.

A union allows:

  • accessing the same memory as different types
  • without extra memory copy
  • without extra overhead

This reduces CPU cycles and increases real-time performance.

6. Helps Create Memory-Efficient State Machines

Unions can store different state data types in the same memory region.
Useful when only one state is active at a time.

Example:

union {
    struct IdleState idle;
    struct TxState   tx;
    struct RxState   rx;
} stateMachine;

Summary

Using unions in embedded systems provides:

  • Memory efficiency
  • Fast data interpretation without copying
  • Clean hardware register mapping
  • Efficient protocol parsing
  • Better performance
  • More compact embedded designs

Type punning using a union is a technique in C where the same memory location is interpreted as different data types by using different members of a union.

This allows you to “reinterpret” the underlying bits of one type as another type without copying data.

Definition

Type punning through a union means writing a value to one member of a union and reading it from another member of the same union.

This works because:

  • All union members share the same memory space.

Simple Example of Type Punning

union Pun {
    float f;
    uint32_t i;
};

union Pun p;
p.f = 3.14f;      // store float
printf("%u", p.i);  // read the same bytes as an unsigned integer

Here:

  • p.f stores the float value in memory.
  • p.i reads the exact same 4 bytes but interprets them as an integer.

This is type punning treating the same bytes as different types.

Why Use Type Punning?

Type punning is very useful in embedded systems for:

1. Inspecting raw binary representation

Example: Seeing how a float is stored in IEEE 754 format.

2. Fast conversion without memcpy()

You can avoid costly data copying and simply reinterpret bits.

3. Hardware register access

Bitfields + raw register value access in the same memory.

4. Communication protocol parsing

Same bytes interpreted as:

  • struct
  • frame header
  • raw bytes

5. Sensor or DSP data interpretation

E.g., interpreting 4 bytes from a sensor as:

  • float
  • int32_t
  • unsigned char buffer[4]

Important: Is it Always Legal?

In old C standards, type punning via union was considered undefined by some compilers.

But in C11 and later, it is well-defined behavior:

Reading a different union member is allowed as long as the data representation is compatible.

Nearly all embedded compilers (GCC, Clang, ARMCC, IAR) support this.

Takeaway

Type punning using unions is a powerful technique where you reinterpret the same bytes in memory as different data types, commonly used in embedded systems for:

  • protocol decoding
  • register manipulation
  • performance optimization
  • raw data conversion

Endianness has a direct and important effect on how data inside a union is interpreted because unions allow viewing the same memory bytes as different types.

Here’s a clear breakdown:

Effect of Endianness in Unions

Endianness determines how multi-byte data types (like int, float, double, structs) are stored in memory, and since all members of a union share the same memory, the interpretation changes based on byte order.

1. Little Endian vs Big Endian Basics

Little Endian (LE)

  • Least significant byte (LSB) stored at the lowest memory address
  • Common in: x86, ARM (little endian mode)

Big Endian (BE)

  • Most significant byte (MSB) stored at the lowest memory address
  • Found in: some network processors, DSPs, PowerPC

2.How Endianness Affects a Union

Because a union overlays members in the same memory, reading data from another member depends entirely on the byte order inside that memory.

Example to Understand Clearly

union U {
    uint32_t num;
    uint8_t bytes[4];
};

union U u;
u.num = 0x11223344;

On Little Endian:

Memory layout (low → high):

44 33 22 11

bytes[0] = 0x44
bytes[1] = 0x33
bytes[2] = 0x22
bytes[3] = 0x11

On Big Endian:

Memory layout:

11 22 33 44

bytes[0] = 0x11
bytes[1] = 0x22
bytes[2] = 0x33
bytes[3] = 0x44

What This Means

A union’s behavior changes across processors with different endianness.

So:

  • Protocol parsing
  • Register mapping
  • Type punning
  • Byte extraction
  • Casting between integer/float/struct

…will produce different results depending on CPU endianness.

3. Why It Matters in Embedded Systems

Endianness impacts:

  • Communication protocols (CAN, UART, I2C, Ethernet)
  • CRC calculations
  • Sensor data interpretation
  • Memory-mapped register access
  • Flash/EEPROM data formats
  • Cross-platform firmware portability

A union used for type punning is NOT portable across:

  • different processors
  • compilers
  • architectures

Unless endianness is accounted for manually.

4. Important Point

Unions do NOT fix or convert endianness.

They simply reveal the byte order as the platform stores it.

If you need consistent byte order across devices, you must handle it manually using:

  • bit shifting
  • htonl/ntohl (network byte order)
  • custom byte-swap functions

Takeaway

Endianness determines how the bytes inside a union are laid out, so interpreting the same memory through different union members (type punning) will produce different results on different architectures.

Tagged unions (also called discriminated unions, variant records, or sum types) are a programming technique where a union is combined with a tag (identifier) that indicates which member of the union is currently valid.

In simple terms:

A tagged union = union + tag variable
The tag tells you which member is active, making the union safe to use.

Why Tagged Unions?

A normal C union is unsafe because:

  • You might read from a member that wasn’t written.
  • There is no built-in way to know which member currently holds valid data.

A tagged union solves this problem by storing:

  1. The actual data (in a union)
  2. An enum tag describing the data type inside the union

Example: Tagged Union in C

typedef enum {
    TYPE_INT,
    TYPE_FLOAT,
    TYPE_STRING
} DataType;

typedef struct {
    DataType type;    // The tag
    union {
        int i;
        float f;
        char *s;
    } data;           // The union
} TaggedValue;

Usage:

TaggedValue v;
v.type = TYPE_FLOAT;
v.data.f = 3.14;

if (v.type == TYPE_FLOAT) {
    printf("Value = %f\n", v.data.f);
}

The tag (TYPE_FLOAT) tells you which union member contains valid data.

Where Tagged Unions Are Used?

1. Inter-process communication (IPC)

Different message types in OS kernels (QNX, Linux, RTOS).

2. Communication protocols

Different packet formats inside a single frame.

3. AST nodes in compilers

Each node type stores different structures.

4. Variant data types

Like dynamically-typed values:
JSON, XML, scripting languages.

5. State machines

Different state data stored in one union.

6. Event-driven systems

A single event buffer holds different event types.

Benefits

  • Prevents undefined behavior
  • Makes unions type-safe
  • Easy to maintain and debug
  • Perfect for embedded protocols/state machines
  • Reduces memory usage while remaining safe

Without a Tag → Dangerous

Plain unions are unsafe because:

u.i = 10;
printf("%f", u.f);  // undefined or garbage!

Tagged unions eliminate this hazard by enforcing a tag check before accessing data.

Summary

Tagged unions =
union (stores one of many data types)

  • tag (enum) (tells which data type is stored)

They are memory-efficient, safe, and heavily used in systems programming, embedded systems, and compiler design.

A self-referential structure in C is a structure that contains a pointer to another structure of the same type.
It is not allowed for a structure to contain an actual instance of itself, but it can contain a pointer to itself.

This is the foundation for linked lists, trees, stacks, queues, and other dynamic data structures.

Definition

A self-referential structure is a structure that includes one or more pointers that point to the same structure type.

Example

struct Node {
    int data;
    struct Node *next;   // pointer to another Node
};

Here:

  • struct Node contains an integer data.
  • next is a pointer to another node of the same structure type.

This enables chaining many nodes together dynamically.

Why a Structure Cannot Contain Itself Directly?

This is invalid:

struct Node {
    int data;
    struct Node next; // ❌ Error
};

Because:

  • The compiler needs to know the complete size of the structure.
  • Including a structure inside itself would create infinite recursion → endless memory size.

But a pointer size is fixed (usually 4 or 8 bytes), so this is legal:

struct Node {
    int data;
    struct Node *next; // ✔ valid
};

Where Self-Referential Structures Are Used?

1. Linked Lists

  • singly linked list
  • doubly linked list

2. Trees

  • binary trees
  • AVL trees
  • red-black trees

3. Graphs

  • adjacency list representations

4. Stacks & Queues

When implemented using linked lists.

5. Dynamic memory data structures

Nodes that grow at runtime.

Key Benefits

  • Allow dynamic, flexible data structures
  • Enable efficient insertion/deletion
  • Memory allocated only when needed
  • No fixed size like arrays

Summary

A self-referential structure is a structure that contains a pointer to another instance of the same structure.
It is the building block of most dynamic data structures in C and embedded systems.

To allocate memory dynamically for structures in C, you use functions from stdlib.h—primarily malloc(), calloc(), or realloc().
This allows structures to be created at runtime instead of compile-time.

Let’s break it down simply and clearly.

Using malloc()

malloc() allocates raw memory equal to the size of the structure.

Example:

struct Student {
    int roll;
    float marks;
};

struct Student *s = (struct Student *)malloc(sizeof(struct Student));

if (s == NULL) {
    // allocation failed
}

Key Points:

  • Memory size = sizeof(struct Student)
  • Returns a pointer to allocated memory
  • Memory contains garbage values

Using calloc()

calloc() allocates memory and initializes it to zero.

struct Student *s = calloc(1, sizeof(struct Student));

Key Points:

  • All bytes set to 0
  • Often used when fields must be zero-initialized

Dynamic Allocation for Arrays of Structures

Using malloc:

struct Student *arr = malloc(10 * sizeof(struct Student));

Using calloc:

struct Student *arr = calloc(10, sizeof(struct Student));

This creates a dynamic array of 10 structure objects.

Using realloc() to Resize Structure Arrays

If you want to grow or shrink the number of structures:

arr = realloc(arr, 20 * sizeof(struct Student));  // resize to 20

Freeing Dynamically Allocated Structure Memory

Always release memory after use:

free(s);
free(arr);

Dynamic Allocation for Self-Referential Structures

Common in linked lists, trees, etc.

struct Node {
    int data;
    struct Node *next;
};

struct Node *newNode = malloc(sizeof(struct Node));
newNode->data = 10;
newNode->next = NULL;

This is extremely common in:

  • queues
  • stacks
  • trees
  • graph adjacency lists

Summary

To dynamically allocate memory for structures:

  • Use malloc() → raw memory
  • Use calloc() → zero-initialized memory
  • Use realloc() → resize arrays of structures
  • Use free() → release memory

Dynamic allocation is essential for:

  • data structures
  • embedded dynamic buffers
  • linked lists, trees, queues
  • memory-efficient programs

Yes, you can use memcpy() with structures, and it is very common in systems programming, embedded systems, drivers, and protocol handling—but only when used correctly.

Here is the full explanation

Can We Use memcpy() With Structures?

Yes, memcpy() can be safely used with structures as long as the structure contains only Plain Old Data (POD) types such as:

  • integers
  • floats
  • chars
  • arrays
  • other POD structs

Because these types can be copied bit-by-bit without breaking anything.

Basic Example

struct Data {
    int a;
    float b;
};

struct Data src = {10, 3.14};
struct Data dest;

memcpy(&dest, &src, sizeof(struct Data));

This copies the entire structure byte-by-byte.

When memcpy() is Safe

memcpy() is safe when the structure contains only simple, non-pointer members, like:

struct Packet {
    uint8_t id;
    uint16_t len;
    uint8_t payload[10];
};

These are ideal for:

  • network messages
  • UART packets
  • firmware protocols
  • memory-mapped registers

When memcpy() is NOT Safe

Structures containing pointers

struct Example {
    int *ptr;
    int value;
};

Copying with memcpy() will copy the pointer address, not the data it points to.
This can cause:

  • double free
  • dangling pointers
  • crashes
  • corrupted memory

Structures with dynamic memory

struct Student {
    char *name;   // allocated dynamically
    int age;
};

memcpy() will copy only the pointer, not the actual string.

Structures with virtual tables (C++), constructors, destructors

Not applicable to plain C, but important in embedded C++.

Structures with padding differences

Two compilers or architectures may use different padding/alignment, so using memcpy() across:

  • different machines
  • network packets
  • files

…can cause mismatches.

Use Case: memcpy() is Commonly Used in Embedded Systems

Because embedded structures often represent:

  • Register layouts
  • Protocol headers
  • CAN frames
  • Flash/EEPROM blocks

These are simple POD data structures → perfect for memcpy().

Final Summary

Use memcpy() when:

  • Structure contains only simple POD fields
  • Memory layout is consistent and known
  • No pointers or dynamic memory
  • Copying raw bytes is desired

Avoid memcpy() when:

  • Structure contains pointers
  • Dynamic memory is involved
  • Padding/alignment differs across systems

Using unions for type-casting (type punning) is possible in C, but it is considered risky, non-portable, and sometimes undefined behavior, especially in advanced or embedded systems.

Here’s a clear, interview-friendly explanation

Why Using Unions for Type-Casting Is Risky

Unions let different data types share the same memory, so programmers often do:

union {
    float f;
    uint32_t i;
} u;

u.f = 3.14f;
printf("%x", u.i);   // type punning

This “works” on many compilers—but it has several dangers.

1. It Relies on Endianness (Byte Order)

Different CPUs store bytes in different orders:

  • Little-endian: LSB first
  • Big-endian: MSB first

So the interpretation of:

u.f → u.i

depends on hardware.

Works on one processor
Completely wrong on another

This makes union type punning non-portable.

2. Violates Strict Aliasing Rule (C Standard Warning)

Strict aliasing allows the compiler to assume that:

Different types do not refer to the same memory

When you do:

u.i = 0x12345678;
float x = u.f;

You are telling the compiler:

Treat the same bytes both as float and int.

This can lead to:

  • Undefined behavior
  • Optimization issues
  • Wrong code generation

Many embedded compilers break or produce unexpected results.

3. Modern Compilers May Optimize Incorrectly

Compilers may reorder or optimize based on type assumptions.
Thus, the type-punning union may not work if optimization is enabled (-O2, -O3).

4. Padding and Alignment Issues

Some structures have internal padding.
Type punning via union may mistakenly interpret those padding bytes.

5. Different Compilers Implement Unions Differently

For example:

  • GCC extension: union type punning works reliably
  • MSVC: does NOT guarantee union aliasing works
  • ARM/IAR compilers for embedded: results vary

Thus union-based type casting is compiler-dependent.

6. Floating-Point Formats May Differ

Interpreting float as int assumes IEEE-754 format.

Some MCUs:

  • use different float formats
  • use software FP emulation
  • store FP values differently

Your type pun will be wrong.

7. Undefined Behavior According to Some C Standards

According to strict ISO C:

  • Writing to one union member
  • Then reading another unrelated member

…is undefined behavior.

Some compilers allow it as an extension, but not all.

When Is It Still Used?

Despite risks, union type punning is popular in:

  • Embedded systems
  • DSP code
  • Hardware drivers
  • Bit manipulation
  • Protocol decoding

But only when:

  • Processor is fixed
  • Compiler is fixed
  • Behavior is well understood

Still, using memcpy() is safer and portable.

Safe Alternative to Union Type-Punning

Use:

float f = 3.14;
uint32_t i;
memcpy(&i, &f, sizeof(i));

Why safer?

  • Avoids strict aliasing violation
  • Portable across hardware and compilers
  • No undefined behavior

Final Summary

Using unions for type-casting is risky because:

  • It depends on endianness
  • Violates strict aliasing rules
  • Can lead to undefined behavior
  • Depends on compiler implementation
  • Can break under optimization
  • Assumes a specific data layout
  • Not portable across architectures

The layout of structure members in memory refers to how the compiler arranges structure fields in RAM, including the effect of alignment, padding, and ordering.

Sequential Arrangement (Declared Order is Preserved)

Members of a structure are stored in the exact order in which they are declared.

Example:

struct A {
    char c;     // 1 byte
    int x;      // 4 bytes
    short s;    // 2 bytes
};

Order in memory:

[c][padding][x x x x][s s]

Alignment Requirement

Each data type has an alignment (1, 2, 4, 8 bytes), meaning:

  • The starting address of the member must be a multiple of its alignment.

Examples:

  • char → 1-byte aligned
  • short → 2-byte aligned
  • int → 4-byte aligned
  • double → 8-byte aligned (on most systems)

Padding is Added

The compiler inserts padding bytes so that:

  • Each member starts at its required aligned boundary.
  • The entire structure size becomes a multiple of max alignment inside the structure.

Detailed Example

struct Test {
    char a;     // offset 0
    int b;      // offset 4 (3 bytes padding before it)
    short c;    // offset 8
};

Memory layout:

Byte offset:   Content
-------------------------------------
0              a
1,2,3          padding
4,5,6,7        b
8,9            c
10,11          padding (structure padding)

Final size: 12 bytes

Structure Padding vs Member Padding

Member padding

Added between members to satisfy alignment.

Structure padding

Added after the last member so total structure size is aligned to max alignment requirement.

Why this layout is required?

To ensure:

  • Faster CPU access
  • Proper aligned memory access
  • Avoid bus errors on architectures where misalignment crashes the program

Some CPUs cannot access unaligned integers or longs, making padding mandatory.

Rearranging Members Can Reduce Padding

Bad layout:

struct Bad {
    char c;
    int x;
    short s;
};

Better layout:

struct Good {
    int x;
    short s;
    char c;
};

Reduces unnecessary padding.

The layout of union members in memory is very different from structures.
In a union, all members share the same memory location, meaning only one member is stored at a time.

All Members Start at the Same Address

Every union member begins at offset 0.

Example:

union U {
    int x;      // 4 bytes
    char c;     // 1 byte
    float f;    // 4 bytes
};

Memory layout (conceptually):

---------------------
|  Shared Memory    |  <-- used by x, c, or f
---------------------

Size of Union = Size of Largest Member

Memory allocated = maximum size among its members.

Example:

  • int = 4 bytes
  • char = 1 byte
  • float = 4 bytes

Union size = 4 bytes

No Padding Between Members

Because only one member exists in memory at a time, there is:

  • No member padding
  • No structure-like alignment gaps
  • No sequential layout

Padding may only appear at the end if required by alignment rules.

Example Layout Visualization

Union definition:

union Data {
    char a;      // 1 byte
    int b;       // 4 bytes
    double c;    // 8 bytes
};

Memory Layout:

Offset 0 → [ union memory block ]
            8 bytes total (size of double)

All members overlap:

a  → uses bytes [0]
b  → uses bytes [0 to 3]
c  → uses bytes [0 to 7]

How union layout affects reading values?

Using one member after writing another causes type punning, which may reveal:

  • Byte-level representation
  • Endianness effects
  • Tricky debugging scenarios

Example:

union Data d;
d.b = 0x12345678;
printf("%x", d.a);   // Prints lowest byte depending on endianness

Alignment Requirement Still Applies

Even though members share memory, the union is aligned to the strictest alignment:

Example:

union X {
    char c;       // align 1 byte
    int i;        // align 4 bytes
    double d;     // align 8 bytes
};

Alignment = 8 bytes
Total size may become 8 bytes or padded to 8 bytes.

Yes, a structure can be packed differently on different compilers and this is a very important concept in systems programming, embedded systems, and cross-platform development.

Short Answer

Yes.
Different compilers (GCC, Clang, MSVC, Keil, IAR, ARM-GCC, etc.) may apply different alignment rules, padding strategies, and packing behavior, which leads to different structure sizes and layouts.

Why Structure Packing Differs Across Compilers?

1. Different default alignment rules

Each compiler and platform defines:

  • Natural alignment (1, 2, 4, 8 bytes)
  • Maximum alignment allowed
  • ABI (Application Binary Interface) rules

Example:

  • GCC on x86 may align double to 4 bytes
  • MSVC on x64 aligns double to 8 bytes

This affects padding and total size.

2. Different target architecture requirements

Structure packing may vary between:

  • ARM Cortex-M (embedded)
  • x86 / x64 (PC)
  • RISC-V
  • PowerPC
  • DSP processors

Some architectures crash on unaligned access, so compilers insert padding differently.

3. Compiler-specific pragmas and attributes

Each compiler has its own syntax:

GCC / Clang:

struct __attribute__((packed)) S {
    char a;
    int b;
};

MSVC:

#pragma pack(push, 1)
struct S {
    char a;
    int b;
};
#pragma pack(pop)

Both do the same thing, but syntax differs.

4. ABI (Application Binary Interface) Differences

Different operating systems and compilers follow different ABIs:

  • System V ABI
  • Windows ABI
  • ARM EABI
  • QNX ABI
  • iOS / Darwin ABI

ABIs define:

  • alignment of integers
  • alignment of floating-point
  • structure padding
  • calling conventions

This directly affects structure layout and size.

Example Showing Different Packing

struct Test {
    char a;
    int b;
    short c;
};

Common results:

  • GCC (x86): 12 bytes
  • MSVC (x86): 8 bytes
  • ARM-GCC: 12 bytes

Why?
MSVC may align int to 2 bytes in some cases unless /Zp or pragma is used.

Implications in the Real World

1. Binary file parsing

Structures written by one compiler may fail to be read by another.

2. Network protocols

Padding issues break communication between systems.

3. Embedded systems

Hardware registers must be exactly aligned—incorrect padding causes:

  • malformed packets
  • incorrect register access
  • undefined behavior

4. API / Driver development

Misaligned structure layouts break OS and firmware interfaces.

Yes unions are guaranteed to store their members in overlapping memory, but with a few important details.

Short Answer

Yes.
In C, all members of a union share the same memory location, meaning they always overlap.
This behavior is guaranteed by the C standard.

What the C Standard Says

The C standard defines a union as:

A type whose members all start at the same memory location.

This means:

  • All members have offset 0
  • They physically overlap
  • The union size = size of the largest member
  • Writing to one member overwrites the others

This behavior is strictly guaranteed across all compilers, architectures, and platforms.

Memory Overlap Visualization

union U {
    int x;    // 4 bytes
    char c;   // 1 byte
    float f;  // 4 bytes
};

Memory representation:

Offset 0:  [Shared Memory Block for union]
           | 4 bytes |
  • x, c, and f all occupy the same 4 bytes.

But There Is One Exception

While overlap is guaranteed, the alignment requirement may force the union itself to have padding outside the overlapped region.

Example:

union U {
    char a;      // align 1
    double b;    // align 8
};
  • Both members still overlap
  • But the union’s total size = 8 bytes
  • Because alignment = 8

So the memory overlap is guaranteed, but extra bytes may exist after the overlapping region (for alignment).

What Is NOT Guaranteed

  1. Reading a member different from what was written
    (type punning without memcpy() is implementation-defined in strict C11/C18 rules)
  2. Endianness behavior
    Overlap is guaranteed, but the byte order of multibyte objects depends on CPU endianness.
  3. Bit-level interpretation
    When overlapping is used to reinterpret data, results vary by architecture.

When a structure contains mixed datatypes like char, int, float, etc., the compiler must arrange them in memory while respecting alignment rules. This leads to padding, alignment, and sometimes increased structure size.

Below is the complete explanation.

Members Are Stored in Declared Order

The compiler does not reorder structure members for optimization.

Example:

struct Mix {
    char c;      
    int i;       
    float f;     
};

Order in memory is always: char → int → float

Alignment Requirements Depend on Each Datatype

Each datatype has a natural alignment:

TypeTypical SizeTypical Alignment
char1 byte1 byte
short2 bytes2 bytes
int4 bytes4 bytes
float4 bytes4 bytes
double8 bytes8 bytes

A member must start at an address multiple of its alignment.

Padding Bytes Are Added Between Members

This happens when the next member’s required alignment doesn’t match the current offset.

Example Layout (Typical 32-bit or 64-bit system)

struct Mix {
    char c;     // 1 byte
    int i;      // 4 bytes
    float f;    // 4 bytes
};

Memory Layout Visualization

Offset 0:   c
Offset 1-3: padding (3 bytes)
Offset 4-7: i (4 bytes)
Offset 8-11:f (4 bytes)

Total size = 12 bytes

Structure Size Depends on Largest Member Alignment

The structure is padded at the end so its total size is a multiple of the maximum member alignment.

In this example:

  • Largest alignment = 4 bytes
  • Structure size must be multiple of 4
  • Already 12 → aligned → no extra padding

Why Mixed Types Cause Padding?

Because:

  • char aligns on 1 byte
  • int must align on 4 bytes
  • So after a 1-byte char, the compiler inserts 3 bytes of padding

This ensures CPU can efficiently access int and float.

Reordering Members Reduces Padding

Optimized version:

struct Mix {
    int i;
    float f;
    char c;
};

Memory layout:

i → 4 bytes
f → 4 bytes
c → 1 byte
3 bytes padding at end (structure padding)

Total: 12 bytes
(Same size but less internal fragmentation)

Real-World Impact

Mixed datatypes affect:

Memory footprint

Especially important in embedded systems.

Performance

Aligned access is faster.

Binary compatibility

Different compilers/platforms may produce different padding.

Network protocols & file formats

Padding can break communication unless packed structs are used.

Final Summary

When structure members have mixed datatypes (char, int, float):

  • They are stored in the order declared
  • Compiler adds padding between members
  • Members must follow alignment rules
  • Structure size increases due to both member padding and structure padding
  • Performance improves due to aligned memory access
  • Reordering members can reduce memory waste

In embedded systems, peripherals (GPIO, UART, I2C, timers, etc.) are controlled using hardware registers located at fixed memory addresses.
C allows us to access these registers in a clean and readable way using structures.

Is Register Mapping?

Mapping hardware registers means:

  • Assigning a C struct to a specific memory address
  • Each field of the struct corresponds to one register
  • Accessing the register becomes easy and readable using -> operator

Example:

GPIO->MODER = 0x01;

Why Use Structures for Register Mapping?

Because:

  • Cleaner code
  • Safe access
  • Easy debugging
  • Eliminates magic numbers
  • Matches datasheet layout

Example from datasheet:

OffsetRegister Name
0x00MODER
0x04OTYPER
0x08OSPEEDR
0x0CPUPDR

You convert this layout into a C struct.

Steps to Map Registers Using Structures

Step 1: Read Register Layout in Datasheet

Example:
GPIO peripheral base address:

GPIOA_BASE = 0x40020000

Mapping Hardware Registers Using Unions in C

Unions allow accessing the same memory location in multiple ways. This is useful when a hardware register has multiple bitfields or when you want both byte-level and word-level access.

1. Why Use Unions for Hardware Registers?

  • Many hardware registers are bit-addressable.
  • You may want both full-register access and bitwise access.
  • Unions combined with structures allow bitfield mapping.

Example scenario:

A 32-bit register:

BitName
31RESERVED
30ERROR
29READY
28ENABLE
27-0DATA

Steps to Map Registers Using Unions

Step 1: Define Bitfield Struct

typedef struct
{
    uint32_t DATA   : 28;
    uint32_t ENABLE : 1;
    uint32_t READY  : 1;
    uint32_t ERROR  : 1;
    uint32_t RESERVED : 1;
} REG_Bits_t;

Here:

  • : n specifies number of bits in the field
  • Total must match the register size (32 bits here)

Step 2: Combine With Union

typedef union
{
    uint32_t all;       // Full register access
    REG_Bits_t bits;    // Bit-level access
} REG_t;

Now, REG_t allows two ways to access the same register:

  1. reg.all → full 32-bit value
  2. reg.bits.ENABLE → individual bit access

Step 3: Map to Hardware Address

#define MY_REG   (*(volatile REG_t*) 0x40021000UL)

Step 4: Access Register

// Set ENABLE bit
MY_REG.bits.ENABLE = 1;

// Read READY bit
if (MY_REG.bits.READY)
{
    // Do something
}

// Write full register at once
MY_REG.all = 0x12345678;

Example With Multiple Registers

typedef struct
{
    union {
        uint32_t CTRL;
        struct {
            uint32_t ENABLE : 1;
            uint32_t MODE   : 3;
            uint32_t RESERVED : 28;
        } CTRL_bits;
    };
    union {
        uint32_t STATUS;
        struct {
            uint32_t READY : 1;
            uint32_t ERROR : 1;
            uint32_t RESERVED : 30;
        } STATUS_bits;
    };
} PERIPH_t;

#define PERIPH   (*(volatile PERIPH_t*)0x40020000UL)

// Usage
PERIPH.CTRL_bits.ENABLE = 1;
if(PERIPH.STATUS_bits.READY)
{
    // Process
}

Advantages of Using Unions

  • Bit-level control of registers
  • Full register access when needed
  • Cleaner, readable code
  • Reduces manual masking/shifting

Important Points / Interview Tips

  1. Use volatile: Always needed to prevent compiler optimization.
  2. Ensure total bits = register width: Otherwise behavior is undefined.
  3. Be careful with padding: Compilers may add padding; using __attribute__((packed)) helps.
  4. Use unions mainly for type punning or bitfields, but don’t rely on them for complex typecasting.

Here’s a detailed, interview-ready explanation of how unions can be used to map hardware registers in embedded systems. This complements the structure-based approach you already know.

Mapping Hardware Registers Using Unions in C

Unions allow accessing the same memory location in multiple ways. This is useful when a hardware register has multiple bitfields or when you want both byte-level and word-level access.

Why Use Unions for Hardware Registers?

  • Many hardware registers are bit-addressable.
  • You may want both full-register access and bitwise access.
  • Unions combined with structures allow bitfield mapping.

Example scenario:

A 32-bit register:

BitName
31RESERVED
30ERROR
29READY
28ENABLE
27-0DATA

Steps to Map Registers Using Unions

Step 1: Define Bitfield Struct

typedef struct
{
    uint32_t DATA   : 28;
    uint32_t ENABLE : 1;
    uint32_t READY  : 1;
    uint32_t ERROR  : 1;
    uint32_t RESERVED : 1;
} REG_Bits_t;

Here:

  • : n specifies number of bits in the field
  • Total must match the register size (32 bits here)

Step 2: Combine With Union

typedef union
{
    uint32_t all;       // Full register access
    REG_Bits_t bits;    // Bit-level access
} REG_t;

Now, REG_t allows two ways to access the same register:

  1. reg.all → full 32-bit value
  2. reg.bits.ENABLE → individual bit access

Step 3: Map to Hardware Address

#define MY_REG   (*(volatile REG_t*) 0x40021000UL)

Step 4: Access Register

// Set ENABLE bit
MY_REG.bits.ENABLE = 1;

// Read READY bit
if (MY_REG.bits.READY)
{
    // Do something
}

// Write full register at once
MY_REG.all = 0x12345678;

Example With Multiple Registers

typedef struct
{
    union {
        uint32_t CTRL;
        struct {
            uint32_t ENABLE : 1;
            uint32_t MODE   : 3;
            uint32_t RESERVED : 28;
        } CTRL_bits;
    };
    union {
        uint32_t STATUS;
        struct {
            uint32_t READY : 1;
            uint32_t ERROR : 1;
            uint32_t RESERVED : 30;
        } STATUS_bits;
    };
} PERIPH_t;

#define PERIPH   (*(volatile PERIPH_t*)0x40020000UL)

// Usage
PERIPH.CTRL_bits.ENABLE = 1;
if(PERIPH.STATUS_bits.READY)
{
    // Process
}

Advantages of Using Unions

  • Bit-level control of registers
  • Full register access when needed
  • Cleaner, readable code
  • Reduces manual masking/shifting

Important Points / Interview Tips

  1. Use volatile: Always needed to prevent compiler optimization.
  2. Ensure total bits = register width: Otherwise behavior is undefined.
  3. Be careful with padding: Compilers may add padding; using __attribute__((packed)) helps.
  4. Use unions mainly for type punning or bitfields, but don’t rely on them for complex typecasting.

Summary

Using unions, you can map a hardware register to both a full-width variable and a bitfield structure, giving flexible access to the same memory location.
This is especially useful in embedded systems where individual bits have special meanings.

Protocols (like CAN, TCP/IP headers, custom binary protocols) often define messages where:

  • A single block of memory can represent different types of data.
  • The same bit pattern can be interpreted in multiple ways depending on the context.

Unions in C are ideal for this because all members share the same memory location, allowing flexible interpretation without copying or converting data.

Key Reasons

Memory Efficiency

  • Union members overlap in memory.
  • No extra storage needed for multiple representations of the same data.
  • Example: a 32-bit message can be accessed as:
    • uint32_t raw → full 32-bit integer
    • struct { uint8_t a,b,c,d; } bytes → individual bytes
  • Both occupy the same 4 bytes.

Flexible Parsing

  • Protocol fields can have different interpretations depending on context (type field, command field, etc.).
  • Example: A packet may carry:
    • Integer, float, or flags depending on type.
  • Union allows one memory location to represent all types, and you choose interpretation dynamically.

Easy Bitfield or Byte-level Access

  • Some protocols need individual bits or bytes.
  • Union with bitfields lets you access whole word or individual bits.
typedef union {
    uint32_t raw;
    struct {
        uint32_t FLAG1 : 1;
        uint32_t FLAG2 : 1;
        uint32_t VALUE : 30;
    } bits;
} PacketField_t;
  • raw → entire 32-bit packet
  • bits.FLAG1 → individual flag

Avoids Typecasting and Copying

  • Instead of manually shifting and masking bits to interpret data:
uint32_t raw = read_register();
uint8_t flag = (raw >> 31) & 0x1;
  • You can use a union:
PacketField_t packet;
packet.raw = read_register();
uint8_t flag = packet.bits.FLAG1;
  • Cleaner, less error-prone, and faster.

Protocol Headers with Multiple Views

  • Many protocols have overlapping fields.
  • Example: a CAN message data field can be interpreted as:
    • A signed int
    • An unsigned int
    • A struct with multiple flags
  • Union allows single memory representation with multiple views.

Example in Practice

typedef union {
    uint32_t raw;      // Full 32-bit message
    struct {
        uint8_t cmd;   // Command byte
        uint8_t len;   // Data length
        uint16_t data; // Payload
    } fields;
} CAN_Message_t;

CAN_Message_t msg;

// Receive raw data from CAN hardware
msg.raw = read_CAN_register();

// Parse fields directly
printf("Command: %u, Length: %u\n", msg.fields.cmd, msg.fields.len);
  • No copying required
  • Memory-efficient
  • Easy to read and maintain

The offsetof() macro is defined in .

It is used to determine the byte offset of a member within a structure.

This is especially useful in:

  • Low-level programming
  • Memory-mapped structures
  • Protocol parsing
  • Implementing container macros (like in Linux kernel)

Syntax

offsetof(TYPE, MEMBER)

Parameters:

ParameterDescription
TYPEName of the struct type
MEMBERMember within the struct

Return Value:
The offset in bytes from the start of the structure to the member.

How It Works

offsetof() essentially computes:

address of member within struct - address of struct start

It does not require an actual struct instance, it’s a compile-time constant.

Example

#include 
#include 

typedef struct
{
    char c;       // 1 byte
    int i;        // 4 bytes (may have padding)
    float f;      // 4 bytes
} MyStruct;

int main() {
    printf("Offset of c: %zu\n", offsetof(MyStruct, c));  // 0
    printf("Offset of i: %zu\n", offsetof(MyStruct, i));  // 4 (likely due to padding)
    printf("Offset of f: %zu\n", offsetof(MyStruct, f));  // 8
    return 0;
}

Output (typical on 32-bit system):

Offset of c: 0
Offset of i: 4
Offset of f: 8
  • Shows padding bytes inserted by the compiler.

Why Use offsetof()

  1. Memory layout awareness
    Helps understand padding and alignment in structures.
  2. Generic container macros
    • Used in the Linux kernel container_of() macro: #define container_of(ptr, type, member) \ ((type *)((char *)(ptr) - offsetof(type, member)))
  3. Pointer arithmetic
    When implementing data structures like linked lists, offsetof() allows you to compute struct base addresses from member pointers safely.

Important Points

  • Defined in → include this header.
  • Returns size_t type.
  • Works at compile-time, no runtime overhead.
  • Useful with packed structures and hardware register mapping.

Quick Summary

The offsetof() macro gives the byte offset of a member within a structure, helping in pointer arithmetic, memory-mapped hardware, and generic data structures. It is widely used in embedded systems, OS kernels, and protocol parsing.

When using unions with bitfields, the ordering of bits in memory is implementation-defined.

This is critical to understand in embedded systems, protocol parsing, and hardware register access.

What Is Bit-Endianness?

  • Endianness determines how multi-byte values are stored in memory.
  • Two types:
  1. Little-endian → least significant byte (or bit) stored first.
  2. Big-endian → most significant byte (or bit) stored first.

Note: Bitfield ordering within a byte is not strictly standardized in C — it depends on compiler and architecture.

Union with Bitfields Example

#include 
#include 

typedef union {
    uint8_t all;
    struct {
        uint8_t bit0 : 1;
        uint8_t bit1 : 1;
        uint8_t bit2 : 1;
        uint8_t bit3 : 1;
        uint8_t bit4 : 1;
        uint8_t bit5 : 1;
        uint8_t bit6 : 1;
        uint8_t bit7 : 1;
    } bits;
} ByteReg_t;

int main() {
    ByteReg_t reg;
    reg.all = 0xA5;  // 10100101 in binary

    printf("bit0 = %u\n", reg.bits.bit0);
    printf("bit1 = %u\n", reg.bits.bit1);
    printf("bit7 = %u\n", reg.bits.bit7);
}

Output may vary depending on compiler/architecture!

  • On most little-endian compilers, bit0 corresponds to LSB.
  • On big-endian compilers, bit0 may correspond to MSB.

Key Rules

  1. Bitfield ordering is compiler-dependent
    • LSB-first or MSB-first within a byte is not guaranteed by C standard.
  2. Byte order (endianness) is separate from bitfield order
    • Little-endian CPU stores bytes LSB-first.
    • Bitfield within the byte may still be defined differently by compiler.
  3. Union allows multiple views of same memory
    • You can write reg.all = 0xA5 and read individual bits via reg.bits.
    • But exact mapping of bit positions must be verified per compiler/target.

Best Practices

  1. Avoid assumptions on bitfield order across compilers
    • Use only for same compiler/architecture or hardware-specific code.
  2. For cross-platform protocols, manually shift & mask bits instead of relying on bitfield order:
uint8_t bit0 = (reg & 0x01);
uint8_t bit7 = (reg >> 7) & 0x01;
  1. Document your compiler/CPU assumptions if using union bitfields in embedded projects.
  2. Use #pragma pack or __attribute__((packed)) for precise memory layout when needed.

Summary

Bit-endianness inside a union bitfield is compiler- and architecture-dependent. The C standard does not guarantee bit order within bytes, so while unions are convenient for accessing hardware registers or protocol fields, you must verify the layout for your specific compiler/CPU. For portable code, prefer manual bit masking and shifting instead of relying on bitfields.

1. Linked Lists

A linked list is a sequence of nodes where each node contains:

  • Data
  • Pointer to the next node

Structure Example: Singly Linked List

#include 
#include 

typedef struct Node {
    int data;
    struct Node* next;  // Pointer to the next node
} Node;

int main() {
    // Create nodes
    Node* head = (Node*)malloc(sizeof(Node));
    head->data = 10;
    head->next = NULL;

    Node* second = (Node*)malloc(sizeof(Node));
    second->data = 20;
    second->next = NULL;

    head->next = second;  // Link nodes

    printf("Linked List: %d -> %d\n", head->data, head->next->data);
    return 0;
}

Key Points:

  • struct Node* next makes the structure self-referential.
  • Allows dynamic memory allocation and flexible list sizes.

2. Stacks (Linked List Implementation)

A stack can be implemented using a linked list structure:

  • Push: Insert at head
  • Pop: Remove from head

Structure Example: Stack Node

typedef struct StackNode {
    int data;
    struct StackNode* next;
} StackNode;

// Push function
StackNode* push(StackNode* top, int value) {
    StackNode* newNode = (StackNode*)malloc(sizeof(StackNode));
    newNode->data = value;
    newNode->next = top;
    return newNode;
}

// Pop function
StackNode* pop(StackNode* top, int* value) {
    if (top == NULL) return NULL;
    *value = top->data;
    StackNode* temp = top;
    top = top->next;
    free(temp);
    return top;
}

Key Points:

  • Stack is a LIFO (Last-In-First-Out) data structure.
  • Structures allow dynamic memory management for variable stack size.

3. Binary Tree

A binary tree node contains:

  • Data
  • Pointer to left child
  • Pointer to right child

Structure Example: Binary Tree Node

typedef struct TreeNode {
    int data;
    struct TreeNode* left;
    struct TreeNode* right;
} TreeNode;

// Create a new node
TreeNode* createNode(int value) {
    TreeNode* node = (TreeNode*)malloc(sizeof(TreeNode));
    node->data = value;
    node->left = node->right = NULL;
    return node;
}

Key Points:

  • Each node points to its children recursively.
  • Structures make it easy to traverse, insert, or delete nodes in trees.

Why Structures Are Ideal for These Data Structures

  1. Self-referential pointers allow dynamic links.
  2. Heterogeneous data can be stored (e.g., integers, floats, strings in one node).
  3. Dynamic memory allocation with malloc() allows flexible size.
  4. Readable and maintainable code for complex data structures.

Summary

Structures are perfect for implementing dynamic data structures like linked lists, stacks, and trees. By including pointers to the same structure type within the structure, you can create self-referential nodes that enable flexible memory layouts and dynamic linking of nodes.

Both structures and unions can be assigned using = in C:

struct MyStruct s1, s2;
s1 = s2;   // Structure assignment

union MyUnion u1, u2;
u1 = u2;   // Union assignment

But structure assignment often costs more than union assignment. Let’s see why.

Memory Layout Differences

Structure

  • Each member has its own memory (may include padding for alignment).
  • Total size = sum of all members + padding.
  • Assignment copies all members from one structure to another.

Union

  • All members share the same memory.
  • Total size = size of largest member only.
  • Assignment copies only one memory block, not multiple members.

Example

#include 
#include 

typedef struct {
    int a;
    double b;
    char c;
} MyStruct;

typedef union {
    int a;
    double b;
    char c;
} MyUnion;

int main() {
    MyStruct s1 = {1, 2.5, 'x'}, s2;
    MyUnion u1 = {1}, u2;

    s2 = s1; // Copies int a, double b, char c
    u2 = u1; // Copies memory of largest member (double b)

    return 0;
}
  • sizeof(MyStruct) → likely 16 bytes (int + padding + double + char + padding)
  • sizeof(MyUnion) → 8 bytes (largest member double)
  • Structure assignment copies more bytes than union assignment, making it more expensive.

Why Union Assignment Is Cheaper

  1. Single memory block → only one copy operation.
  2. No multiple member copying → fewer CPU cycles.
  3. Faster in embedded systems when registers or buffers are memory-mapped.

When It Matters

  • In embedded systems or performance-critical code:
    • Large structures → assignment can be expensive.
    • Unions → often used for type-punning, bitfields, or protocol parsing, where assignment is fast.
  • Structures may need looped or optimized copy for very large arrays of structures.

Summary

FeatureStructure AssignmentUnion Assignment
Memory copiedAll members (full size)Single memory block (largest member)
CPU costHigherLower
Use caseComplex, multiple membersFast type-punning or overlapping data
Memory sizeSum of members + paddingSize of largest member

Key Point: Structure assignment copies all members, whereas union assignment copies only the overlapping memory block, making it cheaper and faster.

A raw data packet is usually a sequence of bytes received from hardware, network, or sensors. Often, the same bytes need to be interpreted in different ways:

  • As integers, floats, or flags
  • As fields in a protocol header
  • As bitfields for control/status information

Unions make this very convenient.

Why Unions Are Ideal

  1. Multiple Views of the Same Memory
    • You can access the packet as a whole (uint32_t) or as individual fields/bytes.
  2. Memory Efficient
    • No need to copy data into multiple variables.
  3. Bit-level Access
    • Works well with bitfields for protocol flags.
  4. Fast Parsing
    • One assignment of raw bytes → multiple interpretations.

Example: Protocol Packet

Suppose you have a 32-bit network packet:

BitsField
31-24Command
23-16Length
15-0Payload

Define Union

#include 
#include 

typedef union {
    uint32_t raw;  // Full 32-bit packet
    struct {
        uint16_t payload;
        uint8_t length;
        uint8_t command;
    } fields;
} Packet_t;

Parsing Raw Data

Packet_t pkt;

// Suppose this comes from hardware/network
pkt.raw = 0x12345678;

printf("Command: 0x%X\n", pkt.fields.command);
printf("Length: 0x%X\n", pkt.fields.length);
printf("Payload: 0x%X\n", pkt.fields.payload);
  • No shifting/masking needed
  • Same memory interpreted in multiple ways

Note: Byte order matters! Little-endian vs big-endian affects interpretation.

Example: Bitfield Access in Packet

Sometimes, a field is bit-encoded, like flags:

typedef union {
    uint8_t byte;
    struct {
        uint8_t FLAG1 : 1;
        uint8_t FLAG2 : 1;
        uint8_t MODE  : 2;
        uint8_t UNUSED: 4;
    } bits;
} ControlReg_t;

ControlReg_t reg;
reg.byte = 0x9; // 00001001

printf("FLAG1 = %u\n", reg.bits.FLAG1);
printf("MODE  = %u\n", reg.bits.MODE);
  • Easily extract flags and modes from raw data.

Advantages for Embedded Systems

  1. Direct mapping: Hardware register or received packet → union
  2. Faster code: No loops or shifts needed for every field
  3. Cleaner code: Easy to maintain
  4. Memory-efficient: No extra buffers

Key Considerations

  1. Endianness:
    • Ensure union interpretation matches CPU endianness or network byte order.
  2. Bitfield order:
    • Compiler-dependent, so verify before using unions with bitfields for protocols.
  3. Alignment & padding:
    • For packed protocols, use __attribute__((packed)) to avoid compiler padding.

Summary

Unions are perfect for interpreting raw data packets because they allow multiple views of the same memory — full word, individual bytes, or bitfields — without copying or extra memory. This makes packet parsing fast, memory-efficient, and clean, which is essential in embedded systems and network protocols.

Both structures and unions are used to group data in C, but they differ fundamentally in memory layout, which affects how memory-related issues manifest.

Memory Layout

FeatureStructureUnion
Memory usedSum of all members + paddingSize of largest member only
Member storageSeparate, each member has its own memoryAll members share same memory
AccessEach member independentOnly one member valid at a time

Implication:

  • Structures: Memory failures usually affect specific members.
  • Unions: Memory failure or corruption affects all overlapping members simultaneously.

Memory Failure Scenarios

a) Structure

typedef struct {
    int a;
    double b;
    char c;
} MyStruct;

MyStruct s;
  • If s.a is corrupted, s.b and s.c remain unaffected because they occupy different memory locations.
  • Debugging is easier — you know which member is affected.

b) Union

typedef union {
    int a;
    double b;
    char c;
} MyUnion;

MyUnion u;
  • All members share the same memory block.
  • If u.a is overwritten incorrectly, u.b and u.c are automatically corrupted because they occupy the same memory.
  • This makes memory failures harder to isolate.

Assignment and Memory Risks

  • Structure assignment: copies all members → risk of memory corruption in each member if source is invalid.
  • Union assignment: copies only largest member memory block → any memory failure affects all members at once, potentially more critical in embedded systems.

Endianness & Alignment Issues

  • Structures: misaligned members may cause hardware exceptions on some architectures (ARM, DSP). But the failure is usually localized to one member.
  • Unions: misalignment or incorrect type-punning can corrupt multiple members because all share the same memory.

Union memory failures propagate to all members, structure failures are usually contained.

Best Practices to Avoid Memory Failures

For Structures

  • Ensure proper padding/alignment
  • Access members individually
  • Check bounds if using arrays inside structures

For Unions

  • Only access the last assigned member
  • Be cautious with bitfields and type-punning
  • Validate endianness when interpreting raw data
  • Use unions primarily for memory-efficient overlays, not as general storage containers

Summary Table

AspectStructureUnion
Memory per memberSeparateShared
Memory corruption impactLocalized to specific memberAffects all members simultaneously
Assignment costCopies all membersCopies only largest memory block
Type safetySafer, independent membersRisky if wrong member is accessed
Ideal useGeneral-purpose data groupingOverlaying different data types, protocol parsing

Key Interview Point

Structures provide memory safety per member, so failures are localized. Unions share memory, so a single memory failure can corrupt multiple interpretations of the data, making debugging and safe access more challenging.

Leave a Comment