Linux System Programming explained clearly for beginners, covering core concepts, real system behavior, and practical foundations to build strong low-level Linux skills.
Linux System Programming Part 1 is a beginner-friendly guide designed to help developers understand how software directly interacts with the Linux operating system. This part focuses on the core foundations of system programming, explaining essential concepts such as processes, system calls, file handling, memory layout, and user space vs kernel space in a simple and practical way.
If you are preparing for embedded systems roles, Linux interviews, or low-level programming jobs, this series will help you build strong fundamentals step by step. Real-world examples, clear explanations, and practical insights make complex topics easy to grasp, even for beginners.
By the end of Part 1, you will have a solid understanding of how Linux programs execute, how resources are managed, and how applications communicate with the kernel. This knowledge forms the base for advanced topics like device drivers, multithreading, and performance optimization covered in later parts.
Perfect for students, working professionals, and anyone serious about mastering Linux system programming.
What is GNU Compiler (GCC) ?
Q1: What is GCC and what are its main components?
A1: GCC (GNU Compiler Collection) is a compiler system that supports multiple programming languages such as C, C++, and Fortran. Its main components include:
- Preprocessor (cpp): Handles macros, header inclusion, and conditional compilation.
- Compiler (cc1): Converts preprocessed source code into assembly code.
- Assembler (as): Converts assembly code into machine code object files.
- Linker (ld): Combines object files and libraries into a final executable.
Q2: What are the basic stages of compilation in GCC?
A2: The stages are:
- Preprocessing: Handles
#include,#define, and conditional compilation. - Compilation: Converts preprocessed code into assembly code.
- Assembly: Converts assembly into object code (
.ofiles). - Linking: Combines object files and libraries into an executable binary.
- Explain Compile & Build Process
Q3: What is the difference between compiling and building?
A3:
- Compiling: Transforming source code into object files (machine code) without producing a complete executable.
- Building: The complete process including compilation, assembly, and linking to generate the final executable.
Q4: What are some common flags used in GCC to control compilation?
A4:
-c: Compile only, do not link.-o <file>: Specify output file name.-Wall: Enable all warnings.-g: Include debugging information.-O/-O2/-O3: Optimization levels.
What is Toolchain ?
Q5: What is a cross-compilation toolchain?
A5: A cross-compilation toolchain allows compiling code on one platform (host) to run on a different platform (target). It typically includes:
- Cross-compiler (e.g.,
arm-none-eabi-gcc) - Assembler
- Linker
- Libraries and headers for the target architecture.
Q6: What is the role of binutils in a toolchain?
A6: binutils is a collection of binary tools like as (assembler), ld (linker), objdump, nm, and ar, which help in generating, inspecting, and managing object files and executables.
Explain Object File Analysis ?
Q7: How can you analyze an object file generated by GCC?
A7: Object files (.o) can be analyzed using:
nm <file>→ Lists symbols (functions and variables).objdump -d <file>→ Disassembles code into assembly.readelf -h <file>→ Shows ELF header information.size <file>→ Shows memory size used by code, data, and bss.
Q8: What is the difference between .text, .data, and .bss sections in an object file?
A8:
.text→ Contains executable code..data→ Contains initialized global/static variables..bss→ Contains uninitialized global/static variables (zero-initialized at runtime).
What is Executable Images ?
Q9: What is an ELF executable?
A9: ELF (Executable and Linkable Format) is a standard binary format used in Linux for executables, object files, and shared libraries. It contains sections for code, data, symbol tables, dynamic linking info, and headers.
Q10: How does the linker resolve symbols while creating an executable?
A10: The linker combines object files and libraries, resolves undefined symbols by matching references with definitions, and adjusts addresses to produce a single executable. It also handles relocation and dynamic linking information if needed.
Q11: How can you check the dependencies of an executable in Linux?
A11: Using ldd <executable> to list shared libraries required by the executable.
What is toolchain ?
A toolchain is essentially a set of programming tools used to develop software for a particular platform or processor. It’s called a “chain” because each tool in the chain passes its output to the next tool, ultimately producing a final executable program.
Key Points about a Toolchain:
- Purpose:
To take source code and turn it into a binary executable that can run on a target system. - Components of a Typical Toolchain:
For C/C++ development (especially with GCC), a toolchain usually includes:- Compiler (e.g., gcc, g++) → Converts source code into assembly or object files.
- Assembler (as) → Converts assembly code into machine code (object files
.o). - Linker (ld) → Combines multiple object files and libraries into a single executable.
- Libraries → Precompiled code you can use (like
libc, math libraries). - Debugger (gdb) → Helps analyze and debug the program.
- Other Utilities (binutils) → Tools like
objdump,nm,readelf,arfor managing object files and binaries.
- Cross-Compilation Toolchain:
When the development machine (host) is different from the target machine, you need a cross-compiler toolchain. For example:arm-none-eabi-gcc→ Compiles code on x86 Linux for ARM microcontrollers.
- Flow of the Toolchain:
Source Code (.c/.cpp) → Compiler → Assembly (.s) → Assembler → Object File (.o) → Linker → Executable - Why It’s Important:
- Ensures the program runs correctly on the target platform.
- Allows debugging, optimization, and analysis of code.
- Supports embedded development where the target hardware is not the same as the host.
Example: On a Linux system, a simple toolchain flow for C could be:
gcc main.c -o main
Here, gcc acts as a compiler + linker, internally invoking the assembler and linker to generate the executable main.
Introduction to Libraries
What is a Library?
A library is a collection of precompiled code, functions, classes, or routines that you can use in your programs without rewriting them.
Libraries help you reuse code, modularize applications, and reduce compilation time.
- Example:
printf()in C comes from the standard C library (libc).
Types of Libraries
- Static Libraries (
.ain Linux,.libin Windows)- Code is copied into the executable at compile-time.
- Advantages: Faster execution, no dependency at runtime.
- Disadvantages: Larger executable, need to recompile to update.
- Shared / Dynamic Libraries (
.soin Linux,.dllin Windows)- Code is linked at runtime.
- Advantages: Smaller executables, easy to update library without recompiling programs.
- Disadvantages: Requires library to be present at runtime.
Creating Libraries
2.1 Creating a Static Library
- Write your functions in a
.cfile. Example:mylib.c
// mylib.c
#include <stdio.h>
void greet() {
printf("Hello from library!\n");
}
- Compile the object file:
gcc -c mylib.c -o mylib.o
- Create the static library:
ar rcs libmylib.a mylib.o
ar→ archive toolrcs→ replace, create, index
- Use it in your program:
// main.c
void greet();
int main() {
greet();
return 0;
}
- Compile with library:
gcc main.c -L. -lmylib -o main
Interview Tip:
- Question: Difference between
.aand.so?.a→ static, linked at compile-time.so→ shared, linked at runtime
Creating a Shared / Dynamic Library
- Write your functions (same as above).
- Compile as Position Independent Code (PIC):
gcc -fPIC -c mylib.c -o mylib.o
- Create the shared library:
gcc -shared -o libmylib.so mylib.o
- Compile your program with shared library:
gcc main.c -L. -lmylib -o main
- Run your program (make sure library path is set):
export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
./main
Interview Tip:
- Question: Why
-fPICis needed for shared libraries?- Answer: To generate position-independent code, so library can be loaded anywhere in memory at runtime.
Using Libraries
Static Library Usage
- Link at compile-time using
-land-Loptions. - Example:
gcc main.c -L. -lmylib -o main
Dynamic Library Usage
- Link at compile-time, but code is loaded at runtime.
- Example:
gcc main.c -L. -lmylib -o main - Use
LD_LIBRARY_PATHor/etc/ld.so.confto locate shared libraries.
Dynamic Loading at Runtime (Optional / Advanced)
- Use
dlopen(),dlsym(),dlclose()in Linux.
#include <dlfcn.h>
#include <stdio.h>
int main() {
void *handle = dlopen("./libmylib.so", RTLD_LAZY);
void (*greet)() = dlsym(handle, "greet");
greet();
dlclose(handle);
return 0;
}
Interview Tip:
- Question: Difference between compile-time and runtime linking?
- Compile-time → Static / shared linked at compilation
- Runtime → Dynamic loading with
dlopen
Managing Libraries
4.1 Dynamic Library Search Paths
- Environment variable:
LD_LIBRARY_PATH - System-wide path:
/etc/ld.so.conf - Run
ldconfigto update cache
4.2 Versioning of Libraries
- Shared libraries often use version numbers:
libmylib.so.1.0 - Symbolic link example:
libmylib.so -> libmylib.so.1
libmylib.so.1 -> libmylib.so.1.0
4.3 Tools for Library Management
ldd <executable>→ check linked librariesnm <library>→ list symbolsobjdump -x <library>→ examine library structure
1. What is a library?
A library is a collection of precompiled functions or code that can be used in multiple programs.
- Purpose: Code reuse, modularity, easier maintenance.
- Example:
printf()in C comes from libc.
Interview Tip: Mention both static and dynamic libraries when asked.
2. Difference between static and dynamic library
| Feature | Static Library (.a / .lib) | Dynamic / Shared Library (.so / .dll) |
|---|---|---|
| Linking | At compile-time | At runtime |
| Executable Size | Larger (contains library code) | Smaller (code loaded at runtime) |
| Update | Requires recompilation to update library | Can update library without recompiling |
| Dependency | No dependency at runtime | Library must be present at runtime |
| Example | libmylib.a | libmylib.so |
3. Advantages of using libraries
- Reuse of code → avoid rewriting functions.
- Reduce development time.
- Modular programming.
- Easier to maintain and update.
- Can share commonly used functions across multiple programs.
4. How do you create a static library in C/C++?
- Write code:
mylib.c
#include <stdio.h>
void greet() {
printf("Hello from static library!\n");
}
- Compile object file:
gcc -c mylib.c -o mylib.o
- Create static library:
ar rcs libmylib.a mylib.o
5. How to link static library while compiling?
gcc main.c -L. -lmylib -o main
-L.→ library search path-lmylib→ linklibmylib.a
6. Why executable size increases with static library?
- Because all code from the library is copied into the executable at compile-time.
- Even if your program uses only one function, all referenced library code is included.
7. How to create a shared library?
- Write functions (same as above).
- Compile with Position Independent Code (PIC):
gcc -fPIC -c mylib.c -o mylib.o
- Create shared library:
gcc -shared -o libmylib.so mylib.o
8. What is Position Independent Code (PIC)?
- Code that can run at any memory address without modification.
- Required for shared libraries because they can be loaded at different addresses in different programs.
- Compile with:
-fPIC.
9. How dynamic linking works at runtime?
- The dynamic linker loads the shared library into memory when the program runs.
- Program uses symbols from the library.
- Benefits: smaller executables, library can be updated without recompiling.
10. How to set LD_LIBRARY_PATH?
export LD_LIBRARY_PATH=/path/to/library:$LD_LIBRARY_PATH
- This tells the loader where to find shared libraries at runtime.
11. What is ldconfig?
- Updates the cache of shared libraries used by the dynamic linker.
- Example:
sudo ldconfig
- Ensures programs find shared libraries in system paths.
12. Difference between .so and .a
.so→ Shared library, linked at runtime, uses PIC.a→ Static library, linked at compile-time, copied into executable
13. How to version a shared library?
- Use symbolic links:
libmylib.so -> libmylib.so.1
libmylib.so.1 -> libmylib.so.1.0
- This allows multiple versions to coexist.
14. How to check which libraries are linked?
ldd ./executable
- Shows shared libraries used by the executable.
15. Difference between dlopen and static linking
| Feature | Static Linking | dlopen |
|---|---|---|
| Linking | Compile-time | Runtime |
| Flexibility | Fixed | Can load/unload dynamically |
| Example | gcc main.c -lmylib | dlopen("libmylib.so") |
16. What are undefined symbols? How to resolve?
- Undefined symbols: Functions or variables referenced but not defined in the program or linked libraries.
- Resolution:
- Link with correct library
- Check function spelling
- Use
nmto see symbols in library
17. What is weak vs strong symbol in library?
- Strong symbol: Must have one definition, linker uses it first.
- Weak symbol: Can be overridden by another definition.
- Useful for providing default implementations in libraries.
What is a Symbol?
A symbol is a name (function or global variable) that the linker resolves.
Example:
int count; // symbol: count
void foo(); // symbol: foo
Strong Symbol
A strong symbol has a definite definition.
Examples:
int x = 10; // strong global variable
void foo() { } // strong function
Rules:
- Only one strong definition allowed
- Two strong symbols with same name → linker error
multiple definition of `foo`
Weak Symbol
A weak symbol is an optional / overridable definition.
Examples:
__attribute__((weak)) void foo() { }
or (common case):
int x; // tentative definition (weak)
Rules:
- Weak symbol can be overridden by a strong one
- Multiple weak symbols → no linker error
- Used as default implementation
Linker Resolution Rules (IMPORTANT)
| Situation | Result |
|---|---|
| Strong + Strong | ❌ Linker error |
| Weak + Weak | ✔ One chosen |
| Weak + Strong | ✔ Strong wins |
| Only Weak | ✔ Weak used |
Why Weak Symbols Are Used in Libraries
1️⃣ Default Implementation (Override allowed)
// library.c
__attribute__((weak))
void log_message() {
printf("Default log\n");
}
// user.c
void log_message() {
printf("Custom log\n");
}
✔ User version overrides library version
2️⃣ Optional Features / Hooks
- Startup code (
_init) - OS hooks
- Driver overrides
- Embedded systems
3️⃣ Reduce Hard Dependencies
Library provides fallback behavior if symbol is not defined.
Weak Symbols in Shared Libraries
- Weak symbols are resolved at runtime
- Strong symbols in application override shared library weak symbols
Example:
App strong foo → overrides libfoo.so weak foo
How to Check Weak vs Strong Symbols
nm libfoo.so
Output:
W foo # Weak symbol
T bar # Strong symbol (text)
One-Line Interview Answer
A strong symbol must have exactly one definition, while a weak symbol provides a default definition that can be overridden by a strong one during linking.
Common Pitfalls
- Weak symbols hide missing implementations
- Debugging override issues can be tricky
- Overuse leads to unclear ownership
18. How to handle dependencies between shared libraries?
- Use
LD_LIBRARY_PATH,rpath, or system-wide paths - Example:
-Wl,-rpath,/path/to/libsduring compilation - Ensure dependent libraries are available at runtime
Shared libraries often depend on other shared libraries. These dependencies must be correctly resolved at build time and runtime.
We define dependencies in common.mk only if they are shared across multiple targets; target-specific libraries should be defined in their respective Makefiles.
19. How does the linker search for libraries?
- Compile-time:
-L/path - Runtime:
LD_LIBRARY_PATH/etc/ld.so.conf+ldconfig- Default system paths like
/lib,/usr/lib
20. How can static and shared libraries coexist in a project?
- Use static libraries for critical code that must be embedded
- Use shared libraries for common code that may be updated or reused across programs
- Compile and link appropriately:
gcc main.c -L. -lmylib -static-libgcc -o main
What is Process Management?
Process Management is a fundamental function of an operating system (OS).
It is responsible for:
- Creating, scheduling, and terminating processes
- Managing process execution and resources
- Ensuring efficient CPU utilization and system stability
Simply put, process management is how the OS keeps track of all running programs and decides who runs, when, and for how long.
Objectives of Process Management
- Efficient CPU Utilization:
- Maximize CPU usage by scheduling multiple processes efficiently.
- Process Isolation and Protection:
- Ensure processes don’t interfere with each other’s memory or resources.
- Synchronization and Communication:
- Manage inter-process communication (IPC) and prevent race conditions.
- Deadlock Prevention:
- Detect and handle situations where processes wait indefinitely for resources.
- Resource Allocation:
- Allocate CPU, memory, I/O devices fairly to all processes.
Components of Process Management
A. Process Creation
- Processes can be created by:
- User requests (running a program)
- Parent process creating child (fork in Linux)
- Example in Linux:
pid_t pid = fork(); // Creates a child process
if(pid == 0) {
// Child process
} else {
// Parent process
}
B. Process Termination
- When a process finishes execution:
- Releases memory, CPU, and other resources
- Parent may collect exit status (using
wait()in Linux)
- Types of termination:
- Normal termination: Process completes normally
- Abnormal termination: Due to error or signal
C. Process Scheduling
- Determines which process runs next on the CPU.
- Two main types:
- Long-term scheduling: Decides which job enters the ready queue.
- Short-term scheduling: Decides which ready process gets CPU next.
- Schedulers in Linux:
- CFS (Completely Fair Scheduler) for normal tasks
- RT (Real-Time) Scheduler for high-priority real-time tasks
D. Process States
- New: Process created but not yet admitted to ready queue
- Ready: Waiting to get CPU
- Running: Currently executing
- Waiting/Blocked: Waiting for I/O or event
- Terminated: Finished execution
E. Process Control Block (PCB)
- OS keeps a PCB for every process, containing:
- Process ID (PID)
- Process state
- Program counter (PC)
- CPU registers
- Memory pointers
- Open files
- Scheduling info
PCB acts as a process identity card in the OS.
F. Inter-Process Communication (IPC)
- Process management also ensures processes can communicate safely:
- Shared memory
- Message queues
- Pipes
- Signals
G. Context Switching
- When CPU switches from one process to another:
- Save current process’s context (registers, PC)
- Load next process’s context
- Overhead exists, but it allows multitasking
4. Importance in Linux
- Linux is a multi-tasking, multi-user OS.
- Process management ensures:
- Fair CPU distribution
- Isolation between user processes
- Efficient handling of thousands of processes simultaneously
Summary Table
| Feature | Description |
|---|---|
| Process Creation | Forking, exec, user requests |
| Process Termination | Normal/Abnormal, releasing resources |
| Scheduling | Long-term, short-term, CFS, RT |
| Process States | New, Ready, Running, Waiting, Terminated |
| PCB | Stores process info for OS |
| IPC | Communication between processes |
| Context Switching | Save/restore CPU state for multitasking |
Introduction to Program Loading
Program loading is the process of taking a program from disk storage and making it ready to run in memory (RAM).
Key Steps in Program Loading:
- Compile and link:
- Source code (
.c) → Object file (.o) → Executable (a.outor.elf)
- Source code (
- Program is stored on disk:
- As executable files.
- Loader reads the executable:
- Loads text (code), data, bss, heap, and stack into memory.
- Dynamic linking (if required):
- Links shared libraries at load time.
- Program execution begins:
- Sets up stack, heap, and registers.
- Jumps to the program’s entry point (like
main()in C).
Interview Tip:
- Be ready to explain static vs dynamic linking in this context.
- Mention ELF format in Linux for executables.
Process: Defined
A process is a running instance of a program.
- Think of a program as a passive entity, and a process as an active entity that executes the code.
- It includes:
- Program code
- Process stack (function calls, local variables)
- Heap (dynamic memory allocation)
- Data segment (global/static variables)
- Registers, program counter (PC), status
- Process control block (PCB)
Characteristics of a process:
- Has a unique PID (Process ID)
- Exists in process states: New → Ready → Running → Waiting → Terminated
Interview Tip:
- Be ready to explain difference between process and thread:
- Process: Own memory space
- Thread: Shares memory with other threads in the same process
Understanding Process Address Space
Every process has a virtual memory space, which is divided as follows:
+-------------------+ <-- Higher memory
| Stack | (Function calls, local variables)
+-------------------+
| Heap | (Dynamic memory: malloc/new)
+-------------------+
| BSS | (Uninitialized global/static vars)
+-------------------+
| Data | (Initialized global/static vars)
+-------------------+
| Text (Code) | (Instructions)
+-------------------+ <-- Lower memory
Key Points:
- Each process has its own isolated virtual address space.
- Memory protection prevents one process from modifying another process’s memory.
- Linux uses paging and MMU (Memory Management Unit) to map virtual addresses to physical addresses.
Interview Tip:
- Explain stack grows downward, heap grows upward.
- Can mention difference between user space and kernel space.
Kernel Process Descriptor
The kernel process descriptor is a data structure in the Linux kernel that represents a process internally.
- In Linux, it’s called
task_struct. - Stored in the kernel memory for every process.
- Contains:
- Process ID (PID)
- Parent PID
- State (Running, Waiting, etc.)
- Pointers to memory segments (code, data, stack, heap)
- File descriptors (open files)
- CPU scheduling info (priority, timeslice)
- Signals and signal handlers
- Accounting info (CPU time, memory usage)
Interview Tip:
- Always mention
task_structin Linux when asked about kernel representation of processes. - Bonus: Can talk about
thread_infostructure for lightweight threads.
Introduction to Linux Process Scheduler
The Linux scheduler decides which process runs on the CPU and for how long.
Key Points:
- Linux supports preemptive multitasking.
- Every process has a priority.
- The scheduler maintains ready queue and waiting queue.
- Common process states in Linux:
- TASK_RUNNING: Ready or running
- TASK_INTERRUPTIBLE: Waiting for an event, can be interrupted
- TASK_UNINTERRUPTIBLE: Waiting, cannot be interrupted
- TASK_STOPPED: Stopped by signal
- TASK_ZOMBIE: Process terminated but not cleaned up
Schedulers in Linux:
- Completely Fair Scheduler (CFS)
- Default for normal processes
- Uses red-black tree to maintain fair CPU distribution
- Ensures no starvation
- Real-Time Scheduler (RT)
SCHED_FIFO: First in, first outSCHED_RR: Round-robin for real-time tasks
Interview Tip:
- Mention priority, timeslice, preemption, and fairness.
- Be able to explain difference between CFS and RT scheduler.
Summary Table for Interview
| Concept | Key Points |
|---|---|
| Program Loading | Loader reads executable → memory → dynamic linking → execution |
| Process | Active program; PID, stack, heap, data, text; process states |
| Process Address Space | Virtual memory: Text, Data, BSS, Heap, Stack; isolated per process |
| Kernel Process Descriptor | task_struct in Linux; holds all process info |
| Linux Process Scheduler | Decides CPU allocation; CFS (fair), RT (real-time), uses queues, priority |
Introduction to Stack
Definition:
A stack is a LIFO (Last In First Out) data structure used in memory to store temporary data such as function calls, local variables, and return addresses.
- LIFO: Last item pushed is the first item popped.
- Primary Use: Function calls, recursion, local variables, CPU context during interrupts, and expression evaluation.
Memory layout:
- Stack usually grows downwards in most architectures (from high memory to low memory).
- It is a part of process memory layout, along with code/text, heap, and data segments.
Interview Q: Why is stack used in function calls?
Answer: Stack stores function parameters, return addresses, and local variables, allowing nested or recursive function calls to execute correctly.
How Stack Grows and Shrinks
Stack Growth:
- Push operation: Adds data to the stack → stack grows downward (in most systems) toward lower memory addresses.
- Pop operation: Removes data from the stack → stack shrinks upward toward higher memory addresses.
Example (x86 32-bit):
High Memory
| |
| |
| |
| |
|-----------| <- Stack starts here (top)
| Local var | <- Push
| Return Addr|
| Parameters|
Low Memory
- Registers Involved:
ESP/RSP→ Stack Pointer (points to the top of the stack)EBP/RBP→ Base Pointer (used to reference function frame)
Interview Q: Does stack grow upward or downward?
Answer: Usually downward in most architectures like x86, but it depends on CPU architecture.
How Function Parameters Are Passed
Function parameters can be passed:
- Via Stack (common in x86, C default calling conventions)
- Push parameters right-to-left.Function reads them relative to the base pointer.
void foo(int a, int b);
foo(10, 20);
Stack layout before foo executes:
[b = 20] <- top of stack
[a = 10]
[Return Address]
[Old EBP]
Via Registers (common in ARM, x64)
- Parameters passed through registers (
r0-r3in ARM,RCX, RDX, R8, R9in x64 Windows).
Interview Q: Why sometimes parameters are passed in registers instead of stack?
Answer: Registers are faster than memory access, so small number of parameters are passed in registers for efficiency.
4. Stack Frame (Activation Record)
Definition:
A stack frame is a block of memory created on the stack for a function call. It stores:
- Function parameters
- Local variables
- Return address
- Saved base pointer (old
EBP/RBP)
Structure of a stack frame (x86):
Higher Memory
-----------------
Function Params <- passed by caller
Return Address
Old Base Pointer <- EBP of previous function
Local Variables
-----------------
Lower Memory
Creation of stack frame (function call):
- Caller pushes parameters.
- Caller executes
CALLinstruction → pushes return address. - Callee:
- Pushes old
EBP - Sets
EBP = ESP - Allocates space for local variables (
ESP -= locals_size)
- Pushes old
Destruction of stack frame (function return):
- Restore
ESP = EBP - Pop old
EBP - Return using
RET→ pops return address from stack
Interview Q: What is an activation record?
Answer: Another name for stack frame, stores all info needed for a function execution.
5. Step-by-Step Example
int sum(int a, int b) {
int c = a + b;
return c;
}
int main() {
int result = sum(10, 20);
return 0;
}
Execution Stack (simplified):
- Before calling sum():
main() stack frame:
[Return Address to OS]
[Old EBP]
[main locals: result]
- During sum():
sum() stack frame:
[Return Address to main]
[Old EBP of main]
[a = 10]
[b = 20]
[c = 30]
- sum() executes → returns 30 → stack frame destroyed → back to main.
6. How Stack Is Managed by CPU
- Stack Pointer (SP / ESP / RSP): Points to top of stack
- Base Pointer (BP / EBP / RBP): Points to base of current stack frame
- Push/Pop Instructions: Automatically update SP.
- CALL Instruction: Pushes return address, jumps to function.
- RET Instruction: Pops return address, resumes execution.
Interview Q: What registers are used in stack management?
Answer: Stack Pointer (SP) for top of stack, Base Pointer (BP) to access local variables/parameters, Instruction Pointer (IP) for return addresses.
7. Common Interview Questions (with Answers)
Q1: Difference between stack and heap?
A:
| Feature | Stack | Heap |
|---|---|---|
| Allocation | Automatic | Manual (malloc/free) |
| Size | Limited | Larger |
| Access | Fast | Slower |
| Lifetime | Function lifetime | Until freed |
| Growth | Downward | Upward |
Q2: What happens if stack overflows?
A: Stack overflow occurs when recursion or deep function calls exceed stack memory → usually crashes program or triggers segmentation fault.
Q3: Can local variables be accessed after function returns?
A: No, they exist only in the stack frame. After return, memory is reclaimed.
Q4: How recursion uses stack?
A: Each recursive call creates a new stack frame storing its parameters and local variables. Base case stops recursion to avoid overflow.
Q5: What is difference between stack and static memory?
- Stack: Dynamic, automatic, LIFO, local variables
- Static: Fixed, global/static variables, exist throughout program lifetime
Q6: How is stack different from queue?
- Stack: LIFO (last in, first out)
- Queue: FIFO (first in, first out)
Q7: What are saved registers in stack frame?
- Usually callee-saved registers (like EBX, ESI in x86) are saved to maintain state across function calls.
Application Programming Interfaces (API)
1. What is an API? (Definition)
API (Application Programming Interface) is a set of functions, rules, and protocols that allows one software component to communicate with another without knowing its internal implementation.
In simple words:
API = Contract between software components
Example:
int open(const char *pathname, int flags);
You don’t know how open() works internally in the kernel — you just use it.
2. Why Do We Need an API? (Understanding the Need)
Problem without APIs
- Applications would directly access hardware or kernel internals
- Very unsafe
- Hardware dependent
- No portability
- Difficult to maintain
APIs solve these problems
| Problem | How API helps |
|---|---|
| Hardware complexity | Hides hardware details |
| Security | Prevents direct kernel access |
| Portability | Same API works across platforms |
| Maintainability | Internal changes don’t affect apps |
| Reusability | Same API used by multiple apps |
Interview Line
APIs provide abstraction, safety, portability, and standardization for application development.
3. Types of APIs (Interview Important)
User-Space APIs
Used by applications in user mode
Examples:
- POSIX APIs (
printf(),malloc(),open()) - C standard library (
libc) - Qt, Android APIs
Kernel APIs (Internal)
Used inside the kernel, not by applications
Examples:
kmalloc()schedule()copy_to_user()
User applications cannot directly call kernel APIs
4. API vs System Calls
| Feature | API | System Call |
|---|---|---|
| Level | High-level | Low-level |
| Mode | User mode | Switches to kernel mode |
| Who provides | Libraries (glibc) | OS Kernel |
| Portability | High | Low |
| Direct hardware access | ❌ No | ✅ Yes |
Example Flow
printf("Hello");
Internally:
printf() → write() → system call → kernel → device driver
Key Interview Point
APIs may internally use system calls, but APIs are NOT system calls themselves.
5. What is a System Call? (Quick Recap)
A system call is a controlled entry point that allows a user-space program to request services from the kernel.
Examples:
read()write()fork()exec()
System calls require mode switching.
6. User Mode vs Kernel Mode (Core OS Concept)
User Mode
- Limited privileges
- Cannot access hardware
- Runs applications
Kernel Mode
- Full privileges
- Direct hardware access
- Runs OS kernel & drivers
| Feature | User Mode | Kernel Mode |
|---|---|---|
| Hardware access | ❌ | ✅ |
| Crash impact | Only app | Entire OS |
| Security | Safe | Critical |
7. User Mode → Kernel Mode Transition (Step-by-Step)
How does it happen?
- Application calls an API
- API invokes a system call
- CPU executes a software interrupt / syscall instruction
- CPU switches to kernel mode
- Kernel performs requested operation
- Control returns to user mode
Architecture Example (x86)
- Instruction:
syscall/int 0x80
Interview Diagram (Explain verbally)
Application (User Mode)
↓
API Call
↓
System Call
↓
Kernel Mode Execution
↓
Return Result
8. Why Applications Cannot Call System Calls Directly?
| Reason | Explanation |
|---|---|
| Security | Prevent unauthorized access |
| Stability | Avoid OS crashes |
| Hardware protection | Prevent misuse |
| Standardization | APIs abstract OS differences |
9. APIs and Application Portability (VERY IMPORTANT)
What is Portability?
Ability of software to run on multiple platforms with minimal changes
How APIs Enable Portability
- Same API interface across OSes
- Internals change, API stays same
Example:
printf("Hello");
Runs on:
- Linux
- QNX
- Android
- Embedded Linux
Because:
printf()API is standardized (POSIX / C standard)
Without APIs
- Direct system calls
- Hardware-specific code
- Rewrite application for every OS
Interview Line
APIs act as a platform-independent layer, enabling application portability.
10. API Example in Embedded / QNX Context
Since you work on QNX & embedded systems, this is gold for interviews
Example:
read(fd, buffer, size);
- Application doesn’t know:
- Whether data comes from UART
- SPI
- Audio device
- QNX kernel handles it via drivers
This allows same application to run on different SoCs.
11. Real-World Analogy (Interview Friendly)
API = Restaurant Menu
- Menu = API
- Kitchen = Kernel
- Customer = Application
Customer doesn’t enter kitchen
Customer orders via menu (API)
12. Common Interview Questions & Answers
Q1: Are APIs platform dependent?
APIs are platform-independent
System calls are platform-dependent
Q2: Can API exist without system calls?
Yes (pure user-space APIs like math libraries)
Q3: Does every API call result in a system call?
No
Example:
strlen()
Works entirely in user space
Q4: Why not expose system calls directly to applications?
- Security risks
- Portability issues
- Kernel instability
Complete Interview Questions & Answers
What is an API?
Answer:
An API (Application Programming Interface) is a set of predefined functions, rules, and protocols that allows an application to communicate with another software component or the operating system without knowing internal implementation details.
Why do we need APIs?
Answer:
APIs are needed to:
- Hide hardware and kernel complexity
- Provide security and controlled access
- Improve code reusability
- Enable portability across platforms
- Simplify application development
What problems would occur without APIs?
Answer:
Without APIs:
- Applications would directly access hardware
- Security would be compromised
- System crashes would increase
- Code would be hardware-dependent
- Applications would not be portable
What are the main advantages of APIs?
Answer:
- Abstraction
- Security
- Portability
- Maintainability
- Scalability
- Reusability
What are the different types of APIs?
Answer:
- User-space APIs – Used by applications
- Kernel APIs – Used internally by the OS kernel
- Library APIs – Provided by libraries like libc, Qt
- Web APIs – REST, HTTP APIs
What is a User-Space API?
Answer:
User-space APIs are functions that run in user mode and are called directly by applications.
Examples:printf(), malloc(), open()
What is a Kernel API?
Answer:
Kernel APIs are internal functions used inside the kernel to manage memory, processes, and hardware.
Example:kmalloc(), schedule()
Can user applications access kernel APIs directly?
Answer:
No
User applications cannot directly access kernel APIs due to security and stability reasons.
What is a System Call?
Answer:
A system call is a mechanism that allows a user-space program to request services from the kernel by switching from user mode to kernel mode.
Examples of System Calls?
Answer:
read()write()fork()exec()exit()
API vs System Call
| Feature | API | System Call |
|---|---|---|
| Level | High-level | Low-level |
| Mode | User mode | Kernel mode |
| Portability | High | Low |
| Hardware access | No | Yes |
| Safety | Safer | Risky |
Are API and system call the same?
Answer:
No
APIs may internally use system calls, but APIs themselves are not system calls.
Does every API call invoke a system call?
Answer:
No
Example:
strlen()
This works completely in user space.
Why not allow applications to use system calls directly?
Answer:
- Security risks
- OS instability
- No abstraction
- Poor portability
What is User Mode?
Answer:
User mode is a restricted CPU mode where applications run with limited privileges and no direct hardware access.
What is Kernel Mode?
Answer:
Kernel mode is a privileged CPU mode where the OS has full control over hardware, memory, and processes.
Difference between User Mode and Kernel Mode?
| Feature | User Mode | Kernel Mode |
|---|---|---|
| Privileges | Limited | Full |
| Hardware access | No | Yes |
| Crash impact | App only | Whole OS |
Why do we need two modes?
Answer:
To:
- Protect the system
- Prevent faulty applications from crashing the OS
- Enforce security boundaries
How does user mode to kernel mode transition happen?
Answer (Step-by-step):
- Application calls API
- API triggers system call
- CPU executes syscall instruction
- CPU switches to kernel mode
- Kernel performs operation
- Control returns to user mode
Which CPU instruction is used for mode switching?
Answer:
syscallsysenterint 0x80(older x86)
What is an API Wrapper?
Answer:
An API wrapper is a library function that wraps a system call and provides a user-friendly interface.
Example:
printf() → write() → syscall
What is libc?
Answer:libc is the C standard library that provides APIs for:
- File handling
- Memory management
- Process control
How does API improve portability?
Answer:
APIs provide a standard interface, allowing the same application code to run on different platforms without modification.
Why are system calls not portable?
Answer:
Because system calls are:
- OS-specific
- Architecture-dependent
- Kernel-implementation dependent
What is POSIX API?
Answer:
POSIX is a standardized API specification that ensures portability across UNIX-like operating systems.
Example of portability using APIs?
Answer:
read(fd, buf, size);
Works on:
- Linux
- QNX
- Android
- Embedded Linux
API vs Driver
Answer:
- API → Application-facing interface
- Driver → Hardware-facing interface
Can an API exist without kernel involvement?
Answer:
Yes
Example:
- Math APIs
- String APIs
What happens if an API fails?
Answer:
API returns:
- Error codes
- NULL pointers
- errno values
What is errno?
Answer:errno is a global variable set by APIs to indicate the cause of failure.
Role of APIs in Embedded Systems?
Answer:
- Hardware abstraction
- RTOS portability
- Driver isolation
- Faster development
API usage in QNX (Interview Bonus)
Answer:
QNX uses POSIX-compliant APIs allowing applications to remain portable across different SoCs and BSPs.
What is the biggest benefit of APIs?
Answer:
Abstraction + Portability
One-line API definition for interviews?
Answer:
An API is a standardized interface that allows applications to interact with the operating system safely and portably.
Virtual Address Space & Process Memory Management
1. Introduction to Virtual Address Space
Q1. What is a Virtual Address?
A virtual address is an address generated by a program during execution.
It does not directly point to physical RAM.
Instead:
- Virtual Address → MMU (Memory Management Unit) → Physical Address
Each process sees its own private virtual address space.
Q2. What is Virtual Address Space (VAS)?
Virtual Address Space is the range of virtual addresses available to a process.
Example:
- 32-bit system → 4 GB virtual address space
- 64-bit system → theoretically 2⁶⁴ bytes (OS limits it)
Q3. Why do we need Virtual Address Space?
Key reasons:
- Process isolation – one process cannot access another’s memory
- Security – prevents accidental/malicious access
- Memory abstraction – programs don’t care about physical RAM layout
- Efficient memory usage – supports paging & swapping
- Simplifies programming – same address layout for all processes
Q4. Difference between Virtual Address and Physical Address
| Virtual Address | Physical Address |
|---|---|
| Used by program | Used by hardware |
| Process-specific | System-wide |
| Translated by MMU | Actual RAM location |
| Not directly accessible | Accessed by CPU |
2. Managing Process Address Space
Q5. What is a Process Address Space?
It is the entire memory layout assigned to a process.
Typical layout (Linux):
High Address
-----------------
Stack
-----------------
Memory Mapped Region
-----------------
Heap
-----------------
BSS
-----------------
Data
-----------------
Text (Code)
-----------------
Low Address
Q6. Who manages the process address space?
- Kernel manages it
- Uses:
- Page tables
- MMU
- Virtual memory subsystem
Each process has its own page table.
Q7. What is Page Table?
A page table maps:
- Virtual Page Number → Physical Frame Number
Used by MMU during address translation.
Q8. What happens during context switch related to memory?
- Kernel switches:
- Page table base register
- TLB entries may be flushed
- New process sees its own virtual memory
3. Stack Allocations
Q9. What is Stack Memory?
Stack is a memory region used for:
- Function calls
- Local variables
- Function parameters
- Return addresses
Q10. How does Stack grow?
- On most architectures (x86, ARM):
- Grows downward (high → low address)
Q11. What is Stack Frame?
A stack frame is created for each function call and contains:
- Function parameters
- Local variables
- Saved registers
- Return address
Q12. Who allocates and deallocates stack memory?
- Automatically managed
- Allocated when function is called
- Deallocated when function returns
Q13. Stack vs Heap (Interview Favorite)
| Stack | Heap |
|---|---|
| Fast | Slower |
| Auto-managed | Programmer-managed |
| Limited size | Larger |
| Function scope | Global scope |
| Risk: Stack overflow | Risk: Memory leak |
Q14. What is Stack Overflow?
Occurs when:
- Deep recursion
- Large local variables
Leads to segmentation fault.
4. Heap & Data Segment Management
Q15. What is Heap Memory?
Heap is used for:
- Dynamic memory allocation at runtime
Allocated using:
malloc(),calloc(),realloc()
Q16. How does Heap grow?
- Grows upward (low → high address)
Q17. What is Data Segment?
Stores global and static variables.
Q18. Types of Data Segment
Initialized Data Segment
int a = 10;
static int b = 5;
Uninitialized Data Segment (BSS)
int x;
static int y;
Q19. Why is BSS important?
- Occupies no space in executable
- Initialized to zero at runtime
- Saves disk space
5. Memory Maps
Q20. What is a Memory Map?
A memory map shows how a process’s virtual address space is laid out.
Q21. How to view process memory map in Linux?
cat /proc/<pid>/maps
Q22. What does /proc/pid/maps show?
- Address ranges
- Permissions (rwx)
- Mapped files
- Stack, heap, shared libraries
Example:
00400000-00452000 r-xp /bin/app
00652000-00653000 rw-p heap
Q23. What is Memory-Mapped I/O?
Mapping files or devices directly into process address space using:
mmap()
Used for:
- Shared memory
- File I/O optimization
- Device access
6. Dynamic Memory Allocation & De-allocation
Q24. What is Dynamic Memory Allocation?
Allocating memory at runtime from heap.
Q25. Functions used in C
| Function | Purpose |
|---|---|
malloc() | Allocate uninitialized memory |
calloc() | Allocate zero-initialized memory |
realloc() | Resize memory |
free() | Deallocate memory |
Q26. Difference between malloc and calloc
| malloc | calloc |
|---|---|
| Uninitialized | Zero-initialized |
| Faster | Slightly slower |
| Single block | Multiple blocks |
Q27. What is Memory Leak?
Occurs when:
- Allocated memory is not freed
Effects:
- Increased RAM usage
- System slowdown
- Crash in embedded systems
Q28. What is Dangling Pointer?
Pointer referencing memory that has been freed.
Q29. What is Fragmentation?
- Internal fragmentation – unused memory inside allocated block
- External fragmentation – free memory split into small pieces
Q30. How does OS allocate heap memory internally?
Uses:
brk()/sbrk()→ heap extensionmmap()→ large allocations
7. Memory Locking
Q31. What is Memory Locking?
Prevents memory pages from being:
- Swapped out to disk
Q32. Why is Memory Locking required?
Used in:
- Real-time systems
- Audio/video processing
- Embedded & QNX/Linux RTOS
Ensures deterministic performance.
Q33. Functions used for Memory Locking
mlock()
munlock()
mlockall()
munlockall()
Q34. What does mlockall() do?
Locks:
- All current and future memory pages of process
mlockall(MCL_CURRENT | MCL_FUTURE);
Q35. What happens if memory is not locked in RT systems?
- Page faults
- Unpredictable latency
- Missed deadlines
Q36. Any limitations of Memory Locking?
- Requires privileges
- Limited by system settings
- Excessive locking affects overall system
8. Interview Rapid-Fire Questions
Q37. Can two processes have same virtual address?
Yes
But mapped to different physical memory
Q38. Who translates virtual to physical address?
MMU with page tables
Q39. What causes Segmentation Fault?
- Invalid memory access
- Stack overflow
- Dereferencing NULL pointer
Q40. What is Copy-on-Write (CoW)?
- Memory shared until modification
- Used during
fork()
Q41. Heap vs mmap allocation – when used?
- Small allocations → Heap
- Large allocations → mmap
9. One-Line Interview Summary
Virtual memory provides each process an isolated address space where stack, heap, data, and mapped regions are managed by the kernel using paging, ensuring security, efficiency, and deterministic behavior when required.
Conclusion
Linux System Programming builds a strong foundation for understanding how software truly works on a Linux system. By learning Linux System Programming, developers gain clarity on process behavior, memory usage, system calls, and kernel interaction, which are essential for writing efficient and reliable programs. These fundamentals not only improve coding confidence but also help in debugging real-world issues and performing better in technical interviews. Whether you are a beginner or an experienced developer, mastering Linux System Programming opens the door to advanced topics like embedded Linux, device drivers, and high-performance system software, making it a valuable skill for long-term career growth.
Frequently Asked Questions (FAQ) : Linux System Programming
1. What is Linux System Programming?
Linux System Programming involves writing programs that interact directly with the Linux operating system using system calls and low-level interfaces.
2. Why is Linux System Programming important?
Linux System Programming helps developers understand how applications communicate with the Linux kernel, improving performance, reliability, and debugging skills.
3. Who should learn Linux System Programming?
Linux System Programming is ideal for embedded engineers, Linux developers, backend programmers, and students preparing for system-level interviews.
4. What are the core topics in Linux System Programming?
Linux System Programming covers processes, system calls, file handling, memory management, signals, and user space vs kernel space concepts.
5. Is Linux System Programming beginner-friendly?
Linux System Programming can be learned by beginners with basic C knowledge when explained step by step with practical examples.
6. Which language is used for Linux System Programming?
Linux System Programming is primarily done using the C programming language because of its close interaction with the Linux kernel.
7. Is Linux System Programming required for embedded systems?
Yes, Linux System Programming is essential for embedded Linux development and understanding low-level system behavior.
8. How does Linux System Programming differ from application programming?
Linux System Programming focuses on low-level OS interaction, while application programming focuses on high-level user functionality.
9. Does Linux System Programming help in debugging?
Linux System Programming improves debugging skills by giving insight into process execution, memory usage, and system resources.
10. Can Linux System Programming improve career opportunities?
Yes, Linux System Programming skills are highly valued in embedded systems, Linux development, and system software roles.
Read More : Audio Device Driver Interview Questions & Answers
Mr. Raj Kumar is a highly experienced Technical Content Engineer with 7 years of dedicated expertise in the intricate field of embedded systems. At Embedded Prep, Raj is at the forefront of creating and curating high-quality technical content designed to educate and empower aspiring and seasoned professionals in the embedded domain.
Throughout his career, Raj has honed a unique skill set that bridges the gap between deep technical understanding and effective communication. His work encompasses a wide range of educational materials, including in-depth tutorials, practical guides, course modules, and insightful articles focused on embedded hardware and software solutions. He possesses a strong grasp of embedded architectures, microcontrollers, real-time operating systems (RTOS), firmware development, and various communication protocols relevant to the embedded industry.
Raj is adept at collaborating closely with subject matter experts, engineers, and instructional designers to ensure the accuracy, completeness, and pedagogical effectiveness of the content. His meticulous attention to detail and commitment to clarity are instrumental in transforming complex embedded concepts into easily digestible and engaging learning experiences. At Embedded Prep, he plays a crucial role in building a robust knowledge base that helps learners master the complexities of embedded technologies.
