System Calls in Linux

First PublishedFeb 22, 2026Last UpdatedMar 30, 2026ByAtif Alam

System calls are the interface between user space and the kernel. When a program needs to read a file, create a process, or send a signal, it invokes a syscall; the CPU switches to kernel mode, the kernel runs the corresponding handler, and control returns to user space with a return value. This page covers how they work, how to trace them (e.g. strace), and how that helps when troubleshooting application startup.

What are system calls?

User space cannot access hardware or kernel data structures directly. The kernel exposes system calls — a fixed set of entry points (e.g. read, write, open, fork, kill). The C library (glibc) wraps them; your program calls read(), which may call the read syscall. Syscalls are documented in man 2 (e.g. man 2 read).

Category	Examples
Process control	`fork`, `execve`, `exit`, `waitpid`, `kill`
File operations	`open`, `close`, `read`, `write`, `stat`, `getdents`
Memory	`brk`, `mmap`, `munmap`
Signals	`kill`, `sigaction`, `rt_sigreturn`
IPC	`pipe`, `socket`, `sendmsg`, `recvmsg`
Networking	`socket`, `bind`, `connect`, `accept`, `send`, `recv`

How system calls work

Rough flow when a user program invokes a syscall (e.g. read()):

Stage	What happens
User call	Program calls libc wrapper (e.g. `read(fd, buf, count)`).
Libc wrapper	Puts syscall number and arguments in registers (per architecture ABI), then triggers the switch to kernel (e.g. `syscall` instruction on x86-64).
Mode switch	CPU switches to kernel mode; kernel syscall entry saves user state and dispatches by syscall number.
Kernel syscall	Kernel runs the handler (e.g. for `read`: VFS → filesystem → block layer).
Return to user	Kernel puts return value in a register, restores user state, and returns to user mode; libc returns that value (or sets `errno` on error).

So: user call → libc → mode switch → kernel → return.

Relevant tools

Tool	Purpose
`strace`	Trace syscalls and signals of a process.
`ltrace`	Trace library function calls (not raw syscalls).
`man 2 <name>`	Syscall documentation.
`perf`	Sampling and tracepoints (e.g. syscalls).
`gdb`	Step through code; can see syscall entry/return.
`dmesg`	Kernel log.
`/proc/<pid>/syscall`	Current syscall (if any) and arguments (kernel-dependent).

How strace works

strace uses the ptrace interface. The kernel allows a tracer process (strace) to attach to a tracee; on each syscall entry and syscall exit, the tracee stops and the tracer is notified. strace reads the syscall number and arguments from registers (and memory for pointer args), then resumes the tracee. On exit it reads the return value. So you see a log of every syscall: name, arguments, return value, and optionally time spent. Example:

1
strace -e openat,read,write cat /etc/hostname

You see openat for the file, read for the content, write to stdout.

Tracing library and function calls

Syscalls are the boundary between user and kernel; library and function calls are inside user space. Tools:

ltrace — Interposes library calls (e.g. malloc, printf) and prints them. Useful to see which libc/other library functions are called.
gdb — Step through code; bt for backtrace. Can break on any function.
perf — perf record / perf report for sampling; perf trace for a syscall-oriented trace. Can also trace user symbols with the right options.
ftrace — Kernel-side; trace kernel functions and some entry points. Less about user-space function calls, more about kernel internals.

Using the tools

Trace a command — strace ./myprog or strace -f ./myprog (follow fork). Filter: strace -e open,read,write ./myprog.
Attach to running process — strace -p <pid>. Use -e trace=file to limit to file-related syscalls.
Man pages — man 2 read, man 2 open.
Current syscall — cat /proc/<pid>/syscall (format is kernel-specific).
Time per syscall — strace -T -p <pid> to see time spent in each syscall (helps spot I/O bound behavior).

Example: read() flow

In C, read(fd, buf, count) typically results in:

Libc read() is called.
Libc invokes the read syscall with fd, buf, count.
Kernel resolves fd to the file (and checks permissions), then uses the filesystem and block layer to read data into a kernel buffer and copy it to buf.
Kernel returns number of bytes read (or -1 and errno).
Libc returns that value to the caller.

Example: ls -l from shell to kernel

When you run ls -l in a shell:

Shell — Parses the command, then typically fork() and in the child execve(“/usr/bin/ls”, [“ls”, “-l”], env).
execve (syscall) — Kernel loads the ls executable, sets up its stack and argv/env, and starts main.
ls — Opens the current directory with open(”.”, O_RDONLY | O_DIRECTORY) (or similar), then getdents (or getdents64) to read directory entries.
For each entry, stat (or lstat) to get file metadata (permissions, size, mtime) for the long listing.
write to stdout (fd 1) to print each line. The shell has set up stdout (and the terminal); the kernel handles the actual write to the terminal.

So: shell → fork/execve → ls → open, getdents, stat, write — all implemented via syscalls.

Troubleshooting application startup

When an application won’t start, work through these systematically:

Logs — Check the app’s log file, journalctl -u <unit> for systemd, or dmesg for kernel messages (e.g. OOM, segfault).
Is it running? — ps aux | grep <name>; check exit status if it exited (echo $? after running in shell).
Ports — If it should listen: ss -tlnp or netstat to see if the port is in use or if the app bound to it.
Permissions — Wrong user, missing execute bit, or unreadable config/file. Run as the intended user; check ls -l and ownership.
strace — Run with strace -f ./myprog 2>&1 | tee trace.log and look for the last syscall before exit (e.g. failing open, connect, or execve).
Missing libraries — ldd ./myprog shows dynamic libraries; “not found” means fix PATH or install the package.
Environment — Wrong PATH, HOME, or required env vars. Run with env -i to strip env and test, or compare with a working environment.
Limits — ulimit -a; too-low limits (e.g. open files, stack) can cause failure. Adjust or run under systemd with LimitNOFILE=... etc.
SELinux — If enabled, denials can block execution or file access. Check getenforce (Enforcing vs Permissive); inspect ausearch -m avc or audit log for denials. Temporarily set to Permissive to confirm, then fix labels or policy.
AppArmor — Similarly can block execution. Check aa-status; look at the app’s profile and audit logs. Easiest test: aa-complain /path/to/profile or disable for that profile to see if startup succeeds.

Summary table

Purpose	Trigger	Return	Trace	Man
Syscall	User code → libc → kernel	Value (or errno)	strace, perf	man 2
Process	fork, execve, exit	PID, 0, or status	strace -f	man 2
File	open, read, write, stat	fd, bytes, or 0	strace -e trace=file	man 2
Traceability	strace (ptrace)	—	strace -p <pid>	—