Backend Security β Lecture Notes
Core mindset: Attackers donβt care about your framework or language. They ask one question: βWhere did the developer make an assumption?β Every vulnerability in this chapter comes from a developer assuming user input would be clean, users are who they say they are, or requests come from their own frontend. Security is about being paranoid at every boundary.
1. The Attackerβs Mental Model
Your backend speaks multiple languages simultaneously:
User (browser: HTML/JS/CSS)
β
Backend (your code: Go/Python/Node)
β β β
Database (SQL) OS (shell) HTML renderer
Every time user input crosses from one language context into another, a vulnerability can arise. The input that is data in one context may become code in another.
Root cause of all injection attacks: Confusion between data and code. Treating data as code, or code as data.
Three questions to ask at every boundary:
- Where is data crossing a boundary?
- What assumptions am I making about this data?
- What if those assumptions are wrong?
2. Injection Attacks
2.1 SQL Injection
The vulnerable pattern β string concatenation:
-- Template
SELECT * FROM users WHERE email = '<user_input>'
-- Legitimate input β alice@gmail.com
SELECT * FROM users WHERE email = 'alice@gmail.com'
-- Malicious input β ' OR '1'='1'--
SELECT * FROM users WHERE email = '' OR '1'='1'--'
What happens:
- The first
'closes the opening quote, making email = empty string (false) OR '1'='1'is always true β entire WHERE becomes true--comments out everything after β the trailing'is ignored- Result:
SELECT * FROM usersβ attacker gets all rows
More destructive variant:
-- Malicious input β '; DROP TABLE users;--
SELECT * FROM users WHERE email = ''; DROP TABLE users;--'
The DROP TABLE executes as a second SQL statement, deleting the table.
Further attack surface:
UNIONstatements to extract data from other tables (payment info, etc.)- Database-specific functions to read server filesystem files
- In some configs, execute OS commands through the database
Fix: Parameterized Queries (Prepared Statements)
-- Two separate things sent to the database:
1. Query template: SELECT * FROM users WHERE email = $1
2. User data: "alice@gmail.com" (or whatever was passed)
The database treats $1 slot content as PURE DATA, never as SQL syntax.
The malicious input ' OR '1'='1'-- becomes just a garbage string β it finds no matching email and causes no harm.
Key points:
- Every modern DB driver supports parameterized queries
- Every ORM uses them by default
- The only way to be vulnerable is to deliberately build raw SQL strings
- Validation layer should also catch non-email-shaped strings before they reach the DB
NoSQL is not immune: MongoDB query objects support operators ($ne, $gt, $exists). If user-controlled JSON is passed directly as a query object, operators can be injected. Always validate structure, not just values.
2.2 Command Injection
The vulnerable pattern: Constructing OS commands with string concatenation.
# Backend calls FFmpeg with user-provided filename
ffmpeg -height 120 -width 220 -o <user_input>
# Legitimate input β output.jpg
ffmpeg -height 120 -width 220 -o output.jpg
# Malicious input β output.jpg; rm -rf /
ffmpeg -height 120 -width 220 -o output.jpg; rm -rf /
When the shell encounters ;, the rm -rf / becomes a new command and executes β deleting the entire root filesystem.
Attackers can also use | (pipe), & (background execution), and shell escape sequences for more creative exploits.
Fix: Use language-provided functions that accept command and arguments separately.
-- Bad: shell string (user input goes through shell interpreter)
exec("ffmpeg -o " + userInput)
-- Good: argument array (user input goes directly to process, bypasses shell)
exec(["ffmpeg", "-o", userInput])
With argument arrays, the OS passes the user string directly to the process β itβs treated as data, never interpreted as shell syntax.
Universal injection prevention rule: Whenever building a string that will be interpreted by another system (SQL, OS, HTML, LDAP) and that string includes user input β stop, and find a parameterized alternative. It almost always exists.
3. Authentication Security
3.1 Use an Auth Provider (Production Recommendation)
Implementing production-grade auth yourself means handling:
- Stateful sessions (Redis, revocation, device tracking)
- OAuth flows (Google, GitHub) + account linking between email and social login
- Session-email linking (same email, different providers = same user)
- Token rotation, refresh strategies, timing attacks
Providers like Clerk, Auth0 handle all of this. When billing becomes painful (millions of users), youβll have revenue to support it. Start with a provider; migrate later if needed.
3.2 Password Storage
Evolution of password storage:
Method 1: Plain Text β
DB: { email: "alice@gmail.com", password: "123456" }
- Breach exposes all passwords directly
- Employees/DBAs can see all user passwords
- 70%+ of users reuse passwords β one breach = multiple account takeovers
Method 2: Hashing β οΈ (not enough alone)
Hashing function properties:
- Takes any input of any length
- Always returns fixed-length output
- Same input β always same output
- One-way: mathematically impossible to reverse
hash("123456") β "$2b$12$xyz..." (bcrypt output)
On login: hash the provided password, compare with stored hash. Breach exposes hashes, not passwords.
Problem β Rainbow Tables: Attackers precompute a table of common_password β hash for all common hashing algorithms. If your hash matches a rainbow table entry, the password is cracked.
Method 3: Hashing + Salting β
Salt: A randomly generated string, unique per user, stored in the DB alongside the hash.
password = "123456"
salt = "sp3xR9kQ..." (cryptographically random, per-user)
hashed = hash(password + salt) β store in DB
Why it defeats rainbow tables: The rainbow table has hash("123456"). But your DB has hash("123456" + "sp3xR9kQ..."). These will never match. Each userβs hash is unique even if their password is identical.
Problem β Brute Force with GPUs: Modern GPUs compute billions of SHA-256 hashes/second. With a breached DB (including salts), an attacker can brute-force offline: try every common password, hash it with the stolen salt, compare.
Method 4: Hashing + Salting + Slow Hash Functions β β
Do NOT use for passwords: MD5, SHA-256, SHA-512 (general-purpose, very fast β billions/sec on GPU)
Use for passwords (slow hash functions):
- bcrypt (long-standing default)
- Argon2id (current industry standard)
These have a configurable cost factor / work factor that controls how slow they run:
| Scenario | Speed | Effect |
|---|---|---|
| Genuine user login | 300β400ms per login | Imperceptible to user |
| GPU brute force | ~4β5 attempts/sec (vs billions without slow hash) | Would take decades/centuries instead of days |
3.3 Sessions (Stateful Authentication)
After successful login, the server:
Step 1: Generate a cryptographically secure random session ID (128β256 bits)
- Must use a CSPRNG (cryptographically secure pseudo-random number generator)
- 128 bits = more possible values than atoms in the universe β guessing is impossible
Step 2: Store session in DB/Redis with metadata:
- User ID
- Created timestamp
- Expiry time (e.g., 7 days)
- IP address (for βsigned in from X locationβ feature)
- User agent (device/browser type)
Step 3: Send session ID to browser as a cookie. All subsequent requests automatically include it.
Critical cookie flags:
| Flag | Value | Effect |
|---|---|---|
HttpOnly | true | JavaScript cannot read this cookie β XSS attacks canβt steal it |
Secure | true | Cookie only sent over HTTPS, never plain HTTP β prevents interception on public Wi-Fi |
SameSite | Strict or Lax | Cookie not sent in cross-origin requests β prevents CSRF |
Never store session ID or JWT in
localStorageβ XSS attacks can steal it. UseHttpOnlycookies.
3.4 JWT (Stateless Authentication)
Structure: Three base64-encoded parts separated by dots.
HEADER.PAYLOAD.SIGNATURE
Header: { "alg": "HS256", "typ": "JWT" }
Payload: { "sub": "user_id_123", "iat": 1710000000, "name": "Alice", "admin": false }
Signature: HMAC_SHA256(header + "." + payload, secret_key)
How it works:
- Server signs the payload with a secret key stored in env vars
- Client stores the JWT and sends it in
Authorization: Bearer <token>header - Server verifies signature on every request β any tampering invalidates the signature
- Payload is just base64 β readable by anyone, but not modifiable without the secret
Important: Never store sensitive data in JWT payload (itβs not encrypted, just encoded).
JWT limitations:
| Problem | Impact |
|---|---|
| Revocation is hard | Canβt immediately log out a compromised account from all devices |
| Storage problem | localStorage = vulnerable to XSS; cookies = need HttpOnly, ends up same as sessions anyway |
Workarounds:
- Blacklist tokens: Store revoked tokens in Redis; check on every request
- Short access token + refresh token flow:
Login β issue:
access_token (expires in 5β10 min)
refresh_token (expires in 1β7 days)
Workflow:
Request with access_token
β 401 if expired
β Send refresh_token to get new access_token + refresh_token
β Cycle continues
If compromised: attacker has access for max 5β10 minutes,
then can't refresh without the refresh_token.
Recommendation: Unless you have specific horizontal scaling requirements, prefer stateful sessions over JWTs. The tradeoffs of stateless auth are rarely worth it for typical SaaS. If you use JWTs, use short expiry + refresh tokens + HttpOnly cookies.
3.5 Rate Limiting on Authentication Endpoints
Without rate limiting, attackers can brute-force credentials at thousands of attempts/second, or crash your server with volume.
Layered rate limiting strategy:
| Layer | Mechanism | Bypassed by |
|---|---|---|
| Per-IP | 10 attempts/min per IP | VPNs, botnets, rotating IPs |
| Per-account | 5 failures per 15 min β lock account | Distributed password spray (1 attempt per account) |
| Global | 100 failed attempts/min system-wide β alert + CAPTCHA | Nothing β last resort |
Use all three layers. More restrictive limits for auth endpoints than general API endpoints.
4. Authorization Security
Authentication = Who is this user? (Identity) Authorization = What is this user allowed to do? (Permissions)
4.1 The False Sense of Security
The mistake: Checking auth at the routing layer, then assuming the user has access to everything.
Routing layer: β
authenticated, β
has "read:books" permission
β
Repository: SELECT * FROM books WHERE id = 5 β NO user check!
Book ID 5 may belong to a different user. The routing layer check doesnβt prevent this.
The fix: Check authorization at the point of data access.
-- Wrong: fetches any book with ID 5
SELECT * FROM books WHERE id = 5
-- Right: fetches only if it belongs to the authenticated user
SELECT * FROM books WHERE id = 5 AND user_id = $currentUserId
Apply this to ALL operations: SELECT, UPDATE, DELETE, INSERT.
4.2 BOLA β Broken Object Level Authorization
What: User A can access User Bβs resources by guessing/enumerating resource IDs.
Example: Attacker iterates IDs in /invoices/101, /invoices/102, /invoices/103β¦ downloads all invoices.
Additional subtlety β 403 vs 404:
β Pattern: fetch invoice β if not owner β return 403 Forbidden
Why bad: 403 confirms the invoice exists. Attacker can enumerate
which IDs exist, then plan social engineering attacks.
β
Pattern: SELECT * FROM invoices WHERE id = 7 AND user_id = $currentUser
If invoice doesn't belong to user β zero rows β return 404 Not Found
Attacker cannot distinguish "doesn't exist" from "exists but not yours."
Sequential IDs enable enumeration. Use UUIDs as primary keys β unpredictable, impossible to iterate.
4.3 BFLA β Broken Function Level Authorization
What: Regular user accesses admin-only functions.
The βsecurity through obscurityβ anti-pattern:
"Only admins know the URL /admin/invoices β we don't share it"
Anyone monitoring network traffic can find this URL. No role check = anyone can call it.
Fix: Role-based middleware at the routing layer.
/admin/invoices route:
1. requireAuth middleware β is user logged in?
2. requireRole("admin") middleware β is user an admin?
3. Only then β handler runs
4.4 Authorization Framework
| Practice | Description |
|---|---|
| Centralize authorization | All auth logic in one place β consistent, maintainable, not scattered across handlers |
| Default deny | If not explicitly allowed β deny. New endpoints protected by default, even if you forget |
| Test authorization | Automated tests: user A canβt access user Bβs resources; member canβt access admin functions; unauthenticated canβt access protected resources |
| Audit logs | Log every access to sensitive functions (admin endpoints) and every authorization failure |
Two categories of authorization attacks:
| Category | What happens | Example vulnerability |
|---|---|---|
| Horizontal | User A β User Bβs data (same privilege level, different scope) | BOLA (Broken Object Level Auth) |
| Vertical | Regular user β Admin functions (escalating privilege) | BFLA (Broken Function Level Auth) |
5. XSS β Cross-Site Scripting
What: Attacker gets their JavaScript to execute in a genuine userβs browser, in the context of your platform.
Why itβs dangerous β attackerβs JavaScript can:
- Read all page content including sensitive data
- Make API requests impersonating the logged-in user
- Steal session cookies (if not HttpOnly) or localStorage
- Redirect user to phishing pages
- Alter page content to trick users
5.1 Stored XSS
Attack flow:
- Attacker submits a comment/post containing
<script>maliciousCode()</script> - Server stores the HTML without sanitizing
- Next time any user views that comment β script executes in their browser
Prevention: Sanitize user-provided markup before storing. Strip script tags, event handlers, and other executable HTML from user input on the server side.
5.2 Root Cause (Same as Injection)
User-defined content (data) is treated as code in the HTML/JS context. Same confusion between data and code as SQL injection, but happening in the browser.
5.3 Prevention
Primary: Sanitize all user-provided content server-side before storing or rendering. Never trust user input to be safe HTML.
Secondary β Content Security Policy (CSP):
- HTTP response header that tells browsers what to execute
Content-Security-Policy: script-src 'self'β only run scripts from your domainContent-Security-Policy: script-src 'none'β block all inline scripts- CSP is a last line of defense, not a prevention β fix the root cause first
6. CSRF β Cross-Site Request Forgery
What: Attacker tricks a userβs browser into making a request to your site with their cookies attached.
Example:
- User is logged into bank.com (browser has bank.com cookie)
- User visits malicious evil.com
- evil.com triggers a hidden form submission to bank.com
- Browser automatically includes bank.com cookie β server thinks itβs a legitimate request
Why itβs less relevant in modern apps:
SameSite=StrictorSameSite=Laxon cookies (modern browsers default to Lax) β cookie not sent in cross-origin requests β CSRF blocked- CORS config blocks cross-origin requests without proper headers
Verdict: Not a major threat if using modern frameworks and proper cookie config. Donβt obsess over it; ensure SameSite is not None.
7. Misconfiguration Vulnerabilities
7.1 Secrets in Source Code
β const apiKey = "sk-abc123..." // committed to git
β const apiKey = process.env.OPENAI_API_KEY
If a secret is committed to git: rotate it immediately. Deleting the commit doesnβt help β it remains in git history.
Store secrets in: environment variables, AWS Parameter Store, HashiCorp Vault, Azure Key Vault.
7.2 Debug Mode in Production
LOG_LEVEL=debug in production leaks:
- Stack traces with function/file names and code structure
- Explicit SQL queries and database configs
- Sensitive user data printed during debug
Set LOG_LEVEL=info in production. Debug logs contain internal implementation details that attackers can use to plan targeted attacks.
7.3 Missing Security Headers
Most web frameworks provide a security middleware (one-line setup) that configures all standard headers:
| Header | Protection |
|---|---|
Content-Security-Policy | Controls what scripts/resources browser will execute |
X-Frame-Options | Prevents your site from being embedded in iframes (blocks clickjacking) |
X-Content-Type-Options | Prevents MIME type sniffing |
Strict-Transport-Security | Forces HTTPS |
Use your frameworkβs security middleware β donβt configure these manually.
8. Defense in Depth β Layered Security
No single defense is perfect. Layer them so an attacker must bypass all layers simultaneously:
Layer 1: Input validation
β Validate everything at entry point; data leaving validation should
be exactly the structure expected. No surprises downstream.
Layer 2: Parameterized operations
β DB queries: parameterized queries/ORMs
β OS commands: argument arrays, not shell strings
Layer 3: Authorization at point of access
β Don't rely on routing-layer auth alone
β Check user ownership in every DB query
Layer 4: Security headers and policies
β CSP, SameSite cookies, X-Frame-Options
β Limit blast radius if something gets through
Layer 5: Monitoring and logging
β Log suspicious activity, failed auth attempts, admin access
β Alert on anomalies; make attacks visible
9. Further Reading
| Resource | What it covers |
|---|---|
| PortSwigger Web Security Academy | Free, comprehensive, hands-on labs for all major vulnerabilities (SQLi, XSS, CSRF, SSRF, auth attacks, JWT attacks, etc.) |
| OWASP Top 10 | Current list of most critical web vulnerabilities with real-world instances and severity |
| OWASP Cheat Sheet Series | Best practices for specific topics: authentication, session management, input validation, etc. |
| Lucia Auth docs | Guidance for implementing secure authentication with industry best practices |
Quick Revision Checklist
- Security mindset: think like an attacker β βwhere did the developer assume?β
- Root cause of all injection: data treated as code when crossing language boundaries
- SQL injection: string concatenation + user input = vulnerability β fix with parameterized queries
- Command injection: shell string + user input = vulnerability β fix with argument arrays
- NoSQL is not immune: MongoDB operators can be injected if user controls query structure
- Password storage: plain text β β hashing β οΈ β hashing + salt β β slow hash (bcrypt/Argon2id) + salt β β
- Slow hashes (bcrypt, Argon2id): cost factor makes brute force decades-long instead of days
- Session IDs: 128β256 bit, CSPRNG-generated, stored server-side (Redis/DB), sent as HttpOnly cookie
- Cookie flags:
HttpOnly=true,Secure=true,SameSite=StrictorLax - JWT: payload is base64-encoded (readable!), not encrypted β donβt store sensitive data
- JWT revocation problem: use short access token (5β10 min) + longer refresh token (1β7 days)
- Prefer stateful sessions over JWTs unless you have specific horizontal scaling needs
- Rate limiting: per-IP + per-account + global (all three layers for auth endpoints)
- BOLA: always include
AND user_id = $currentUserin DB queries β donβt just check at routing layer - Return 404 (not 403) when user requests another userβs resource β prevents existence confirmation
- BFLA: admin functions need role middleware, not just permission middleware β security through obscurity fails
- Default deny: new endpoints protected by default until explicitly granted
- UUID primary keys prevent enumeration attacks that sequential IDs enable
- XSS root cause: user HTML/JS content treated as code in browser context
- XSS fix: sanitize user markup server-side; CSP as last line of defense (not prevention)
- CSRF: largely mitigated by
SameSitecookies (Lax/Strict) in modern browsers - Never commit secrets to git; rotate immediately if accidentally committed
-
LOG_LEVEL=infoin production β debug logs expose code structure and sensitive data - Use framework security middleware for headers (CSP, X-Frame-Options, HSTS) β one line
- Defense in depth: validation β parameterized ops β auth at access point β headers β monitoring