Backend Development

Mental Model: Backend is about three responsibilities: (1) receiving and validating requests, (2) executing business logic against data, (3) returning structured responses reliably, securely, and at scale. Every backend decision is a tradeoff between consistency, availability, performance, and developer ergonomics.

Core Technologies—Runtimes & Frameworks

Skill	Core Concepts & Mental Model	Tools & Libraries	Key Techniques	Tradeoffs & Failure Modes	Resources
Node.js Runtime	Single-threaded event loop. Non-blocking I/O means one thread handles thousands of concurrent connections by delegating I/O (network, disk) to the OS. CPU-bound work blocks the loop and kills concurrency	Node.js, `libuv`, `worker_threads`, `child_process`	Event loop phases (timers → I/O → poll → check), `process.nextTick` vs `setImmediate`, `cluster` module for multi-core, `worker_threads` for CPU-intensive tasks	❌ CPU-bound code (image processing, crypto, complex calculations) blocks the event loop and stalls all requests — offload to worker threads or a queue. Unhandled promise rejections crash the process	Node.js docs, Node.js event loop deep dive
Express.js	Minimal, unopinionated Node.js web framework. Gives you: routing, middleware pipeline, request/response abstraction. Everything else (validation, auth, error handling) is your responsibility	Express.js, `express-router`, middleware ecosystem	Middleware chain (`req → res → next`), router-level vs app-level middleware, error-handling middleware (4 params), route grouping, `Router()` for modular routes	❌ No structure enforced — large Express apps become unmaintainable without discipline. Error handling requires explicit 4-param middleware (`(err, req, res, next)`) — missing it silently swallows errors	Express.js docs
Fastify	High-performance Node.js framework. Faster than Express due to schema-based serialization (JSON Schema → compiled serializer). Opinionated plugin system with dependency injection	Fastify, `@fastify/cors`, `@fastify/jwt`, Fastify plugins	Schema-based route validation + serialization (`schema: { body, querystring, response }`), plugin encapsulation, hooks (lifecycle), decorators for extending `request`/`reply`	✅ 2-3x faster JSON serialization than Express ❌ Schema-first approach requires more upfront design; smaller ecosystem than Express; plugin system has a learning curve	Fastify docs
NestJS	Opinionated Node.js framework with Angular-inspired architecture. Built on Express (or Fastify). Enforces structure via modules, controllers, services, and dependency injection	NestJS, `@nestjs/common`, `@nestjs/typeorm`, `@nestjs/graphql`	Modules (feature boundaries), Controllers (routes), Services (business logic), Providers (DI), Guards (auth), Interceptors (logging/transform), Pipes (validation), Decorators everywhere	✅ Scalable architecture out of the box, great for teams, full TypeScript, built-in DI ❌ Heavy boilerplate for small apps; steep learning curve; opinionated structure can feel rigid	NestJS docs
Python — FastAPI	Modern Python framework with async support, automatic OpenAPI docs, and Pydantic for runtime type validation. Ideal for ML/AI backends and data-heavy services	FastAPI, Pydantic, Uvicorn (ASGI server), SQLAlchemy	Path parameters, request body Pydantic models, dependency injection, async route handlers, auto-generated `/docs` (Swagger) and `/redoc`	✅ Auto-generated docs, Python ML ecosystem access, fast for I/O-bound ❌ Python GIL limits true parallelism for CPU work (use Celery/multiprocessing); slower than Node.js for pure throughput; type system weaker than TypeScript	FastAPI docs
Python — Django	Batteries-included Python framework. ORM, admin panel, auth, migrations, sessions — all built in. MVC architecture (Model-View-Template in Django's terminology)	Django, Django REST Framework (DRF), Celery	Models → Migrations → Views/Serializers → URLs, DRF serializers for API responses, class-based vs function-based views, Django admin for rapid backoffice	✅ Fastest time-to-production for CRUD apps, mature ecosystem, excellent ORM ❌ Monolithic by default (hard to split into services); synchronous by default (Django Channels for async); heavy for microservices	Django docs, DRF docs

API Design & Communication

Skill	Core Concepts & Mental Model	Tools & Libraries	Key Techniques	Tradeoffs & Failure Modes	Resources
REST API Design	Resources identified by URLs, stateless, HTTP verbs define the operation. Good REST API design is intuitive — developers can guess endpoints	Postman, Hoppscotch, Swagger/OpenAPI, `express-openapi-validator`	Noun-based URLs (`/users`, `/posts/:id`), proper HTTP verbs, meaningful status codes (201 Created, 204 No Content, 422 Unprocessable Entity), versioning (`/api/v1/`), consistent error response shape	❌ Over/under-fetching, no type contract between client/server, API drift without spec (use OpenAPI), inconsistent error formats make client error handling brittle	REST API Design Best Practices, HTTP Status Codes
GraphQL	Client specifies exact data shape. Single `/graphql` endpoint. Schema = contract. Solves over/under-fetching fundamentally	Apollo Server, GraphQL Yoga, `graphql-js`, DataLoader	Schema-first design (SDL), resolvers, mutations, subscriptions, DataLoader for batching (N+1 fix), persisted queries, schema stitching / federation for microservices	❌ N+1 query problem if not using DataLoader (every resolver fires a separate DB query for each parent); harder HTTP caching (POST requests); complexity overhead for simple CRUD; introspection in production leaks schema	GraphQL.org, Apollo Server docs
gRPC	Binary RPC protocol using Protocol Buffers (`.proto` files as schema). Strongly typed, generates client/server code. Native streaming (unary, server-stream, client-stream, bidirectional)	gRPC (Node.js: `@grpc/grpc-js`), Protobuf, `grpc-web` (browser proxy)	Define service + messages in `.proto`, generate TypeScript types, interceptors for middleware, deadlines/timeouts per call, error codes (`INVALID_ARGUMENT`, `NOT_FOUND`, etc.)	✅ Fastest protocol (binary), built-in streaming, strong typed contract ❌ Not browser-native (requires gRPC-Web + proxy); `.proto` adds overhead for small teams; harder to debug (binary — use `grpcurl`); overkill unless service-to-service at scale	gRPC Node.js, Protobuf docs
tRPC	End-to-end type-safe RPC for TypeScript monorepos. No `.proto` files, no codegen, no REST endpoints — just TypeScript functions. Types flow from server to client automatically	tRPC v11, Zod (input validation), TanStack Query (transport), Next.js / Express adapters	Routers + procedures (`query`/`mutation`/`subscription`), Zod input schemas, context (auth/session per request), middleware on procedures, infinite query support	✅ Zero type drift, best DX for full-stack TypeScript, no schema duplication ❌ Tight client-server coupling (TypeScript monorepo required); not suitable for public APIs or non-TS consumers; can't call from Postman/curl easily	tRPC docs
API Versioning	APIs change — versioning prevents breaking existing clients. Choose one strategy and enforce it from day one	—	URL versioning (`/api/v1/` → most common, explicit), Header versioning (`Accept: application/vnd.api+json;version=2`), query param (`?version=2`). Deprecation headers (`Sunset`, `Deprecation`)	❌ No versioning = breaking change = all clients break simultaneously. URL versioning duplicates routes. Never remove a version without sunset notice + migration period	—
API Documentation	Documentation is a product feature. Auto-generated docs from code are always more accurate than manually written docs	Swagger/OpenAPI (`swagger-ui-express`, `@nestjs/swagger`), Scalar, Redoc, Postman Collections	OpenAPI spec in `openapi.yaml` or generated from decorators/Zod, interactive playground for testing, changelog for breaking changes	❌ Outdated docs are worse than no docs — auto-generate from code annotations or Zod schemas. Missing error response docs frustrate frontend developers	OpenAPI Spec

Database Systems

Skill	Core Concepts & Mental Model	Tools & Libraries	Key Techniques	Tradeoffs & Failure Modes	Resources
Relational Databases (SQL)	Tables with fixed schemas, relationships via foreign keys, ACID transactions. The right default for most applications — structured, consistent, queryable	PostgreSQL (recommended), MySQL, SQLite (dev/test)	Schema design (normalization, indexes), JOINs, transactions (`BEGIN`/`COMMIT`/`ROLLBACK`), indexes (B-tree, GIN, partial), `EXPLAIN ANALYZE` for query optimization, `VACUUM` for PostgreSQL maintenance	❌ Schema migrations on production with live traffic are dangerous (use `pg-migrate` or Prisma migrations with care). N+1 queries — always check query count. Missing indexes = full table scans at scale	PostgreSQL docs, Use the Index, Luke
ORM — Prisma	Type-safe database client for Node.js/TypeScript. Schema-first: write `schema.prisma` → generates TypeScript types + migration SQL	Prisma Client, Prisma Migrate, Prisma Studio (GUI)	`schema.prisma` as single source of truth, `prisma migrate dev` (dev) vs `prisma migrate deploy` (prod), `prisma.$transaction()` for atomic ops, relation queries (`include`, `select`), raw SQL via `prisma.$queryRaw`	✅ Best-in-class TypeScript types, readable query API, auto-migration ❌ Generated queries can be inefficient for complex joins — use raw SQL when needed. `findUnique` vs `findFirst` performance difference. Schema changes require migration — carefully managed in prod	Prisma docs
ORM — Drizzle	Lightweight SQL-first TypeScript ORM. Write queries that look like SQL — no magic. Zero runtime overhead (compiles to raw SQL). Better for devs who want SQL control	Drizzle ORM, `drizzle-kit` (migrations), Drizzle Studio	Schema defined in TypeScript (not a separate file), `db.select().from(users).where(eq(users.id, 1))`, transactions, joins, type-safe `WHERE` clauses	✅ Lighter than Prisma, SQL-first (no surprises), better for complex queries ❌ Less ergonomic for simple CRUD vs Prisma; smaller ecosystem; fewer built-in conventions	Drizzle docs
NoSQL — MongoDB	Document database. Flexible JSON-like documents, no fixed schema. Horizontal scaling (sharding) built-in. Good for variable-structure data, not ideal for complex relational queries	MongoDB, Mongoose (ODM), MongoDB Atlas (hosted)	Schema design: embed vs reference (embed for "owns", reference for "many-to-many"), indexes, aggregation pipeline, change streams, transactions (replica set required), Atlas Search	❌ No JOINs — you must denormalize or use `$lookup` (slow). Transactions require replica set. No schema enforcement by default = data integrity issues at scale. "Flexible schema" becomes "inconsistent schema" without Mongoose schemas	MongoDB docs, MongoDB Data Modeling
Redis	In-memory key-value store. Sub-millisecond reads/writes. Used as cache, session store, pub/sub broker, rate limiter, and distributed lock — not just a cache	Redis, `ioredis` (Node.js client), `@upstash/redis` (serverless)	String/Hash/List/Set/Sorted Set data structures, `TTL`/`EXPIRE` for cache expiry, `SETNX` for distributed locks, pub/sub for real-time messaging, Redis Streams for event log, Lua scripts for atomic operations	❌ In-memory = data lost on restart without persistence config (`RDB`/`AOF`). Memory is expensive — don't cache everything. Cache invalidation is hard: stale cache after DB update. Redis is single-threaded — avoid large blocking commands (`KEYS *`, `LRANGE` on huge lists)	Redis docs, ioredis
Vector Databases	Store and query high-dimensional embeddings (arrays of floats). Support semantic similarity search (`k-NN`) — "find documents most similar to this query" — unlike SQL which does exact matching	Pinecone, Weaviate, Qdrant, pgvector (PostgreSQL extension)	Store embedding + metadata, query by vector similarity (cosine / dot product / Euclidean), filter by metadata alongside vector search, namespaces/collections for isolation	❌ Approximate nearest neighbor (ANN) — results are probabilistic not exact. pgvector is good for <1M vectors; dedicated vector DB for scale. Embedding model must be consistent — changing model invalidates all stored vectors	pgvector, Pinecone docs
Message Queues	Decouple producers and consumers. Producer writes a job/message; consumer processes it asynchronously — prevents slow downstream services from blocking the main request flow	BullMQ + Redis (Node.js jobs), RabbitMQ (message broker), Kafka (event streaming), NATS	Job queues (BullMQ): `queue.add('send-email', { to, subject })` + `worker.process()`. Kafka: topics, partitions, consumer groups, offset management (replay events). RabbitMQ: exchanges, routing keys, dead-letter queues	❌ At-least-once delivery = idempotent consumers required (process same message twice safely). Kafka: message ordering only within a partition — partition key design matters. BullMQ: failed jobs need dead-letter queue + alerting	BullMQ docs, Kafka docs
Database Indexes	The single most impactful performance optimization. Without indexes, every query = full table scan	`CREATE INDEX`, `EXPLAIN ANALYZE` (PostgreSQL), Prisma `@@index`, Mongoose `index: true`	B-tree (default, equality + range), GIN (arrays, JSON, full-text search), partial index (`WHERE active = true`), composite index (column order matters — most selective first), covering index	❌ Over-indexing slows writes (every write updates all indexes). Index on wrong column = unused. Composite index column order matters — `(a, b)` helps query on `a` but not `b` alone. Always `EXPLAIN ANALYZE` before and after adding index	PostgreSQL indexing, Use the Index, Luke
Database Transactions	Group multiple operations into an atomic unit — all succeed or all fail (ACID: Atomicity, Consistency, Isolation, Durability)	`pg` `client.query('BEGIN')`, Prisma `$transaction()`, Mongoose sessions	Pessimistic locking (`SELECT FOR UPDATE`), optimistic locking (version field), transaction isolation levels (Read Committed vs Serializable), saga pattern for distributed transactions	❌ Long-running transactions hold locks and block other queries. Nested transactions need savepoints. Distributed transactions across services are fundamentally hard — use saga pattern with compensating actions instead	PostgreSQL Transactions

Authentication & Authorization

Skill	Core Concepts & Mental Model	Tools & Libraries	Key Techniques	Tradeoffs & Failure Modes	Resources
Session-based Auth	Server stores session state. Client holds a session ID in a cookie. Stateful — server must look up session on every request	`express-session`, `connect-redis` (Redis session store), `passport.js`	Server-side session in Redis (not in-memory for multi-server), `HttpOnly + Secure + SameSite=Strict` cookie, session rotation on login (prevent session fixation), session expiry + absolute timeout	❌ Doesn't scale horizontally without shared session store (Redis). Session in server RAM = stateless scaling impossible. Session fixation attack if session ID not regenerated on login	—
JWT (JSON Web Tokens)	Stateless tokens — all data encoded in the token itself. Server signs with secret; client stores and sends on every request. No server-side session lookup	`jsonwebtoken`, `jose` (modern, supports Web Crypto), `passport-jwt`	Access token (short-lived: 15min) + refresh token (long-lived: 7-30 days, stored `HttpOnly` cookie), token rotation on refresh, `RS256` (asymmetric) vs `HS256` (symmetric), `jti` claim for token blacklisting	❌ JWTs cannot be invalidated before expiry without a blocklist (defeats stateless benefit). Storing JWT in `localStorage` = XSS can steal it. Storing in cookie = CSRF risk (mitigate with `SameSite` + CSRF token). Large payload bloats every request	jwt.io, OWASP JWT Cheatsheet
OAuth 2.0 & OpenID Connect	OAuth 2.0 = authorization delegation ("this app can read your Google Drive"). OIDC = identity layer on top ("this is who the user is"). Backend implements the OAuth flow server-side (authorization code + PKCE)	`passport.js` (OAuth strategies), Auth.js/NextAuth, Clerk, Auth0, `openid-client`	Authorization Code + PKCE flow, exchange code for tokens at token endpoint, verify ID token signature, store refresh token server-side, token introspection for API protection	❌ Implicit flow is deprecated (tokens in URL = leaked in logs/history). Never expose client secret in frontend. State parameter required (prevents CSRF on OAuth callback). Token expiry mismatch between access/refresh tokens	OAuth 2.0 RFC, PKCE
RBAC (Role-Based Access Control)	Users have roles; roles have permissions. Authorization = check if user's role permits the action on the resource	Custom middleware, `casl` (Node.js), `accesscontrol`	`user.role = 'admin'	'editor'	'viewer'`, permission check middleware` requireRole('admin')`, resource ownership check (`user.id === resource.ownerId`), ABAC (attribute-based) for complex rules
Password Security	Passwords must be hashed (not encrypted) with a slow, salted algorithm. Encryption is reversible; hashing is not	`bcrypt`, `argon2` (preferred — winner of Password Hashing Competition)	Hash before storing (`argon2.hash(password)`), compare on login (`argon2.verify(hash, input)`), never log passwords, minimum work factor (argon2: `memoryCost: 65536, timeCost: 3`), pepper (server-side secret)	❌ MD5/SHA-1/SHA-256 are NOT safe for passwords (too fast — brute-forceable). `bcrypt` max 72 bytes — truncation vulnerability for long passwords. Never store plaintext, never encrypt (decryptable)	OWASP Password Storage
API Key Management	Machine-to-machine authentication. Keys must be stored hashed on server (like passwords) — only shown to user once at creation	Custom implementation, `nanoid` for key generation	Generate cryptographically random key (`crypto.randomBytes(32).toString('hex')`), store only the hash in DB, prefix keys for identification (`sk_live_...`), scope keys to specific permissions, key rotation	❌ API keys in source code / git = major breach (use secret scanning in CI). No expiry on API keys = indefinite access after leak. Missing rate limiting per key = abuse	—

Data Validation & Error Handling

Skill	Core Concepts & Mental Model	Tools & Libraries	Key Techniques	Tradeoffs & Failure Modes	Resources
Schema Validation	Validate all incoming data at the boundary (request enters your system). Never trust client data — validate type, format, range, and required fields	Zod (TypeScript-first, preferred), Joi, Yup, `class-validator` (NestJS)	Validate `req.body`, `req.params`, `req.query` before any business logic. Zod: `z.object({...}).parse(req.body)` — throws `ZodError` on failure. Use `.safeParse()` for controlled error handling	❌ Validating after DB query = error after side effects (e.g. charge then validate = bad). Missing `req.params` validation = SQL injection / path traversal. `Joi.object().unknown(true)` (allowing unknown fields) = schema bypass	Zod docs
Error Handling Architecture	Centralized error handling. Every thrown error should flow to one place — not scattered `res.status(500).json(...)` calls everywhere	Express error middleware, NestJS exception filters, custom `AppError` class	Custom error class `class AppError extends Error { constructor(message, statusCode, code) }`, throw from anywhere, catch in Express `(err, req, res, next)` global handler, map error types to HTTP status codes, consistent error response shape	❌ `try/catch` in every route = inconsistent error formats. Unhandled promise rejections crash Node process — add `process.on('unhandledRejection', ...)`. Leaking stack traces / DB errors to client in production = security vulnerability	—
Consistent Error Response Shape	Frontend must be able to reliably parse error responses. Pick a shape and enforce it everywhere	Custom error middleware, Zod error formatter	`{ success: false, error: { code: 'VALIDATION_ERROR', message: '...', details: [...] } }`. Use error codes (not just messages) so frontend can handle programmatically without string matching	❌ Different error shapes per endpoint = frontend needs per-endpoint error handling. HTTP 200 with `{ error: true }` in body = breaks standard HTTP error handling	—
Input Sanitization	Sanitize user input before storing or rendering. Validation says "is this data valid?"; sanitization says "make this data safe"	`DOMPurify` (client-side), `sanitize-html` (server-side), `helmet`	Trim whitespace, strip HTML from non-HTML fields, encode special characters for SQL/shell, sanitize before display (not before storage), use parameterized queries (never string-concat SQL)	❌ Sanitizing before storing and then trusting stored data = XSS from DB. Parameterized queries prevent SQL injection — never use string template literals for SQL	—