The Microsoft SQL Server Bible

A complete T-SQL reference — fundamentals through window functions, CTEs, stored procedures, error handling, MERGE, indexing, JSON, and query optimization. Every concept explained with real-world use cases.

✍️ Created & curated by Cody Jent

How to Use This Bible

This guide is designed for non-linear learning. Use the sidebar to jump to any topic. Search for any keyword, function, or clause. Each Part stands independently — start wherever you need.

🟢 Beginner

Part I & II. Execution order, SELECT, WHERE, JOINs, aggregation, and built-in functions.

🔵 Intermediate

Parts II–III. Subqueries, set operations, PIVOT, and advanced filtering patterns.

🔴 Advanced

Part III & IV. Window functions, CTEs, stored procedures, triggers, and dynamic SQL.

🏆 Mastery

Part V. Indexes, execution plans, MERGE, JSON/XML, and T-SQL gotchas.

Part I

Foundations — Core T-SQL Query Concepts

Master the mental models that explain how SQL Server actually executes queries, filters, joins, and aggregates data.

Query Execution Order — Start Here Beginner

The most important concept in T-SQL. You write clauses in one order; SQL Server processes them in a completely different order. This single fact explains most query errors and performance pitfalls.

You write SQL as: SELECT → FROM → WHERE → GROUP BY → HAVING → ORDER BY. SQL Server runs it like this:

1FROM / JOINAssemble the full working dataset

2WHEREFilter individual rows (before grouping)

3GROUP BYCollapse rows into groups

4HAVINGFilter groups by aggregate value

5SELECTCompute output columns and aliases

6DISTINCTRemove duplicate output rows

7ORDER BYSort result set

8TOP / OFFSET-FETCHTrim final result set

T-SQL

-- FAILS: alias used in WHERE, but WHERE runs before SELECT creates it
SELECT salary * 1.1 AS adjusted
FROM employees
WHERE adjusted > 100000;   -- ERROR: Invalid column name 'adjusted'

-- FIX 1: repeat the expression in WHERE
SELECT salary * 1.1 AS adjusted
FROM employees
WHERE salary * 1.1 > 100000;

-- FIX 2: wrap in a subquery (derived table)
SELECT * FROM (
    SELECT *, salary * 1.1 AS adjusted FROM employees
) base
WHERE adjusted > 100000;

-- FIX 3: CTE — cleanest and most readable
WITH base AS (
    SELECT *, salary * 1.1 AS adjusted FROM employees
)
SELECT * FROM base WHERE adjusted > 100000;

-- NOTE: Unlike MySQL, SQL Server does NOT allow aliases in GROUP BY
-- You must repeat the expression, or use a subquery / CTE
SELECT YEAR(order_date) AS yr, COUNT(*) AS cnt
FROM orders
GROUP BY YEAR(order_date)   -- must repeat expression, not alias
HAVING COUNT(*) > 10
ORDER BY yr;               -- aliases ARE allowed in ORDER BY

⚠️ SQL Server vs MySQL aliases: SQL Server is stricter — aliases defined in SELECT cannot be referenced in WHERE, GROUP BY, or HAVING. Only ORDER BY can see SELECT aliases. Always use CTEs to avoid repeating complex expressions.

SELECT & FROM Beginner

SELECT defines the shape of your output. FROM is where the data lives. T-SQL adds schema-qualified naming, TOP, and several computed column techniques that differ from other SQL dialects.

T-SQL

-- Always qualify with schema (dbo is the default)
SELECT emp_id, first_name, last_name, salary
FROM dbo.employees;

-- Computed columns and aliases
SELECT
    first_name + ' ' + last_name              AS full_name,
    salary * 12                               AS annual_salary,
    ROUND(salary / 2080.0, 2)               AS hourly_rate,
    DATEDIFF(YEAR, hire_date, GETDATE())    AS years_tenure,
    IIF(salary > 100000, 'Senior', 'Standard') AS tier
FROM dbo.employees;

-- Table aliases: essential for joins
SELECT e.first_name, d.dept_name, l.city
FROM  dbo.employees   e
JOIN  dbo.departments d ON e.dept_id     = d.dept_id
JOIN  dbo.locations   l ON d.location_id = l.location_id;

-- TOP N — SQL Server's row limiter (replaces LIMIT)
SELECT TOP 10 * FROM dbo.orders ORDER BY order_date DESC;

-- TOP with PERCENT
SELECT TOP 5 PERCENT * FROM dbo.sales ORDER BY amount DESC;

-- TOP WITH TIES — includes rows tied at the boundary
SELECT TOP 3 WITH TIES employee_id, salary
FROM dbo.employees
ORDER BY salary DESC;

-- SELECT INTO — creates a new table from a query result
SELECT * INTO dbo.employees_backup
FROM dbo.employees
WHERE dept_id = 5;

-- DISTINCT
SELECT DISTINCT department FROM dbo.employees;
SELECT DISTINCT department, job_title FROM dbo.employees; -- unique combos

📌 Use Case — Clean Report with Calculated Fields

T-SQL

SELECT
    e.emp_id,
    e.first_name + ' ' + e.last_name            AS full_name,
    d.dept_name,
    '$' + FORMAT(e.salary, 'N2')               AS salary_fmt,
    DATEDIFF(YEAR, e.hire_date, GETDATE())     AS years,
    CASE
        WHEN e.salary >= 100000 THEN 'Senior'
        WHEN e.salary >= 70000  THEN 'Mid-Level'
        ELSE 'Junior'
    END                                           AS level
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE e.is_active = 1
ORDER BY years DESC, full_name;

WHERE & Filtering Beginner

WHERE is your row-level filter — it runs early, before grouping, so it's your primary performance lever. Writing a bad WHERE clause is the fastest way to kill query speed.

T-SQL

-- Basic comparisons
WHERE salary > 80000
WHERE status = 'Active'
WHERE hire_date >= '2020-01-01'
WHERE dept_id != 3           -- or <>

-- AND / OR — AND binds tighter than OR, always use parentheses
WHERE (dept_id = 1 OR dept_id = 2) AND salary > 60000

-- BETWEEN (inclusive on both ends)
WHERE salary BETWEEN 50000 AND 120000
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'

-- IN / NOT IN
WHERE dept_id IN (1, 3, 7)
WHERE status NOT IN ('Terminated', 'On Leave')

-- LIKE — pattern matching
WHERE last_name LIKE 'Sm%'       -- starts with Sm
WHERE email LIKE '%@gmail.com'   -- ends with
WHERE phone LIKE '555-____'      -- _ = single character
WHERE code LIKE '[A-C]%'         -- starts with A, B, or C
WHERE code LIKE '[^0-9]%'        -- does NOT start with a digit

-- NULL checks — never use = NULL, always IS NULL / IS NOT NULL
WHERE manager_id IS NULL
WHERE termination_date IS NOT NULL

-- Date filtering — avoid wrapping indexed columns in functions!
-- BAD (non-sargable — can't use index): YEAR(order_date) = 2024
-- GOOD (sargable — can use index on order_date):
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'

Sargability — The Performance Rule

A predicate is sargable (Search ARGument ABLE) if SQL Server can use an index to satisfy it. Wrapping a column in a function destroys sargability.

Non-Sargable ❌ (slow)	Sargable ✅ (fast)
`WHERE YEAR(order_date) = 2024`	`WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'`
`WHERE UPPER(last_name) = 'SMITH'`	`WHERE last_name = 'Smith'` (or use case-insensitive collation)
`WHERE LEN(phone) = 10`	Add a computed column with an index instead
`WHERE salary + bonus > 100000`	`WHERE salary > 100000 - bonus`
`WHERE CONVERT(VARCHAR, id) = '42'`	`WHERE id = 42`

🚨 Implicit type conversions: Comparing a VARCHAR column to an INT literal forces SQL Server to convert every row. Always match data types in your predicates — this is one of the most common accidental performance killers.

JOINs — All Types Intermediate

JOINs combine rows from multiple tables. Understanding which join type returns which rows is fundamental — getting this wrong produces silent data errors that are hard to debug.

T-SQL

-- INNER JOIN — only rows with matches in BOTH tables
SELECT e.first_name, d.dept_name
FROM  dbo.employees   e
JOIN  dbo.departments d ON e.dept_id = d.dept_id;
-- Employees without a dept_id, or depts with no employees, are excluded

-- LEFT JOIN — all rows from left table; NULLs where no right match
SELECT e.first_name, d.dept_name
FROM       dbo.employees   e
LEFT JOIN  dbo.departments d ON e.dept_id = d.dept_id;
-- All employees returned; dept_name is NULL if no department assigned

-- RIGHT JOIN — all rows from right table (rarely used; prefer LEFT JOIN)
SELECT e.first_name, d.dept_name
FROM        dbo.employees   e
RIGHT JOIN  dbo.departments d ON e.dept_id = d.dept_id;

-- FULL OUTER JOIN — all rows from both tables
SELECT e.first_name, d.dept_name
FROM             dbo.employees   e
FULL OUTER JOIN  dbo.departments d ON e.dept_id = d.dept_id;

-- Find orphans on BOTH sides
WHERE e.dept_id IS NULL OR d.dept_id IS NULL;

-- CROSS JOIN — every row × every row (Cartesian product)
SELECT p.product_name, c.color_name
FROM        dbo.products p
CROSS JOIN  dbo.colors   c;
-- 10 products × 5 colors = 50 rows. Use for generating combinations.

-- SELF JOIN — a table joined to itself
SELECT e.first_name AS employee, m.first_name AS manager
FROM  dbo.employees e
LEFT JOIN  dbo.employees m ON e.manager_id = m.emp_id;

Anti-Joins — Finding Non-Matches

T-SQL

-- Pattern 1: LEFT JOIN + IS NULL — find unmatched rows
SELECT e.*
FROM      dbo.employees   e
LEFT JOIN dbo.time_sheets t ON e.emp_id = t.emp_id
    AND t.week_start = '2024-01-01'
WHERE t.emp_id IS NULL;   -- employees with no timesheet that week

-- Pattern 2: NOT EXISTS — often faster, handles NULLs cleanly
SELECT * FROM dbo.employees e
WHERE NOT EXISTS (
    SELECT 1 FROM dbo.time_sheets t
    WHERE t.emp_id = e.emp_id AND t.week_start = '2024-01-01'
);

-- Pattern 3: NOT IN (avoid if subquery can return NULL — entire result becomes empty)
SELECT * FROM dbo.employees
WHERE emp_id NOT IN (SELECT emp_id FROM dbo.time_sheets
                     WHERE emp_id IS NOT NULL);

📌 Use Case — Multi-Table Customer Order Report

T-SQL

SELECT
    c.customer_id,
    c.company_name,
    COUNT(o.order_id)      AS total_orders,
    SUM(oi.quantity * oi.unit_price) AS lifetime_value,
    MAX(o.order_date)      AS last_order_date,
    r.rep_name             AS sales_rep
FROM      dbo.customers    c
LEFT JOIN dbo.orders       o  ON c.customer_id = o.customer_id
LEFT JOIN dbo.order_items  oi ON o.order_id    = oi.order_id
LEFT JOIN dbo.sales_reps   r  ON c.rep_id      = r.rep_id
GROUP BY c.customer_id, c.company_name, r.rep_name
ORDER BY lifetime_value DESC;

GROUP BY & HAVING Intermediate

GROUP BY collapses rows into groups; aggregate functions summarize each group. HAVING filters those groups. Together they power every summary report and analytics query.

T-SQL

-- Basic aggregation
SELECT
    dept_id,
    COUNT(*)              AS headcount,
    COUNT(manager_id)     AS managed_count,  -- ignores NULLs
    COUNT(DISTINCT job_title) AS unique_roles,
    AVG(salary)           AS avg_salary,
    MIN(salary)           AS min_salary,
    MAX(salary)           AS max_salary,
    SUM(salary)           AS payroll_cost,
    STDEV(salary)         AS salary_stdev
FROM dbo.employees
GROUP BY dept_id;

-- HAVING filters groups (WHERE filters rows — they run at different times)
SELECT dept_id, COUNT(*) AS cnt
FROM dbo.employees
WHERE is_active = 1           -- filter rows first (cheap)
GROUP BY dept_id
HAVING COUNT(*) >= 5         -- then filter groups (on aggregate)
ORDER BY cnt DESC;

-- ROLLUP — adds subtotals and grand total
SELECT region, product_line, SUM(sales) AS total
FROM dbo.sales
GROUP BY ROLLUP(region, product_line);
-- NULL in region = grand total; NULL in product_line = region subtotal

-- CUBE — all possible subtotal combinations
SELECT region, product_line, SUM(sales) AS total
FROM dbo.sales
GROUP BY CUBE(region, product_line);

-- GROUPING() — tells you if a NULL is a subtotal or real data
SELECT
    CASE WHEN GROUPING(region) = 1 THEN 'ALL REGIONS' ELSE region END AS region,
    SUM(sales) AS total
FROM dbo.sales
GROUP BY ROLLUP(region);

-- GROUPING SETS — define exactly which grouping combinations you want
SELECT region, product_line, SUM(sales)
FROM dbo.sales
GROUP BY GROUPING SETS (
    (region, product_line),
    (region),
    ()
);

💡 COUNT(*) vs COUNT(col): COUNT(*) counts all rows including NULLs. COUNT(column) only counts non-NULL values in that column. Use COUNT(DISTINCT col) for unique value counts.

ORDER BY & TOP / OFFSET-FETCH Beginner

SQL Server replaces MySQL's LIMIT with TOP for simple cases and OFFSET-FETCH for pagination. TOP is evaluated last, after ORDER BY — never use TOP without ORDER BY unless row order truly doesn't matter.

T-SQL

-- Basic ORDER BY
SELECT first_name, last_name, salary
FROM dbo.employees
ORDER BY salary DESC, last_name ASC;

-- NULLs sort LAST in ASC, FIRST in DESC (SQL Server behavior)
-- Force NULLs to sort last regardless of direction:
ORDER BY CASE WHEN bonus IS NULL THEN 1 ELSE 0 END, bonus DESC;

-- TOP — get first N rows
SELECT TOP 5 emp_id, salary
FROM dbo.employees
ORDER BY salary DESC;

-- TOP with variable
DECLARE @n INT = 10;
SELECT TOP (@n) * FROM dbo.employees ORDER BY emp_id;

-- OFFSET-FETCH — proper pagination (SQL Server 2012+)
-- Page 3, 20 rows per page:
SELECT emp_id, first_name, salary
FROM dbo.employees
ORDER BY emp_id
OFFSET  40 ROWS          -- skip (page-1) × page_size rows
FETCH NEXT 20 ROWS ONLY;  -- take page_size rows

-- Parameterized pagination in a stored procedure:
DECLARE @page INT = 3, @size INT = 20;
SELECT * FROM dbo.employees
ORDER BY emp_id
OFFSET (@page - 1) * @size ROWS
FETCH NEXT @size ROWS ONLY;

⚠️ ORDER BY required for OFFSET-FETCH: Unlike TOP, OFFSET-FETCH mandates an ORDER BY clause. Without deterministic ordering, pagination results are unpredictable between pages.

Part II

Built-In Functions

String manipulation, numeric operations, date/time arithmetic, conditional logic, and NULL handling — the tools you'll use in every query.

String Functions Intermediate

T-SQL string functions differ meaningfully from MySQL equivalents. String concatenation uses + instead of CONCAT in older code, but CONCAT is the safe choice — it handles NULLs gracefully.

T-SQL

-- Concatenation
'Hello' + ' ' + 'World'              -- NULLs propagate! NULL + 'x' = NULL
CONCAT('Hello', ' ', 'World')       -- NULL-safe: treats NULL as empty string
CONCAT_WS(', ', city, state, zip)  -- separator, skips NULLs automatically

-- Case transformation
UPPER('hello')          -- 'HELLO'
LOWER('HELLO')          -- 'hello'

-- Trimming
LTRIM('  hello  ')      -- 'hello  '  (left trim)
RTRIM('  hello  ')      -- '  hello'  (right trim)
TRIM('  hello  ')       -- 'hello'    (both, SQL Server 2017+)
TRIM('x' FROM 'xxhelloxx') -- trim specific character

-- Length
LEN('Hello')            -- 5 (excludes trailing spaces)
DATALENGTH('Hello')     -- 5 for VARCHAR, 10 for NVARCHAR (bytes)

-- Substrings
SUBSTRING('Hello World', 7, 5)  -- 'World' (start, length)
LEFT('Hello World', 5)          -- 'Hello'
RIGHT('Hello World', 5)         -- 'World'

-- Search
CHARINDEX('World', 'Hello World')     -- 7 (1-based, 0 if not found)
CHARINDEX('l', 'Hello', 4)            -- 4 (start search at position 4)
PATINDEX('%[0-9]%', 'abc123')        -- 4 (supports LIKE patterns)

-- Replace & Repeat
REPLACE('Hello World', 'World', 'SQL')  -- 'Hello SQL'
REPLICATE('AB', 3)                      -- 'ABABAB'
STUFF('Hello World', 6, 6, ' T-SQL')   -- 'Hello T-SQL'

-- Padding
FORMAT(42, 'D5')          -- '00042' (left-pad numbers)
RIGHT('00000' + CAST(id AS VARCHAR), 5)  -- '00042' (traditional)

-- String splitting (SQL Server 2016+)
SELECT value FROM STRING_SPLIT('apple,banana,cherry', ',');

-- STRING_AGG — concatenate rows into a string (SQL Server 2017+)
SELECT dept_id,
    STRING_AGG(first_name, ', ') WITHIN GROUP (ORDER BY first_name) AS members
FROM dbo.employees
GROUP BY dept_id;

💡 VARCHAR vs NVARCHAR: Use NVARCHAR for any column that might store non-ASCII characters (names, addresses, etc.). Always prefix Unicode string literals with N: N'Héllo'. Mixing VARCHAR and NVARCHAR in joins or comparisons causes implicit conversions that kill index seeks.

Numeric Functions Beginner

T-SQL numeric handling has a critical trap: integer division. Always ensure at least one operand is a decimal to get the result you expect.

T-SQL

-- Integer division trap
SELECT 7 / 2        -- = 3 (integer division!)
SELECT 7 / 2.0      -- = 3.5 (decimal)
SELECT 7 * 1.0 / 2  -- = 3.5 (force decimal early)
SELECT CAST(7 AS DECIMAL(10,2)) / 2  -- = 3.50

-- Rounding
ROUND(3.456, 2)      -- 3.460
ROUND(3456, -2)      -- 3500  (rounds to hundreds)
FLOOR(3.9)           -- 3     (round down always)
CEILING(3.1)         -- 4     (round up always)

-- Power, sqrt, log
POWER(2, 10)         -- 1024
SQRT(144)            -- 12.0
LOG(100)             -- natural log ≈ 4.605
LOG(100, 10)         -- log base 10 = 2.0
EXP(1)               -- e ≈ 2.718

-- Absolute value, modulo, sign
ABS(-42)             -- 42
17 % 5               -- 2  (modulo operator)
SIGN(-5)             -- -1 / 0 / 1

-- Random integers 1–100 (per-row safe)
ABS(CHECKSUM(NEWID())) % 100 + 1

-- FORMAT — locale-aware number formatting
FORMAT(1234567.89, 'N2')    -- '1,234,567.89'
FORMAT(0.1234, 'P1')        -- '12.3%'
FORMAT(9.99, 'C', 'en-US') -- '$9.99'

-- Safe division (avoid divide by zero error)
SELECT numerator / NULLIF(denominator, 0)                     -- returns NULL
SELECT ISNULL(numerator / NULLIF(denominator, 0), 0)         -- returns 0

Date & Time Functions Intermediate

SQL Server has a rich set of date/time functions and multiple date data types. Choosing the right type and writing sargable date predicates are the two most impactful date-handling decisions you'll make.

Date Data Types

Type	Storage	Range	Precision	Use When
`DATE`	3 bytes	0001–9999	Day	Birth dates, calendar dates
`TIME`	3–5 bytes	—	100ns	Time of day only
`DATETIME`	8 bytes	1753–9999	~3ms	Legacy; avoid in new code
`DATETIME2`	6–8 bytes	0001–9999	100ns	Preferred date+time type
`SMALLDATETIME`	4 bytes	1900–2079	1 min	Low-precision legacy
`DATETIMEOFFSET`	8–10 bytes	0001–9999	100ns + tz	Time zones, global apps

T-SQL

-- Getting current date/time
GETDATE()           -- DATETIME, local server time
GETUTCDATE()        -- DATETIME, UTC time
SYSDATETIME()       -- DATETIME2, higher precision, local
SYSUTCDATETIME()    -- DATETIME2, higher precision, UTC
SYSDATETIMEOFFSET() -- DATETIMEOFFSET with timezone info

-- Extracting date parts
YEAR(order_date)                      -- 2024
MONTH(order_date)                     -- 1–12
DAY(order_date)                       -- 1–31
DATEPART(weekday, order_date)         -- 1=Sunday...7=Saturday
DATEPART(quarter, order_date)        -- 1–4
DATENAME(month, order_date)          -- 'January'
EOMONTH(order_date)                  -- last day of that month
EOMONTH(order_date, 1)              -- last day of next month

-- Date arithmetic
DATEADD(day, 30, order_date)         -- add 30 days
DATEADD(month, -3, GETDATE())        -- 3 months ago
DATEDIFF(day, start_date, end_date)  -- days between
DATEDIFF(year, birth_date, GETDATE()) -- approximate age

-- Truncating to start of period
CAST(order_date AS DATE)                                  -- strip time portion
DATEFROMPARTS(YEAR(order_date), MONTH(order_date), 1)   -- first of month
DATEFROMPARTS(YEAR(order_date), 1, 1)                   -- first of year

-- Converting / parsing dates
CAST('2024-06-15' AS DATE)          -- safe, ISO format always works
CONVERT(DATE, '15/06/2024', 103)    -- style 103 = dd/mm/yyyy
TRY_CAST('bad-date' AS DATE)        -- returns NULL instead of error

-- Formatting dates for display
FORMAT(order_date, 'yyyy-MM-dd')       -- '2024-06-15'
FORMAT(order_date, 'MMMM d, yyyy')     -- 'June 15, 2024'
CONVERT(VARCHAR, order_date, 101)     -- '06/15/2024' (faster than FORMAT)

📌 Use Case — Monthly Sales Summary with Running Total

T-SQL

SELECT
    FORMAT(order_date, 'yyyy-MM')               AS month,
    COUNT(*)                                     AS order_count,
    SUM(total_amount)                            AS revenue,
    SUM(SUM(total_amount)) OVER (
        ORDER BY MIN(order_date) ROWS UNBOUNDED PRECEDING
    )                                            AS running_total
FROM dbo.orders
WHERE order_date >= DATEFROMPARTS(YEAR(GETDATE())-1, 1, 1)
  AND status = 'Completed'
GROUP BY FORMAT(order_date, 'yyyy-MM')
ORDER BY month;

CASE & IIF Beginner

CASE is SQL Server's primary conditional expression — it works anywhere a value is expected: SELECT, WHERE, ORDER BY, GROUP BY, and even inside aggregate functions.

T-SQL

-- Searched CASE (most flexible)
SELECT
    first_name, salary,
    CASE
        WHEN salary >= 120000 THEN 'Executive'
        WHEN salary >= 90000  THEN 'Senior'
        WHEN salary >= 60000  THEN 'Mid-Level'
        ELSE 'Junior'
    END AS grade
FROM dbo.employees;

-- Simple CASE (equality only)
SELECT
    CASE status
        WHEN 'A' THEN 'Active'
        WHEN 'I' THEN 'Inactive'
        WHEN 'T' THEN 'Terminated'
        ELSE 'Unknown'
    END AS status_label
FROM dbo.employees;

-- IIF — shorthand IF/ELSE (SQL Server 2012+)
SELECT IIF(salary > 100000, 'High', 'Standard') AS tier FROM dbo.employees;

-- CASE inside aggregate — conditional aggregation
SELECT
    dept_id,
    COUNT(*) AS total,
    SUM(CASE WHEN is_active = 1 THEN 1 ELSE 0 END) AS active_count,
    SUM(CASE WHEN salary > 100000 THEN salary ELSE 0 END) AS senior_payroll,
    AVG(CASE WHEN gender = 'F' THEN salary END) AS avg_female_salary
FROM dbo.employees
GROUP BY dept_id;

-- CASE in ORDER BY — custom sort order
SELECT priority, task_name FROM dbo.tasks
ORDER BY
    CASE priority
        WHEN 'Critical' THEN 1
        WHEN 'High'     THEN 2
        WHEN 'Medium'   THEN 3
        ELSE 4
    END, task_name;

-- CHOOSE — pick the Nth item from a list (1-indexed)
SELECT CHOOSE(MONTH(GETDATE()), 'Q1','Q1','Q1','Q2','Q2','Q2',
                                   'Q3','Q3','Q3','Q4','Q4','Q4') AS quarter;

NULL Handling Intermediate

NULL is not a value — it means unknown. Any comparison with NULL yields NULL (not TRUE or FALSE), which means rows are silently excluded from results. This causes more data bugs than almost anything else in SQL.

T-SQL

-- NULL arithmetic: NULL propagates through all operations
SELECT NULL + 5         -- NULL
SELECT NULL * 0         -- NULL (not 0!)
SELECT NULL = NULL      -- NULL (not TRUE!)
SELECT NULL IS NULL    -- TRUE — correct way to check for NULL

-- ISNULL — replace NULL with a default (SQL Server specific)
ISNULL(bonus, 0)
ISNULL(manager_name, 'No Manager')

-- COALESCE — returns first non-NULL; ANSI standard, accepts multiple args
COALESCE(bonus, commission, 0)
COALESCE(nickname, first_name, 'Unknown')

-- NULLIF — returns NULL if two values are equal (avoids div/0)
NULLIF(denominator, 0)
numerator / NULLIF(denominator, 0)  -- safe division

-- NULLs in aggregates: all aggregate functions ignore NULLs except COUNT(*)
AVG(bonus)       -- average of non-NULL bonus values only
COUNT(bonus)     -- count of non-NULL bonus values
COUNT(*)         -- count of ALL rows, regardless of NULLs

-- Practical NULL handling in reports
SELECT
    employee_id,
    COALESCE(preferred_name, first_name)                          AS display_name,
    ISNULL(phone_ext, 'N/A')                                      AS ext,
    salary + ISNULL(bonus, 0)                                    AS total_comp,
    CASE WHEN termination_date IS NULL THEN 'Active' ELSE 'Former' END AS status
FROM dbo.employees;

🚨 NOT IN with NULLs: If a subquery used with NOT IN can return NULLs, the entire NOT IN returns no rows. Always add WHERE col IS NOT NULL to the subquery, or use NOT EXISTS instead — it handles NULLs correctly.

Part III

Advanced Queries

Window functions, CTEs, subqueries, set operations, and PIVOT — the patterns that separate intermediate from advanced T-SQL developers.

Window Functions Advanced

Window functions perform calculations across a set of rows related to the current row — without collapsing them like GROUP BY. They are the single biggest productivity multiplier in analytical SQL.

T-SQL

-- Anatomy: function() OVER (PARTITION BY ... ORDER BY ... ROWS/RANGE ...)

-- RANKING functions
SELECT
    first_name, dept_id, salary,
    ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rn,
    -- unique sequential; ties get different numbers arbitrarily
    RANK()       OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rnk,
    -- ties same rank; gaps in sequence (1,1,3)
    DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS drnk,
    -- ties same rank; no gaps (1,1,2)
    NTILE(4)     OVER (PARTITION BY dept_id ORDER BY salary DESC) AS quartile
FROM dbo.employees;

-- OFFSET functions — LAG and LEAD
SELECT
    order_date, revenue,
    LAG(revenue, 1, 0) OVER (ORDER BY order_date)  AS prev_day_rev,
    LEAD(revenue, 1, 0) OVER (ORDER BY order_date) AS next_day_rev,
    revenue - LAG(revenue) OVER (ORDER BY order_date) AS day_over_day
FROM dbo.daily_sales;

-- AGGREGATE window functions — running totals, moving averages
SELECT
    order_date, revenue,
    SUM(revenue) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS running_total,
    AVG(revenue) OVER (ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS avg_7d,
    SUM(revenue) OVER ()                                              AS grand_total,
    revenue * 100.0 / SUM(revenue) OVER ()                          AS pct_of_total
FROM dbo.daily_sales;

Practical Patterns

T-SQL

-- Top N per group (top 3 earners per department)
WITH ranked AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rn
    FROM dbo.employees
)
SELECT * FROM ranked WHERE rn <= 3;

-- Deduplicate — keep most recent record per customer
WITH deduped AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY updated_at DESC) AS rn
    FROM dbo.customer_records
)
SELECT * FROM deduped WHERE rn = 1;

-- Year-over-year comparison with LAG
WITH monthly AS (
    SELECT YEAR(order_date) yr, MONTH(order_date) mo, SUM(amount) rev
    FROM dbo.orders
    GROUP BY YEAR(order_date), MONTH(order_date)
)
SELECT *,
    LAG(rev, 12) OVER (ORDER BY yr, mo) AS same_month_last_year,
    rev - LAG(rev, 12) OVER (ORDER BY yr, mo) AS yoy_change
FROM monthly;

CTEs & Recursive CTEs Intermediate

Common Table Expressions (CTEs) are named subqueries that live for the duration of a single statement. They dramatically improve readability and enable recursive queries for hierarchical data.

T-SQL

-- Basic CTE
WITH high_earners AS (
    SELECT emp_id, first_name, salary, dept_id
    FROM dbo.employees
    WHERE salary > 100000
)
SELECT h.first_name, d.dept_name
FROM  high_earners h
JOIN  dbo.departments d ON h.dept_id = d.dept_id;

-- Multiple CTEs chained together
WITH
dept_stats AS (
    SELECT dept_id, AVG(salary) AS avg_sal
    FROM dbo.employees
    GROUP BY dept_id
),
above_avg AS (
    SELECT e.*, ds.avg_sal
    FROM dbo.employees e
    JOIN dept_stats ds ON e.dept_id = ds.dept_id
    WHERE e.salary > ds.avg_sal
)
SELECT * FROM above_avg ORDER BY dept_id, salary DESC;

-- Recursive CTE — walk an org chart
WITH org_tree AS (
    -- Anchor: start with the CEO (no manager)
    SELECT
        emp_id, first_name, manager_id,
        0 AS level,
        CAST(first_name AS VARCHAR(4000)) AS path
    FROM dbo.employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive: join employees to their parent
    SELECT
        e.emp_id, e.first_name, e.manager_id,
        ot.level + 1,
        ot.path + ' > ' + e.first_name
    FROM dbo.employees e
    JOIN org_tree ot ON e.manager_id = ot.emp_id
)
SELECT
    REPLICATE('  ', level) + first_name AS org_chart,
    level, path
FROM org_tree
ORDER BY path;

-- Generate a date series with recursion
WITH nums AS (
    SELECT 0 AS n
    UNION ALL
    SELECT n + 1 FROM nums WHERE n < 364
)
SELECT DATEADD(day, n, '2024-01-01') AS calendar_date
FROM nums
OPTION (MAXRECURSION 365);

💡 CTEs are not materialized: In SQL Server, CTEs are expanded inline. If you reference the same CTE multiple times and it's expensive, consider a #temp table instead.

Subqueries & EXISTS Intermediate

Subqueries nest one query inside another. EXISTS and NOT EXISTS are often more efficient than IN/NOT IN for existence checks and handle NULLs correctly.

T-SQL

-- Scalar subquery in SELECT
SELECT
    first_name, salary,
    (SELECT AVG(salary) FROM dbo.employees) AS company_avg,
    salary - (SELECT AVG(salary) FROM dbo.employees) AS vs_avg
FROM dbo.employees;

-- Correlated subquery — references outer query row
SELECT
    e.first_name, e.dept_id, e.salary,
    (SELECT AVG(salary) FROM dbo.employees i
     WHERE i.dept_id = e.dept_id) AS dept_avg
FROM dbo.employees e;

-- Derived table (subquery in FROM)
SELECT dept_id, avg_sal
FROM (
    SELECT dept_id, AVG(salary) AS avg_sal
    FROM dbo.employees
    GROUP BY dept_id
) AS dept_avgs
WHERE avg_sal > 80000;

-- EXISTS — stops scanning as soon as one match is found (very fast)
SELECT * FROM dbo.customers c
WHERE EXISTS (
    SELECT 1 FROM dbo.orders o
    WHERE o.customer_id = c.customer_id
      AND o.order_date >= '2024-01-01'
);

-- NOT EXISTS — safe alternative to NOT IN (handles NULLs correctly)
SELECT * FROM dbo.products p
WHERE NOT EXISTS (
    SELECT 1 FROM dbo.order_items oi
    WHERE oi.product_id = p.product_id
);

Set Operations Intermediate

Set operations combine result sets from multiple SELECT statements. UNION ALL is almost always preferable to UNION for performance — UNION requires a deduplication sort.

T-SQL

-- UNION ALL — all rows including duplicates (FAST)
SELECT emp_id, first_name, 'Employee' AS type FROM dbo.employees
UNION ALL
SELECT contractor_id, full_name, 'Contractor' FROM dbo.contractors;

-- UNION — deduplicates (requires sort — use only when needed)
SELECT email FROM dbo.customers
UNION
SELECT email FROM dbo.leads;

-- INTERSECT — rows that appear in BOTH result sets
SELECT product_id FROM dbo.online_sales
INTERSECT
SELECT product_id FROM dbo.store_sales;

-- EXCEPT — rows in first set but NOT in second
SELECT product_id FROM dbo.products
EXCEPT
SELECT product_id FROM dbo.order_items;
-- Products that have never been ordered

-- ORDER BY — only one at the end, applies to combined result
SELECT emp_id, name FROM dbo.employees
UNION ALL
SELECT contractor_id, name FROM dbo.contractors
ORDER BY name;

PIVOT & UNPIVOT Advanced

PIVOT rotates rows into columns. UNPIVOT does the reverse. Dynamic PIVOT with dynamic SQL is required when column values aren't known at query-write time.

T-SQL

-- Static PIVOT — quarters as columns
SELECT product_line, [Q1], [Q2], [Q3], [Q4]
FROM (
    SELECT product_line, quarter, sales
    FROM dbo.quarterly_sales
) src
PIVOT (
    SUM(sales)
    FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) pvt;

-- UNPIVOT — turn columns back into rows
SELECT product_line, quarter, sales
FROM dbo.pivot_table
UNPIVOT (
    sales FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) unpvt;

-- Dynamic PIVOT — column values at runtime
DECLARE @cols NVARCHAR(MAX), @sql NVARCHAR(MAX);

SELECT @cols = STRING_AGG(QUOTENAME(quarter), ',') WITHIN GROUP (ORDER BY quarter)
FROM (SELECT DISTINCT quarter FROM dbo.quarterly_sales) q;

SET @sql = N'SELECT product_line, ' + @cols + N'
FROM (SELECT product_line, quarter, sales FROM dbo.quarterly_sales) src
PIVOT (SUM(sales) FOR quarter IN (' + @cols + N')) pvt';

EXEC sp_executesql @sql;

💡 PIVOT alternative: Conditional aggregation with CASE is often more readable: SUM(CASE WHEN quarter = 'Q1' THEN sales END) AS Q1. Use it for simple cross-tabs; reserve PIVOT for automatic column generation.

Part IV

Programmability

Stored procedures, user-defined functions, triggers, cursors, dynamic SQL, error handling, and transactions — writing reusable, robust T-SQL.

Variables & Flow Control Intermediate

T-SQL is a full procedural language. Variables, loops, and conditionals enable complex logic — but should be used sparingly in favor of set-based approaches wherever possible.

T-SQL

-- Declaring and assigning variables
DECLARE @name        NVARCHAR(100) = N'Alice';
DECLARE @salary      DECIMAL(10,2) = 95000.00;
DECLARE @hire_date   DATE         = '2022-03-15';
DECLARE @count       INT;  -- NULL by default

-- SET vs SELECT assignment
SET @count = 42;
SELECT @count = COUNT(*) FROM dbo.employees WHERE is_active = 1;

-- Table variable
DECLARE @results TABLE (
    emp_id   INT,
    name     NVARCHAR(100),
    salary   DECIMAL(10,2)
);
INSERT INTO @results
    SELECT emp_id, first_name + ' ' + last_name, salary
    FROM dbo.employees WHERE dept_id = 3;

-- IF / ELSE
IF @salary > 100000
BEGIN
    PRINT 'High earner';
    UPDATE dbo.employees SET grade = 'Senior' WHERE emp_id = 101;
END
ELSE IF @salary > 70000
BEGIN
    PRINT 'Mid-level';
END
ELSE
BEGIN
    PRINT 'Junior';
END

-- WHILE loop
DECLARE @i INT = 1;
WHILE @i <= 10
BEGIN
    PRINT 'Iteration: ' + CAST(@i AS VARCHAR);
    SET @i += 1;
    IF @i = 5 BREAK;    -- exit loop
    IF @i = 3 CONTINUE; -- skip to next iteration
END

Stored Procedures Advanced

Stored procedures are precompiled, named blocks of T-SQL. They support parameters, return values, output parameters, and full procedural logic — the backbone of application-to-database communication.

T-SQL

-- Create a stored procedure
CREATE OR ALTER PROCEDURE dbo.usp_GetEmployeesByDept
    @dept_id    INT,
    @min_salary DECIMAL(10,2) = 0   -- optional parameter with default
AS
BEGIN
    SET NOCOUNT ON;    -- suppress "N rows affected" messages

    SELECT
        emp_id,
        first_name + ' ' + last_name AS full_name,
        salary, hire_date
    FROM dbo.employees
    WHERE dept_id    = @dept_id
      AND salary     >= @min_salary
      AND is_active  = 1
    ORDER BY salary DESC;
END;

-- Execute
EXEC dbo.usp_GetEmployeesByDept @dept_id = 3;
EXEC dbo.usp_GetEmployeesByDept @dept_id = 3, @min_salary = 80000;

-- OUTPUT parameters — return values back to caller
CREATE OR ALTER PROCEDURE dbo.usp_CreateEmployee
    @first_name  NVARCHAR(50),
    @last_name   NVARCHAR(50),
    @salary      DECIMAL(10,2),
    @dept_id     INT,
    @new_emp_id  INT OUTPUT
AS
BEGIN
    SET NOCOUNT ON;
    INSERT INTO dbo.employees (first_name, last_name, salary, dept_id, hire_date, is_active)
    VALUES (@first_name, @last_name, @salary, @dept_id, GETDATE(), 1);
    SET @new_emp_id = SCOPE_IDENTITY();
END;

-- Call with OUTPUT parameter
DECLARE @new_id INT;
EXEC dbo.usp_CreateEmployee
    @first_name = N'Alice', @last_name = N'Johnson',
    @salary = 85000, @dept_id = 3,
    @new_emp_id = @new_id OUTPUT;
PRINT 'New employee ID: ' + CAST(@new_id AS VARCHAR);

⚠️ Use CREATE OR ALTER PROCEDURE (SQL Server 2016+) to avoid needing separate CREATE and ALTER scripts. Always include SET NOCOUNT ON — row count messages can break calling applications.

User-Defined Functions Advanced

UDFs come in three forms: scalar (return one value), inline table-valued (return a table, like a parameterized view), and multi-statement table-valued. Use inline TVFs wherever possible — they perform like views and the optimizer can look inside them.

T-SQL

-- Scalar UDF — use sparingly; runs row-by-row, can kill performance
CREATE OR ALTER FUNCTION dbo.fn_GetAge (@birth_date DATE)
RETURNS INT
AS
BEGIN
    RETURN
        DATEDIFF(YEAR, @birth_date, GETDATE())
        - CASE WHEN FORMAT(GETDATE(), 'MMdd') < FORMAT(@birth_date, 'MMdd')
               THEN 1 ELSE 0 END;
END;

SELECT dbo.fn_GetAge('1990-06-15');  -- must use schema prefix

-- Inline Table-Valued Function (iTVF) — best performance
-- Behaves like a parameterized view; optimizer can push predicates inside
CREATE OR ALTER FUNCTION dbo.fn_GetDeptEmployees (@dept_id INT)
RETURNS TABLE
AS
RETURN (
    SELECT emp_id, first_name, last_name, salary, hire_date
    FROM dbo.employees
    WHERE dept_id = @dept_id AND is_active = 1
);

-- CROSS APPLY — call a TVF for each row
SELECT d.dept_name, e.first_name, e.salary
FROM dbo.departments d
CROSS APPLY dbo.fn_GetDeptEmployees(d.dept_id) e;

-- OUTER APPLY — like LEFT JOIN, includes depts with no employees
SELECT d.dept_name, e.first_name, e.salary
FROM dbo.departments d
OUTER APPLY dbo.fn_GetDeptEmployees(d.dept_id) e;

🚨 Scalar UDF performance: Scalar UDFs called in WHERE or SELECT on large tables execute once per row and prevent parallelism. In SQL Server 2019+, Scalar UDF Inlining may help, but iTVFs or inline expressions are always safer for performance-critical code.

Triggers Advanced

Triggers fire automatically in response to DML or DDL events. They're powerful but invisible — overuse leads to hidden business logic that's hard to debug and maintain.

T-SQL

-- AFTER trigger — fires after the DML completes
-- INSERTED and DELETED are virtual tables with the affected rows
CREATE OR ALTER TRIGGER trg_employees_audit
ON dbo.employees
AFTER UPDATE
AS
BEGIN
    SET NOCOUNT ON;

    INSERT INTO dbo.employee_audit (emp_id, field_changed, old_value, new_value, changed_at, changed_by)
    SELECT
        d.emp_id, 'salary',
        CAST(d.salary AS NVARCHAR(50)),
        CAST(i.salary AS NVARCHAR(50)),
        GETDATE(), SYSTEM_USER
    FROM inserted i
    JOIN deleted d ON i.emp_id = d.emp_id
    WHERE i.salary <> d.salary;  -- only log actual changes
END;

-- INSTEAD OF trigger — replaces the DML (useful on views)
CREATE OR ALTER TRIGGER trg_prevent_delete
ON dbo.employees
INSTEAD OF DELETE
AS
BEGIN
    SET NOCOUNT ON;
    UPDATE dbo.employees
    SET is_active = 0, termination_date = GETDATE()
    WHERE emp_id IN (SELECT emp_id FROM deleted);
END;

-- DDL trigger — fires on schema changes
CREATE OR ALTER TRIGGER trg_prevent_drop
ON DATABASE
FOR DROP_TABLE
AS
BEGIN
    PRINT 'Dropping tables is not allowed.';
    ROLLBACK;
END;

-- Managing triggers
DISABLE TRIGGER trg_employees_audit ON dbo.employees;
ENABLE TRIGGER  trg_employees_audit ON dbo.employees;
DROP TRIGGER    trg_employees_audit;

⚠️ INSERTED and DELETED are sets, not single rows. Never write trigger logic that assumes exactly one row was affected — bulk operations will break it. Always JOIN inserted/deleted to get all affected rows.

Cursors (& Set-Based Alternatives) Advanced

Cursors process one row at a time — almost always slower than set-based logic. Understand them well enough to replace them.

T-SQL

-- Cursor anatomy
DECLARE @emp_id INT, @salary DECIMAL(10,2);

DECLARE emp_cursor CURSOR
    LOCAL STATIC READ_ONLY FORWARD_ONLY   -- always specify these options
FOR
    SELECT emp_id, salary FROM dbo.employees WHERE is_active = 1;

OPEN emp_cursor;
FETCH NEXT FROM emp_cursor INTO @emp_id, @salary;

WHILE @@FETCH_STATUS = 0
BEGIN
    IF @salary < 50000
        UPDATE dbo.employees SET salary = salary * 1.1 WHERE emp_id = @emp_id;

    FETCH NEXT FROM emp_cursor INTO @emp_id, @salary;
END;

CLOSE      emp_cursor;
DEALLOCATE emp_cursor;

-- ✅ SET-BASED equivalent (100x faster):
UPDATE dbo.employees
SET salary = salary * 1.1
WHERE is_active = 1 AND salary < 50000;

💡 When cursors are acceptable: Calling a stored procedure once per row, sending row-by-row emails/notifications, processing rows that must execute sequentially with side effects. For all other cases, rewrite as set-based operations.

Dynamic SQL Advanced

Dynamic SQL builds query strings at runtime — enabling flexible queries with variable table/column names. Always use parameterized dynamic SQL to prevent SQL injection.

T-SQL

-- Basic dynamic SQL with sp_executesql (ALWAYS prefer over EXEC())
DECLARE @sql  NVARCHAR(MAX);
DECLARE @dept INT = 3;

SET @sql = N'SELECT * FROM dbo.employees WHERE dept_id = @dept_id';

EXEC sp_executesql
    @sql,
    N'@dept_id INT',    -- parameter definition
    @dept_id = @dept;   -- parameter value

-- Dynamic table name (can't be parameterized — must be whitelisted)
DECLARE @table_name SYSNAME = N'employees';

-- Validate against known list first!
IF @table_name NOT IN ('employees', 'contractors', 'vendors')
BEGIN
    THROW 50001, 'Invalid table name', 1;
    RETURN;
END;

SET @sql = N'SELECT COUNT(*) FROM dbo.' + QUOTENAME(@table_name);
-- QUOTENAME wraps in brackets and escapes internal brackets
EXEC sp_executesql @sql;

-- Dynamic ORDER BY — whitelist column names, never concatenate user input raw
DECLARE @order_col NVARCHAR(50) = N'salary';
IF @order_col NOT IN ('emp_id', 'last_name', 'salary', 'hire_date')
    SET @order_col = N'emp_id';

SET @sql = N'SELECT * FROM dbo.employees ORDER BY ' + QUOTENAME(@order_col) + N' DESC';
EXEC sp_executesql @sql;

🚨 SQL Injection: Never concatenate user input directly into dynamic SQL. Always use sp_executesql with parameters for values, and QUOTENAME() + whitelisting for object names. EXEC(@sql) with concatenated input is a critical vulnerability.

Error Handling — TRY/CATCH Intermediate

T-SQL's TRY/CATCH is modeled after C# — cleaner than the old @@ERROR approach. Always wrap multi-statement operations in TRY/CATCH with transactions to ensure data consistency on failure.

T-SQL

-- Basic TRY/CATCH
BEGIN TRY
    INSERT INTO dbo.employees (first_name, dept_id) VALUES ('Alice', 99);
END TRY
BEGIN CATCH
    PRINT 'Error: '    + ERROR_MESSAGE();
    PRINT 'Number: '   + CAST(ERROR_NUMBER()   AS VARCHAR);
    PRINT 'Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR);
    PRINT 'Line: '     + CAST(ERROR_LINE()     AS VARCHAR);
END CATCH;

-- TRY/CATCH with transaction — the complete pattern
BEGIN TRY
    BEGIN TRANSACTION;

    UPDATE dbo.accounts SET balance = balance - 500 WHERE account_id = 1;
    UPDATE dbo.accounts SET balance = balance + 500 WHERE account_id = 2;

    IF (SELECT balance FROM dbo.accounts WHERE account_id = 1) < 0
        THROW 50001, 'Insufficient funds', 1;

    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
    THROW;   -- re-throw original error to caller (SQL Server 2012+)
END CATCH;

-- Log error in CATCH block before re-throwing
BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK;

    INSERT INTO dbo.error_log (error_num, error_msg, error_proc, error_line, logged_at)
    VALUES (ERROR_NUMBER(), ERROR_MESSAGE(), ERROR_PROCEDURE(), ERROR_LINE(), GETDATE());

    THROW;
END CATCH;

Transactions & Locking Advanced

Transactions ensure atomicity. Understanding isolation levels is critical for balancing data consistency against concurrency and blocking.

T-SQL

-- Explicit transaction
BEGIN TRANSACTION;
UPDATE dbo.inventory SET qty = qty - 1 WHERE product_id = 42;
INSERT INTO dbo.sales (product_id, qty, sale_date) VALUES (42, 1, GETDATE());
COMMIT TRANSACTION;

-- Savepoints — partial rollback within a transaction
BEGIN TRANSACTION;
    UPDATE dbo.orders SET status = 'Processing' WHERE order_id = 101;
    SAVE TRANSACTION sp1;
    UPDATE dbo.inventory SET qty = qty - 5 WHERE product_id = 7;
    IF (SELECT qty FROM dbo.inventory WHERE product_id = 7) < 0
        ROLLBACK TRANSACTION sp1;  -- partial rollback to savepoint
COMMIT TRANSACTION;

-- Set isolation level for current session
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;  -- dirty reads OK
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;    -- default
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

-- NOLOCK hint — per-table equivalent of READ UNCOMMITTED
SELECT * FROM dbo.orders WITH (NOLOCK);
-- ⚠️ Can return uncommitted data. Use only for dashboards/approx counts.

-- Enable RCSI — best option for most OLTP workloads
ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;

Isolation Level	Dirty Reads	Non-Repeatable	Phantom	Use When
`READ UNCOMMITTED`	Yes ❌	Yes	Yes	Quick approximations; max concurrency
`READ COMMITTED`	No ✅	Yes	Yes	Default; most OLTP workloads
`REPEATABLE READ`	No ✅	No ✅	Yes	Rarely used
`SERIALIZABLE`	No ✅	No ✅	No ✅	Financial; highest consistency
`READ COMMITTED SNAPSHOT`	No ✅	Versions	Versions	Best for mixed OLTP
`SNAPSHOT`	No ✅	No ✅	No ✅	Reporting on live OLTP

Part V

Schema Design & Objects

DDL, DML, MERGE, views, indexes, temp tables, JSON, XML — building and maintaining the database objects that power your queries.

DDL — Tables & Data Types Intermediate

Getting your schema right from the start — choosing correct data types, constraints, and identity behavior — pays dividends in performance, storage, and data integrity for the life of the system.

T-SQL

-- Create table with best-practice patterns
CREATE TABLE dbo.employees (
    emp_id        INT             NOT NULL IDENTITY(1,1),
    first_name    NVARCHAR(50)   NOT NULL,
    last_name     NVARCHAR(50)   NOT NULL,
    email         NVARCHAR(255)  NOT NULL,
    salary        DECIMAL(10,2) NOT NULL DEFAULT 0,
    hire_date     DATE           NOT NULL DEFAULT CAST(GETDATE() AS DATE),
    dept_id       INT            NULL,
    is_active     BIT            NOT NULL DEFAULT 1,
    created_at    DATETIME2      NOT NULL DEFAULT SYSDATETIME(),
    updated_at    DATETIME2      NULL,
    row_version   ROWVERSION     NOT NULL,

    CONSTRAINT PK_employees PRIMARY KEY (emp_id),
    CONSTRAINT UQ_employees_email UNIQUE (email),
    CONSTRAINT FK_employees_dept FOREIGN KEY (dept_id)
        REFERENCES dbo.departments (dept_id) ON DELETE SET NULL,
    CONSTRAINT CK_employees_salary CHECK (salary >= 0)
);

-- Alter table
ALTER TABLE dbo.employees ADD         phone       NVARCHAR(20) NULL;
ALTER TABLE dbo.employees ALTER COLUMN last_name   NVARCHAR(100) NOT NULL;
ALTER TABLE dbo.employees DROP COLUMN  phone;

-- Conditional DDL — only create if it doesn't exist
IF NOT EXISTS (SELECT 1 FROM sys.tables
               WHERE name = 'employees' AND schema_id = SCHEMA_ID('dbo'))
BEGIN
    -- CREATE TABLE ...
END;

Key Data Types Quick Reference

Type	Size	Range / Notes	Use For
`INT`	4 bytes	-2.1B to 2.1B	Most integer IDs and counts
`BIGINT`	8 bytes	±9.2 quintillion	Large IDs, event sequences
`TINYINT`	1 byte	0–255	Status codes, flags
`DECIMAL(p,s)`	5–17 bytes	Exact numeric	Money, measurements
`FLOAT`	8 bytes	Approximate	Scientific; never money
`NVARCHAR(n)`	2×n bytes	Unicode, max 4000	Names, text (use for all strings)
`BIT`	1 bit	0, 1, or NULL	Boolean flags
`UNIQUEIDENTIFIER`	16 bytes	GUID	Distributed IDs
`DATETIME2`	6–8 bytes	High precision	Timestamps (preferred over DATETIME)

DML — Insert, Update, Delete Beginner

T-SQL's OUTPUT clause is uniquely powerful — it lets you capture rows affected by any DML statement without a separate SELECT.

T-SQL

-- INSERT — single and multi-row
INSERT INTO dbo.employees (first_name, last_name, salary, dept_id)
VALUES (N'Alice', N'Johnson', 85000, 3);

INSERT INTO dbo.employees (first_name, last_name, salary)
VALUES
    (N'Bob',   N'Smith',  70000),
    (N'Carol', N'Lee',    90000),
    (N'David', N'Brown',  65000);

-- OUTPUT clause — capture inserted rows
DECLARE @inserted_ids TABLE (emp_id INT, full_name NVARCHAR(100));

INSERT INTO dbo.employees (first_name, last_name, salary)
OUTPUT inserted.emp_id, inserted.first_name + ' ' + inserted.last_name
INTO @inserted_ids
VALUES (N'Eve', N'Davis', 78000);

-- UPDATE with JOIN
UPDATE e
SET e.salary = e.salary * 1.1
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE d.dept_name = 'Engineering';

-- UPDATE with OUTPUT — capture before and after values
UPDATE dbo.employees
SET salary = salary * 1.05
OUTPUT
    deleted.emp_id,
    deleted.salary AS old_salary,
    inserted.salary AS new_salary
WHERE dept_id = 2;

-- DELETE with JOIN
DELETE e
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE d.dept_name = 'Discontinued';

-- TRUNCATE — removes all rows, resets identity, faster than DELETE
TRUNCATE TABLE dbo.staging_table;
-- Cannot use WHERE; cannot truncate FK-referenced tables

MERGE Statement Advanced

MERGE (UPSERT) combines INSERT, UPDATE, and DELETE — comparing a source to a target and performing the right action for each row. Ideal for syncing tables and staging loads.

T-SQL

-- Full MERGE pattern
MERGE INTO dbo.employees WITH (HOLDLOCK) AS target
USING (
    SELECT emp_id, first_name, last_name, salary, dept_id
    FROM dbo.employees_staging
) AS source
ON target.emp_id = source.emp_id

WHEN MATCHED AND (
    target.salary  <> source.salary  OR
    target.dept_id <> source.dept_id
) THEN
    UPDATE SET
        target.salary     = source.salary,
        target.dept_id    = source.dept_id,
        target.updated_at = SYSDATETIME()

WHEN NOT MATCHED BY TARGET THEN
    INSERT (first_name, last_name, salary, dept_id, is_active)
    VALUES (source.first_name, source.last_name, source.salary, source.dept_id, 1)

WHEN NOT MATCHED BY SOURCE THEN
    DELETE

OUTPUT
    $action AS merge_action,
    inserted.emp_id, deleted.emp_id AS deleted_id;

-- Simple UPSERT (no DELETE)
MERGE INTO dbo.product_prices AS t
USING (SELECT 42 AS product_id, 29.99 AS price) AS s
ON t.product_id = s.product_id
WHEN MATCHED THEN
    UPDATE SET t.price = s.price
WHEN NOT MATCHED THEN
    INSERT (product_id, price) VALUES (s.product_id, s.price);

🚨 MERGE race condition: In concurrent environments, MERGE can produce duplicate key errors. Add WITH (HOLDLOCK) to the target table reference (shown above) to prevent this.

Views & Indexed Views Intermediate

Views are saved SELECT statements — they simplify complex queries, enforce consistent data access, and can be secured independently. Indexed views materialize the result set for dramatic aggregation speedups.

T-SQL

-- Create a view
CREATE OR ALTER VIEW dbo.vw_active_employees
AS
SELECT
    e.emp_id,
    e.first_name + ' ' + e.last_name  AS full_name,
    d.dept_name,
    e.salary,
    e.hire_date,
    DATEDIFF(YEAR, e.hire_date, GETDATE()) AS tenure_years
FROM dbo.employees   e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE e.is_active = 1;

SELECT * FROM dbo.vw_active_employees WHERE dept_name = 'Engineering';

-- Indexed View — materialized, maintained automatically by SQL Server
-- Strict requirements: WITH SCHEMABINDING, no outer joins/subqueries/DISTINCT
CREATE OR ALTER VIEW dbo.vw_dept_salary_totals
WITH SCHEMABINDING
AS
SELECT
    dept_id,
    COUNT_BIG(*)    AS headcount,   -- COUNT_BIG required for indexed views
    SUM(salary)     AS total_salary
FROM dbo.employees
GROUP BY dept_id;

-- Create the clustered index to materialize it
CREATE UNIQUE CLUSTERED INDEX IX_vw_dept_salary
ON dbo.vw_dept_salary_totals (dept_id);

Indexes & Execution Plans Advanced

Indexes are the single biggest lever for query performance. A good index turns a full table scan into an index seek. Understanding which type to create — and when — separates good from great.

T-SQL

-- Clustered index — defines physical row order. Only 1 per table.
CREATE CLUSTERED INDEX CX_employees_emp_id ON dbo.employees (emp_id);

-- Non-clustered index
CREATE NONCLUSTERED INDEX IX_employees_dept_salary
ON dbo.employees (dept_id, salary);

-- Covering index — add columns to avoid key lookups
CREATE NONCLUSTERED INDEX IX_employees_dept_covering
ON dbo.employees (dept_id, is_active)
INCLUDE (first_name, last_name, salary, hire_date);
-- Query: WHERE dept_id=3 AND is_active=1 selecting these cols = no lookup needed

-- Filtered index — indexes a subset of rows (smaller and faster)
CREATE NONCLUSTERED INDEX IX_orders_pending
ON dbo.orders (order_date, customer_id)
WHERE status = 'Pending';

-- Rebuild vs Reorganize (defragmentation)
ALTER INDEX IX_employees_dept_covering ON dbo.employees REBUILD;
ALTER INDEX IX_employees_dept_covering ON dbo.employees REORGANIZE;
ALTER INDEX ALL ON dbo.employees REBUILD;

-- Statistics I/O — see how many pages were read
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SELECT * FROM dbo.employees WHERE dept_id = 3;
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;

Reading Execution Plans

Operator	Signal	Action
Index Seek ✅	Efficient	Jumping to specific rows
Index Scan ⚠️	OK for small tables	Reading full index
Table Scan 🚨	No index used	Add or fix an index
Key Lookup 🚨	Extra bookmark lookups	Add INCLUDE columns to non-clustered index
Hash Match ⚠️	Building hash table	Often means missing JOIN index
Sort ⚠️	Expensive sort step	Add ORDER BY-aligned index or reduce rows earlier

💡 The golden rule for indexes: Index columns used in WHERE/JOIN ON (in order of selectivity), then add frequently selected columns in INCLUDE. Never over-index — every index slows INSERT/UPDATE/DELETE. Aim for 5–7 indexes per heavily written table.

Temp Tables & Table Variables Intermediate

SQL Server gives you three ways to store intermediate results: temp tables (#table), table variables (@table), and CTEs. Choosing correctly affects performance and scope.

T-SQL

-- Local temp table — visible only in current session and child procedures
CREATE TABLE #dept_summary (
    dept_id     INT            NOT NULL,
    dept_name   NVARCHAR(100) NOT NULL,
    headcount   INT            NOT NULL,
    avg_salary  DECIMAL(10,2) NOT NULL
);

INSERT INTO #dept_summary
SELECT d.dept_id, d.dept_name, COUNT(*), AVG(e.salary)
FROM dbo.employees   e
JOIN dbo.departments d ON e.dept_id = d.dept_id
GROUP BY d.dept_id, d.dept_name;

-- Can add indexes to temp tables!
CREATE NONCLUSTERED INDEX IX_tmp_dept ON #dept_summary (dept_id);

SELECT * FROM #dept_summary WHERE avg_salary > 80000;
DROP TABLE IF EXISTS #dept_summary;  -- always clean up

-- Table variable — scoped to batch, no statistics
DECLARE @results TABLE (
    emp_id  INT           NOT NULL,
    name    NVARCHAR(100) NOT NULL
);
INSERT INTO @results SELECT emp_id, first_name FROM dbo.employees;

Feature	#Temp Table	@Table Variable	CTE
Scope	Session + child procs	Batch only	Single statement
Indexes	✅ Yes	❌ No	N/A
Statistics	✅ Yes	❌ No	N/A
Transactions	✅ Participates	❌ Does not rollback	N/A
Best for	Large results, reuse	Small sets, simple staging	Readability, single use

💡 Temp tables vs table variables: For more than ~1000 rows or when you need statistics for the optimizer, prefer #temp tables. The optimizer assumes table variables have exactly 1 row, leading to poor plans for large datasets in older versions.

JSON Support Advanced

SQL Server 2016+ provides native JSON support for parsing, querying, and generating JSON — covering most practical needs for API integrations and document storage.

T-SQL

-- Generate JSON from query results
SELECT emp_id, first_name, salary
FROM dbo.employees
WHERE dept_id = 3
FOR JSON PATH;

SELECT emp_id, first_name, salary
FROM dbo.employees
FOR JSON AUTO, ROOT('employees'), INCLUDE_NULL_VALUES;

-- Extract values from JSON string
DECLARE @json NVARCHAR(MAX) = N'{"name":"Alice","salary":85000,"tags":["sql","data"]}';

SELECT
    JSON_VALUE(@json, '$.name')        AS name,       -- scalar value
    JSON_VALUE(@json, '$.salary')      AS salary,
    JSON_QUERY(@json, '$.tags')        AS tags_array; -- object or array

-- OPENJSON with schema — shred JSON array into rows
DECLARE @employees_json NVARCHAR(MAX) = N'
[{"id":1,"name":"Alice","salary":85000},
 {"id":2,"name":"Bob","salary":70000}]';

SELECT emp_id, name, salary
FROM OPENJSON(@employees_json)
WITH (
    emp_id  INT            '$.id',
    name    NVARCHAR(100) '$.name',
    salary  DECIMAL(10,2) '$.salary'
);

-- Modify JSON
SET @json = JSON_MODIFY(@json, '$.salary', 90000);

-- Validate JSON
SELECT ISJSON('{"valid":true}');    -- 1
SELECT ISJSON('not json');           -- 0

XML Support Advanced

SQL Server has deep XML support including the native XML data type and XQuery — still common in legacy integrations, SSRS, and SSAS.

T-SQL

-- FOR XML — convert query results to XML
SELECT emp_id, first_name, salary FROM dbo.employees
FOR XML PATH('employee'), ROOT('employees');

-- Query XML data type
DECLARE @xml XML = '<employees>
  <employee id="1"><name>Alice</name><salary>85000</salary></employee>
  <employee id="2"><name>Bob</name><salary>70000</salary></employee>
</employees>';

-- .value() — extract a scalar
SELECT @xml.value('(/employees/employee[@id=1]/name)[1]', 'NVARCHAR(100)');

-- .nodes() — shred to rows
SELECT
    emp.value('(@id)',      'INT')          AS emp_id,
    emp.value('(name)[1]',   'NVARCHAR(100)') AS name,
    emp.value('(salary)[1]', 'DECIMAL(10,2)') AS salary
FROM @xml.nodes('/employees/employee') AS t(emp);

T-SQL Gotchas & Traps Mastery

These are the patterns that silently corrupt data, destroy performance, or produce wrong results. Every experienced SQL Server developer has been burned by at least half of these.

Data & Logic Traps

T-SQL

-- 1. NULL propagation in expressions
SELECT 5 + NULL                -- NULL, not 5
SELECT salary + ISNULL(bonus, 0) -- correct pattern

-- 2. NOT IN with NULLs — returns ZERO rows if subquery has any NULLs!
WHERE customer_id NOT IN (SELECT customer_id FROM dbo.blocked);
-- Fix: add WHERE customer_id IS NOT NULL, or use NOT EXISTS

-- 3. Implicit type conversion — converts every row
WHERE varchar_id = 42    -- BAD: table scan
WHERE varchar_id = '42'  -- GOOD: same types

-- 4. Integer division
SELECT 1/3               -- = 0 (integer!)
SELECT 1/3.0             -- = 0.333333

-- 5. DATEDIFF counts boundary crossings, not elapsed time
DATEDIFF(year, '2023-12-31', '2024-01-01') -- = 1 (crossed a year boundary)

-- 6. String concatenation with + operator converts numbers
SELECT '1' + '2'          -- = 3 (implicit cast to int!)
SELECT CONCAT('1', '2')  -- = '12' (correct)

-- 7. UPDATE without WHERE — updates EVERY row
UPDATE dbo.employees SET salary = 50000;  -- disaster!
-- Always test with SELECT first, then change to UPDATE

-- 8. SCOPE_IDENTITY vs @@IDENTITY
SELECT @@IDENTITY;       -- last identity in session — DANGEROUS with triggers
SELECT SCOPE_IDENTITY(); -- last identity in current scope — ALWAYS use this

Performance Traps

T-SQL

-- 9. Non-sargable predicates (can't use index)
WHERE YEAR(order_date) = 2024        -- BAD: function on column
WHERE order_date >= '2024-01-01'      -- GOOD
  AND order_date <  '2025-01-01'

-- 10. Leading wildcard LIKE
WHERE last_name LIKE '%son'   -- BAD: can't use index on last_name
WHERE last_name LIKE 'John%'  -- GOOD: can seek on index

-- 11. SELECT * in production code
SELECT *                         -- schema changes break calling code silently
SELECT emp_id, first_name, salary -- explicit = safe

-- 12. Scalar UDFs in WHERE/SELECT for large tables
WHERE dbo.fn_IsEligible(emp_id) = 1  -- runs once per row, prevents parallelism

-- 13. NOLOCK is not free — can return dirty, duplicate, or missing rows
SELECT * FROM dbo.orders WITH (NOLOCK)
-- Use RCSI at the database level instead

-- 14. Always schema-qualify object names
SELECT * FROM employees      -- BAD: checks user schema first
SELECT * FROM dbo.employees  -- GOOD: direct resolution

⚠️ Schema qualifying matters: Always prefix objects with their schema (dbo.employees not just employees). Without a schema, SQL Server checks your user schema first — slowing lookups and creating hard-to-find bugs when two schemas have same-named objects.