A complete T-SQL reference — fundamentals through window functions, CTEs, stored procedures, error handling, MERGE, indexing, JSON, and query optimization. Every concept explained with real-world use cases.
How to Use This Bible
This guide is designed for non-linear learning. Use the sidebar to jump to any topic. Search for any keyword, function, or clause. Each Part stands independently — start wherever you need.
🟢 Beginner
Part I & II. Execution order, SELECT, WHERE, JOINs, aggregation, and built-in functions.
🔵 Intermediate
Parts II–III. Subqueries, set operations, PIVOT, and advanced filtering patterns.
🔴 Advanced
Part III & IV. Window functions, CTEs, stored procedures, triggers, and dynamic SQL.
🏆 Mastery
Part V. Indexes, execution plans, MERGE, JSON/XML, and T-SQL gotchas.
Foundations — Core T-SQL Query Concepts
Master the mental models that explain how SQL Server actually executes queries, filters, joins, and aggregates data.
Query Execution Order — Start Here Beginner
The most important concept in T-SQL. You write clauses in one order; SQL Server processes them in a completely different order. This single fact explains most query errors and performance pitfalls.
You write SQL as: SELECT → FROM → WHERE → GROUP BY → HAVING → ORDER BY. SQL Server runs it like this:
-- FAILS: alias used in WHERE, but WHERE runs before SELECT creates it
SELECT salary * 1.1 AS adjusted
FROM employees
WHERE adjusted > 100000; -- ERROR: Invalid column name 'adjusted'
-- FIX 1: repeat the expression in WHERE
SELECT salary * 1.1 AS adjusted
FROM employees
WHERE salary * 1.1 > 100000;
-- FIX 2: wrap in a subquery (derived table)
SELECT * FROM (
SELECT *, salary * 1.1 AS adjusted FROM employees
) base
WHERE adjusted > 100000;
-- FIX 3: CTE — cleanest and most readable
WITH base AS (
SELECT *, salary * 1.1 AS adjusted FROM employees
)
SELECT * FROM base WHERE adjusted > 100000;
-- NOTE: Unlike MySQL, SQL Server does NOT allow aliases in GROUP BY
-- You must repeat the expression, or use a subquery / CTE
SELECT YEAR(order_date) AS yr, COUNT(*) AS cnt
FROM orders
GROUP BY YEAR(order_date) -- must repeat expression, not alias
HAVING COUNT(*) > 10
ORDER BY yr; -- aliases ARE allowed in ORDER BY
SELECT & FROM Beginner
SELECT defines the shape of your output. FROM is where the data lives. T-SQL adds schema-qualified naming, TOP, and several computed column techniques that differ from other SQL dialects.
-- Always qualify with schema (dbo is the default)
SELECT emp_id, first_name, last_name, salary
FROM dbo.employees;
-- Computed columns and aliases
SELECT
first_name + ' ' + last_name AS full_name,
salary * 12 AS annual_salary,
ROUND(salary / 2080.0, 2) AS hourly_rate,
DATEDIFF(YEAR, hire_date, GETDATE()) AS years_tenure,
IIF(salary > 100000, 'Senior', 'Standard') AS tier
FROM dbo.employees;
-- Table aliases: essential for joins
SELECT e.first_name, d.dept_name, l.city
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
JOIN dbo.locations l ON d.location_id = l.location_id;
-- TOP N — SQL Server's row limiter (replaces LIMIT)
SELECT TOP 10 * FROM dbo.orders ORDER BY order_date DESC;
-- TOP with PERCENT
SELECT TOP 5 PERCENT * FROM dbo.sales ORDER BY amount DESC;
-- TOP WITH TIES — includes rows tied at the boundary
SELECT TOP 3 WITH TIES employee_id, salary
FROM dbo.employees
ORDER BY salary DESC;
-- SELECT INTO — creates a new table from a query result
SELECT * INTO dbo.employees_backup
FROM dbo.employees
WHERE dept_id = 5;
-- DISTINCT
SELECT DISTINCT department FROM dbo.employees;
SELECT DISTINCT department, job_title FROM dbo.employees; -- unique combos
SELECT
e.emp_id,
e.first_name + ' ' + e.last_name AS full_name,
d.dept_name,
'$' + FORMAT(e.salary, 'N2') AS salary_fmt,
DATEDIFF(YEAR, e.hire_date, GETDATE()) AS years,
CASE
WHEN e.salary >= 100000 THEN 'Senior'
WHEN e.salary >= 70000 THEN 'Mid-Level'
ELSE 'Junior'
END AS level
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE e.is_active = 1
ORDER BY years DESC, full_name;
WHERE & Filtering Beginner
WHERE is your row-level filter — it runs early, before grouping, so it's your primary performance lever. Writing a bad WHERE clause is the fastest way to kill query speed.
-- Basic comparisons
WHERE salary > 80000
WHERE status = 'Active'
WHERE hire_date >= '2020-01-01'
WHERE dept_id != 3 -- or <>
-- AND / OR — AND binds tighter than OR, always use parentheses
WHERE (dept_id = 1 OR dept_id = 2) AND salary > 60000
-- BETWEEN (inclusive on both ends)
WHERE salary BETWEEN 50000 AND 120000
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'
-- IN / NOT IN
WHERE dept_id IN (1, 3, 7)
WHERE status NOT IN ('Terminated', 'On Leave')
-- LIKE — pattern matching
WHERE last_name LIKE 'Sm%' -- starts with Sm
WHERE email LIKE '%@gmail.com' -- ends with
WHERE phone LIKE '555-____' -- _ = single character
WHERE code LIKE '[A-C]%' -- starts with A, B, or C
WHERE code LIKE '[^0-9]%' -- does NOT start with a digit
-- NULL checks — never use = NULL, always IS NULL / IS NOT NULL
WHERE manager_id IS NULL
WHERE termination_date IS NOT NULL
-- Date filtering — avoid wrapping indexed columns in functions!
-- BAD (non-sargable — can't use index): YEAR(order_date) = 2024
-- GOOD (sargable — can use index on order_date):
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
Sargability — The Performance Rule
A predicate is sargable (Search ARGument ABLE) if SQL Server can use an index to satisfy it. Wrapping a column in a function destroys sargability.
| Non-Sargable ❌ (slow) | Sargable ✅ (fast) |
|---|---|
WHERE YEAR(order_date) = 2024 | WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01' |
WHERE UPPER(last_name) = 'SMITH' | WHERE last_name = 'Smith' (or use case-insensitive collation) |
WHERE LEN(phone) = 10 | Add a computed column with an index instead |
WHERE salary + bonus > 100000 | WHERE salary > 100000 - bonus |
WHERE CONVERT(VARCHAR, id) = '42' | WHERE id = 42 |
JOINs — All Types Intermediate
JOINs combine rows from multiple tables. Understanding which join type returns which rows is fundamental — getting this wrong produces silent data errors that are hard to debug.
-- INNER JOIN — only rows with matches in BOTH tables
SELECT e.first_name, d.dept_name
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id;
-- Employees without a dept_id, or depts with no employees, are excluded
-- LEFT JOIN — all rows from left table; NULLs where no right match
SELECT e.first_name, d.dept_name
FROM dbo.employees e
LEFT JOIN dbo.departments d ON e.dept_id = d.dept_id;
-- All employees returned; dept_name is NULL if no department assigned
-- RIGHT JOIN — all rows from right table (rarely used; prefer LEFT JOIN)
SELECT e.first_name, d.dept_name
FROM dbo.employees e
RIGHT JOIN dbo.departments d ON e.dept_id = d.dept_id;
-- FULL OUTER JOIN — all rows from both tables
SELECT e.first_name, d.dept_name
FROM dbo.employees e
FULL OUTER JOIN dbo.departments d ON e.dept_id = d.dept_id;
-- Find orphans on BOTH sides
WHERE e.dept_id IS NULL OR d.dept_id IS NULL;
-- CROSS JOIN — every row × every row (Cartesian product)
SELECT p.product_name, c.color_name
FROM dbo.products p
CROSS JOIN dbo.colors c;
-- 10 products × 5 colors = 50 rows. Use for generating combinations.
-- SELF JOIN — a table joined to itself
SELECT e.first_name AS employee, m.first_name AS manager
FROM dbo.employees e
LEFT JOIN dbo.employees m ON e.manager_id = m.emp_id;
Anti-Joins — Finding Non-Matches
-- Pattern 1: LEFT JOIN + IS NULL — find unmatched rows
SELECT e.*
FROM dbo.employees e
LEFT JOIN dbo.time_sheets t ON e.emp_id = t.emp_id
AND t.week_start = '2024-01-01'
WHERE t.emp_id IS NULL; -- employees with no timesheet that week
-- Pattern 2: NOT EXISTS — often faster, handles NULLs cleanly
SELECT * FROM dbo.employees e
WHERE NOT EXISTS (
SELECT 1 FROM dbo.time_sheets t
WHERE t.emp_id = e.emp_id AND t.week_start = '2024-01-01'
);
-- Pattern 3: NOT IN (avoid if subquery can return NULL — entire result becomes empty)
SELECT * FROM dbo.employees
WHERE emp_id NOT IN (SELECT emp_id FROM dbo.time_sheets
WHERE emp_id IS NOT NULL);
SELECT
c.customer_id,
c.company_name,
COUNT(o.order_id) AS total_orders,
SUM(oi.quantity * oi.unit_price) AS lifetime_value,
MAX(o.order_date) AS last_order_date,
r.rep_name AS sales_rep
FROM dbo.customers c
LEFT JOIN dbo.orders o ON c.customer_id = o.customer_id
LEFT JOIN dbo.order_items oi ON o.order_id = oi.order_id
LEFT JOIN dbo.sales_reps r ON c.rep_id = r.rep_id
GROUP BY c.customer_id, c.company_name, r.rep_name
ORDER BY lifetime_value DESC;
GROUP BY & HAVING Intermediate
GROUP BY collapses rows into groups; aggregate functions summarize each group. HAVING filters those groups. Together they power every summary report and analytics query.
-- Basic aggregation
SELECT
dept_id,
COUNT(*) AS headcount,
COUNT(manager_id) AS managed_count, -- ignores NULLs
COUNT(DISTINCT job_title) AS unique_roles,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary,
SUM(salary) AS payroll_cost,
STDEV(salary) AS salary_stdev
FROM dbo.employees
GROUP BY dept_id;
-- HAVING filters groups (WHERE filters rows — they run at different times)
SELECT dept_id, COUNT(*) AS cnt
FROM dbo.employees
WHERE is_active = 1 -- filter rows first (cheap)
GROUP BY dept_id
HAVING COUNT(*) >= 5 -- then filter groups (on aggregate)
ORDER BY cnt DESC;
-- ROLLUP — adds subtotals and grand total
SELECT region, product_line, SUM(sales) AS total
FROM dbo.sales
GROUP BY ROLLUP(region, product_line);
-- NULL in region = grand total; NULL in product_line = region subtotal
-- CUBE — all possible subtotal combinations
SELECT region, product_line, SUM(sales) AS total
FROM dbo.sales
GROUP BY CUBE(region, product_line);
-- GROUPING() — tells you if a NULL is a subtotal or real data
SELECT
CASE WHEN GROUPING(region) = 1 THEN 'ALL REGIONS' ELSE region END AS region,
SUM(sales) AS total
FROM dbo.sales
GROUP BY ROLLUP(region);
-- GROUPING SETS — define exactly which grouping combinations you want
SELECT region, product_line, SUM(sales)
FROM dbo.sales
GROUP BY GROUPING SETS (
(region, product_line),
(region),
()
);
COUNT(*) counts all rows including NULLs. COUNT(column) only counts non-NULL values in that column. Use COUNT(DISTINCT col) for unique value counts.ORDER BY & TOP / OFFSET-FETCH Beginner
SQL Server replaces MySQL's LIMIT with TOP for simple cases and OFFSET-FETCH for pagination. TOP is evaluated last, after ORDER BY — never use TOP without ORDER BY unless row order truly doesn't matter.
-- Basic ORDER BY
SELECT first_name, last_name, salary
FROM dbo.employees
ORDER BY salary DESC, last_name ASC;
-- NULLs sort LAST in ASC, FIRST in DESC (SQL Server behavior)
-- Force NULLs to sort last regardless of direction:
ORDER BY CASE WHEN bonus IS NULL THEN 1 ELSE 0 END, bonus DESC;
-- TOP — get first N rows
SELECT TOP 5 emp_id, salary
FROM dbo.employees
ORDER BY salary DESC;
-- TOP with variable
DECLARE @n INT = 10;
SELECT TOP (@n) * FROM dbo.employees ORDER BY emp_id;
-- OFFSET-FETCH — proper pagination (SQL Server 2012+)
-- Page 3, 20 rows per page:
SELECT emp_id, first_name, salary
FROM dbo.employees
ORDER BY emp_id
OFFSET 40 ROWS -- skip (page-1) × page_size rows
FETCH NEXT 20 ROWS ONLY; -- take page_size rows
-- Parameterized pagination in a stored procedure:
DECLARE @page INT = 3, @size INT = 20;
SELECT * FROM dbo.employees
ORDER BY emp_id
OFFSET (@page - 1) * @size ROWS
FETCH NEXT @size ROWS ONLY;
Built-In Functions
String manipulation, numeric operations, date/time arithmetic, conditional logic, and NULL handling — the tools you'll use in every query.
String Functions Intermediate
T-SQL string functions differ meaningfully from MySQL equivalents. String concatenation uses + instead of CONCAT in older code, but CONCAT is the safe choice — it handles NULLs gracefully.
-- Concatenation
'Hello' + ' ' + 'World' -- NULLs propagate! NULL + 'x' = NULL
CONCAT('Hello', ' ', 'World') -- NULL-safe: treats NULL as empty string
CONCAT_WS(', ', city, state, zip) -- separator, skips NULLs automatically
-- Case transformation
UPPER('hello') -- 'HELLO'
LOWER('HELLO') -- 'hello'
-- Trimming
LTRIM(' hello ') -- 'hello ' (left trim)
RTRIM(' hello ') -- ' hello' (right trim)
TRIM(' hello ') -- 'hello' (both, SQL Server 2017+)
TRIM('x' FROM 'xxhelloxx') -- trim specific character
-- Length
LEN('Hello') -- 5 (excludes trailing spaces)
DATALENGTH('Hello') -- 5 for VARCHAR, 10 for NVARCHAR (bytes)
-- Substrings
SUBSTRING('Hello World', 7, 5) -- 'World' (start, length)
LEFT('Hello World', 5) -- 'Hello'
RIGHT('Hello World', 5) -- 'World'
-- Search
CHARINDEX('World', 'Hello World') -- 7 (1-based, 0 if not found)
CHARINDEX('l', 'Hello', 4) -- 4 (start search at position 4)
PATINDEX('%[0-9]%', 'abc123') -- 4 (supports LIKE patterns)
-- Replace & Repeat
REPLACE('Hello World', 'World', 'SQL') -- 'Hello SQL'
REPLICATE('AB', 3) -- 'ABABAB'
STUFF('Hello World', 6, 6, ' T-SQL') -- 'Hello T-SQL'
-- Padding
FORMAT(42, 'D5') -- '00042' (left-pad numbers)
RIGHT('00000' + CAST(id AS VARCHAR), 5) -- '00042' (traditional)
-- String splitting (SQL Server 2016+)
SELECT value FROM STRING_SPLIT('apple,banana,cherry', ',');
-- STRING_AGG — concatenate rows into a string (SQL Server 2017+)
SELECT dept_id,
STRING_AGG(first_name, ', ') WITHIN GROUP (ORDER BY first_name) AS members
FROM dbo.employees
GROUP BY dept_id;
NVARCHAR for any column that might store non-ASCII characters (names, addresses, etc.). Always prefix Unicode string literals with N: N'Héllo'. Mixing VARCHAR and NVARCHAR in joins or comparisons causes implicit conversions that kill index seeks.Numeric Functions Beginner
T-SQL numeric handling has a critical trap: integer division. Always ensure at least one operand is a decimal to get the result you expect.
-- Integer division trap
SELECT 7 / 2 -- = 3 (integer division!)
SELECT 7 / 2.0 -- = 3.5 (decimal)
SELECT 7 * 1.0 / 2 -- = 3.5 (force decimal early)
SELECT CAST(7 AS DECIMAL(10,2)) / 2 -- = 3.50
-- Rounding
ROUND(3.456, 2) -- 3.460
ROUND(3456, -2) -- 3500 (rounds to hundreds)
FLOOR(3.9) -- 3 (round down always)
CEILING(3.1) -- 4 (round up always)
-- Power, sqrt, log
POWER(2, 10) -- 1024
SQRT(144) -- 12.0
LOG(100) -- natural log ≈ 4.605
LOG(100, 10) -- log base 10 = 2.0
EXP(1) -- e ≈ 2.718
-- Absolute value, modulo, sign
ABS(-42) -- 42
17 % 5 -- 2 (modulo operator)
SIGN(-5) -- -1 / 0 / 1
-- Random integers 1–100 (per-row safe)
ABS(CHECKSUM(NEWID())) % 100 + 1
-- FORMAT — locale-aware number formatting
FORMAT(1234567.89, 'N2') -- '1,234,567.89'
FORMAT(0.1234, 'P1') -- '12.3%'
FORMAT(9.99, 'C', 'en-US') -- '$9.99'
-- Safe division (avoid divide by zero error)
SELECT numerator / NULLIF(denominator, 0) -- returns NULL
SELECT ISNULL(numerator / NULLIF(denominator, 0), 0) -- returns 0
Date & Time Functions Intermediate
SQL Server has a rich set of date/time functions and multiple date data types. Choosing the right type and writing sargable date predicates are the two most impactful date-handling decisions you'll make.
Date Data Types
| Type | Storage | Range | Precision | Use When |
|---|---|---|---|---|
DATE | 3 bytes | 0001–9999 | Day | Birth dates, calendar dates |
TIME | 3–5 bytes | — | 100ns | Time of day only |
DATETIME | 8 bytes | 1753–9999 | ~3ms | Legacy; avoid in new code |
DATETIME2 | 6–8 bytes | 0001–9999 | 100ns | Preferred date+time type |
SMALLDATETIME | 4 bytes | 1900–2079 | 1 min | Low-precision legacy |
DATETIMEOFFSET | 8–10 bytes | 0001–9999 | 100ns + tz | Time zones, global apps |
-- Getting current date/time
GETDATE() -- DATETIME, local server time
GETUTCDATE() -- DATETIME, UTC time
SYSDATETIME() -- DATETIME2, higher precision, local
SYSUTCDATETIME() -- DATETIME2, higher precision, UTC
SYSDATETIMEOFFSET() -- DATETIMEOFFSET with timezone info
-- Extracting date parts
YEAR(order_date) -- 2024
MONTH(order_date) -- 1–12
DAY(order_date) -- 1–31
DATEPART(weekday, order_date) -- 1=Sunday...7=Saturday
DATEPART(quarter, order_date) -- 1–4
DATENAME(month, order_date) -- 'January'
EOMONTH(order_date) -- last day of that month
EOMONTH(order_date, 1) -- last day of next month
-- Date arithmetic
DATEADD(day, 30, order_date) -- add 30 days
DATEADD(month, -3, GETDATE()) -- 3 months ago
DATEDIFF(day, start_date, end_date) -- days between
DATEDIFF(year, birth_date, GETDATE()) -- approximate age
-- Truncating to start of period
CAST(order_date AS DATE) -- strip time portion
DATEFROMPARTS(YEAR(order_date), MONTH(order_date), 1) -- first of month
DATEFROMPARTS(YEAR(order_date), 1, 1) -- first of year
-- Converting / parsing dates
CAST('2024-06-15' AS DATE) -- safe, ISO format always works
CONVERT(DATE, '15/06/2024', 103) -- style 103 = dd/mm/yyyy
TRY_CAST('bad-date' AS DATE) -- returns NULL instead of error
-- Formatting dates for display
FORMAT(order_date, 'yyyy-MM-dd') -- '2024-06-15'
FORMAT(order_date, 'MMMM d, yyyy') -- 'June 15, 2024'
CONVERT(VARCHAR, order_date, 101) -- '06/15/2024' (faster than FORMAT)
SELECT
FORMAT(order_date, 'yyyy-MM') AS month,
COUNT(*) AS order_count,
SUM(total_amount) AS revenue,
SUM(SUM(total_amount)) OVER (
ORDER BY MIN(order_date) ROWS UNBOUNDED PRECEDING
) AS running_total
FROM dbo.orders
WHERE order_date >= DATEFROMPARTS(YEAR(GETDATE())-1, 1, 1)
AND status = 'Completed'
GROUP BY FORMAT(order_date, 'yyyy-MM')
ORDER BY month;
CASE & IIF Beginner
CASE is SQL Server's primary conditional expression — it works anywhere a value is expected: SELECT, WHERE, ORDER BY, GROUP BY, and even inside aggregate functions.
-- Searched CASE (most flexible)
SELECT
first_name, salary,
CASE
WHEN salary >= 120000 THEN 'Executive'
WHEN salary >= 90000 THEN 'Senior'
WHEN salary >= 60000 THEN 'Mid-Level'
ELSE 'Junior'
END AS grade
FROM dbo.employees;
-- Simple CASE (equality only)
SELECT
CASE status
WHEN 'A' THEN 'Active'
WHEN 'I' THEN 'Inactive'
WHEN 'T' THEN 'Terminated'
ELSE 'Unknown'
END AS status_label
FROM dbo.employees;
-- IIF — shorthand IF/ELSE (SQL Server 2012+)
SELECT IIF(salary > 100000, 'High', 'Standard') AS tier FROM dbo.employees;
-- CASE inside aggregate — conditional aggregation
SELECT
dept_id,
COUNT(*) AS total,
SUM(CASE WHEN is_active = 1 THEN 1 ELSE 0 END) AS active_count,
SUM(CASE WHEN salary > 100000 THEN salary ELSE 0 END) AS senior_payroll,
AVG(CASE WHEN gender = 'F' THEN salary END) AS avg_female_salary
FROM dbo.employees
GROUP BY dept_id;
-- CASE in ORDER BY — custom sort order
SELECT priority, task_name FROM dbo.tasks
ORDER BY
CASE priority
WHEN 'Critical' THEN 1
WHEN 'High' THEN 2
WHEN 'Medium' THEN 3
ELSE 4
END, task_name;
-- CHOOSE — pick the Nth item from a list (1-indexed)
SELECT CHOOSE(MONTH(GETDATE()), 'Q1','Q1','Q1','Q2','Q2','Q2',
'Q3','Q3','Q3','Q4','Q4','Q4') AS quarter;
NULL Handling Intermediate
NULL is not a value — it means unknown. Any comparison with NULL yields NULL (not TRUE or FALSE), which means rows are silently excluded from results. This causes more data bugs than almost anything else in SQL.
-- NULL arithmetic: NULL propagates through all operations
SELECT NULL + 5 -- NULL
SELECT NULL * 0 -- NULL (not 0!)
SELECT NULL = NULL -- NULL (not TRUE!)
SELECT NULL IS NULL -- TRUE — correct way to check for NULL
-- ISNULL — replace NULL with a default (SQL Server specific)
ISNULL(bonus, 0)
ISNULL(manager_name, 'No Manager')
-- COALESCE — returns first non-NULL; ANSI standard, accepts multiple args
COALESCE(bonus, commission, 0)
COALESCE(nickname, first_name, 'Unknown')
-- NULLIF — returns NULL if two values are equal (avoids div/0)
NULLIF(denominator, 0)
numerator / NULLIF(denominator, 0) -- safe division
-- NULLs in aggregates: all aggregate functions ignore NULLs except COUNT(*)
AVG(bonus) -- average of non-NULL bonus values only
COUNT(bonus) -- count of non-NULL bonus values
COUNT(*) -- count of ALL rows, regardless of NULLs
-- Practical NULL handling in reports
SELECT
employee_id,
COALESCE(preferred_name, first_name) AS display_name,
ISNULL(phone_ext, 'N/A') AS ext,
salary + ISNULL(bonus, 0) AS total_comp,
CASE WHEN termination_date IS NULL THEN 'Active' ELSE 'Former' END AS status
FROM dbo.employees;
WHERE col IS NOT NULL to the subquery, or use NOT EXISTS instead — it handles NULLs correctly.Advanced Queries
Window functions, CTEs, subqueries, set operations, and PIVOT — the patterns that separate intermediate from advanced T-SQL developers.
Window Functions Advanced
Window functions perform calculations across a set of rows related to the current row — without collapsing them like GROUP BY. They are the single biggest productivity multiplier in analytical SQL.
-- Anatomy: function() OVER (PARTITION BY ... ORDER BY ... ROWS/RANGE ...)
-- RANKING functions
SELECT
first_name, dept_id, salary,
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rn,
-- unique sequential; ties get different numbers arbitrarily
RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rnk,
-- ties same rank; gaps in sequence (1,1,3)
DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS drnk,
-- ties same rank; no gaps (1,1,2)
NTILE(4) OVER (PARTITION BY dept_id ORDER BY salary DESC) AS quartile
FROM dbo.employees;
-- OFFSET functions — LAG and LEAD
SELECT
order_date, revenue,
LAG(revenue, 1, 0) OVER (ORDER BY order_date) AS prev_day_rev,
LEAD(revenue, 1, 0) OVER (ORDER BY order_date) AS next_day_rev,
revenue - LAG(revenue) OVER (ORDER BY order_date) AS day_over_day
FROM dbo.daily_sales;
-- AGGREGATE window functions — running totals, moving averages
SELECT
order_date, revenue,
SUM(revenue) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS running_total,
AVG(revenue) OVER (ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS avg_7d,
SUM(revenue) OVER () AS grand_total,
revenue * 100.0 / SUM(revenue) OVER () AS pct_of_total
FROM dbo.daily_sales;
Practical Patterns
-- Top N per group (top 3 earners per department)
WITH ranked AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rn
FROM dbo.employees
)
SELECT * FROM ranked WHERE rn <= 3;
-- Deduplicate — keep most recent record per customer
WITH deduped AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY updated_at DESC) AS rn
FROM dbo.customer_records
)
SELECT * FROM deduped WHERE rn = 1;
-- Year-over-year comparison with LAG
WITH monthly AS (
SELECT YEAR(order_date) yr, MONTH(order_date) mo, SUM(amount) rev
FROM dbo.orders
GROUP BY YEAR(order_date), MONTH(order_date)
)
SELECT *,
LAG(rev, 12) OVER (ORDER BY yr, mo) AS same_month_last_year,
rev - LAG(rev, 12) OVER (ORDER BY yr, mo) AS yoy_change
FROM monthly;
CTEs & Recursive CTEs Intermediate
Common Table Expressions (CTEs) are named subqueries that live for the duration of a single statement. They dramatically improve readability and enable recursive queries for hierarchical data.
-- Basic CTE
WITH high_earners AS (
SELECT emp_id, first_name, salary, dept_id
FROM dbo.employees
WHERE salary > 100000
)
SELECT h.first_name, d.dept_name
FROM high_earners h
JOIN dbo.departments d ON h.dept_id = d.dept_id;
-- Multiple CTEs chained together
WITH
dept_stats AS (
SELECT dept_id, AVG(salary) AS avg_sal
FROM dbo.employees
GROUP BY dept_id
),
above_avg AS (
SELECT e.*, ds.avg_sal
FROM dbo.employees e
JOIN dept_stats ds ON e.dept_id = ds.dept_id
WHERE e.salary > ds.avg_sal
)
SELECT * FROM above_avg ORDER BY dept_id, salary DESC;
-- Recursive CTE — walk an org chart
WITH org_tree AS (
-- Anchor: start with the CEO (no manager)
SELECT
emp_id, first_name, manager_id,
0 AS level,
CAST(first_name AS VARCHAR(4000)) AS path
FROM dbo.employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: join employees to their parent
SELECT
e.emp_id, e.first_name, e.manager_id,
ot.level + 1,
ot.path + ' > ' + e.first_name
FROM dbo.employees e
JOIN org_tree ot ON e.manager_id = ot.emp_id
)
SELECT
REPLICATE(' ', level) + first_name AS org_chart,
level, path
FROM org_tree
ORDER BY path;
-- Generate a date series with recursion
WITH nums AS (
SELECT 0 AS n
UNION ALL
SELECT n + 1 FROM nums WHERE n < 364
)
SELECT DATEADD(day, n, '2024-01-01') AS calendar_date
FROM nums
OPTION (MAXRECURSION 365);
#temp table instead.Subqueries & EXISTS Intermediate
Subqueries nest one query inside another. EXISTS and NOT EXISTS are often more efficient than IN/NOT IN for existence checks and handle NULLs correctly.
-- Scalar subquery in SELECT
SELECT
first_name, salary,
(SELECT AVG(salary) FROM dbo.employees) AS company_avg,
salary - (SELECT AVG(salary) FROM dbo.employees) AS vs_avg
FROM dbo.employees;
-- Correlated subquery — references outer query row
SELECT
e.first_name, e.dept_id, e.salary,
(SELECT AVG(salary) FROM dbo.employees i
WHERE i.dept_id = e.dept_id) AS dept_avg
FROM dbo.employees e;
-- Derived table (subquery in FROM)
SELECT dept_id, avg_sal
FROM (
SELECT dept_id, AVG(salary) AS avg_sal
FROM dbo.employees
GROUP BY dept_id
) AS dept_avgs
WHERE avg_sal > 80000;
-- EXISTS — stops scanning as soon as one match is found (very fast)
SELECT * FROM dbo.customers c
WHERE EXISTS (
SELECT 1 FROM dbo.orders o
WHERE o.customer_id = c.customer_id
AND o.order_date >= '2024-01-01'
);
-- NOT EXISTS — safe alternative to NOT IN (handles NULLs correctly)
SELECT * FROM dbo.products p
WHERE NOT EXISTS (
SELECT 1 FROM dbo.order_items oi
WHERE oi.product_id = p.product_id
);
Set Operations Intermediate
Set operations combine result sets from multiple SELECT statements. UNION ALL is almost always preferable to UNION for performance — UNION requires a deduplication sort.
-- UNION ALL — all rows including duplicates (FAST)
SELECT emp_id, first_name, 'Employee' AS type FROM dbo.employees
UNION ALL
SELECT contractor_id, full_name, 'Contractor' FROM dbo.contractors;
-- UNION — deduplicates (requires sort — use only when needed)
SELECT email FROM dbo.customers
UNION
SELECT email FROM dbo.leads;
-- INTERSECT — rows that appear in BOTH result sets
SELECT product_id FROM dbo.online_sales
INTERSECT
SELECT product_id FROM dbo.store_sales;
-- EXCEPT — rows in first set but NOT in second
SELECT product_id FROM dbo.products
EXCEPT
SELECT product_id FROM dbo.order_items;
-- Products that have never been ordered
-- ORDER BY — only one at the end, applies to combined result
SELECT emp_id, name FROM dbo.employees
UNION ALL
SELECT contractor_id, name FROM dbo.contractors
ORDER BY name;
PIVOT & UNPIVOT Advanced
PIVOT rotates rows into columns. UNPIVOT does the reverse. Dynamic PIVOT with dynamic SQL is required when column values aren't known at query-write time.
-- Static PIVOT — quarters as columns
SELECT product_line, [Q1], [Q2], [Q3], [Q4]
FROM (
SELECT product_line, quarter, sales
FROM dbo.quarterly_sales
) src
PIVOT (
SUM(sales)
FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) pvt;
-- UNPIVOT — turn columns back into rows
SELECT product_line, quarter, sales
FROM dbo.pivot_table
UNPIVOT (
sales FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) unpvt;
-- Dynamic PIVOT — column values at runtime
DECLARE @cols NVARCHAR(MAX), @sql NVARCHAR(MAX);
SELECT @cols = STRING_AGG(QUOTENAME(quarter), ',') WITHIN GROUP (ORDER BY quarter)
FROM (SELECT DISTINCT quarter FROM dbo.quarterly_sales) q;
SET @sql = N'SELECT product_line, ' + @cols + N'
FROM (SELECT product_line, quarter, sales FROM dbo.quarterly_sales) src
PIVOT (SUM(sales) FOR quarter IN (' + @cols + N')) pvt';
EXEC sp_executesql @sql;
SUM(CASE WHEN quarter = 'Q1' THEN sales END) AS Q1. Use it for simple cross-tabs; reserve PIVOT for automatic column generation.Programmability
Stored procedures, user-defined functions, triggers, cursors, dynamic SQL, error handling, and transactions — writing reusable, robust T-SQL.
Variables & Flow Control Intermediate
T-SQL is a full procedural language. Variables, loops, and conditionals enable complex logic — but should be used sparingly in favor of set-based approaches wherever possible.
-- Declaring and assigning variables
DECLARE @name NVARCHAR(100) = N'Alice';
DECLARE @salary DECIMAL(10,2) = 95000.00;
DECLARE @hire_date DATE = '2022-03-15';
DECLARE @count INT; -- NULL by default
-- SET vs SELECT assignment
SET @count = 42;
SELECT @count = COUNT(*) FROM dbo.employees WHERE is_active = 1;
-- Table variable
DECLARE @results TABLE (
emp_id INT,
name NVARCHAR(100),
salary DECIMAL(10,2)
);
INSERT INTO @results
SELECT emp_id, first_name + ' ' + last_name, salary
FROM dbo.employees WHERE dept_id = 3;
-- IF / ELSE
IF @salary > 100000
BEGIN
PRINT 'High earner';
UPDATE dbo.employees SET grade = 'Senior' WHERE emp_id = 101;
END
ELSE IF @salary > 70000
BEGIN
PRINT 'Mid-level';
END
ELSE
BEGIN
PRINT 'Junior';
END
-- WHILE loop
DECLARE @i INT = 1;
WHILE @i <= 10
BEGIN
PRINT 'Iteration: ' + CAST(@i AS VARCHAR);
SET @i += 1;
IF @i = 5 BREAK; -- exit loop
IF @i = 3 CONTINUE; -- skip to next iteration
END
Stored Procedures Advanced
Stored procedures are precompiled, named blocks of T-SQL. They support parameters, return values, output parameters, and full procedural logic — the backbone of application-to-database communication.
-- Create a stored procedure
CREATE OR ALTER PROCEDURE dbo.usp_GetEmployeesByDept
@dept_id INT,
@min_salary DECIMAL(10,2) = 0 -- optional parameter with default
AS
BEGIN
SET NOCOUNT ON; -- suppress "N rows affected" messages
SELECT
emp_id,
first_name + ' ' + last_name AS full_name,
salary, hire_date
FROM dbo.employees
WHERE dept_id = @dept_id
AND salary >= @min_salary
AND is_active = 1
ORDER BY salary DESC;
END;
-- Execute
EXEC dbo.usp_GetEmployeesByDept @dept_id = 3;
EXEC dbo.usp_GetEmployeesByDept @dept_id = 3, @min_salary = 80000;
-- OUTPUT parameters — return values back to caller
CREATE OR ALTER PROCEDURE dbo.usp_CreateEmployee
@first_name NVARCHAR(50),
@last_name NVARCHAR(50),
@salary DECIMAL(10,2),
@dept_id INT,
@new_emp_id INT OUTPUT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.employees (first_name, last_name, salary, dept_id, hire_date, is_active)
VALUES (@first_name, @last_name, @salary, @dept_id, GETDATE(), 1);
SET @new_emp_id = SCOPE_IDENTITY();
END;
-- Call with OUTPUT parameter
DECLARE @new_id INT;
EXEC dbo.usp_CreateEmployee
@first_name = N'Alice', @last_name = N'Johnson',
@salary = 85000, @dept_id = 3,
@new_emp_id = @new_id OUTPUT;
PRINT 'New employee ID: ' + CAST(@new_id AS VARCHAR);
SET NOCOUNT ON — row count messages can break calling applications.User-Defined Functions Advanced
UDFs come in three forms: scalar (return one value), inline table-valued (return a table, like a parameterized view), and multi-statement table-valued. Use inline TVFs wherever possible — they perform like views and the optimizer can look inside them.
-- Scalar UDF — use sparingly; runs row-by-row, can kill performance
CREATE OR ALTER FUNCTION dbo.fn_GetAge (@birth_date DATE)
RETURNS INT
AS
BEGIN
RETURN
DATEDIFF(YEAR, @birth_date, GETDATE())
- CASE WHEN FORMAT(GETDATE(), 'MMdd') < FORMAT(@birth_date, 'MMdd')
THEN 1 ELSE 0 END;
END;
SELECT dbo.fn_GetAge('1990-06-15'); -- must use schema prefix
-- Inline Table-Valued Function (iTVF) — best performance
-- Behaves like a parameterized view; optimizer can push predicates inside
CREATE OR ALTER FUNCTION dbo.fn_GetDeptEmployees (@dept_id INT)
RETURNS TABLE
AS
RETURN (
SELECT emp_id, first_name, last_name, salary, hire_date
FROM dbo.employees
WHERE dept_id = @dept_id AND is_active = 1
);
-- CROSS APPLY — call a TVF for each row
SELECT d.dept_name, e.first_name, e.salary
FROM dbo.departments d
CROSS APPLY dbo.fn_GetDeptEmployees(d.dept_id) e;
-- OUTER APPLY — like LEFT JOIN, includes depts with no employees
SELECT d.dept_name, e.first_name, e.salary
FROM dbo.departments d
OUTER APPLY dbo.fn_GetDeptEmployees(d.dept_id) e;
Triggers Advanced
Triggers fire automatically in response to DML or DDL events. They're powerful but invisible — overuse leads to hidden business logic that's hard to debug and maintain.
-- AFTER trigger — fires after the DML completes
-- INSERTED and DELETED are virtual tables with the affected rows
CREATE OR ALTER TRIGGER trg_employees_audit
ON dbo.employees
AFTER UPDATE
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.employee_audit (emp_id, field_changed, old_value, new_value, changed_at, changed_by)
SELECT
d.emp_id, 'salary',
CAST(d.salary AS NVARCHAR(50)),
CAST(i.salary AS NVARCHAR(50)),
GETDATE(), SYSTEM_USER
FROM inserted i
JOIN deleted d ON i.emp_id = d.emp_id
WHERE i.salary <> d.salary; -- only log actual changes
END;
-- INSTEAD OF trigger — replaces the DML (useful on views)
CREATE OR ALTER TRIGGER trg_prevent_delete
ON dbo.employees
INSTEAD OF DELETE
AS
BEGIN
SET NOCOUNT ON;
UPDATE dbo.employees
SET is_active = 0, termination_date = GETDATE()
WHERE emp_id IN (SELECT emp_id FROM deleted);
END;
-- DDL trigger — fires on schema changes
CREATE OR ALTER TRIGGER trg_prevent_drop
ON DATABASE
FOR DROP_TABLE
AS
BEGIN
PRINT 'Dropping tables is not allowed.';
ROLLBACK;
END;
-- Managing triggers
DISABLE TRIGGER trg_employees_audit ON dbo.employees;
ENABLE TRIGGER trg_employees_audit ON dbo.employees;
DROP TRIGGER trg_employees_audit;
Cursors (& Set-Based Alternatives) Advanced
Cursors process one row at a time — almost always slower than set-based logic. Understand them well enough to replace them.
-- Cursor anatomy
DECLARE @emp_id INT, @salary DECIMAL(10,2);
DECLARE emp_cursor CURSOR
LOCAL STATIC READ_ONLY FORWARD_ONLY -- always specify these options
FOR
SELECT emp_id, salary FROM dbo.employees WHERE is_active = 1;
OPEN emp_cursor;
FETCH NEXT FROM emp_cursor INTO @emp_id, @salary;
WHILE @@FETCH_STATUS = 0
BEGIN
IF @salary < 50000
UPDATE dbo.employees SET salary = salary * 1.1 WHERE emp_id = @emp_id;
FETCH NEXT FROM emp_cursor INTO @emp_id, @salary;
END;
CLOSE emp_cursor;
DEALLOCATE emp_cursor;
-- ✅ SET-BASED equivalent (100x faster):
UPDATE dbo.employees
SET salary = salary * 1.1
WHERE is_active = 1 AND salary < 50000;
Dynamic SQL Advanced
Dynamic SQL builds query strings at runtime — enabling flexible queries with variable table/column names. Always use parameterized dynamic SQL to prevent SQL injection.
-- Basic dynamic SQL with sp_executesql (ALWAYS prefer over EXEC())
DECLARE @sql NVARCHAR(MAX);
DECLARE @dept INT = 3;
SET @sql = N'SELECT * FROM dbo.employees WHERE dept_id = @dept_id';
EXEC sp_executesql
@sql,
N'@dept_id INT', -- parameter definition
@dept_id = @dept; -- parameter value
-- Dynamic table name (can't be parameterized — must be whitelisted)
DECLARE @table_name SYSNAME = N'employees';
-- Validate against known list first!
IF @table_name NOT IN ('employees', 'contractors', 'vendors')
BEGIN
THROW 50001, 'Invalid table name', 1;
RETURN;
END;
SET @sql = N'SELECT COUNT(*) FROM dbo.' + QUOTENAME(@table_name);
-- QUOTENAME wraps in brackets and escapes internal brackets
EXEC sp_executesql @sql;
-- Dynamic ORDER BY — whitelist column names, never concatenate user input raw
DECLARE @order_col NVARCHAR(50) = N'salary';
IF @order_col NOT IN ('emp_id', 'last_name', 'salary', 'hire_date')
SET @order_col = N'emp_id';
SET @sql = N'SELECT * FROM dbo.employees ORDER BY ' + QUOTENAME(@order_col) + N' DESC';
EXEC sp_executesql @sql;
sp_executesql with parameters for values, and QUOTENAME() + whitelisting for object names. EXEC(@sql) with concatenated input is a critical vulnerability.Error Handling — TRY/CATCH Intermediate
T-SQL's TRY/CATCH is modeled after C# — cleaner than the old @@ERROR approach. Always wrap multi-statement operations in TRY/CATCH with transactions to ensure data consistency on failure.
-- Basic TRY/CATCH
BEGIN TRY
INSERT INTO dbo.employees (first_name, dept_id) VALUES ('Alice', 99);
END TRY
BEGIN CATCH
PRINT 'Error: ' + ERROR_MESSAGE();
PRINT 'Number: ' + CAST(ERROR_NUMBER() AS VARCHAR);
PRINT 'Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR);
PRINT 'Line: ' + CAST(ERROR_LINE() AS VARCHAR);
END CATCH;
-- TRY/CATCH with transaction — the complete pattern
BEGIN TRY
BEGIN TRANSACTION;
UPDATE dbo.accounts SET balance = balance - 500 WHERE account_id = 1;
UPDATE dbo.accounts SET balance = balance + 500 WHERE account_id = 2;
IF (SELECT balance FROM dbo.accounts WHERE account_id = 1) < 0
THROW 50001, 'Insufficient funds', 1;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
ROLLBACK TRANSACTION;
THROW; -- re-throw original error to caller (SQL Server 2012+)
END CATCH;
-- Log error in CATCH block before re-throwing
BEGIN CATCH
IF @@TRANCOUNT > 0 ROLLBACK;
INSERT INTO dbo.error_log (error_num, error_msg, error_proc, error_line, logged_at)
VALUES (ERROR_NUMBER(), ERROR_MESSAGE(), ERROR_PROCEDURE(), ERROR_LINE(), GETDATE());
THROW;
END CATCH;
Transactions & Locking Advanced
Transactions ensure atomicity. Understanding isolation levels is critical for balancing data consistency against concurrency and blocking.
-- Explicit transaction
BEGIN TRANSACTION;
UPDATE dbo.inventory SET qty = qty - 1 WHERE product_id = 42;
INSERT INTO dbo.sales (product_id, qty, sale_date) VALUES (42, 1, GETDATE());
COMMIT TRANSACTION;
-- Savepoints — partial rollback within a transaction
BEGIN TRANSACTION;
UPDATE dbo.orders SET status = 'Processing' WHERE order_id = 101;
SAVE TRANSACTION sp1;
UPDATE dbo.inventory SET qty = qty - 5 WHERE product_id = 7;
IF (SELECT qty FROM dbo.inventory WHERE product_id = 7) < 0
ROLLBACK TRANSACTION sp1; -- partial rollback to savepoint
COMMIT TRANSACTION;
-- Set isolation level for current session
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; -- dirty reads OK
SET TRANSACTION ISOLATION LEVEL READ COMMITTED; -- default
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
-- NOLOCK hint — per-table equivalent of READ UNCOMMITTED
SELECT * FROM dbo.orders WITH (NOLOCK);
-- ⚠️ Can return uncommitted data. Use only for dashboards/approx counts.
-- Enable RCSI — best option for most OLTP workloads
ALTER DATABASE YourDatabase SET READ_COMMITTED_SNAPSHOT ON;
| Isolation Level | Dirty Reads | Non-Repeatable | Phantom | Use When |
|---|---|---|---|---|
READ UNCOMMITTED | Yes ❌ | Yes | Yes | Quick approximations; max concurrency |
READ COMMITTED | No ✅ | Yes | Yes | Default; most OLTP workloads |
REPEATABLE READ | No ✅ | No ✅ | Yes | Rarely used |
SERIALIZABLE | No ✅ | No ✅ | No ✅ | Financial; highest consistency |
READ COMMITTED SNAPSHOT | No ✅ | Versions | Versions | Best for mixed OLTP |
SNAPSHOT | No ✅ | No ✅ | No ✅ | Reporting on live OLTP |
Schema Design & Objects
DDL, DML, MERGE, views, indexes, temp tables, JSON, XML — building and maintaining the database objects that power your queries.
DDL — Tables & Data Types Intermediate
Getting your schema right from the start — choosing correct data types, constraints, and identity behavior — pays dividends in performance, storage, and data integrity for the life of the system.
-- Create table with best-practice patterns
CREATE TABLE dbo.employees (
emp_id INT NOT NULL IDENTITY(1,1),
first_name NVARCHAR(50) NOT NULL,
last_name NVARCHAR(50) NOT NULL,
email NVARCHAR(255) NOT NULL,
salary DECIMAL(10,2) NOT NULL DEFAULT 0,
hire_date DATE NOT NULL DEFAULT CAST(GETDATE() AS DATE),
dept_id INT NULL,
is_active BIT NOT NULL DEFAULT 1,
created_at DATETIME2 NOT NULL DEFAULT SYSDATETIME(),
updated_at DATETIME2 NULL,
row_version ROWVERSION NOT NULL,
CONSTRAINT PK_employees PRIMARY KEY (emp_id),
CONSTRAINT UQ_employees_email UNIQUE (email),
CONSTRAINT FK_employees_dept FOREIGN KEY (dept_id)
REFERENCES dbo.departments (dept_id) ON DELETE SET NULL,
CONSTRAINT CK_employees_salary CHECK (salary >= 0)
);
-- Alter table
ALTER TABLE dbo.employees ADD phone NVARCHAR(20) NULL;
ALTER TABLE dbo.employees ALTER COLUMN last_name NVARCHAR(100) NOT NULL;
ALTER TABLE dbo.employees DROP COLUMN phone;
-- Conditional DDL — only create if it doesn't exist
IF NOT EXISTS (SELECT 1 FROM sys.tables
WHERE name = 'employees' AND schema_id = SCHEMA_ID('dbo'))
BEGIN
-- CREATE TABLE ...
END;
Key Data Types Quick Reference
| Type | Size | Range / Notes | Use For |
|---|---|---|---|
INT | 4 bytes | -2.1B to 2.1B | Most integer IDs and counts |
BIGINT | 8 bytes | ±9.2 quintillion | Large IDs, event sequences |
TINYINT | 1 byte | 0–255 | Status codes, flags |
DECIMAL(p,s) | 5–17 bytes | Exact numeric | Money, measurements |
FLOAT | 8 bytes | Approximate | Scientific; never money |
NVARCHAR(n) | 2×n bytes | Unicode, max 4000 | Names, text (use for all strings) |
BIT | 1 bit | 0, 1, or NULL | Boolean flags |
UNIQUEIDENTIFIER | 16 bytes | GUID | Distributed IDs |
DATETIME2 | 6–8 bytes | High precision | Timestamps (preferred over DATETIME) |
DML — Insert, Update, Delete Beginner
T-SQL's OUTPUT clause is uniquely powerful — it lets you capture rows affected by any DML statement without a separate SELECT.
-- INSERT — single and multi-row
INSERT INTO dbo.employees (first_name, last_name, salary, dept_id)
VALUES (N'Alice', N'Johnson', 85000, 3);
INSERT INTO dbo.employees (first_name, last_name, salary)
VALUES
(N'Bob', N'Smith', 70000),
(N'Carol', N'Lee', 90000),
(N'David', N'Brown', 65000);
-- OUTPUT clause — capture inserted rows
DECLARE @inserted_ids TABLE (emp_id INT, full_name NVARCHAR(100));
INSERT INTO dbo.employees (first_name, last_name, salary)
OUTPUT inserted.emp_id, inserted.first_name + ' ' + inserted.last_name
INTO @inserted_ids
VALUES (N'Eve', N'Davis', 78000);
-- UPDATE with JOIN
UPDATE e
SET e.salary = e.salary * 1.1
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE d.dept_name = 'Engineering';
-- UPDATE with OUTPUT — capture before and after values
UPDATE dbo.employees
SET salary = salary * 1.05
OUTPUT
deleted.emp_id,
deleted.salary AS old_salary,
inserted.salary AS new_salary
WHERE dept_id = 2;
-- DELETE with JOIN
DELETE e
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE d.dept_name = 'Discontinued';
-- TRUNCATE — removes all rows, resets identity, faster than DELETE
TRUNCATE TABLE dbo.staging_table;
-- Cannot use WHERE; cannot truncate FK-referenced tables
MERGE Statement Advanced
MERGE (UPSERT) combines INSERT, UPDATE, and DELETE — comparing a source to a target and performing the right action for each row. Ideal for syncing tables and staging loads.
-- Full MERGE pattern
MERGE INTO dbo.employees WITH (HOLDLOCK) AS target
USING (
SELECT emp_id, first_name, last_name, salary, dept_id
FROM dbo.employees_staging
) AS source
ON target.emp_id = source.emp_id
WHEN MATCHED AND (
target.salary <> source.salary OR
target.dept_id <> source.dept_id
) THEN
UPDATE SET
target.salary = source.salary,
target.dept_id = source.dept_id,
target.updated_at = SYSDATETIME()
WHEN NOT MATCHED BY TARGET THEN
INSERT (first_name, last_name, salary, dept_id, is_active)
VALUES (source.first_name, source.last_name, source.salary, source.dept_id, 1)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT
$action AS merge_action,
inserted.emp_id, deleted.emp_id AS deleted_id;
-- Simple UPSERT (no DELETE)
MERGE INTO dbo.product_prices AS t
USING (SELECT 42 AS product_id, 29.99 AS price) AS s
ON t.product_id = s.product_id
WHEN MATCHED THEN
UPDATE SET t.price = s.price
WHEN NOT MATCHED THEN
INSERT (product_id, price) VALUES (s.product_id, s.price);
WITH (HOLDLOCK) to the target table reference (shown above) to prevent this.Views & Indexed Views Intermediate
Views are saved SELECT statements — they simplify complex queries, enforce consistent data access, and can be secured independently. Indexed views materialize the result set for dramatic aggregation speedups.
-- Create a view
CREATE OR ALTER VIEW dbo.vw_active_employees
AS
SELECT
e.emp_id,
e.first_name + ' ' + e.last_name AS full_name,
d.dept_name,
e.salary,
e.hire_date,
DATEDIFF(YEAR, e.hire_date, GETDATE()) AS tenure_years
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
WHERE e.is_active = 1;
SELECT * FROM dbo.vw_active_employees WHERE dept_name = 'Engineering';
-- Indexed View — materialized, maintained automatically by SQL Server
-- Strict requirements: WITH SCHEMABINDING, no outer joins/subqueries/DISTINCT
CREATE OR ALTER VIEW dbo.vw_dept_salary_totals
WITH SCHEMABINDING
AS
SELECT
dept_id,
COUNT_BIG(*) AS headcount, -- COUNT_BIG required for indexed views
SUM(salary) AS total_salary
FROM dbo.employees
GROUP BY dept_id;
-- Create the clustered index to materialize it
CREATE UNIQUE CLUSTERED INDEX IX_vw_dept_salary
ON dbo.vw_dept_salary_totals (dept_id);
Indexes & Execution Plans Advanced
Indexes are the single biggest lever for query performance. A good index turns a full table scan into an index seek. Understanding which type to create — and when — separates good from great.
-- Clustered index — defines physical row order. Only 1 per table.
CREATE CLUSTERED INDEX CX_employees_emp_id ON dbo.employees (emp_id);
-- Non-clustered index
CREATE NONCLUSTERED INDEX IX_employees_dept_salary
ON dbo.employees (dept_id, salary);
-- Covering index — add columns to avoid key lookups
CREATE NONCLUSTERED INDEX IX_employees_dept_covering
ON dbo.employees (dept_id, is_active)
INCLUDE (first_name, last_name, salary, hire_date);
-- Query: WHERE dept_id=3 AND is_active=1 selecting these cols = no lookup needed
-- Filtered index — indexes a subset of rows (smaller and faster)
CREATE NONCLUSTERED INDEX IX_orders_pending
ON dbo.orders (order_date, customer_id)
WHERE status = 'Pending';
-- Rebuild vs Reorganize (defragmentation)
ALTER INDEX IX_employees_dept_covering ON dbo.employees REBUILD;
ALTER INDEX IX_employees_dept_covering ON dbo.employees REORGANIZE;
ALTER INDEX ALL ON dbo.employees REBUILD;
-- Statistics I/O — see how many pages were read
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SELECT * FROM dbo.employees WHERE dept_id = 3;
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
Reading Execution Plans
| Operator | Signal | Action |
|---|---|---|
| Index Seek ✅ | Efficient | Jumping to specific rows |
| Index Scan ⚠️ | OK for small tables | Reading full index |
| Table Scan 🚨 | No index used | Add or fix an index |
| Key Lookup 🚨 | Extra bookmark lookups | Add INCLUDE columns to non-clustered index |
| Hash Match ⚠️ | Building hash table | Often means missing JOIN index |
| Sort ⚠️ | Expensive sort step | Add ORDER BY-aligned index or reduce rows earlier |
Temp Tables & Table Variables Intermediate
SQL Server gives you three ways to store intermediate results: temp tables (#table), table variables (@table), and CTEs. Choosing correctly affects performance and scope.
-- Local temp table — visible only in current session and child procedures
CREATE TABLE #dept_summary (
dept_id INT NOT NULL,
dept_name NVARCHAR(100) NOT NULL,
headcount INT NOT NULL,
avg_salary DECIMAL(10,2) NOT NULL
);
INSERT INTO #dept_summary
SELECT d.dept_id, d.dept_name, COUNT(*), AVG(e.salary)
FROM dbo.employees e
JOIN dbo.departments d ON e.dept_id = d.dept_id
GROUP BY d.dept_id, d.dept_name;
-- Can add indexes to temp tables!
CREATE NONCLUSTERED INDEX IX_tmp_dept ON #dept_summary (dept_id);
SELECT * FROM #dept_summary WHERE avg_salary > 80000;
DROP TABLE IF EXISTS #dept_summary; -- always clean up
-- Table variable — scoped to batch, no statistics
DECLARE @results TABLE (
emp_id INT NOT NULL,
name NVARCHAR(100) NOT NULL
);
INSERT INTO @results SELECT emp_id, first_name FROM dbo.employees;
| Feature | #Temp Table | @Table Variable | CTE |
|---|---|---|---|
| Scope | Session + child procs | Batch only | Single statement |
| Indexes | ✅ Yes | ❌ No | N/A |
| Statistics | ✅ Yes | ❌ No | N/A |
| Transactions | ✅ Participates | ❌ Does not rollback | N/A |
| Best for | Large results, reuse | Small sets, simple staging | Readability, single use |
#temp tables. The optimizer assumes table variables have exactly 1 row, leading to poor plans for large datasets in older versions.JSON Support Advanced
SQL Server 2016+ provides native JSON support for parsing, querying, and generating JSON — covering most practical needs for API integrations and document storage.
-- Generate JSON from query results
SELECT emp_id, first_name, salary
FROM dbo.employees
WHERE dept_id = 3
FOR JSON PATH;
SELECT emp_id, first_name, salary
FROM dbo.employees
FOR JSON AUTO, ROOT('employees'), INCLUDE_NULL_VALUES;
-- Extract values from JSON string
DECLARE @json NVARCHAR(MAX) = N'{"name":"Alice","salary":85000,"tags":["sql","data"]}';
SELECT
JSON_VALUE(@json, '$.name') AS name, -- scalar value
JSON_VALUE(@json, '$.salary') AS salary,
JSON_QUERY(@json, '$.tags') AS tags_array; -- object or array
-- OPENJSON with schema — shred JSON array into rows
DECLARE @employees_json NVARCHAR(MAX) = N'
[{"id":1,"name":"Alice","salary":85000},
{"id":2,"name":"Bob","salary":70000}]';
SELECT emp_id, name, salary
FROM OPENJSON(@employees_json)
WITH (
emp_id INT '$.id',
name NVARCHAR(100) '$.name',
salary DECIMAL(10,2) '$.salary'
);
-- Modify JSON
SET @json = JSON_MODIFY(@json, '$.salary', 90000);
-- Validate JSON
SELECT ISJSON('{"valid":true}'); -- 1
SELECT ISJSON('not json'); -- 0
XML Support Advanced
SQL Server has deep XML support including the native XML data type and XQuery — still common in legacy integrations, SSRS, and SSAS.
-- FOR XML — convert query results to XML
SELECT emp_id, first_name, salary FROM dbo.employees
FOR XML PATH('employee'), ROOT('employees');
-- Query XML data type
DECLARE @xml XML = '<employees>
<employee id="1"><name>Alice</name><salary>85000</salary></employee>
<employee id="2"><name>Bob</name><salary>70000</salary></employee>
</employees>';
-- .value() — extract a scalar
SELECT @xml.value('(/employees/employee[@id=1]/name)[1]', 'NVARCHAR(100)');
-- .nodes() — shred to rows
SELECT
emp.value('(@id)', 'INT') AS emp_id,
emp.value('(name)[1]', 'NVARCHAR(100)') AS name,
emp.value('(salary)[1]', 'DECIMAL(10,2)') AS salary
FROM @xml.nodes('/employees/employee') AS t(emp);
T-SQL Gotchas & Traps Mastery
These are the patterns that silently corrupt data, destroy performance, or produce wrong results. Every experienced SQL Server developer has been burned by at least half of these.
Data & Logic Traps
-- 1. NULL propagation in expressions
SELECT 5 + NULL -- NULL, not 5
SELECT salary + ISNULL(bonus, 0) -- correct pattern
-- 2. NOT IN with NULLs — returns ZERO rows if subquery has any NULLs!
WHERE customer_id NOT IN (SELECT customer_id FROM dbo.blocked);
-- Fix: add WHERE customer_id IS NOT NULL, or use NOT EXISTS
-- 3. Implicit type conversion — converts every row
WHERE varchar_id = 42 -- BAD: table scan
WHERE varchar_id = '42' -- GOOD: same types
-- 4. Integer division
SELECT 1/3 -- = 0 (integer!)
SELECT 1/3.0 -- = 0.333333
-- 5. DATEDIFF counts boundary crossings, not elapsed time
DATEDIFF(year, '2023-12-31', '2024-01-01') -- = 1 (crossed a year boundary)
-- 6. String concatenation with + operator converts numbers
SELECT '1' + '2' -- = 3 (implicit cast to int!)
SELECT CONCAT('1', '2') -- = '12' (correct)
-- 7. UPDATE without WHERE — updates EVERY row
UPDATE dbo.employees SET salary = 50000; -- disaster!
-- Always test with SELECT first, then change to UPDATE
-- 8. SCOPE_IDENTITY vs @@IDENTITY
SELECT @@IDENTITY; -- last identity in session — DANGEROUS with triggers
SELECT SCOPE_IDENTITY(); -- last identity in current scope — ALWAYS use this
Performance Traps
-- 9. Non-sargable predicates (can't use index)
WHERE YEAR(order_date) = 2024 -- BAD: function on column
WHERE order_date >= '2024-01-01' -- GOOD
AND order_date < '2025-01-01'
-- 10. Leading wildcard LIKE
WHERE last_name LIKE '%son' -- BAD: can't use index on last_name
WHERE last_name LIKE 'John%' -- GOOD: can seek on index
-- 11. SELECT * in production code
SELECT * -- schema changes break calling code silently
SELECT emp_id, first_name, salary -- explicit = safe
-- 12. Scalar UDFs in WHERE/SELECT for large tables
WHERE dbo.fn_IsEligible(emp_id) = 1 -- runs once per row, prevents parallelism
-- 13. NOLOCK is not free — can return dirty, duplicate, or missing rows
SELECT * FROM dbo.orders WITH (NOLOCK)
-- Use RCSI at the database level instead
-- 14. Always schema-qualify object names
SELECT * FROM employees -- BAD: checks user schema first
SELECT * FROM dbo.employees -- GOOD: direct resolution
dbo.employees not just employees). Without a schema, SQL Server checks your user schema first — slowing lookups and creating hard-to-find bugs when two schemas have same-named objects.