A complete, zero-assumption training system — from absolute beginner to professional Python developer. Every concept explained. Every line of code commented.
How to Use This Bible
This guide is designed for non-linear learning. Jump directly to any Part using the sidebar. Search for any concept, function, or keyword instantly. Each Part builds on the last, but every section stands on its own — start wherever you need.
🟢 Beginner
Parts I–VIII. Core syntax, data types, control flow, functions, and data structures. No prior experience needed.
🔵 Intermediate
Parts IX–XIII. File handling, OOP, error handling, modules, and Pythonic patterns.
🔴 Advanced
Parts XIV–XV. Popular libraries, type hints, and static typing with mypy.
🏆 Mastery
Part XVI. Professional patterns, testing, performance, and production-quality engineering.
Part I — Absolute Foundations
Everything starts here. We'll answer: what is a computer program, what is Python, how do you install it, and how do you write your very first line of code — from scratch, no experience required.
What Is Programming? Beginner
Before we write a single line of Python, let's answer the most fundamental question: what exactly is programming?
Imagine you're teaching a very literal-minded robot to make a peanut butter sandwich. You can't just say "make a sandwich" — the robot has no idea what that means. You have to be incredibly specific: "Open the bread bag. Take out two slices. Set them on the counter. Open the peanut butter jar. Pick up the knife. Scoop peanut butter onto the knife. Spread it on one slice..." and so on. Programming is exactly this — giving a computer step-by-step instructions in a language it can understand.
A computer is extraordinarily fast at following instructions, but it has zero common sense. It will do exactly what you tell it, nothing more, nothing less. Your job as a programmer is to be precise, logical, and thorough.
Programming is writing instructions for a computer to follow, expressed in a formal language the computer can understand.
You write text in a file, a program converts it to machine instructions, and the computer executes them one by one at millions of steps per second.
Anytime you need a computer to do something automatically, reliably, or at a scale no human could manage manually.
To automate repetitive tasks, analyze data, build applications, and solve problems faster and more reliably than any human could manually.
Every app on your phone, every website you visit, every video game you play — all of it is code that someone wrote. Once you learn to program, you can build things too.
What Is Python? Beginner
Python is a programming language — one of many. Think of programming languages like spoken languages: Spanish, French, Mandarin all let you express ideas, but in different ways with different rules. Python is just one way to talk to a computer.
Python was created by Guido van Rossum in 1991. He named it after Monty Python (the British comedy group), not the snake. Today, Python is one of the most popular programming languages in the world, used in:
- Data Science & Machine Learning — companies like Google, Netflix, and NASA use Python to analyze data and build AI
- Web Development — websites like Instagram and Pinterest were built with Python
- Automation & Scripting — automating repetitive tasks like renaming files or sending emails
- Finance — banks use Python to analyze markets and manage risk
- Science & Research — scientists use Python to process experimental data
Why is Python so popular for beginners? Because it was designed to be readable. Compare these two programs that both print "Hello, World!":
public class HelloWorld { public static void main(String[] args) { System.out.println("Hello, World!"); } }
print("Hello, World!") # That's it. One line.
Python's philosophy is captured in a document called The Zen of Python, which includes principles like "readability counts" and "simple is better than complex." Python code is meant to read almost like English.
There are two major versions of Python. Python 2 is old and no longer officially supported. Python 3 is the current, modern version you should learn and use. All examples in this Bible use Python 3. When you install Python, make sure you install Python 3.
Installing Python & VS Code Beginner
Before you can run Python code, you need two things: Python itself (the engine) and a code editor (the place you write code). We'll use VS Code — it's free, powerful, and what most professionals use.
Step 1: Install Python
- Go to python.org/downloads
- Click the big yellow "Download Python 3.x.x" button (the exact version number doesn't matter as long as it starts with 3)
- Run the installer
- CRITICAL: On Windows, check the box that says "Add Python to PATH" before clicking Install. This is the most common beginner mistake.
- Click "Install Now"
To verify Python installed correctly, open your terminal (on Mac: search for "Terminal"; on Windows: search for "Command Prompt") and type:
python --version
You should see something like Python 3.11.5. If you see an error, try python3 --version instead (common on Mac).
Step 2: Install VS Code
- Go to code.visualstudio.com
- Download the version for your operating system (Windows, Mac, or Linux)
- Install it like any normal application
Step 3: Install the Python Extension for VS Code
- Open VS Code
- Press
Ctrl+Shift+X(Windows) orCmd+Shift+X(Mac) to open the Extensions panel - Search for "Python"
- Install the one made by Microsoft (it has millions of downloads)
Once VS Code opens and you can see a file explorer on the left side, you're ready to write code. Create a new folder somewhere on your computer called "python-learning" — this is where all your practice files will live.
Your First Script Beginner
A script is simply a text file containing Python instructions. Python files end in .py. Let's create and run your very first one.
Create the file
- In VS Code, open your "python-learning" folder: File → Open Folder
- Click the "New File" icon in the Explorer panel
- Name it
hello.py - Type the following code:
# This is your very first Python program! # The pound sign (#) starts a comment — Python ignores everything after it # Comments are notes for humans, not instructions for the computer print("Hello, World!") # print() displays text on the screen print("My name is Python.") # You can have multiple print() calls print("I am a programming language.")
Run the file
Open the terminal in VS Code with Ctrl+` (that's the backtick key, next to 1 on the keyboard) or Terminal → New Terminal. Then type:
python hello.py
You should see:
Hello, World! My name is Python. I am a programming language.
Think of running a Python script like playing a recipe. The recipe file (hello.py) contains the instructions. Python is the chef who reads the instructions from top to bottom and executes each one in order. Just like a chef doesn't skip around a recipe randomly, Python reads your code from the very first line to the very last line, in order.
If you get SyntaxError, you probably have a typo — a missing quote, mismatched parentheses, or wrong capitalization. Python is case-sensitive: Print() with a capital P will fail because the correct name is print() with lowercase p.
The Python REPL Beginner
There's another way to run Python code besides saving a file: the REPL. REPL stands for Read-Eval-Print Loop. Don't worry about that acronym — just think of it as a conversation with Python where you type one line at a time and Python responds immediately.
To start the REPL, open your terminal and just type:
python
You'll see something like:
Python 3.11.5 (default, ...) Type "help", "copyright", "credits" or "license" for more information. >>>
The >>> prompt means Python is waiting for your input. Try typing:
>>> 2 + 2 4 >>> print("Hello from the REPL!") Hello from the REPL! >>> 10 * 5 50
To exit the REPL, type quit() and press Enter, or press Ctrl+D (Mac/Linux) or Ctrl+Z then Enter (Windows).
Use it for quick experiments, testing a single line of code, or learning how something works. It's your scratchpad.
Use files for anything you want to save, anything longer than a few lines, and any real project you're building.
The print() Function Beginner
print() is Python's way of displaying information to the screen. It's the most basic tool you have for seeing what your program is doing. Let's explore everything it can do:
# Basic print — displays text (called a "string") to the screen print("Hello, World!") # Print a number — no quotes needed for numbers print(42) # Print multiple things at once — separate them with commas # Python automatically adds a space between each item print("My age is", 25, "years old") # Output: My age is 25 years old # Change the separator (the character between items) # By default, sep=" " (a space), but we can change it print("2024", "01", "15", sep="-") # Output: 2024-01-15 # Change the end character (what's printed at the very end) # By default, end="\n" which means "new line" # Using end="" means: don't go to a new line after printing print("Hello", end=" ") # stays on same line print("World") # continues on same line # Output: Hello World # Print a blank line (useful for spacing output) print() # Output: (empty line) # Print the result of a calculation directly print(10 + 5) # Output: 15 print(3 * 4) # Output: 12
print() is a function — a reusable set of instructions. The parentheses () are how you "call" or activate a function. The stuff you put inside the parentheses (like the text you want to print) is called an argument — it's the information you're giving the function to work with. We'll explore functions in depth in Part VII.
Comments Beginner
Comments are notes in your code that Python completely ignores. They start with a # symbol.
Code is written once but read many times — by you, your team, your future self. Comments explain the why, not just the what.
# This is a single-line comment # Everything after # on this line is ignored by Python print("Hello") # Comments can also go at the END of a line of code # Python doesn't have a special "multi-line comment" syntax # Instead, just use multiple # lines # Like this. # And this. # HOWEVER — triple-quoted strings are sometimes used as multi-line "comments" # (they're technically strings, but if you don't store or use them, they do nothing) """ This is a triple-quoted string. It spans multiple lines. When used at the top of a function or file, it becomes a "docstring" which documents what the code does. More on this in Part VII. """ # Good comment: explains WHY, not just what # BAD comment example (don't do this — it's obvious): x = 5 # set x to 5 <-- useless, the code already says this # GOOD comment example: max_retries = 3 # API calls sometimes fail; retry up to 3 times before giving up
Throughout this guide, every line of code is commented to help you learn. In real professional code, you wouldn't comment every single line — only the parts that aren't immediately obvious. As you grow as a developer, you'll develop a feel for what needs explaining and what doesn't.
Indentation Rules — Why Python Is Different Beginner
This might be the most important concept in Python that trips up beginners: Python uses whitespace (indentation) to define structure. Most programming languages use curly braces { } to group code together. Python uses indentation instead.
Think of an outline. Main topics are at the left margin. Sub-topics are indented one level. Sub-sub-topics are indented two levels. Python works exactly the same way. When code "belongs inside" something (like inside an if statement, or inside a function), you indent it to show that relationship.
# This is a simple "if statement" (covered in depth in Part V) # The key lesson here is: the INDENTED code belongs to the if temperature = 35 # this line is NOT indented — it's at the top level if temperature > 30: # the colon (:) means "a block is starting" print("It's hot!") # 4 spaces indent — this belongs to the if block print("Stay hydrated") # also 4 spaces — also belongs to the if block print("This runs no matter what") # back to 0 indent — NOT inside the if # Here's what WRONG indentation looks like (this would cause an error): # if temperature > 30: # print("It's hot!") <-- ERROR! This should be indented # The standard in Python is 4 spaces per level of indentation # Most editors (including VS Code) automatically do this when you press Tab # Nested indentation (code inside code inside code): if temperature > 30: # level 0 (no indent) print("Hot day") # level 1 (4 spaces) if temperature > 40: # level 1 (4 spaces) print("Dangerously hot") # level 2 (8 spaces)
Why does Python use indentation instead of curly braces?
Guido van Rossum made a deliberate design choice: he believed that if indentation is required, code will automatically look clean and organized. In languages with curly braces, some programmers write messy code where the visual structure doesn't match the logical structure. Python prevents this entirely — in Python, what you see is exactly what happens.
Python requires that you use either spaces or tabs — not both. The Python community standard (PEP 8) says to use 4 spaces per indentation level. VS Code's Tab key inserts 4 spaces by default. Never mix tabs and spaces in the same file — Python will throw an IndentationError.
Part II — Variables & Data Types
Every program needs to store information. Variables are the containers. Data types describe what kind of information goes inside them. This is the bedrock of all programming.
What Is a Variable? Beginner
Imagine a variable as a labeled box. The label is the variable name (like "age" or "username"). The contents of the box is the value (like 25 or "Alice"). You can look inside the box, change what's inside, and use the contents whenever you need them. The box itself stays the same — just what's inside can change.
In Python, you create a variable by choosing a name, writing an equals sign =, and then providing the value. That equals sign is called the assignment operator — it assigns a value to a name.
# Creating (declaring) a variable # Format: variable_name = value age = 25 # 'age' is the label, 25 is the value inside name = "Alice" # 'name' holds the text "Alice" height = 5.9 # 'height' holds the decimal number 5.9 is_student = True # 'is_student' holds the value True # Using (reading) a variable print(age) # prints: 25 print(name) # prints: Alice print("My name is", name, "and I am", age, "years old.") # prints: My name is Alice and I am 25 years old. # Changing (reassigning) a variable — the box gets new contents age = 26 # 25 is gone, 26 is now inside the box print(age) # prints: 26 # You can even change what TYPE of data is in the variable # (Python allows this — more on "dynamic typing" shortly) age = "twenty-six" # now 'age' holds text instead of a number print(age) # prints: twenty-six # Variables can be used in calculations price = 100 discount = 20 final_price = price - discount # store the result of a calculation print(final_price) # prints: 80
In Python (and most programming languages), = means "assign this value to this name." It's not asking a question — it's giving an instruction. To compare if two things are equal, you use double equals: ==. We'll cover this in Part IV.
Integers (int) Beginner
An integer (called int in Python) is a whole number — no decimal point. It can be positive, negative, or zero.
# Integers are whole numbers — no decimal point age = 25 # positive integer temperature = -10 # negative integer count = 0 # zero is also an integer # Python integers can be ARBITRARILY LARGE — no overflow! # (Unlike some other languages, Python handles huge numbers natively) big_number = 999999999999999999999999999999 print(big_number) # Python handles this fine # Underscores in numbers — makes large numbers readable (Python 3.6+) population = 8_000_000_000 # same as 8000000000 but much easier to read one_million = 1_000_000 # Python ignores underscores in numbers # Basic arithmetic with integers a = 10 b = 3 print(a + b) # Addition: 13 print(a - b) # Subtraction: 7 print(a * b) # Multiplication: 30 print(a / b) # Division: 3.3333... (always returns a float!) print(a // b) # Floor division: 3 (divides and throws away the remainder) print(a % b) # Modulo: 1 (the REMAINDER after dividing) print(a ** b) # Exponent: 1000 (10 to the power of 3) # Check the type of a variable using type() print(type(age)) # Output: <class 'int'>
The modulo operator gives you the remainder after division. 10 ÷ 3 = 3 remainder 1, so 10 % 3 equals 1. This is incredibly useful for things like "is this number even or odd?" (if n % 2 == 0, it's even), or cycling through items in a list.
Floats (float) Beginner
A float (short for "floating-point number") is any number with a decimal point. The name "floating point" refers to how computers represent these numbers internally — the decimal point can "float" to different positions.
# Floats have a decimal point price = 9.99 pi = 3.14159 temperature = -23.5 # Scientific notation works too speed_of_light = 3e8 # 3 × 10^8 = 300,000,000.0 tiny = 1.5e-4 # 1.5 × 10^-4 = 0.00015 print(type(price)) # Output: <class 'float'> # IMPORTANT: Floats are not perfectly precise! # This is a fundamental limitation of how computers store decimal numbers print(0.1 + 0.2) # Output: 0.30000000000000004 (not 0.3!) # For money and financial calculations, use the 'decimal' module instead from decimal import Decimal print(Decimal('0.1') + Decimal('0.2')) # Output: 0.3 (precise!) # Dividing two integers ALWAYS gives a float in Python 3 result = 10 / 2 print(result) # Output: 5.0 (not 5) print(type(result)) # Output: <class 'float'> # Rounding floats print(round(3.14159, 2)) # Round to 2 decimal places: 3.14 print(round(3.7)) # Round to nearest whole number: 4
Strings (str) Beginner
A string is any sequence of characters — letters, numbers, symbols, spaces — surrounded by quotes. The word "string" comes from the idea of characters strung together like beads on a necklace.
# Strings can use single quotes OR double quotes — both are fine name1 = 'Alice' # single quotes name2 = "Bob" # double quotes # Both work identically. Pick one style and stay consistent. # Use the OTHER type of quote when your string contains quotes message = "I'm learning Python!" # string has ' inside, so we use "" quote = 'She said "hello" to me.' # string has "" inside, so we use '' # Strings can contain anything — letters, numbers, symbols email = "[email protected]" address = "123 Main St, Anytown, USA" empty = "" # an empty string is a valid string (no characters) # String length — how many characters are in it? word = "Python" print(len(word)) # Output: 6 (P-y-t-h-o-n = 6 characters) # String concatenation — joining strings together with + first = "Hello" second = "World" combined = first + " " + second # adds a space in the middle print(combined) # Output: Hello World # Repeating strings with * print("ha" * 3) # Output: hahaha print("-" * 30) # prints a line of 30 dashes # The number 5 and the string "5" are DIFFERENT things num = 5 # integer text = "5" # string print(num + num) # Output: 10 (math addition) print(text + text) # Output: 55 (string joining, not addition!)
Booleans (bool) Beginner
A boolean is the simplest data type — it can only be one of two values: True or False. Named after mathematician George Boole who invented boolean algebra.
Think of a light switch. It's either ON or OFF — never both, never neither. Booleans are the light switches of programming. They're used to answer yes/no questions: "Is the user logged in? Is this file empty? Did the payment go through?"
# Booleans — note the capital T and F! is_raining = True # True — capital T, no quotes is_sunny = False # False — capital F, no quotes # Booleans come from comparisons (more in Part IV) age = 20 can_vote = age >= 18 # Is 20 >= 18? Yes! So can_vote = True print(can_vote) # Output: True is_teenager = age < 20 # Is 20 < 20? No! So is_teenager = False print(is_teenager) # Output: False # Boolean values in Python ARE actually integers underneath: # True == 1, False == 0 (this is useful for counting True values) print(True + True) # Output: 2 print(True + False) # Output: 1 print(type(True)) # Output: <class 'bool'>
None — The Absence of a Value Beginner
None is Python's way of representing "nothing," "no value," or "absence of a value." It's not zero, it's not an empty string — it means the variable intentionally has no value.
Imagine a form with a field for "Middle Name." Some people have a middle name (like "Robert"), some leave it blank. None is the equivalent of leaving it blank — not writing "N/A" (that's a string), not writing 0 (that's a number) — just nothing. It represents "this field was intentionally not filled in."
# None represents "no value" — note the capital N! result = None print(result) # Output: None print(type(result)) # Output: <class 'NoneType'> # None is often used as a placeholder until you have real data user_input = None # We haven't asked the user yet # ... later in the program ... user_input = "Alice" # Now we have real data # Functions that don't explicitly return a value return None def say_hi(): print("Hi!") # this function prints but doesn't RETURN anything output = say_hi() # Output: Hi! print(output) # Output: None (because say_hi() returned nothing) # Check if something is None using 'is' (not ==) if result is None: print("No value yet") # This will print if result is not None: print("We have a value!") # This won't print
type() Function & Type Coercion Beginner
Python gives you a built-in tool to check what type of data any variable holds: the type() function. You can also convert between types — a process called type coercion or type casting.
# type() tells you what kind of data something is print(type(42)) # <class 'int'> print(type(3.14)) # <class 'float'> print(type("hello")) # <class 'str'> print(type(True)) # <class 'bool'> print(type(None)) # <class 'NoneType'> # ── TYPE CONVERSION (CASTING) ────────────────────────────────── # Convert to integer with int() num_text = "42" # this is a string num_int = int(num_text) # now it's an integer print(num_int + 8) # Output: 50 (math works now!) # Convert float to int — drops the decimal part (does NOT round!) print(int(9.9)) # Output: 9 (not 10!) print(int(3.14)) # Output: 3 # Convert to float with float() print(float(5)) # Output: 5.0 print(float("3.14")) # Output: 3.14 # Convert to string with str() age = 25 message = "I am " + str(age) + " years old" # must convert int to str to join print(message) # Output: I am 25 years old # Convert to boolean with bool() # Almost everything converts to True — except these falsy values: print(bool(0)) # False (zero) print(bool("")) # False (empty string) print(bool(None)) # False (None) print(bool(42)) # True (any non-zero number) print(bool("hello")) # True (any non-empty string) # WATCH OUT: Some conversions will fail! # int("hello") <-- ERROR: "hello" can't become a number # int("3.14") <-- ERROR: strings with decimals must go through float first print(int(float("3.14"))) # Works: "3.14" → 3.14 → 3
Naming Conventions Beginner
Variable names are yours to choose, but there are rules (things Python enforces) and conventions (things the community agrees on).
The Rules (Python enforces these — break them and get an error)
- Must start with a letter or underscore
_— NOT a number - Can only contain letters, numbers, and underscores — no spaces, hyphens, or special characters
- Case-sensitive:
name,Name, andNAMEare three different variables - Cannot be a Python keyword (words Python uses, like
if,for,while,class, etc.)
# ✅ VALID variable names age = 25 first_name = "Alice" # snake_case: words separated by underscores _private = "secret" # underscore prefix: convention for "private" variable name2 = "Bob" # numbers OK if not at the start CONSTANT_VALUE = 3.14 # ALL_CAPS convention for values that never change # ❌ INVALID variable names (these cause SyntaxError) # 2name = "Bob" <-- starts with a number # first-name = "" <-- hyphens not allowed # my name = "" <-- spaces not allowed # for = 5 <-- 'for' is a Python keyword # ── PYTHON NAMING CONVENTIONS (community standards) ─────────── # Variables and functions: snake_case (all lowercase, underscores) user_name = "Alice" total_price = 99.99 def calculate_tax(): pass # function names: same as variables # Constants: ALL_CAPS_WITH_UNDERSCORES (Python can't truly enforce constants, # this is just a convention telling others "don't change this!") MAX_CONNECTIONS = 100 PI = 3.14159 API_KEY = "abc123" # Classes: PascalCase (capitalize first letter of each word, no underscores) # class UserAccount: <-- correct class naming (Part XII) # class user_account: <-- wrong convention for classes # Be descriptive! Good names make code self-documenting. # BAD: x = 86400 # GOOD: seconds_per_day = 86400 # anyone can understand what this is
Dynamic Typing — Why It Matters Intermediate
Python is a dynamically typed language. This means you don't have to declare what type a variable will hold before you use it — Python figures it out automatically at runtime.
The "type" of a variable (int, str, float, etc.) is determined by the value currently stored in it, and can change at any time.
Python inspects the value you assign and automatically attaches the appropriate type. When you reassign, the type updates accordingly.
When debugging type errors, when working with user input (which always comes in as a string), and when writing code for others to use.
Makes code faster to write. But it requires discipline — accidentally storing the wrong type in a variable causes hard-to-find bugs.
# Python vs statically-typed languages (like Java or C++) # In Java, you must declare the type: # int age = 25; <-- type declared explicitly # age = "twenty-five" <-- ERROR in Java — can't change type! # In Python, no type declaration needed: age = 25 # Python sees an integer print(type(age)) # <class 'int'> age = "twenty-five" # Python now sees a string — no error! print(type(age)) # <class 'str'> age = 25.5 # Now a float print(type(age)) # <class 'float'> # THE DANGER: dynamic typing can hide bugs # Example: user input always comes back as a string user_age = "25" # Pretend user typed "25" — it's a STRING # user_age + 5 <-- TypeError! Can't add string + int # You must convert it: user_age = int("25") # Now it's a real integer print(user_age + 5) # Output: 30 ✅ # SOLUTION for larger codebases: Type Hints (covered in Part XV) # You can optionally annotate what type a variable SHOULD be: def greet(name: str, age: int) -> str: # these are hints, not enforced return f"Hello {name}, you are {age} years old"
Part III — Strings In Depth
Strings are everywhere in programming — user input, file contents, API responses, error messages. Master strings and you unlock the ability to work with text in any form.
Indexing & Slicing Beginner
Every character in a string has an address called an index. Python (like most languages) starts counting at 0, not 1. This trips up almost every beginner.
Think of apartments in a building. If the building has floors 0, 1, 2, 3... the ground floor is floor 0, not floor 1. In Python, the first character of a string lives at index 0, the second at index 1, and so on. This feels weird at first but becomes completely natural with practice.
# String indexing — accessing individual characters word = "Python" # P y t h o n # 0 1 2 3 4 5 (positive indices, counting from left) # -6 -5 -4 -3 -2 -1 (negative indices, counting from right) # Access a character using square brackets [] print(word[0]) # Output: P (first character) print(word[1]) # Output: y print(word[5]) # Output: n (last character, index = length - 1) # Negative indexing — count from the END print(word[-1]) # Output: n (last character) print(word[-2]) # Output: o (second to last) print(word[-6]) # Output: P (same as word[0]) # IndexError: going out of bounds # print(word[10]) <-- ERROR! "Python" only has indices 0-5 # ── SLICING — extracting a portion of a string ───────────────── # Format: string[start:stop] # Returns characters from 'start' UP TO BUT NOT INCLUDING 'stop' sentence = "Hello, World!" print(sentence[0:5]) # Output: Hello (indices 0,1,2,3,4) print(sentence[7:12]) # Output: World (indices 7,8,9,10,11) # Omitting start: defaults to 0 print(sentence[:5]) # Output: Hello (same as [0:5]) # Omitting end: goes to the end of the string print(sentence[7:]) # Output: World! # Omitting both: copy the entire string print(sentence[:]) # Output: Hello, World! # ── STEP — the third slice argument ─────────────────────────── # Format: string[start:stop:step] # step: how many characters to skip text = "abcdefgh" print(text[::2]) # Every 2nd character: abcdefgh → aceg print(text[1:7:2]) # Start at 1, stop at 7, every 2nd: bdf # The famous trick: reverse a string with [::-1] # step of -1 means "go backwards" print("Python"[::-1]) # Output: nohtyP print("racecar"[::-1]) # Output: racecar (a palindrome!)
String Concatenation Beginner
# Concatenation: joining strings together # Method 1: The + operator first = "Hello" second = "World" result = first + " " + second # joins the strings print(result) # Output: Hello World # Remember: can only concatenate strings with strings name = "Alice" age = 25 # message = "My name is " + name + " and I'm " + age <-- TypeError! message = "My name is " + name + " and I'm " + str(age) # convert first print(message) # Method 2: The % formatting (old style — you'll see this in older code) message = "My name is %s and I'm %d years old" % (name, age) # %s = string placeholder, %d = integer placeholder print(message) # Method 3: .format() (medium-age style) message = "My name is {} and I'm {} years old".format(name, age) print(message) # Method 4: f-strings (modern, recommended — we'll cover next!) message = f"My name is {name} and I'm {age} years old" print(message)
f-Strings — The Modern Way Beginner
f-strings (formatted string literals) were introduced in Python 3.6 and are now the standard way to embed variables and expressions inside strings. The f before the quote tells Python "this is an f-string — treat {curly braces} as code."
# f-strings: put an 'f' before the opening quote # Then use {curly braces} to embed variables or expressions name = "Alice" age = 25 score = 95.678 # Basic f-string print(f"Hello, {name}!") # Output: Hello, Alice! print(f"{name} is {age} years old.") # Output: Alice is 25 years old. # You can put EXPRESSIONS (calculations) inside the braces print(f"Next year she'll be {age + 1}.") # evaluates age + 1 = 26 print(f"Uppercase name: {name.upper()}") # can call methods too! print(f"2 to the power of 10 is {2**10}") # Output: 1024 # Format specifiers — control HOW values are displayed # Format: {value:format_spec} # Number formatting print(f"Score: {score:.2f}") # .2f = 2 decimal places: 95.68 print(f"Score: {score:.0f}") # .0f = no decimals: 96 print(f"Big: {1234567:,}") # comma separator: 1,234,567 print(f"Percent: {0.75:.1%}") # percentage: 75.0% # Width and alignment print(f"{name:10}|") # left-align, 10 chars wide: "Alice |" print(f"{name:>10}|") # right-align: " Alice|" print(f"{name:^10}|") # center: " Alice |" # Python 3.8+ bonus: the = specifier (debugging helper) x = 42 print(f"{x=}") # Output: x=42 (shows name AND value!) # Multiline f-strings info = ( f"Name: {name}\n" f"Age: {age}\n" f"Score: {score:.1f}" ) print(info)
f-strings are the fastest, most readable, and most modern way to format strings in Python. Use them as your default. Only use .format() if you need to support Python versions before 3.6 (very rare), or the old % style if you're maintaining old code.
Common String Methods Beginner
A method is a function that belongs to an object. For strings, you call methods using string.method_name(). Think of them as built-in tools that every string comes with.
String methods never change the original string — they always return a NEW string. This is because strings in Python are immutable (cannot be changed after creation). Always capture the result: new_text = old_text.upper(), not just old_text.upper().
# All string methods return a NEW string — they don't change the original text = " Hello, World! " # ── CASE METHODS ────────────────────────────────────────────── print(text.upper()) # " HELLO, WORLD! " — all uppercase print(text.lower()) # " hello, world! " — all lowercase print(text.title()) # " Hello, World! " — first letter of each word capitalized print(text.capitalize()) # first char of entire string capitalized print(text.swapcase()) # UPPER→lower, lower→UPPER # ── WHITESPACE METHODS ──────────────────────────────────────── print(text.strip()) # "Hello, World!" — removes leading/trailing spaces print(text.lstrip()) # removes only LEFT (leading) whitespace print(text.rstrip()) # removes only RIGHT (trailing) whitespace # ── SEARCH METHODS ──────────────────────────────────────────── sentence = "Python is great. Python is fun." print(sentence.find("Python")) # 0 — index of FIRST occurrence print(sentence.rfind("Python")) # 17 — index of LAST occurrence print(sentence.find("Java")) # -1 — not found returns -1 print(sentence.count("Python")) # 2 — how many times "Python" appears # in keyword: check if substring exists (returns True/False) print("Python" in sentence) # True print("Java" in sentence) # False # startswith and endswith print(sentence.startswith("Python")) # True print(sentence.endswith("fun.")) # True # ── REPLACE METHOD ──────────────────────────────────────────── new = sentence.replace("Python", "Java") # replace ALL occurrences print(new) # Java is great. Java is fun. new2 = sentence.replace("Python", "Ruby", 1) # replace only FIRST occurrence print(new2) # Ruby is great. Python is fun. # ── SPLIT AND JOIN ──────────────────────────────────────────── csv_line = "Alice,25,Engineer,New York" # split(): break a string into a LIST of substrings parts = csv_line.split(",") # split at every comma print(parts) # ['Alice', '25', 'Engineer', 'New York'] # split() with no argument splits on ANY whitespace words = "Hello World Python".split() print(words) # ['Hello', 'World', 'Python'] # join(): the REVERSE of split — combine a list into a string fruits = ["apple", "banana", "cherry"] print(", ".join(fruits)) # "apple, banana, cherry" print(" | ".join(fruits)) # "apple | banana | cherry" print("".join(fruits)) # "applebananacherry" # ── CHECK METHODS ───────────────────────────────────────────── print("hello".isalpha()) # True — only letters? print("123".isdigit()) # True — only digits? print("abc123".isalnum()) # True — only letters and digits? print(" ".isspace()) # True — only whitespace? print("HELLO".isupper()) # True — all uppercase? print("hello".islower()) # True — all lowercase? # ── PADDING ─────────────────────────────────────────────────── print("5".zfill(3)) # "005" — pad with zeros on left print("hello".center(11)) # " hello " — center in given width print("hello".ljust(10, "-")) # "hello-----" — pad right with dashes
Multi-line Strings & Raw Strings Beginner
# ── MULTI-LINE STRINGS ──────────────────────────────────────── # Triple quotes let you write strings that span multiple lines # Can use ''' or """ poem = """Roses are red, Violets are blue, Python is awesome, And so are you.""" print(poem) # Output: # Roses are red, # Violets are blue, # Python is awesome, # And so are you. # Triple-quoted strings preserve ALL whitespace including newlines sql_query = """ SELECT name, age, email FROM users WHERE age > 18 ORDER BY name """ # great for embedding SQL, HTML, JSON templates in your code # ── RAW STRINGS ─────────────────────────────────────────────── # A raw string treats backslashes (\) as literal characters # Prefix the string with r or R # The problem: backslash has special meaning in normal strings normal = "C:\new_folder\test.txt" # \n is "newline", \t is "tab"! print(normal) # Output: # C: # ew_folder est.txt <-- NOT what we wanted! # The solution: raw strings (prefix with r) raw_path = r"C:\new_folder\test.txt" # backslashes are literal print(raw_path) # Output: C:\new_folder\test.txt <-- correct! # Raw strings are ESSENTIAL when working with: # 1. Windows file paths # 2. Regular expressions (regex patterns — very common!) import re pattern = r"\d{3}-\d{4}" # without r, \d would be interpreted as escape # With the r prefix, \d means "a digit" in regex (the correct interpretation)
Escape Characters Beginner
Escape characters are special two-character sequences that represent characters you can't easily type. They always start with a backslash \.
| Escape Sequence | What It Represents | Example |
|---|---|---|
\n | New line | print("Hello\nWorld") prints on two lines |
\t | Tab character | print("Name:\tAlice") |
\\ | Literal backslash | print("C:\\Users") → C:\Users |
\' | Single quote | print('It\'s fine') |
\" | Double quote | print("She said \"hi\"") |
\r | Carriage return | Used in Windows line endings |
\0 | Null character | Rarely used directly |
# Escape characters in action print("Line 1\nLine 2\nLine 3") # Output: # Line 1 # Line 2 # Line 3 print("Name:\tAlice") # Output: Name: Alice print("Say \"hello\"") # Output: Say "hello" print('It\'s raining') # Output: It's raining print("C:\\Users\\Alice") # Output: C:\Users\Alice # Unicode characters — use \u followed by 4 hex digits print("\u2764") # Output: ❤ (heart symbol) print("\U0001F600") # Output: 😀 (emoji!)
Part IV — Operators & Expressions
Operators are the symbols that let you do things with data — math, comparisons, logic. Understanding them deeply lets you write conditions and calculations with precision and confidence.
Arithmetic Operators Beginner
| Operator | Name | Example | Result |
|---|---|---|---|
+ | Addition | 10 + 3 | 13 |
- | Subtraction | 10 - 3 | 7 |
* | Multiplication | 10 * 3 | 30 |
/ | Division (always float) | 10 / 3 | 3.333... |
// | Floor Division (integer) | 10 // 3 | 3 |
% | Modulo (remainder) | 10 % 3 | 1 |
** | Exponent / Power | 2 ** 10 | 1024 |
-x | Negation (unary) | -5 | -5 |
# Practical uses of the less-obvious operators # Floor division (//) — get whole number result, throw away remainder minutes = 137 hours = minutes // 60 # How many full hours? 2 remaining = minutes % 60 # How many leftover minutes? 17 print(f"{minutes} minutes = {hours}h {remaining}m") # 137 minutes = 2h 17m # Modulo (%) — incredibly useful! # Check if number is even (even numbers have no remainder when divided by 2) for n in [1, 2, 3, 4, 5]: if n % 2 == 0: print(f"{n} is even") else: print(f"{n} is odd") # Exponent (**) — powers and roots print(2**8) # 256 (2 to the 8th power) print(9**0.5) # 3.0 (square root of 9 = 9^0.5) print(8**(1/3)) # 2.0 (cube root of 8) # Built-in math functions (no import needed) print(abs(-42)) # 42 — absolute value print(max(3, 7, 1)) # 7 — largest value print(min(3, 7, 1)) # 1 — smallest value print(round(3.7)) # 4 — round to nearest integer print(sum([1, 2, 3])) # 6 — sum a list of numbers print(pow(2, 10)) # 1024 — same as 2**10
Comparison Operators Beginner
Comparison operators compare two values and always return a Boolean (True or False). They're the foundation of all decision-making in code.
a = 10 b = 5 print(a == b) # Equal to: False (10 is not equal to 5) print(a != b) # Not equal to: True (they are different) print(a > b) # Greater than: True (10 > 5) print(a < b) # Less than: False (10 is not less than 5) print(a >= b) # Greater than or equal: True (10 >= 5) print(a <= b) # Less than or equal: False (10 is not <= 5) # Python allows CHAINED comparisons — very readable! age = 25 print(18 <= age <= 65) # True — is age between 18 and 65? # This is equivalent to: (18 <= age) and (age <= 65) # == vs is — a crucial distinction! # == checks if VALUES are equal # is checks if they're the SAME OBJECT in memory list1 = [1, 2, 3] list2 = [1, 2, 3] print(list1 == list2) # True — same values print(list1 is list2) # False — different objects in memory # Use 'is' ONLY for None, True, False comparisons: value = None print(value is None) # ✅ Correct way to check for None print(value == None) # Works, but is considered bad style # Strings comparison — alphabetical/lexicographic print("apple" < "banana") # True — 'a' comes before 'b' print("Z" < "a") # True — uppercase letters come before lowercase in ASCII
Logical Operators — and, or, not Beginner
Logical operators combine multiple conditions. They work on boolean values (or values that can be treated as boolean).
age = 25 has_ticket = True is_vip = False # ── AND: both conditions must be True ───────────────────────── print(age >= 18 and has_ticket) # True AND True = True print(age >= 18 and is_vip) # True AND False = False # AND truth table: T+T=T, T+F=F, F+T=F, F+F=F # ── OR: at least one condition must be True ──────────────────── print(has_ticket or is_vip) # True OR False = True print(False or False) # False OR False = False # OR truth table: T+T=T, T+F=T, F+T=T, F+F=F # ── NOT: reverses a boolean ──────────────────────────────────── print(not is_vip) # not False = True print(not has_ticket) # not True = False # Practical example — entrance to an event if (age >= 18 and has_ticket) or is_vip: print("Welcome in!") else: print("Sorry, you can't enter.") # 'not in' and 'not is' — clean negative checks fruits = ["apple", "banana"] print("mango" not in fruits) # True — mango is NOT in the list value = None print(value is not None) # False — it IS None
Assignment & Augmented Assignment Operators Beginner
# Basic assignment x = 10 # assign 10 to x # ── AUGMENTED ASSIGNMENT — shorthand for x = x OPERATOR value ─ # Instead of writing x = x + 5, write x += 5 x += 5 # same as: x = x + 5 → x is now 15 x -= 3 # same as: x = x - 3 → x is now 12 x *= 2 # same as: x = x * 2 → x is now 24 x /= 4 # same as: x = x / 4 → x is now 6.0 x //= 2 # same as: x = x // 2 → x is now 3.0 x **= 3 # same as: x = x ** 3 → x is now 27.0 x %= 5 # same as: x = x % 5 → x is now 2.0 print(x) # 2.0 # String and list augmented assignment works too name = "Hello" name += " World" # name = name + " World" → "Hello World" items = [1, 2] items += [3, 4] # items = items + [3, 4] → [1, 2, 3, 4] # Walrus operator := (Python 3.8+) — assign AND use in one step # Useful in while loops and comprehensions # Version badge: requires Python 3.8+ import random while (n := random.randint(1, 10)) != 5: # assign n, then check if != 5 print(f"Got {n}, trying again...") print(f"Finally got {n}!")
Operator Precedence Intermediate
When multiple operators appear in one expression, Python follows rules about which ones to evaluate first — just like math's order of operations (remember PEMDAS?).
| Priority | Operators | Description |
|---|---|---|
| 1 (highest) | () | Parentheses |
| 2 | ** | Exponentiation |
| 3 | +x, -x, ~x | Unary (negation) |
| 4 | *, /, //, % | Multiplication, division, modulo |
| 5 | +, - | Addition, subtraction |
| 6 | ==, !=, <, >, <=, >=, is, in | Comparison |
| 7 | not | Logical NOT |
| 8 | and | Logical AND |
| 9 (lowest) | or | Logical OR |
# Python follows standard math order of operations print(2 + 3 * 4) # 14, not 20! (* before +) print((2 + 3) * 4) # 20 — parentheses override print(2 ** 3 ** 2) # 512 — ** is RIGHT associative: 2^(3^2) = 2^9 print(10 - 3 - 2) # 5 — left to right: (10-3)-2 # ADVICE: When in doubt, use parentheses! # This is clearer than relying on everyone knowing precedence rules: result = (price * quantity) + (shipping_fee * (1 - discount_rate))
Short-Circuit Evaluation Intermediate
Python is smart about evaluating and and or — it stops as soon as it knows the answer. This is called short-circuit evaluation and it's a crucial concept for writing safe code.
If the LEFT side is False, Python skips the right side entirely. A false first condition makes the whole and false — no need to check further.
If the LEFT side is True, Python skips the right side entirely. A true first condition makes the whole or true — no need to check further.
# Short-circuit with AND — prevents errors! name = None # Without short-circuit, this would crash: # print(len(name) > 0) ← TypeError: object of type 'NoneType' has no len() # With short-circuit, the right side is never evaluated if left is False: if name is not None and len(name) > 0: print(f"Name: {name}") else: print("No name provided") # Because name IS None, Python stops at 'name is not None' (False) # It never tries to call len(None) — so no crash! # Short-circuit with OR — provide default values user_name = "" # empty string — falsy display_name = user_name or "Anonymous" # if user_name is falsy, use "Anonymous" print(display_name) # Output: Anonymous user_name = "Alice" # truthy display_name = user_name or "Anonymous" # user_name is truthy, stop there print(display_name) # Output: Alice # This "or default" pattern is very common in Python! # It's a clean way to set fallback values config_value = None timeout = config_value or 30 # if no config, default to 30 seconds
Part V — Control Flow
Control flow is how you make programs make decisions. Without it, code would just run top to bottom every time. With it, your programs can react to conditions and take different paths.
if / elif / else Beginner
Think of if/elif/else as a flowchart. "IF it's raining, take an umbrella. ELSE IF it's sunny, wear sunscreen. ELSE, just go normally." Python evaluates conditions top to bottom, takes the first path that's true, then skips everything else.
# ── BASIC IF STATEMENT ───────────────────────────────────────── temperature = 35 if temperature > 30: # condition — must be True or False print("It's hot outside!") # only runs if condition is True print("Drink water.") # still part of this if block (indented) # back to indent level 0 — this always runs: print("Done checking temperature.") # ── IF / ELSE ───────────────────────────────────────────────── age = 16 if age >= 18: print("You can vote!") # runs if age >= 18 else: print("Too young to vote.") # runs if age < 18 # Exactly ONE of these will always run # ── IF / ELIF / ELSE ────────────────────────────────────────── # elif = "else if" — check another condition if the previous was False score = 82 if score >= 90: grade = "A" # only if score is 90 or higher elif score >= 80: grade = "B" # only if score is 80-89 elif score >= 70: grade = "C" # only if score is 70-79 elif score >= 60: grade = "D" # only if score is 60-69 else: grade = "F" # if none of the above matched (score < 60) print(f"Your grade: {grade}") # Output: Your grade: B # ── TERNARY / CONDITIONAL EXPRESSION ───────────────────────── # A one-line way to write a simple if/else # Format: value_if_true if condition else value_if_false status = "adult" if age >= 18 else "minor" print(status) # Output: minor (because age is 16) # This is equivalent to: if age >= 18: status = "adult" else: status = "minor" # Only use the ternary for simple cases — readability matters!
Truthy & Falsy Values Beginner
Python conditions don't need to be literally True or False. Any value can be used in an if statement. Python automatically converts it to boolean. Values that convert to False are called falsy; everything else is truthy.
# FALSY values — these all behave like False in conditions: # False, None, 0, 0.0, "", '', [], {}, set(), () # TRUTHY values — everything else: # True, any non-zero number, any non-empty string/list/dict/set # Practical examples: name = "" # empty string — falsy if name: # if name is truthy (non-empty)... print(f"Hello, {name}!") else: print("No name entered.") # This runs — empty string is falsy items = [] # empty list — falsy if items: # if list is non-empty... print(f"You have {len(items)} items") else: print("Cart is empty.") # This runs # This is MUCH cleaner than: if len(items) > 0: # Both work, but 'if items:' is more Pythonic count = 0 # zero — falsy if count: print("Count is non-zero") else: print("Count is zero.") # This runs # Falsy check: None result = None if result is None: # explicit None check — preferred print("No result yet.") # Alternatively: if not result: — but this would also catch 0, "", etc. # Be explicit about None unless you want to catch all falsy values
Nested Conditions Beginner
# Nested if — an if statement inside another if statement age = 20 has_id = True if age >= 18: # outer if print("You're old enough.") if has_id: # inner if (only reached if outer is True) print("ID verified. Welcome!") else: # inner else print("Need to see your ID.") else: print("Sorry, you must be 18+.") # Often, nested ifs can be flattened with 'and' — more readable! # BAD (too nested): if age >= 18: if has_id: print("Welcome!") # GOOD (flat with 'and'): if age >= 18 and has_id: print("Welcome!") # "Guard clauses" — return early to avoid deep nesting def process_order(order): # Instead of deeply nested ifs, check failure conditions first and return if order is None: return "No order provided" # early return! if not order.get("items"): return "Order has no items" # early return! if order.get("total", 0) <= 0: return "Invalid total" # early return! # If we reach here, the order is valid — process it return "Order processed successfully"
match / case — Structural Pattern Matching 🐍 3.10+ Intermediate
Python 3.10 introduced structural pattern matching via match/case. It's similar to switch/case in other languages but far more powerful — it can match against values, types, structures, and more.
# Basic value matching — like a cleaner if/elif chain command = "quit" match command: case "quit": print("Exiting...") case "help": print("Showing help...") case "start": print("Starting...") case _: # _ is the "wildcard" — matches anything else print(f"Unknown command: {command}") # Multiple values in one case using | day = "Saturday" match day: case "Saturday" | "Sunday": print("Weekend!") case _: print("Weekday") # Matching with guards (conditions) point = (3, 7) match point: case (0, 0): print("Origin") case (x, 0): print(f"On x-axis at {x}") # x is captured from the match case (0, y): print(f"On y-axis at {y}") case (x, y) if x == y: # guard condition print(f"On diagonal at {x}") case (x, y): print(f"At coordinates ({x}, {y})") # this one matches (3, 7)
Common Beginner Mistakes Beginner
# ❌ MISTAKE 1: Using = instead of == x = 5 # if x = 10: ← SyntaxError! = is assignment, == is comparison if x == 10: # ✅ correct print("ten") # ❌ MISTAKE 2: Forgetting the colon after the condition # if x > 3 ← SyntaxError! Missing colon if x > 3: # ✅ colon required print("big") # ❌ MISTAKE 3: Wrong indentation # if x > 3: # print("big") ← IndentationError! Must be indented if x > 3: print("big") # ✅ indented 4 spaces # ❌ MISTAKE 4: Comparing with 'is' instead of == num = 1000 # if num is 1000: ← Works sometimes but WRONG conceptually # 'is' tests identity (same object), not equality (same value) if num == 1000: # ✅ correct for comparing values print("one thousand") # ❌ MISTAKE 5: Thinking elif is checked even when if was True # Python takes the FIRST matching branch and skips all others score = 95 if score >= 60: print("Passing") # This runs elif score >= 90: print("Excellent") # This NEVER runs! The if above already matched! # Put more specific conditions FIRST
Part VI — Loops
Loops are what make computers powerful. Instead of writing the same instruction 1000 times, you write it once and tell Python to repeat it. Loops are how you process lists, iterate over files, and automate repetitive tasks.
for Loops Beginner
A for loop is like telling someone to "do this for each item in this list." Imagine you have a grocery list: "For each item on the grocery list, add it to the cart." You don't write separate instructions for milk, eggs, and bread — you write the instruction once and it applies to all items.
# Basic for loop — iterate over a list # Format: for variable in iterable: # 'variable' takes on each value from the iterable, one at a time fruits = ["apple", "banana", "cherry"] for fruit in fruits: # fruit = "apple", then "banana", then "cherry" print(f"I like {fruit}") # runs once for each item # Output: # I like apple # I like banana # I like cherry # You can loop over ANYTHING iterable — strings too! for letter in "Python": print(letter, end=" ") # prints each letter separated by space # Output: P y t h o n print() # new line after the loop # Loop over a dictionary person = {"name": "Alice", "age": 25, "city": "NYC"} for key in person: # iterates over KEYS by default print(f"{key}: {person[key]}") for key, value in person.items(): # .items() gives (key, value) pairs print(f"{key} = {value}") # Accumulating a result — running total numbers = [1, 2, 3, 4, 5] total = 0 # start with 0 for num in numbers: total += num # add each number to total print(f"Added {num}, total is now {total}") print(f"Final total: {total}") # Output: 15
while Loops Beginner
A while loop keeps running as long as a condition is True. Use it when you don't know in advance how many times you need to loop.
# Basic while loop count = 0 while count < 5: # keep looping while count is less than 5 print(f"Count is: {count}") count += 1 # CRITICAL: must update the condition variable! # Without this, loop runs forever! # Output: Count is: 0, 1, 2, 3, 4 # While loops are great for user input validation while True: # infinite loop — but we'll break out of it user_input = input("Enter a number between 1-10: ") if user_input.isdigit(): # check it's actually a number number = int(user_input) if 1 <= number <= 10: # check it's in range break # exit the loop — we have valid input! print("Invalid! Try again.") # only prints if we didn't break print(f"You chose: {number}") # Countdown example countdown = 10 while countdown > 0: print(countdown, end=" ") countdown -= 1 print("Blastoff! 🚀")
If you forget to update the condition variable in a while loop, it will run forever, freezing your program. If this happens, press Ctrl+C in the terminal to force-stop the program. The most common cause: forgetting count += 1 inside the loop, or accidentally resetting the counter at the end.
range() — Generating Number Sequences Beginner
# range() generates a sequence of integers # range(stop) — 0 up to (not including) stop # range(start, stop) — start up to (not including) stop # range(start, stop, step) — start to stop, counting by step # Count from 0 to 4 for i in range(5): # range(5) generates: 0, 1, 2, 3, 4 print(i, end=" ") # Output: 0 1 2 3 4 print() # Count from 1 to 10 for i in range(1, 11): # range(1, 11) generates: 1, 2, ..., 10 print(i, end=" ") print() # Count by 2 (even numbers) for i in range(0, 20, 2): # step of 2: 0, 2, 4, 6, ..., 18 print(i, end=" ") print() # Count backwards for i in range(10, 0, -1): # step of -1: 10, 9, 8, ..., 1 print(i, end=" ") print() # Using range to access list items by index colors = ["red", "green", "blue"] for i in range(len(colors)): # range(3) = 0, 1, 2 print(f"Color {i}: {colors[i]}") # Run something N times for _ in range(3): # _ means "I don't care about the variable value" print("This repeats 3 times") # range() is LAZY — it doesn't create the full list in memory # range(1_000_000) uses almost no memory — it generates numbers one at a time big = range(1_000_000) # instant! no big list created print(list(range(5))) # convert to list if you need to see it: [0, 1, 2, 3, 4]
enumerate() & zip() Intermediate
# ── ENUMERATE — loop with index AND value ───────────────────── # Problem: you want to know BOTH the position and value of each item fruits = ["apple", "banana", "cherry"] # The old, clunky way: for i in range(len(fruits)): print(f"{i}: {fruits[i]}") # The Pythonic way with enumerate(): for index, fruit in enumerate(fruits): # gives (index, value) pairs print(f"{index}: {fruit}") # Output: # 0: apple # 1: banana # 2: cherry # Start counting from 1 instead of 0 for num, fruit in enumerate(fruits, start=1): print(f"{num}. {fruit}") # Output: 1. apple, 2. banana, 3. cherry # ── ZIP — loop over multiple iterables simultaneously ────────── # Pairs up elements from multiple iterables names = ["Alice", "Bob", "Charlie"] scores = [95, 87, 92] grades = ["A", "B", "A"] for name, score, grade in zip(names, scores, grades): print(f"{name}: {score} ({grade})") # Output: # Alice: 95 (A) # Bob: 87 (B) # Charlie: 92 (A) # zip() stops at the SHORTEST iterable a = [1, 2, 3] b = ["x", "y"] # shorter! print(list(zip(a, b))) # [(1, 'x'), (2, 'y')] — 3 is dropped! # Create a dictionary from two parallel lists keys = ["name", "age", "city"] values = ["Alice", 25, "NYC"] person = dict(zip(keys, values)) print(person) # {'name': 'Alice', 'age': 25, 'city': 'NYC'}
break, continue, pass Beginner
# ── BREAK — exit the loop immediately ───────────────────────── for i in range(10): if i == 5: break # stop the loop when i is 5 print(i, end=" ") # prints: 0 1 2 3 4 print("Loop ended") # this runs after break # Practical: find first match and stop names = ["Alice", "Bob", "Charlie", "Dave"] search = "Charlie" for name in names: if name == search: print(f"Found {search}!") break # no need to keep looking once found # ── CONTINUE — skip this iteration, go to next ──────────────── for i in range(10): if i % 2 == 0: continue # skip even numbers — go to next iteration print(i, end=" ") # prints only odd: 1 3 5 7 9 # Practical: skip invalid items data = [1, None, 3, None, 5] total = 0 for item in data: if item is None: continue # skip None values total += item # only add actual numbers print(total) # Output: 9 # ── PASS — do nothing (placeholder) ─────────────────────────── # pass is used when syntax requires a block but you don't want to do anything yet for i in range(5): pass # loop runs 5 times but does absolutely nothing # Common use: placeholder while building code def function_ill_write_later(): pass # without pass, Python requires something here — SyntaxError class MyClass: pass # empty class placeholder
else on Loops — Python's Secret Feature Intermediate
Python has a unique feature almost no other language has: you can put an else block on a loop. The else runs when the loop completes normally (without hitting a break).
# Loop else: runs if the loop finished WITHOUT a break # Example: search for an item items = ["apple", "banana", "cherry"] target = "mango" for item in items: if item == target: print(f"Found {target}!") break # break exits — else does NOT run else: print(f"{target} not found.") # runs because no break occurred # Output: mango not found. # Without this feature, you'd need a flag variable: found = False # messy way for item in items: if item == target: found = True break if not found: print("Not found.") # the else clause is cleaner! # While loop else works the same way n = 2 while n < 100: if n % 7 == 0: print(f"Found multiple of 7: {n}") break n += 1 else: print("No multiples of 7 found below 100") # won't print — 7 exists!
Nested Loops Beginner
# A loop inside a loop = nested loop # The inner loop runs COMPLETELY for each single iteration of the outer loop # Multiplication table for i in range(1, 4): # outer loop: rows 1, 2, 3 for j in range(1, 4): # inner loop: columns 1, 2, 3 print(f"{i}x{j}={i*j:2}", end=" ") print() # new line after each row # Output: # 1x1= 1 1x2= 2 1x3= 3 # 2x1= 2 2x2= 4 2x3= 6 # 3x1= 3 3x2= 6 3x3= 9 # Nested loops with a list of lists (2D data) matrix = [ [1, 2, 3], # row 0 [4, 5, 6], # row 1 [7, 8, 9] # row 2 ] for row in matrix: # each row is a list for cell in row: # each cell is a number print(cell, end=" ") print() # Output: 1 2 3 / 4 5 6 / 7 8 9 # PERFORMANCE WARNING: nested loops multiply complexity # One loop of 1000: 1,000 operations # Two nested loops of 1000: 1,000,000 operations! # Three nested loops: 1,000,000,000 — very slow! # Think carefully before nesting more than 2 levels deep
Part VII — Functions
Functions are the building blocks of organized, reusable code. Instead of writing the same logic 10 times, you write it once as a function and call it whenever you need it.
Defining Functions Beginner
Think of a function like a recipe. It has a name ("Chocolate Cake"), ingredients (inputs/parameters), and step-by-step instructions. Once written, you can "make the cake" any time by calling the function — without rewriting any instructions.
# Define a function with 'def' keyword # Format: def function_name(parameters): def say_hello(): """Print a greeting.""" # docstring — describes what function does print("Hello, World!") # CALL the function (execute it) say_hello() # Output: Hello, World! say_hello() # Call as many times as needed # Function with a parameter def greet(name): # 'name' is a parameter (placeholder) print(f"Hello, {name}!") greet("Alice") # 'Alice' is the argument greet("Bob") # Function that returns a value def square(n): return n ** 2 # 'return' sends a value back to the caller result = square(5) # result = 25 print(result) # Multiple parameters def rectangle_area(width, height): """Calculate area of a rectangle.""" return width * height print(rectangle_area(5, 3)) # 15 — positional: order matters print(rectangle_area(height=4, width=6)) # 24 — keyword: order doesn't matter
Default Parameter Values Beginner
# Default values: used when caller doesn't provide that argument def greet(name, greeting="Hello"): print(f"{greeting}, {name}!") greet("Alice") # uses default: Hello, Alice! greet("Bob", "Hi") # overrides: Hi, Bob! greet("Charlie", greeting="Hey") # keyword: Hey, Charlie! # ⚠️ GOTCHA: Never use mutable objects (lists, dicts) as defaults! def add_item_BAD(item, my_list=[]): # [] is shared across ALL calls! my_list.append(item) return my_list print(add_item_BAD("a")) # ['a'] — seems fine print(add_item_BAD("b")) # ['a', 'b'] — WRONG! 'a' persists! # Correct pattern: use None and create fresh inside def add_item_GOOD(item, my_list=None): if my_list is None: my_list = [] # fresh list every time my_list.append(item) return my_list
def add_tag(item, tags=[]):
tags.append(item)
return tags
print(add_tag("python")) # ['python']
print(add_tag("code")) # ['python', 'code'] 😱
def add_tag(item, tags=None):
if tags is None:
tags = []
tags.append(item)
return tags
print(add_tag("python")) # ['python']
print(add_tag("code")) # ['code'] ✓
Default arguments are evaluated once when the function is defined, not each time it's called. Using a mutable object like [] or {} as a default means all calls share the same object. Always use None as a sentinel and create the mutable object inside the function body.
*args and **kwargs Intermediate
# *args: collect unlimited POSITIONAL args into a TUPLE def add_all(*args): print(f"args: {args}") # it's a tuple return sum(args) print(add_all(1, 2)) # args=(1,2) → 3 print(add_all(1, 2, 3, 4, 5)) # args=(1,2,3,4,5) → 15 # **kwargs: collect unlimited KEYWORD args into a DICT def print_info(**kwargs): print(f"kwargs: {kwargs}") # it's a dict for key, val in kwargs.items(): print(f" {key} = {val}") print_info(name="Alice", age=25, city="NYC") # Combining all types — must be in this EXACT order: # def f(positional, *args, keyword_only, **kwargs) def full_example(required, *args, keyword_only="default", **kwargs): print(required, args, keyword_only, kwargs) full_example(1, 2, 3, keyword_only="hi", x=10) # Output: 1 (2, 3) hi {'x': 10} # Unpacking with * and ** when CALLING a function numbers = [1, 2, 3] print(*numbers) # same as: print(1, 2, 3) config = {"end": "!\n", "sep": "-"} print("a", "b", **config) # unpacks dict as keyword args
Scope — Local vs Global Intermediate
Variables created INSIDE a function. They only exist while that function runs — invisible outside.
Variables created OUTSIDE any function. Visible everywhere in the file. Can be read inside functions but not written without global.
company = "TechCorp" # global variable def show(): print(company) # ✅ can READ globals local_var = "local" # this only lives inside show() print(local_var) show() # print(local_var) ← NameError! Can't see inside the function # Assigning inside function creates LOCAL, doesn't touch global counter = 0 def bad_increment(): counter = 10 # creates LOCAL counter — global unchanged! bad_increment() print(counter) # still 0! # To modify a global, declare with 'global' def good_increment(): global counter counter += 1 # now modifies the actual global good_increment() print(counter) # 1 ✅ # Better approach: pass in, return out (avoid globals!) def increment(n): return n + 1 counter = increment(counter) # clean, testable, no hidden state
Docstrings Beginner
def calculate_bmi(weight_kg, height_m): """ Calculate Body Mass Index (BMI). Args: weight_kg (float): Weight in kilograms height_m (float): Height in meters Returns: float: BMI value (weight / height^2) Example: >>> calculate_bmi(70, 1.75) 22.86 """ return round(weight_kg / (height_m ** 2), 2) # Access a function's docstring print(calculate_bmi.__doc__) # Or use the built-in help() function help(calculate_bmi) # prints nicely formatted docs # Good docstrings answer: # - WHAT does this function do? # - WHAT does each parameter mean? # - WHAT does it return? # - Any caveats or examples
Lambda Functions Intermediate
A lambda is a small, anonymous (unnamed) function written in a single line. Use them for short, simple operations — especially when passing a function as an argument to another function.
# Format: lambda parameters: expression # The expression is automatically returned — no 'return' keyword # Regular function vs lambda — same result def square_regular(x): return x ** 2 square_lambda = lambda x: x ** 2 # anonymous function stored in a variable print(square_regular(5)) # 25 print(square_lambda(5)) # 25 — same result # Multiple parameters add = lambda a, b: a + b print(add(3, 4)) # 7 # The REAL power of lambdas: passing functions as arguments # Sort a list of dicts by a specific key students = [ {"name": "Charlie", "grade": 85}, {"name": "Alice", "grade": 92}, {"name": "Bob", "grade": 78}, ] # sorted() takes a 'key' argument — a function that says "sort by THIS" by_grade = sorted(students, key=lambda s: s["grade"]) for s in by_grade: print(f"{s['name']}: {s['grade']}") # Sort by name length words = ["banana", "apple", "kiwi", "strawberry"] print(sorted(words, key=lambda w: len(w))) # map() and filter() with lambdas nums = [1, 2, 3, 4, 5] squared = list(map(lambda x: x**2, nums)) # [1, 4, 9, 16, 25] evens = list(filter(lambda x: x%2==0, nums)) # [2, 4] print(squared, evens) # NOTE: For complex logic, use a regular function — lambdas are for simplicity
Pure vs Impure Functions Intermediate
Given the same inputs, ALWAYS returns the same output. Has no side effects (doesn't modify anything outside itself). Easy to test and reason about.
Has side effects — modifies state outside itself (prints, writes files, changes global variables, modifies mutable arguments). Harder to test, but necessary.
# PURE function — same input, always same output, no side effects def add(a, b): return a + b # predictable, testable, no surprises # IMPURE function — has side effects (modifies external list) data = [1, 2, 3] def append_and_sum_impure(item): data.append(item) # SIDE EFFECT: modifies external list return sum(data) print(append_and_sum_impure(4)) # 10 — data is now [1,2,3,4] print(append_and_sum_impure(4)) # 14 — data is now [1,2,3,4,4]! # Same call with same argument, different output — unpredictable! # Prefer pure: pass data in, return new data out def append_and_sum_pure(lst, item): new_list = lst + [item] # create NEW list, don't modify original return new_list, sum(new_list) new_data, total = append_and_sum_pure(data, 5) # data is unchanged; new_data is the modified version
🔑 Key Takeaways — Functions
- Functions are the primary unit of reuse — if you write the same logic twice, make it a function
- Default arguments are evaluated once at definition time — use
Noneas a sentinel for mutable defaults *argscollects extra positional arguments as a tuple;**kwargscollects keyword arguments as a dict- Closures capture variables by reference, not by value — watch for late binding in loops
- Every public function should have a docstring explaining what it does, its parameters, and its return value
- Prefer pure functions (same input → same output, no side effects) — they're easier to test and reason about
Part VIII — Data Structures
Data structures are ways of organizing and storing collections of data. Python has four built-in data structures that cover nearly every real-world need: lists, tuples, dictionaries, and sets.
Lists Beginner
A list is like a numbered shopping list. Items are in order, you can add or remove items, and you can access any item by its position number (index). Lists maintain insertion order and allow duplicates.
# Creating lists — use square brackets [] fruits = ["apple", "banana", "cherry"] # list of strings numbers = [1, 2, 3, 4, 5] # list of ints mixed = ["Alice", 25, True, None] # lists can hold ANYTHING empty = [] # empty list nested = [[1, 2], [3, 4], [5, 6]] # list of lists # ── ACCESSING ITEMS ──────────────────────────────────────────── print(fruits[0]) # "apple" — first item print(fruits[-1]) # "cherry" — last item print(fruits[1:3]) # ['banana', 'cherry'] — slicing # ── MODIFYING LISTS ──────────────────────────────────────────── fruits[1] = "blueberry" # change item at index 1 print(fruits) # ['apple', 'blueberry', 'cherry'] # Adding items fruits.append("date") # adds to the END — most common fruits.insert(1, "avocado") # insert at index 1, shifts others right fruits.extend(["elderberry", "fig"]) # add multiple items from another list # Removing items fruits.remove("banana") # remove first occurrence of this value popped = fruits.pop() # remove and RETURN last item popped2 = fruits.pop(0) # remove and return item at index 0 del fruits[0] # delete item at index 0 (no return) fruits.clear() # remove ALL items → [] # ── SEARCHING ────────────────────────────────────────────────── colors = ["red", "green", "blue", "red"] print("red" in colors) # True — membership test print(colors.index("green")) # 1 — index of first match print(colors.count("red")) # 2 — how many times it appears # ── SORTING ──────────────────────────────────────────────────── nums = [3, 1, 4, 1, 5, 9, 2] nums.sort() # sort IN-PLACE — modifies the list print(nums) # [1, 1, 2, 3, 4, 5, 9] nums.sort(reverse=True) # descending print(nums) # [9, 5, 4, 3, 2, 1, 1] original = [3, 1, 4, 1, 5] sorted_copy = sorted(original) # returns NEW sorted list, original unchanged print(original) # [3, 1, 4, 1, 5] — untouched # Sort by custom key words = ["banana", "apple", "kiwi"] print(sorted(words, key=len)) # sort by string length: ['kiwi', 'apple', 'banana'] # ── OTHER USEFUL OPERATIONS ──────────────────────────────────── nums = [1, 2, 3, 4, 5] print(len(nums)) # 5 — number of items print(sum(nums)) # 15 — sum of all items print(min(nums)) # 1 — smallest print(max(nums)) # 5 — largest nums.reverse() # reverse in place copy = nums[:] # shallow copy (slicing trick) copy2 = nums.copy() # same as above, more explicit # ── LIST COMPREHENSIONS — create lists with elegance ─────────── # Format: [expression for item in iterable if condition] # Old way: loop + append squares_old = [] for n in range(1, 6): squares_old.append(n ** 2) # New way: list comprehension (concise, fast, Pythonic) squares = [n ** 2 for n in range(1, 6)] print(squares) # [1, 4, 9, 16, 25] # With a filter condition evens = [n for n in range(20) if n % 2 == 0] print(evens) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] # Transforming strings names = ["alice", "bob", "charlie"] upper_names = [name.upper() for name in names] print(upper_names) # ['ALICE', 'BOB', 'CHARLIE'] # Nested comprehension (flattening a 2D list) matrix = [[1,2,3],[4,5,6],[7,8,9]] flat = [num for row in matrix for num in row] print(flat) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
Tuples — Immutable Sequences Beginner
A tuple is like a list but immutable — once created, it cannot be changed. No adding, removing, or changing items. Use parentheses () to create them.
Think of a tuple as a record engraved in stone. A GPS coordinate (latitude, longitude) should never be accidentally modified — it represents a fixed point. A person's birthday (year, month, day) doesn't change. These are perfect tuples: data that belongs together and must stay together unchanged.
# Creating tuples — parentheses (optional but conventional) point = (3, 7) # 2D coordinate rgb = (255, 128, 0) # color values person = ("Alice", 25, "NYC") # grouped data single = (42,) # ← trailing comma REQUIRED for 1-item tuples! empty = () # empty tuple # Accessing — same as lists print(point[0]) # 3 print(rgb[-1]) # 0 print(person[1:3]) # (25, 'NYC') # Immutability — cannot change after creation # point[0] = 10 ← TypeError! Tuples don't support item assignment # ── WHY USE TUPLES OVER LISTS? ──────────────────────────────── # 1. Immutability signals intent: "this data shouldn't change" # 2. Tuples are slightly faster than lists # 3. Tuples can be used as DICTIONARY KEYS — lists cannot! locations = { (40.7128, -74.0060): "New York", # tuple as key ✅ (34.0522, -118.2437): "Los Angeles", } # 4. Tuples are returned by functions returning multiple values def divmod_custom(a, b): return a // b, a % b # returns a tuple automatically quotient, remainder = divmod_custom(17, 5) print(quotient, remainder) # 3 2 # Tuple unpacking — assign multiple variables at once name, age, city = person # unpack all 3 values print(name, age, city) # Alice 25 NYC x, y = 10, 20 # create and unpack in one line x, y = y, x # SWAP two variables — Python magic! print(x, y) # 20 10 # Extended unpacking with * first, *middle, last = [1, 2, 3, 4, 5] print(first) # 1 print(middle) # [2, 3, 4] print(last) # 5
Dictionaries Beginner
A dictionary is like a real dictionary (the book). Every entry has a word (the key) and a definition (the value). You look things up by word, not by page number. In Python, dictionaries store key→value pairs and let you retrieve any value instantly by its key.
# Creating dictionaries — curly braces {key: value} person = { "name": "Alice", # key: value "age": 25, "city": "NYC", "hobbies": ["coding", "hiking"] # value can be anything! } # ── ACCESSING VALUES ─────────────────────────────────────────── print(person["name"]) # "Alice" — direct access # person["missing"] ← KeyError if key doesn't exist! # .get() — safe access with optional default print(person.get("name")) # "Alice" print(person.get("email")) # None — key doesn't exist, no error print(person.get("email", "N/A")) # "N/A" — custom default # ── MODIFYING DICTIONARIES ───────────────────────────────────── person["age"] = 26 # update existing key person["email"] = "[email protected]" # add new key person.update({"age": 27, "country": "USA"}) # update multiple at once del person["city"] # delete a key-value pair removed = person.pop("email") # remove and return value # ── CHECKING EXISTENCE ───────────────────────────────────────── print("name" in person) # True — checks KEYS print("Alice" in person) # False — doesn't check values print("Alice" in person.values()) # True — explicitly check values # ── ITERATING ────────────────────────────────────────────────── for key in person: # iterates over keys print(key) for key in person.keys(): # explicit keys iteration print(key) for value in person.values(): # iterate over values print(value) for key, value in person.items(): # iterate over key-value pairs print(f"{key}: {value}") # ── DICT COMPREHENSION ───────────────────────────────────────── # {key_expr: val_expr for item in iterable} squares = {n: n**2 for n in range(1, 6)} print(squares) # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25} # Invert a dictionary (swap keys and values) original = {"a": 1, "b": 2, "c": 3} inverted = {v: k for k, v in original.items()} print(inverted) # {1: 'a', 2: 'b', 3: 'c'} # Merge two dicts (Python 3.9+) defaults = {"theme": "light", "language": "en"} user_prefs = {"theme": "dark"} merged = defaults | user_prefs # user_prefs overrides defaults print(merged) # {'theme': 'dark', 'language': 'en'}
Sets — Unique Collections Beginner
A set is an unordered collection of unique items. Duplicates are automatically removed. Sets are perfect for deduplication and mathematical set operations.
# Creating sets — curly braces, but no key:value pairs colors = {"red", "green", "blue"} nums = {1, 2, 3, 4} # Duplicates are automatically removed! dupes = {1, 2, 2, 3, 3, 3} print(dupes) # {1, 2, 3} # Create a set from a list (deduplicate a list!) raw = ["apple", "banana", "apple", "cherry", "banana"] unique = set(raw) # remove duplicates print(unique) # {'apple', 'banana', 'cherry'} (order not guaranteed) unique_list = list(unique) # convert back to list if needed # IMPORTANT: empty set — can't use {} (that creates an empty DICT!) empty_set = set() # ✅ correct empty_dict = {} # this is a dict, not a set! # Adding and removing colors.add("yellow") # add one item colors.discard("pink") # remove if exists (no error if missing) colors.remove("red") # remove (KeyError if missing) # ── SET MATHEMATICS — the real power of sets ─────────────────── a = {1, 2, 3, 4, 5} b = {4, 5, 6, 7, 8} print(a | b) # Union: all items from both: {1,2,3,4,5,6,7,8} print(a & b) # Intersection: items in BOTH: {4, 5} print(a - b) # Difference: in a but NOT b: {1, 2, 3} print(a ^ b) # Symmetric diff: in one but not both: {1,2,3,6,7,8} # Practical: find common users between two lists list_a = ["Alice", "Bob", "Charlie"] list_b = ["Bob", "Dave", "Alice"] common = set(list_a) & set(list_b) print(common) # {'Bob', 'Alice'} # Subset and superset checks print({1, 2}.issubset({1, 2, 3})) # True: {1,2} is inside {1,2,3} print({1, 2, 3}.issuperset({1, 2})) # True: {1,2,3} contains {1,2}
When to Use Each Data Structure Beginner
| Structure | Ordered? | Mutable? | Duplicates? | Best For |
|---|---|---|---|---|
list | ✅ Yes | ✅ Yes | ✅ Yes | Ordered collections you'll modify; sequences |
tuple | ✅ Yes | ❌ No | ✅ Yes | Fixed data (coordinates, dates); dict keys; function return values |
dict | ✅ Yes (3.7+) | ✅ Yes | Keys: No | Key→value lookups; structured records; counting |
set | ❌ No | ✅ Yes | ❌ No | Deduplication; membership testing; set math |
# Decision guide — which structure to choose? # "I need ordered items I'll loop through" → LIST todo_list = ["Buy groceries", "Call dentist", "Write report"] # "I have fixed data that shouldn't change" → TUPLE SCREEN_SIZE = (1920, 1080) # won't and shouldn't change birthday = (1999, 4, 15) # "I need to look things up by name/key" → DICT user = {"username": "alice99", "score": 1500, "level": 5} print(user["score"]) # O(1) instant lookup by key # "I need unique items / membership testing" → SET visited_pages = {"/home", "/about", "/contact"} if "/home" in visited_pages: # O(1) — sets are FAST for this print("Already visited home") # Performance note: checking 'x in list' is O(n) — slow for large lists # Checking 'x in set' is O(1) — instant regardless of set size! big_list = list(range(1_000_000)) big_set = set(range(1_000_000)) # 999_999 in big_list → scans up to 1 million items # 999_999 in big_set → instant! Always O(1)
The collections Module — Supercharged Data Structures Intermediate
The collections module provides specialized containers that solve common problems more cleanly than using plain lists and dicts. These show up constantly in real Python codebases.
from collections import defaultdict, Counter, deque, namedtuple # ── defaultdict — never get a KeyError again ─────────────────── # PROBLEM: counting word frequency with a plain dict requires a key check # APPROACH: defaultdict auto-creates missing keys with a default factory # Plain dict — cumbersome word_count = {} for word in ["apple", "banana", "apple", "cherry", "banana", "apple"]: if word not in word_count: word_count[word] = 0 word_count[word] += 1 # defaultdict — clean and Pythonic word_count = defaultdict(int) # int() returns 0, the default for word in ["apple", "banana", "apple", "cherry", "banana", "apple"]: word_count[word] += 1 # no KeyError — missing keys default to 0 print(dict(word_count)) # {'apple': 3, 'banana': 2, 'cherry': 1} # defaultdict(list) — group items by_length = defaultdict(list) for word in ["cat", "dog", "elephant", "ant", "bear"]: by_length[len(word)].append(word) # {3: ['cat', 'dog', 'ant'], 8: ['elephant'], 4: ['bear']} # ── Counter — count anything in one line ─────────────────────── from collections import Counter votes = ["Alice", "Bob", "Alice", "Alice", "Bob", "Charlie"] tally = Counter(votes) print(tally) # Counter({'Alice': 3, 'Bob': 2, 'Charlie': 1}) print(tally.most_common(2)) # [('Alice', 3), ('Bob', 2)] — top 2 print(tally["Alice"]) # 3 print(tally["Zara"]) # 0 — missing keys return 0, not KeyError # Counter arithmetic a = Counter("aabbcc") b = Counter("abcd") print(a + b) # Counter({'a': 3, 'b': 3, 'c': 3, 'd': 1}) print(a - b) # Counter({'a': 1, 'b': 1, 'c': 1}) # ── deque — O(1) append/pop from BOTH ends ───────────────────── # list.insert(0, x) is O(n) — slow for large lists # deque.appendleft(x) is O(1) — always fast queue = deque(["first", "second"]) queue.append("third") # add to right: ['first', 'second', 'third'] queue.appendleft("zeroth") # add to left: ['zeroth', 'first', ...] print(queue.popleft()) # remove from left: 'zeroth' print(queue.pop()) # remove from right: 'third' # Fixed-size sliding window (maxlen) recent = deque(maxlen=3) # keeps only last 3 items for n in [1, 2, 3, 4, 5]: recent.append(n) print(recent) # deque([3, 4, 5], maxlen=3) # ── namedtuple — lightweight named records ───────────────────── # Like a tuple but fields have names — more readable than index access Point = namedtuple("Point", ["x", "y"]) p = Point(3, 7) print(p.x, p.y) # 3, 7 — named access print(p[0], p[1]) # 3, 7 — still works as a tuple print(p) # Point(x=3, y=7) — readable repr x, y = p # still unpackable # Better than: color = (255, 128, 0) and having to remember indices Color = namedtuple("Color", ["red", "green", "blue"]) orange = Color(255, 128, 0) print(orange.red) # 255 — clearer than orange[0]
Which collections tool? Use defaultdict when you'd write if key not in d: d[key] = default_value. Use Counter when you're counting occurrences of anything. Use deque when you need a queue (FIFO) or need fast prepend. Use namedtuple for simple immutable records where you want named fields but not full class overhead — consider @dataclass (Part XII) if you need mutability or methods.
Many developers write 5-line boilerplate patterns (like if key not in d: d[key] = [] then d[key].append(...)) not realizing defaultdict(list) does the same in one line. Know what collections provides before hand-rolling common patterns.
🔑 Key Takeaways — Data Structures
- Lists for ordered, mutable sequences. Tuples for fixed records. Dicts for key→value lookups. Sets for uniqueness and O(1) membership testing
inon a list is O(n) — slow on large collections.inon a set or dict is O(1) — always instantdefaultdicteliminates boilerplate key-existence checks.Countercounts anything in one line.dequeis a proper queue with O(1) operations on both ends- Dict keys must be hashable (immutable): strings, numbers, tuples — not lists or dicts
- As of Python 3.7+, regular dicts maintain insertion order
Part IX — File Handling
Most real programs need to read data from files or write results to them. Python makes file I/O (Input/Output) straightforward with simple, readable syntax — and two standard modules (csv, json) handle the most common file formats.
open() and File Modes Beginner
# open(filename, mode) — open a file and return a file object # Mode tells Python what you want to DO with the file # ── FILE MODES ───────────────────────────────────────────────── # "r" — Read (default). File must exist. Cannot write. # "w" — Write. Creates file if not exists. ERASES existing content! # "a" — Append. Creates if not exists. Adds to END of file. # "x" — Exclusive create. Fails if file already exists. # "r+" — Read and write. File must exist. # "b" — Binary mode (add to any: "rb", "wb") — for images, PDFs, etc. # The old way — you MUST close the file manually! f = open("notes.txt", "r") # open for reading content = f.read() # read entire file into a string f.close() # MUST close — releases the file lock # Problem: if an error happens between open() and close(), # close() never gets called and the file stays locked. # Solution: use 'with' statement (next section)
| Mode | Read? | Write? | Creates? | Erases? |
|---|---|---|---|---|
"r" | ✅ | ❌ | ❌ | ❌ |
"w" | ❌ | ✅ | ✅ | ✅ Yes! |
"a" | ❌ | ✅ | ✅ | ❌ |
"x" | ❌ | ✅ | ✅ | ❌ |
"r+" | ✅ | ✅ | ❌ | ❌ |
The with Statement — Always Use This Beginner
The with statement is a context manager. It automatically closes the file when the block ends — even if an error occurs. Always use with open(...) instead of manual open/close.
f = open("notes.txt", "r")
content = f.read()
# If this crashes, f never closes!
process(content)
f.close() # easy to forget
with open("notes.txt", "r") as f:
content = f.read()
process(content)
# File closes automatically here,
# even if process() raised an error
# The 'with' statement — ALWAYS use this for files # When the 'with' block ends (normally or via exception), # Python automatically closes the file for you. # ── READING ──────────────────────────────────────────────────── with open("notes.txt", "r") as f: # 'f' is the file object content = f.read() # read entire file as one string print(content) # file is automatically closed here — no f.close() needed! # Read line by line (memory-efficient for huge files) with open("notes.txt", "r") as f: for line in f: # file objects are iterable! print(line.strip()) # .strip() removes the \n at end of each line # Read all lines into a list with open("notes.txt", "r") as f: lines = f.readlines() # list of strings, each ending with \n print(f"{len(lines)} lines") # ── WRITING ──────────────────────────────────────────────────── with open("output.txt", "w") as f: # "w" ERASES existing content! f.write("Hello, File!\n") # write a string (\n = new line) f.write("Second line.\n") # Write multiple lines at once lines = ["Line 1\n", "Line 2\n", "Line 3\n"] with open("output.txt", "w") as f: f.writelines(lines) # write a list of strings # ── APPENDING ────────────────────────────────────────────────── with open("log.txt", "a") as f: # "a" adds to end, doesn't erase f.write("New log entry\n") # ── SPECIFYING ENCODING (always do this!) ────────────────────── # Different computers/systems use different text encodings # UTF-8 is the universal standard — always specify it explicitly with open("data.txt", "r", encoding="utf-8") as f: content = f.read() # Reading/writing non-English text requires this! with open("greetings.txt", "w", encoding="utf-8") as f: f.write("こんにちは\n") # Japanese — works fine with UTF-8 f.write("Héllo Wörld\n") # Accented chars — also fine
CSV Files — the csv Module Intermediate
CSV (Comma-Separated Values) is the most common format for tabular data — spreadsheets, database exports, data feeds. Python's built-in csv module handles them properly (including fields that contain commas inside quotes).
import csv # built-in module — no installation needed # ── READING A CSV ────────────────────────────────────────────── # Imagine data.csv contains: # name,age,city # Alice,25,New York # Bob,30,Los Angeles # Read as lists with open("data.csv", "r", newline="", encoding="utf-8") as f: # newline="" is recommended by Python docs to handle line endings correctly reader = csv.reader(f) # creates a CSV reader object header = next(reader) # read (and skip) the first row = header print(f"Columns: {header}") # ['name', 'age', 'city'] for row in reader: # each row is a list of strings name, age, city = row # unpack print(f"{name} is {age} from {city}") # Read as dicts (DictReader — cleaner for most use cases) with open("data.csv", "r", newline="", encoding="utf-8") as f: reader = csv.DictReader(f) # first row becomes dict keys for row in reader: print(row) # {'name': 'Alice', 'age': '25', 'city': 'New York'} print(row["name"]) # access by column name — much cleaner! # ── WRITING A CSV ────────────────────────────────────────────── data = [ ["name", "age", "city"], # header row ["Alice", 25, "New York"], ["Bob", 30, "Los Angeles"], ["Charlie", 22, "Chicago"], ] with open("output.csv", "w", newline="", encoding="utf-8") as f: writer = csv.writer(f) writer.writerows(data) # write all rows at once # Write dicts with DictWriter people = [ {"name": "Alice", "score": 95}, {"name": "Bob", "score": 87}, ] fields = ["name", "score"] # define column order with open("scores.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=fields) writer.writeheader() # writes "name,score" header row writer.writerows(people) # writes all dict rows
JSON Files — the json Module Intermediate
JSON (JavaScript Object Notation) is the universal language of the web — API responses, config files, and data storage all use JSON. Python's json module converts between Python objects and JSON text seamlessly.
import json # Python ↔ JSON type mapping: # Python dict ↔ JSON object {} # Python list ↔ JSON array [] # Python str ↔ JSON string "" # Python int/float ↔ JSON number # Python True/False ↔ JSON true/false # Python None ↔ JSON null # ── WRITING JSON (serializing Python → JSON) ─────────────────── user = { "name": "Alice", "age": 25, "active": True, "scores": [95, 87, 92], "address": {"city": "NYC", "zip": "10001"} } # json.dump — write to a file with open("user.json", "w", encoding="utf-8") as f: json.dump(user, f, indent=4) # indent=4 makes it human-readable # Creates a nicely formatted file # json.dumps — convert to string (not file) json_string = json.dumps(user, indent=2) print(json_string) # ── READING JSON (deserializing JSON → Python) ───────────────── # json.load — read from a file with open("user.json", "r", encoding="utf-8") as f: loaded_user = json.load(f) print(type(loaded_user)) # <class 'dict'> — it's a Python dict now print(loaded_user["name"]) # Alice print(loaded_user["scores"][0]) # 95 # json.loads — parse from a string (common with API responses) raw_json = '{"status": "ok", "count": 42}' # a string, not a dict parsed = json.loads(raw_json) print(parsed["count"]) # 42 — now a proper Python dict # Handling JSON that might be malformed try: data = json.loads('{"broken": json') # invalid JSON except json.JSONDecodeError as e: print(f"JSON error: {e}")
On Windows, use raw strings or forward slashes for paths: r"C:\Users\Alice\data.csv" or "C:/Users/Alice/data.csv". Even better, use Python's pathlib module for cross-platform paths: from pathlib import Path; p = Path("data") / "file.csv". A relative path like "data.txt" is relative to where you run the script, not where the script lives — a common source of "file not found" errors.
pathlib — Modern File Path Handling 🐍 3.4+ Intermediate
pathlib gives you path objects instead of plain strings, making file path operations cross-platform, readable, and chainable. It's the modern replacement for os.path.
import os
base = os.path.dirname(__file__)
data = os.path.join(base, "data", "input.csv")
if os.path.exists(data):
with open(data) as f:
content = f.read()
from pathlib import Path
base = Path(__file__).parent
data = base / "data" / "input.csv"
if data.exists():
content = data.read_text()
from pathlib import Path # ── Creating Paths ───────────────────────────────────────────── p = Path("data/report.csv") # relative path p = Path.cwd() # current working directory p = Path.home() # user's home directory (~) p = Path(__file__).parent # directory of the current script # ── Path arithmetic with / operator ─────────────────────────── config_dir = Path.home() / ".config" / "myapp" data_file = Path("project") / "data" / "sales.csv" # ── Inspecting paths ────────────────────────────────────────── p = Path("reports/annual_2024.pdf") print(p.name) # "annual_2024.pdf" — filename with extension print(p.stem) # "annual_2024" — filename without extension print(p.suffix) # ".pdf" — extension including the dot print(p.parent) # "reports" — parent directory print(p.parts) # ('reports', 'annual_2024.pdf') # ── Checking existence ──────────────────────────────────────── p = Path("myfile.txt") print(p.exists()) # True/False print(p.is_file()) # True if it's a file (not a directory) print(p.is_dir()) # True if it's a directory # ── Reading and writing — no open() needed! ─────────────────── path = Path("output.txt") path.write_text("Hello, pathlib!") # write string to file content = path.read_text() # read entire file as string path.write_bytes(b"binary data") # write bytes data = path.read_bytes() # read as bytes # ── Creating directories ────────────────────────────────────── Path("new/nested/dir").mkdir(parents=True, exist_ok=True) # parents=True: create intermediate dirs; exist_ok=True: don't error if exists # ── Globbing — find files matching a pattern ────────────────── src = Path(".") for py_file in src.glob("*.py"): # all .py in current dir print(py_file.name) for py_file in src.rglob("*.py"): # recursive: all .py everywhere print(py_file) # ── Renaming and moving ──────────────────────────────────────── old = Path("draft.txt") old.rename(Path("final.txt")) # rename in-place # ── Changing extensions ──────────────────────────────────────── p = Path("data.csv") new_p = p.with_suffix(".json") # Path("data.json") new_p = p.with_name("output.csv") # Path("output.csv")
Never build paths with "folder" + "/" + "file.txt" — this breaks on Windows (which uses backslashes). Use Path("folder") / "file.txt" or pass components as separate arguments. The / operator in pathlib is not division — it's path joining, and it works on every OS.
🔑 Key Takeaways — File Handling
- Always use
with open(...) as f:— it guarantees the file is closed even if an error occurs - Use
"r"to read,"w"to write (overwrites!),"a"to append. Addencoding="utf-8"explicitly - Use
csv.DictReaderfor CSV files — named column access is cleaner than numeric indices json.dumps()converts Python → JSON string;json.loads()converts JSON string → Python- Prefer
pathlib.Pathoveros.path— it's object-oriented, cross-platform, and reads like English - A relative path is relative to where you run the script, not where the script lives. Use
Path(__file__).parentfor script-relative paths
Part X — Error Handling
Errors are not failures — they're information. Professional Python code anticipates what can go wrong and handles it gracefully instead of crashing. Error handling is what separates brittle scripts from robust programs.
What Are Exceptions? Beginner
Think of exceptions like alarms. When something goes wrong in your code — dividing by zero, opening a file that doesn't exist, accessing a dictionary key that isn't there — Python "raises" an exception. It's like pulling a fire alarm: it signals that something unexpected happened and normal execution cannot continue. Without error handling, the alarm just crashes your program. With error handling, you "catch" the alarm and respond appropriately.
# Common exceptions you'll encounter # ValueError — right type, wrong value int("hello") # ValueError: invalid literal for int() # TypeError — wrong type entirely 5 + "ten" # TypeError: unsupported operand type(s) # KeyError — dictionary key doesn't exist d = {"a": 1} d["b"] # KeyError: 'b' # IndexError — list index out of range lst = [1, 2, 3] lst[10] # IndexError: list index out of range # NameError — variable doesn't exist print(undefined_variable) # NameError: name 'undefined_variable' is not defined # FileNotFoundError — file doesn't exist open("missing.txt") # FileNotFoundError: No such file or directory # ZeroDivisionError — dividing by zero 10 / 0 # ZeroDivisionError: division by zero # AttributeError — object doesn't have that attribute/method 5.upper() # AttributeError: 'int' object has no attribute 'upper'
try / except / else / finally Beginner
# Basic try/except — the minimum you need try: number = int("hello") # this will fail except ValueError: # catch that specific type of error print("That's not a valid number!") # Output: That's not a valid number! (program continues, no crash) # Catch multiple exception types def safe_divide(a, b): try: result = a / b except ZeroDivisionError: print("Error: cannot divide by zero") return None except TypeError: print("Error: both arguments must be numbers") return None return result print(safe_divide(10, 2)) # 5.0 print(safe_divide(10, 0)) # Error: cannot divide by zero → None print(safe_divide("a", 2)) # Error: both must be numbers → None # Access the exception object with 'as' try: result = int("abc") except ValueError as e: # 'e' is the exception object print(f"Error: {e}") # Error: invalid literal for int() with base 10: 'abc' # ── FULL STRUCTURE: try/except/else/finally ──────────────────── def read_file_safely(filename): try: # try: code that MIGHT raise an exception f = open(filename, "r") content = f.read() except FileNotFoundError: # except: runs ONLY if the error occurs print(f"File '{filename}' not found.") content = None except PermissionError: # multiple except blocks for different errors print("Permission denied.") content = None else: # else: runs ONLY if NO exception occurred print(f"Successfully read {len(content)} characters.") f.close() # close on success finally: # finally: ALWAYS runs — error or not # Use for cleanup: close connections, release locks, etc. print("Finished attempting to read file.") return content # Catch ALL exceptions (use sparingly — hides bugs!) try: risky_operation = 1 / 0 except Exception as e: # 'Exception' catches almost everything print(f"Something went wrong: {type(e).__name__}: {e}") # ⚠️ Avoid bare 'except:' with no exception type — catches even keyboard interrupts!
Raising Exceptions Intermediate
# You can raise exceptions yourself to signal invalid conditions def set_age(age): if not isinstance(age, int): raise TypeError(f"Age must be an integer, got {type(age).__name__}") if age < 0 or age > 150: raise ValueError(f"Age must be between 0 and 150, got {age}") return age try: set_age("twenty") except TypeError as e: print(f"TypeError: {e}") try: set_age(-5) except ValueError as e: print(f"ValueError: {e}") # Re-raise an exception (after logging it) def process(data): try: result = risky_operation(data) except Exception as e: print(f"Logging error: {e}") # log it raise # re-raise the SAME exception
Custom Exceptions Intermediate
# Create your own exception classes by inheriting from Exception # This lets callers catch YOUR specific error type class ValidationError(Exception): """Raised when user-provided data fails validation.""" pass # no extra code needed — inheriting from Exception is enough class InsufficientFundsError(Exception): """Raised when a bank account has insufficient funds.""" def __init__(self, amount, balance): self.amount = amount self.balance = balance # Call parent's __init__ with a descriptive message super().__init__(f"Tried to withdraw {amount}, but balance is {balance}") # Using custom exceptions def withdraw(balance, amount): if amount <= 0: raise ValidationError("Withdrawal amount must be positive") if amount > balance: raise InsufficientFundsError(amount, balance) return balance - amount try: new_balance = withdraw(100, 150) except InsufficientFundsError as e: print(f"Can't withdraw: {e}") print(f"You need {e.amount - e.balance} more.") # access custom attributes except ValidationError as e: print(f"Invalid input: {e}")
Logging Basics Intermediate
For production code, use Python's logging module instead of print(). Logging gives you levels (DEBUG, INFO, WARNING, ERROR, CRITICAL), timestamps, file output, and filtering — all things print can't do.
import logging # Configure logging (do this once at the start of your program) logging.basicConfig( level=logging.DEBUG, # minimum level to log format="%(asctime)s - %(levelname)s - %(message)s", handlers=[ logging.StreamHandler(), # print to console logging.FileHandler("app.log") # also write to file ] ) # Five logging levels (least → most severe) logging.debug("Detailed diagnostic info — only for debugging") logging.info("Normal operation — server started, user logged in") logging.warning("Something unexpected but not breaking — disk 80% full") logging.error("A function failed — couldn't connect to database") logging.critical("System-level failure — application is crashing") # Use in error handling def safe_parse(text): try: value = int(text) logging.info(f"Parsed value: {value}") return value except ValueError: logging.error(f"Could not parse '{text}' as integer", exc_info=True) # exc_info=True includes the full traceback in the log return None
🔑 Key Takeaways — Error Handling
- Catch specific exceptions, not bare
except:— a bare except catches everything including keyboard interrupts and system exits elseblock runs only if no exception occurred;finallyruns always — use it for cleanup- Raise exceptions with context:
raise ValueError("must be positive, got -3")— the message is read by humans - Use
logginginstead ofprint()for production code — you can control log levels and output destinations without changing code - Create custom exception classes by inheriting from
Exceptionto make errors self-documenting - Never use exceptions for control flow —
try/exceptaround code you expect to succeed, not as anif/elsereplacement
Part XI — Modules & Packages
No programmer writes everything from scratch. Modules let you reuse code across files. The Python standard library gives you hundreds of tools for free. Packages from PyPI extend Python into virtually every domain imaginable.
The Import System Beginner
# ── IMPORTING MODULES ────────────────────────────────────────── # import the whole module — access things with module.name import math print(math.pi) # 3.141592653589793 print(math.sqrt(16)) # 4.0 print(math.floor(3.7)) # 3 print(math.ceil(3.2)) # 4 # from module import specific_thing — use it directly from math import sqrt, pi print(sqrt(25)) # 5.0 — no 'math.' prefix needed print(pi) # 3.14... # import with alias — shorter name import numpy as np # convention: everyone uses 'np' import pandas as pd # convention: everyone uses 'pd' from datetime import datetime as dt # from module import * — import EVERYTHING (avoid this!) # from math import * ← pollutes namespace, hard to know where names come from # ── YOUR OWN MODULES ─────────────────────────────────────────── # Any .py file is a module. Create utils.py: # ┌─ utils.py ──────────────────────────┐ # │ def add(a, b): │ # │ return a + b │ # │ PI = 3.14159 │ # └─────────────────────────────────────┘ # In main.py (same folder): # import utils # print(utils.add(3, 4)) # 7 # print(utils.PI) # 3.14159 # The if __name__ == "__main__" guard # When Python runs a file directly, __name__ = "__main__" # When a file is imported as a module, __name__ = the file's name # This lets code run when executed directly but not when imported def main(): print("Running as main program") if __name__ == "__main__": main() # only runs when you execute this file directly
Standard Library Tour Intermediate
The Python Standard Library is a massive collection of modules included with every Python installation. "Batteries included" is Python's philosophy — you don't need to install anything to do most common tasks.
# ── os — operating system interface ─────────────────────────── import os print(os.getcwd()) # current working directory os.makedirs("new_folder", exist_ok=True) # create folder (ok if exists) print(os.listdir(".")) # list files in current directory print(os.path.exists("data.csv")) # does this file exist? print(os.path.join("data", "file.csv")) # cross-platform path join print(os.path.basename("/home/user/file.txt")) # "file.txt" # Better: pathlib (modern, object-oriented paths) from pathlib import Path p = Path("data") / "users" / "file.csv" # / operator joins paths! print(p.exists()) # True/False print(p.suffix) # ".csv" print(p.stem) # "file" # ── sys — system-specific parameters ───────────────────────── import sys print(sys.version) # Python version string print(sys.argv) # command-line arguments: ['script.py', 'arg1', ...] sys.exit(0) # exit program (0 = success, non-zero = error) # ── math — mathematical functions ───────────────────────────── import math print(math.pi, math.e, math.tau) # pi, Euler's number, tau (2π) print(math.sin(math.pi / 2)) # 1.0 print(math.log(100, 10)) # 2.0 (log base 10 of 100) print(math.factorial(5)) # 120 print(math.gcd(48, 18)) # 6 (greatest common divisor) # ── random — random number generation ───────────────────────── import random print(random.random()) # float between 0.0 and 1.0 print(random.randint(1, 100)) # random integer 1-100 (inclusive) print(random.choice(["a", "b", "c"])) # pick random item from list items = [1, 2, 3, 4, 5] random.shuffle(items) # shuffle list in place print(random.sample(items, 3)) # 3 unique random items (no replacement) random.seed(42) # set seed for reproducible results # ── datetime — dates and times ──────────────────────────────── from datetime import datetime, date, timedelta now = datetime.now() # current date and time print(now) # 2024-01-15 14:30:22.123456 print(now.year, now.month, now.day) # 2024, 1, 15 print(now.strftime("%Y-%m-%d %H:%M")) # format: "2024-01-15 14:30" birthday = date(1990, 6, 15) # create a specific date today = date.today() age = (today - birthday).days // 365 # calculate age in years # timedelta: time differences one_week_later = now + timedelta(weeks=1) yesterday = now - timedelta(days=1) # ── collections — specialized containers ────────────────────── from collections import Counter, defaultdict, namedtuple, deque # Counter: count occurrences of items words = ["apple", "banana", "apple", "cherry", "apple", "banana"] counts = Counter(words) print(counts) # Counter({'apple': 3, 'banana': 2, 'cherry': 1}) print(counts.most_common(2)) # [('apple', 3), ('banana', 2)] # Counter also works on strings! char_counts = Counter("hello world") print(char_counts) # defaultdict: dict that auto-creates missing keys # Normal dict raises KeyError for missing keys # defaultdict creates a default value instead word_groups = defaultdict(list) # missing key creates empty list for word in ["apple", "avocado", "banana", "blueberry"]: word_groups[word[0]].append(word) # group by first letter print(dict(word_groups)) # {'a': ['apple', 'avocado'], 'b': ['banana', 'blueberry']} # namedtuple: tuples with named fields (like a lightweight class) Point = namedtuple("Point", ["x", "y"]) p = Point(3, 7) print(p.x, p.y) # 3, 7 — access by name print(p[0]) # 3 — still works as a tuple # deque: double-ended queue, fast at both ends dq = deque([1, 2, 3]) dq.appendleft(0) # fast prepend to front dq.append(4) # fast append to end print(dq) # deque([0, 1, 2, 3, 4])
Installing Packages with pip Beginner
pip is Python's package manager. It downloads and installs packages from PyPI (the Python Package Index) — a repository of over 400,000 third-party packages.
# Install a package pip install requests # Install a specific version pip install requests==2.31.0 # Install minimum version pip install requests>=2.28.0 # Install multiple packages pip install requests pandas numpy matplotlib # Upgrade an existing package pip install --upgrade requests # Uninstall a package pip uninstall requests # List all installed packages pip list # Show info about a specific package pip show requests # Check for outdated packages pip list --outdated # Save all installed packages to a file pip freeze > requirements.txt # Install from a requirements file pip install -r requirements.txt
Virtual Environments Intermediate
Imagine you're working on two projects: Project A needs version 1.0 of a library, Project B needs version 2.0. Without virtual environments, installing one would break the other. Virtual environments are isolated Python installations — each project gets its own sandbox with its own packages, completely separate from everything else.
# Create a virtual environment in a folder called 'venv' python -m venv venv # Activate it (MUST DO THIS every time you work on the project!) # On Windows: venv\Scripts\activate # On Mac/Linux: source venv/bin/activate # Your prompt now shows (venv) to confirm it's active # (venv) $ pip install requests ← installs ONLY in this venv # Deactivate when done deactivate # Never commit the venv/ folder to git! # Add it to .gitignore: # venv/ # __pycache__/ # *.pyc
requirements.txt Beginner
A requirements.txt file lists all the packages your project needs. Anyone who wants to run your project can install all dependencies with one command.
# requirements.txt — pin exact versions for reproducibility requests==2.31.0 pandas==2.0.3 numpy==1.25.2 matplotlib==3.7.2 # Or with flexible versions requests>=2.28.0,<3.0.0
# Generate requirements.txt from current environment pip freeze > requirements.txt # Install everything in requirements.txt (for collaborators / deployment) pip install -r requirements.txt
The datetime Module — Working with Dates & Times Intermediate
Dates and times are surprisingly tricky. Python's datetime module handles the complexity, but it has important gotchas — especially around timezones.
from datetime import datetime, date, time, timedelta, timezone # ── Creating dates and times ─────────────────────────────────── today = date.today() # just the date: 2024-03-15 now = datetime.now() # local date + time (naive!) utc_now = datetime.now(timezone.utc) # ✅ UTC-aware datetime specific = datetime(2024, 3, 15, 14, 30) # year, month, day, hour, min print(today) # 2024-03-15 print(now) # 2024-03-15 14:30:00.123456 print(utc_now) # 2024-03-15 19:30:00.123456+00:00 # ── Formatting: datetime → string (strftime) ─────────────────── # strftime = "string format time" now = datetime(2024, 3, 15, 14, 30, 45) print(now.strftime("%Y-%m-%d")) # "2024-03-15" print(now.strftime("%B %d, %Y")) # "March 15, 2024" print(now.strftime("%I:%M %p")) # "02:30 PM" print(now.strftime("%A, %B %d")) # "Friday, March 15" # Key format codes: # %Y=year(4dig) %m=month(01-12) %d=day(01-31) # %H=hour(24h) %M=minute %S=second # %I=hour(12h) %p=AM/PM # %A=weekday(full) %B=month(full) # ── Parsing: string → datetime (strptime) ───────────────────── # strptime = "string parse time" date_str = "2024-03-15" parsed = datetime.strptime(date_str, "%Y-%m-%d") print(parsed) # 2024-03-15 00:00:00 # fromisoformat — parse ISO 8601 strings (simpler than strptime) dt = datetime.fromisoformat("2024-03-15T14:30:00") # ── Date arithmetic with timedelta ──────────────────────────── today = date.today() tomorrow = today + timedelta(days=1) next_week = today + timedelta(weeks=1) in_90_days = today + timedelta(days=90) two_hours_ago = datetime.now() - timedelta(hours=2) # Difference between two dates start = date(2024, 1, 1) end = date(2024, 12, 31) diff = end - start print(diff.days) # 365 # ── Timestamps ──────────────────────────────────────────────── import time as time_module ts = datetime.now(timezone.utc).timestamp() # Unix timestamp (float) back = datetime.fromtimestamp(ts, tz=timezone.utc) # back to datetime
datetime.now() and datetime.utcnow() return naive datetimes — they have no timezone info. This causes silent bugs when converting between timezones or comparing datetimes from different sources. Always use datetime.now(timezone.utc) for UTC-aware datetimes. Store and compare everything in UTC; only convert to local time for display.
from datetime import datetime
# No timezone — causes bugs
now = datetime.utcnow()
# now has no tz info!
from datetime import datetime, timezone
# Always timezone-aware
now = datetime.now(timezone.utc)
# now.tzinfo = UTC ✓
Regular Expressions — Pattern Matching with re Intermediate
Regular expressions (regex) let you find, validate, and transform text using patterns. They handle things that simple string methods can't — like "find all email addresses" or "extract every number from this log file."
When to use regex vs. str methods: Use str.find(), str.replace(), str.split() for fixed, simple patterns. Reach for re when you need: repetition (one or more digits), alternatives (cat or dog), groups/captures, or patterns that vary in structure (email, URL, phone number).
import re # ── Core functions ───────────────────────────────────────────── # re.search() — find FIRST match anywhere in string # re.match() — match only at START of string # re.findall() — return list of ALL matches # re.sub() — replace matches with something else # re.compile() — pre-compile a pattern for repeated use (faster) text = "Call us at 555-1234 or 555-5678 for info." # re.search — returns a Match object or None match = re.search(r"\d{3}-\d{4}", text) if match: print(match.group()) # "555-1234" — the matched text print(match.start()) # 11 — start index print(match.end()) # 19 — end index # re.findall — return all matches as a list phones = re.findall(r"\d{3}-\d{4}", text) print(phones) # ['555-1234', '555-5678'] # re.sub — replace all matches cleaned = re.sub(r"\d{3}-\d{4}", "[REDACTED]", text) print(cleaned) # "Call us at [REDACTED] or [REDACTED]..." # re.compile — reuse a pattern (more efficient for loops) phone_pattern = re.compile(r"\d{3}-\d{4}") for line in ["555-1234", "no match", "999-0000"]: if phone_pattern.search(line): print(f"Found phone in: {line}") # ── Essential pattern syntax ─────────────────────────────────── # . any character except newline # * 0 or more of previous # + 1 or more of previous # ? 0 or 1 of previous (optional) # {n} exactly n of previous # {n,m} between n and m of previous # ^ start of string # $ end of string # [abc] any of: a, b, or c # [^abc] anything EXCEPT a, b, or c # \d any digit (0-9) # \w any word character (letter, digit, underscore) # \s any whitespace (space, tab, newline) # (abc) capturing group # a|b a or b # ── Capturing groups ────────────────────────────────────────── # Parentheses create groups you can extract separately email = "[email protected]" match = re.search(r"(\w+)@(\w+)\.(\w+)", email) if match: print(match.group(0)) # "[email protected]" — whole match print(match.group(1)) # "alice" — group 1 print(match.group(2)) # "example" — group 2 print(match.group(3)) # "com" — group 3 # Named groups — cleaner than numbered match = re.search(r"(?P<user>\w+)@(?P<domain>\w+)\.(?P<tld>\w+)", email) if match: print(match.group("user")) # "alice" print(match.group("domain")) # "example" # ── Raw strings: always use r"..." for patterns ─────────────── # \d without raw string = \\d (double backslash needed) # r"\d" = \d (backslash preserved) ← always use raw strings! # ── Flags ───────────────────────────────────────────────────── text = "Hello\nWorld" re.search(r"hello", text, re.IGNORECASE) # case-insensitive re.search(r"^World", text, re.MULTILINE) # ^ matches each line start
Regex has overhead. For simple checks: "error" in line is 10x more readable and faster than re.search(r"error", line). Use regex when you need patterns with structure, repetition, or alternatives — not for simple substring checks.
🔑 Key Takeaways — Modules, datetime & Regex
- Use
import modulefor whole modules;from module import namefor specific items. Avoidfrom module import *— it pollutes the namespace - Virtual environments isolate project dependencies — always create one per project, never install globally for project work
- Always use
datetime.now(timezone.utc)notdatetime.utcnow()— the former returns an aware datetime; the latter returns a naive one that silently causes timezone bugs - Always use raw strings
r"pattern"for regex — backslashes in patterns behave differently without therprefix - Use
re.compile()when using the same pattern multiple times — it pre-compiles for efficiency - Reach for regex only when
strmethods (find, replace, split, startswith) aren't expressive enough
Part XII — Object Oriented Programming
OOP is a way of organizing code by modeling the world as objects — things that have both data (attributes) and behavior (methods). It's the paradigm behind most large software systems, frameworks, and libraries.
What Is OOP and Why Does It Exist? Intermediate
Think of a blueprint for a house. The blueprint (class) describes what every house of that type has: rooms, doors, windows. An actual house built from that blueprint (object/instance) is real — it has specific values: 3 bedrooms, 2 bathrooms, blue door. You can build many houses from one blueprint. Each house shares the same structure but has its own unique characteristics.
OOP organizes code into "objects" that bundle related data and behavior together, mimicking how we think about real-world entities.
As programs grow large, procedural code becomes tangled. OOP gives structure: group related things, hide complexity, reuse code through inheritance.
Classes and __init__ Intermediate
# CLASS — the blueprint/template # Classes use PascalCase by convention class Dog: """Represents a dog with a name and breed.""" # __init__ = "initializer" — called automatically when you create an instance # 'self' refers to the specific object being created # 'self' MUST be the first parameter of every method — Python passes it automatically def __init__(self, name, breed, age): # self.attribute = value — these become INSTANCE attributes # Each Dog object gets its own copy of these self.name = name # store name on THIS dog object self.breed = breed # store breed on THIS dog object self.age = age # store age on THIS dog object self.tricks = [] # each dog starts with an empty tricks list # METHOD — a function that belongs to the class def bark(self): # 'self' lets the method access the object's data print(f"{self.name} says: Woof!") def learn_trick(self, trick): self.tricks.append(trick) print(f"{self.name} learned {trick}!") def describe(self): tricks_str = ", ".join(self.tricks) if self.tricks else "none" print(f"{self.name} is a {self.age}-year-old {self.breed}. Tricks: {tricks_str}") # INSTANCE — a specific object created from the class blueprint dog1 = Dog("Rex", "German Shepherd", 3) # create dog1 dog2 = Dog("Luna", "Labrador", 5) # create dog2 — independent object! # Access attributes print(dog1.name) # Rex print(dog2.breed) # Labrador # Call methods dog1.bark() # Rex says: Woof! dog1.learn_trick("sit") # Rex learned sit! dog1.learn_trick("shake hands") # Rex learned shake hands! dog1.describe() dog2.bark() # Luna says: Woof! — same method, different object dog2.describe() # tricks: none — dog2 is independent of dog1
Instance vs Class Attributes Intermediate
class BankAccount: # CLASS ATTRIBUTE — shared by ALL instances # Defined at class level, not inside __init__ interest_rate = 0.02 # 2% for all accounts account_count = 0 # tracks total number of accounts created def __init__(self, owner, balance=0): # INSTANCE ATTRIBUTES — unique to each account self.owner = owner self.balance = balance BankAccount.account_count += 1 # update class attribute def apply_interest(self): # Access class attribute via self or class name self.balance *= (1 + BankAccount.interest_rate) def deposit(self, amount): self.balance += amount # Class attributes are accessed via the class OR any instance acc1 = BankAccount("Alice", 1000) acc2 = BankAccount("Bob", 500) print(BankAccount.interest_rate) # 0.02 — class attribute print(acc1.interest_rate) # 0.02 — also accessible via instance print(BankAccount.account_count) # 2 — two accounts created # Instance attributes are per-object print(acc1.owner, acc1.balance) # Alice 1000 print(acc2.owner, acc2.balance) # Bob 500 # Changing class attribute affects ALL instances BankAccount.interest_rate = 0.03 # now 3% for everyone acc1.apply_interest() print(acc1.balance) # 1030.0
Inheritance & super() Intermediate
Inheritance is like a family tree. A child inherits traits from their parent but can also have their own unique traits. A SavingsAccount IS-A BankAccount — it has everything a BankAccount has, plus extra features like withdrawal limits.
# Parent (base) class class Animal: def __init__(self, name, species): self.name = name self.species = species self.is_alive = True def eat(self): print(f"{self.name} is eating.") def describe(self): print(f"{self.name} is a {self.species}") # Child (derived) class — inherits from Animal class Dog(Animal): # (Animal) means "inherit from Animal" def __init__(self, name, breed): # super() calls the PARENT's __init__ — don't duplicate code! super().__init__(name, species="Canis lupus familiaris") self.breed = breed # Dog-specific attribute self.tricks = [] # Override parent method def describe(self): # replaces Animal's describe() super().describe() # call parent's version first print(f" Breed: {self.breed}") # then add dog-specific info # New method only dogs have def bark(self): print(f"{self.name}: Woof!") class Cat(Animal): def __init__(self, name, indoor=True): super().__init__(name, species="Felis catus") self.indoor = indoor def purr(self): print(f"{self.name}: Purrr...") # Usage rex = Dog("Rex", "German Shepherd") whiskers = Cat("Whiskers") rex.eat() # inherited from Animal rex.bark() # Dog-specific rex.describe() # overridden version whiskers.eat() # also inherited from Animal whiskers.purr() # Cat-specific # isinstance() — check if object is an instance of a class (or its parents) print(isinstance(rex, Dog)) # True print(isinstance(rex, Animal)) # True! Dog IS-A Animal print(isinstance(rex, Cat)) # False
Dunder / Magic Methods Intermediate
Dunder methods (double underscore methods, like __str__) let your classes work with Python's built-in functions and operators. They're what makes Python's objects feel natural.
class Book: def __init__(self, title, author, pages): self.title = title self.author = author self.pages = pages # __str__: called by print() and str() — human-readable representation def __str__(self): return f"'{self.title}' by {self.author}" # __repr__: called in REPL and repr() — developer representation # Should ideally be enough to recreate the object def __repr__(self): return f"Book(title={self.title!r}, author={self.author!r}, pages={self.pages})" # __len__: called by len() def __len__(self): return self.pages # __eq__: called by == def __eq__(self, other): if not isinstance(other, Book): return NotImplemented return self.title == other.title and self.author == other.author # __lt__: called by < — enables sorting! def __lt__(self, other): return self.pages < other.pages # __contains__: called by 'in' operator def __contains__(self, word): return word.lower() in self.title.lower() # __add__: called by + def __add__(self, other): """Combine two books into an anthology.""" return Book( title=f"{self.title} & {other.title}", author=f"{self.author}, {other.author}", pages=self.pages + other.pages ) # In action b1 = Book("Python Basics", "Alice", 300) b2 = Book("Advanced Python", "Bob", 500) b3 = Book("Python Basics", "Alice", 300) print(b1) # 'Python Basics' by Alice (__str__) print(repr(b1)) # Book(title='Python Basics', ...) (__repr__) print(len(b1)) # 300 (__len__) print(b1 == b3) # True (__eq__) print(b1 == b2) # False print(b1 < b2) # True (300 < 500) (__lt__) print("Python" in b1) # True (__contains__) anthology = b1 + b2 # creates a new Book (__add__) print(anthology) print(sorted([b2, b1])) # sorts by pages using __lt__
Dataclasses — Boilerplate-Free Classes Intermediate
from dataclasses import dataclass, field # @dataclass decorator auto-generates __init__, __repr__, __eq__ # You just declare the fields — no boilerplate! @dataclass class Point: x: float # type hint required! y: float z: float = 0.0 # default value p = Point(1.0, 2.0) print(p) # Point(x=1.0, y=2.0, z=0.0) — __repr__ auto-generated print(p.x, p.y) # 1.0, 2.0 — attributes work normally p2 = Point(1.0, 2.0) print(p == p2) # True — __eq__ auto-generated (compares all fields) # Frozen dataclass — immutable (like a named tuple with types) @dataclass(frozen=True) class Color: r: int g: int b: int red = Color(255, 0, 0) # red.r = 100 ← FrozenInstanceError! Can't modify frozen dataclass # Dataclass with methods @dataclass class Student: name: str grades: list = field(default_factory=list) # ← correct way to have mutable default def average(self) -> float: return sum(self.grades) / len(self.grades) if self.grades else 0.0 alice = Student("Alice") alice.grades.extend([95, 87, 92]) print(f"{alice.name}'s average: {alice.average():.1f}")
🔑 Key Takeaways — Object Oriented Programming
- A class is a blueprint; an instance is a concrete object built from that blueprint. Each instance has its own attribute values
selfis always the first parameter of instance methods — Python passes it automatically when you call a method- Use
@dataclasswhen you mostly need data storage with automatic__init__,__repr__, and__eq__— avoid boilerplate - Inherit when you have an "is-a" relationship. Use composition (store an instance as an attribute) for "has-a" relationships
- Implement
__str__for human-readable output;__repr__for developer/debug output. Always implement both - Ask "do I need shared state and multiple related behaviors?" before reaching for a class — functions often suffice
Part XIII — Intermediate Pythonic Patterns
These patterns separate good Python from great Python. Generators, decorators, and context managers are the tools that make Python code elegant, memory-efficient, and maintainable at scale.
Comprehensions In Depth Intermediate
# ── LIST COMPREHENSION ───────────────────────────────────────── # [expression for item in iterable if condition] # All in one line — more readable once you know the pattern squares = [x**2 for x in range(10)] evens = [x for x in range(20) if x % 2 == 0] # Conditional expression inside (ternary in comprehension) labeled = ["even" if x % 2 == 0 else "odd" for x in range(6)] print(labeled) # ['even', 'odd', 'even', 'odd', 'even', 'odd'] # Nested comprehension — flatten a 2D list matrix = [[1,2],[3,4],[5,6]] flat = [n for row in matrix for n in row] print(flat) # [1, 2, 3, 4, 5, 6] # ── DICT COMPREHENSION ───────────────────────────────────────── # {key_expr: val_expr for item in iterable if condition} # Word length mapping words = ["python", "is", "great"] lengths = {word: len(word) for word in words} print(lengths) # {'python': 6, 'is': 2, 'great': 5} # Filter while building scores = {"Alice": 92, "Bob": 68, "Charlie": 85} passing = {name: score for name, score in scores.items() if score >= 80} print(passing) # {'Alice': 92, 'Charlie': 85} # ── SET COMPREHENSION ────────────────────────────────────────── # {expression for item in iterable} # Unique word lengths sentence = "the quick brown fox jumps over the lazy dog" unique_lengths = {len(word) for word in sentence.split()} print(unique_lengths) # {3, 4, 5} (unordered, unique) # When NOT to use comprehensions: # If the logic is complex, use a regular loop — readability wins! # BAD — too complex to read quickly: result = [str(x**2) for x in range(100) if x % 3 == 0 if x % 5 == 0 if x < 50] # Better: write a regular loop with comments
Generator Expressions & Functions Intermediate
Generators produce values on-demand, one at a time, instead of creating the entire sequence in memory at once.
For large datasets, generators use tiny amounts of memory. A generator for a billion numbers uses the same memory as one for ten.
# List comprehension: creates ALL values in memory at once squares_list = [x**2 for x in range(1_000_000)] # ~8MB of memory # Generator expression: produces values ONE AT A TIME (lazy evaluation) # Same syntax as list comprehension but with () instead of [] squares_gen = (x**2 for x in range(1_000_000)) # ~200 bytes! # You can iterate over it just like a list for sq in squares_gen: if sq > 100: break # only generated values up to this point # Generators work great with sum(), max(), any(), all() total = sum(x**2 for x in range(100)) # no intermediate list created! has_negative = any(x < 0 for x in [1, -2, 3]) # stops as soon as it finds -2 # ── GENERATOR FUNCTIONS (using yield) ───────────────────────── # 'yield' is like 'return' but pauses the function instead of ending it # Each time next() is called, execution resumes from after the yield def count_up(start, step=1): """Generate integers starting from 'start', incrementing by 'step'.""" current = start while True: # infinite generator — only generates when asked yield current # pause here and send 'current' to the caller current += step # then resume from here on next call gen = count_up(10, 5) print(next(gen)) # 10 — get first value print(next(gen)) # 15 print(next(gen)) # 20 # Generator remembers its state between calls! # Practical generator: reading a huge file line by line def read_large_file(filepath): """Read file without loading it all into memory.""" with open(filepath, "r") as f: for line in f: yield line.strip() # yield one line at a time # The file is read one line at a time — works for 100GB files! for line in read_large_file("huge_data.txt"): process(line) # handle one line, then discard it
Decorators Advanced
A decorator is like putting a fancy frame around a picture. The picture (function) doesn't change — the frame (decorator) just adds something around it. You can add logging, timing, caching, or access control to any function without modifying the function itself.
import time from functools import wraps # ── HOW DECORATORS WORK ──────────────────────────────────────── # A decorator is a function that takes a function and returns a function def timer(func): # 'func' is the function being decorated @wraps(func) # preserves original function's metadata def wrapper(*args, **kwargs): # wrapper captures all arguments start = time.time() # do something BEFORE result = func(*args, **kwargs) # call the original function end = time.time() # do something AFTER print(f"{func.__name__} took {end - start:.4f}s") return result # return the original result return wrapper # return the wrapper, not the result # Apply with @ syntax — this is just syntactic sugar for: # slow_function = timer(slow_function) @timer def slow_function(n): """Sleep for n seconds.""" time.sleep(n) return f"Done sleeping for {n}s" result = slow_function(1) # slow_function took 1.0002s print(result) # Done sleeping for 1s # ── DECORATOR WITH ARGUMENTS ─────────────────────────────────── def retry(max_attempts=3): # outer function takes decorator args def decorator(func): # middle function takes the function @wraps(func) def wrapper(*args, **kwargs): # inner function runs the logic for attempt in range(max_attempts): try: return func(*args, **kwargs) except Exception as e: print(f"Attempt {attempt+1} failed: {e}") if attempt + 1 == max_attempts: raise return wrapper return decorator @retry(max_attempts=3) def flaky_function(): import random if random.random() < 0.7: # fails 70% of the time raise ConnectionError("Network error") return "Success!" # ── BUILT-IN DECORATORS ──────────────────────────────────────── class Circle: def __init__(self, radius): self._radius = radius # _ prefix = "private by convention" # @property: access a method like an attribute (no parentheses) @property def radius(self): return self._radius # @radius.setter: called when you set circle.radius = value @radius.setter def radius(self, value): if value < 0: raise ValueError("Radius cannot be negative") self._radius = value @property def area(self): import math return math.pi * self._radius ** 2 # @staticmethod: belongs to class but doesn't need self or cls @staticmethod def is_valid_radius(r): return r > 0 # @classmethod: alternative constructor, gets class as first arg (cls) @classmethod def from_diameter(cls, diameter): return cls(diameter / 2) c = Circle(5) print(c.radius) # 5 — property accessed like attribute print(c.area) # 78.54 — computed on the fly c.radius = 10 # calls the setter # c.radius = -1 ← ValueError! c2 = Circle.from_diameter(20) # alternative constructor print(c2.radius) # 10.0 print(Circle.is_valid_radius(5)) # True — no instance needed
Context Managers Advanced
Context managers (the with statement) ensure resources are properly set up and torn down. You've already used one with with open(...). Here's how to build your own.
# Class-based context manager # __enter__: called at the start of the 'with' block # __exit__: called at the end, even if an error occurs class Timer: """Context manager that measures elapsed time.""" def __enter__(self): self.start = time.time() return self # 'as' clause receives this return value def __exit__(self, exc_type, exc_val, exc_tb): # exc_type, exc_val, exc_tb = exception info (None if no error) self.elapsed = time.time() - self.start print(f"Elapsed: {self.elapsed:.4f}s") return False # False = don't suppress exceptions with Timer() as t: # t = what __enter__ returned (self) time.sleep(0.5) print(f"Mid-way, elapsed so far: {time.time() - t.start:.2f}s") # __exit__ called here: Elapsed: 0.5001s # ── contextlib: easier context managers ─────────────────────── from contextlib import contextmanager @contextmanager def managed_resource(name): print(f"Setting up {name}") # runs BEFORE the with block try: yield name # pause here — this is the 'with' block finally: print(f"Cleaning up {name}") # runs AFTER, even on error with managed_resource("database") as resource: print(f"Using {resource}") # Output: # Setting up database # Using database # Cleaning up database # Multiple context managers in one with statement (Python 3.10+) with open("input.txt") as infile, open("output.txt", "w") as outfile: outfile.write(infile.read().upper())
itertools — Efficient Iteration Tools Intermediate
itertools is a collection of fast, memory-efficient functions for working with iterables. They all return lazy iterators — they compute values on demand, never building the full result in memory at once.
import itertools # ── chain() — flatten multiple iterables into one ────────────── letters = ["a", "b"] numbers = [1, 2, 3] symbols = ("!", "?") for item in itertools.chain(letters, numbers, symbols): print(item, end=" ") # a b 1 2 3 ! ? # vs the naive approach: for item in letters + numbers + list(symbols) ... # chain() doesn't build a new list — saves memory on large collections # ── product() — Cartesian product (replaces nested for loops) ── sizes = ["S", "M", "L"] colors = ["red", "blue"] # Naive approach with nested loop: # for size in sizes: # for color in colors: print(size, color) # With product: for size, color in itertools.product(sizes, colors): print(f"{size}-{color}") # S-red, S-blue, M-red, M-blue, L-red, L-blue # ── groupby() — group consecutive items by a key ─────────────── # IMPORTANT: data must be sorted by the key first! data = [ {"name": "Alice", "dept": "Eng"}, {"name": "Bob", "dept": "Eng"}, {"name": "Carol", "dept": "HR"}, {"name": "Dave", "dept": "HR"}, ] data.sort(key=lambda x: x["dept"]) # sort first! for dept, members in itertools.groupby(data, key=lambda x: x["dept"]): names = [m["name"] for m in members] print(f"{dept}: {names}") # Eng: ['Alice', 'Bob'] # HR: ['Carol', 'Dave'] # ── islice() — lazy slicing of any iterable ──────────────────── # Useful when you can't use [start:stop] (e.g., on a generator) def infinite_counter(): n = 0 while True: yield n n += 1 first_10 = list(itertools.islice(infinite_counter(), 10)) print(first_10) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] # ── combinations() & permutations() ─────────────────────────── items = ["A", "B", "C"] print(list(itertools.combinations(items, 2))) # [('A','B'), ('A','C'), ('B','C')] — order doesn't matter print(list(itertools.permutations(items, 2))) # [('A','B'), ('A','C'), ('B','A'), ('B','C'), ...] — order matters
functools — Higher-Order Function Tools Intermediate
functools provides tools for working with functions as first-class objects — caching, partial application, and decoration helpers.
from functools import lru_cache, cache, partial, reduce, wraps # ── @lru_cache — memoization in one decorator ────────────────── # PROBLEM: recursive fibonacci recomputes the same values millions of times # APPROACH: cache results — if we've seen inputs before, return cached result # Without cache — exponential time O(2^n) def fib_slow(n): if n < 2: return n return fib_slow(n-1) + fib_slow(n-2) # With lru_cache — linear time O(n) @lru_cache(maxsize=None) # cache unlimited results def fib_fast(n): if n < 2: return n return fib_fast(n-1) + fib_fast(n-2) print(fib_fast(50)) # instant! fib_slow(50) would take minutes print(fib_fast.cache_info()) # CacheInfo(hits=48, misses=51, ...) # @cache is a simpler alias for @lru_cache(maxsize=None) @cache # Python 3.9+ def expensive_computation(n): return n ** 2 # ── partial() — pre-fill function arguments ──────────────────── # Creates a new function with some arguments already filled in def power(base, exponent): return base ** exponent square = partial(power, exponent=2) # exponent always 2 cube = partial(power, exponent=3) # exponent always 3 print(square(5)) # 25 print(cube(3)) # 27 # Useful with sorted(): pre-fill the key function data = ["banana", "apple", "Cherry", "date"] sort_ci = partial(sorted, key=str.lower) # case-insensitive sort print(sort_ci(data)) # ['apple', 'banana', 'Cherry', 'date'] # ── @wraps — every decorator MUST use this ───────────────────── # Without @wraps, decorators destroy the function's metadata def timer_bad(func): def wrapper(*args, **kwargs): return func(*args, **kwargs) return wrapper # wrapper.__name__ = "wrapper" — WRONG! def timer_good(func): @wraps(func) # copies __name__, __doc__, __annotations__ from func def wrapper(*args, **kwargs): return func(*args, **kwargs) return wrapper # wrapper.__name__ = func.__name__ — correct! @timer_good def calculate(x): """Squares x.""" return x ** 2 print(calculate.__name__) # "calculate" — preserved by @wraps print(calculate.__doc__) # "Squares x." — preserved by @wraps # ── reduce() — fold a sequence into a single value ───────────── from functools import reduce import operator numbers = [1, 2, 3, 4, 5] total = reduce(operator.add, numbers) # 15 — same as sum(numbers) product = reduce(operator.mul, numbers) # 120 — 1*2*3*4*5 # Generally: prefer sum(), max(), min() over reduce() when possible # Use reduce() for operations that don't have a built-in equivalent
Every decorator you write should apply @wraps(func) to its inner wrapper function. Without it, the decorated function loses its __name__, __doc__, and __annotations__. This breaks help(), introspection tools, pytest output, and type checkers. It's a one-line fix with no downside.
🔑 Key Takeaways — Intermediate Patterns
- List comprehensions are faster than equivalent for loops for building new collections — but don't nest them more than 2 levels deep
- Generators are lazy — they compute values on demand. Use them when you don't need all values at once, especially for large datasets
- Decorators are functions that wrap other functions. Always use
@wraps(func)inside your decorators to preserve metadata @lru_cache/@cacheadds memoization in one line — transforming exponential recursive algorithms to linearitertools.chain()anditertools.product()replace nested loops with clean, memory-efficient iterators- Context managers (
withblocks) guarantee cleanup even when exceptions occur — use@contextmanagerfor concise custom ones
Part XIV — Popular Libraries — Practical Depth
These four libraries cover 80% of what Python professionals do every day: pandas for data manipulation, numpy for numerical computing, requests for HTTP, and matplotlib/seaborn for visualization.
pandas — Data Analysis Intermediate
pandas is the gold standard for working with tabular data in Python. Its two core objects are Series (a column) and DataFrame (a table).
pip install pandas
import pandas as pd # 'pd' is the universal alias import numpy as np # ── SERIES — a labeled 1D array (like a column) ──────────────── temps = pd.Series([72, 75, 68, 80, 79], index=["Mon", "Tue", "Wed", "Thu", "Fri"]) print(temps) print(temps["Wed"]) # 68 — access by label print(temps.mean()) # 74.8 print(temps[temps > 74]) # filter: only days above 74 # ── DATAFRAME — a 2D table ───────────────────────────────────── # Create from a dict of lists (each list = one column) data = { "name": ["Alice", "Bob", "Charlie", "Diana", "Eve"], "dept": ["Eng", "Mktg", "Eng", "HR", "Eng"], "salary": [95000, 72000, 105000, 68000, 98000], "years": [3, 5, 7, 2, 4], } df = pd.DataFrame(data) print(df) print(df.head(3)) # first 3 rows print(df.tail(2)) # last 2 rows print(df.shape) # (5, 4) — rows, columns print(df.dtypes) # data type of each column print(df.describe()) # statistics: mean, std, min, max... print(df.info()) # column types, null counts, memory # ── READING/WRITING CSV ──────────────────────────────────────── df.to_csv("employees.csv", index=False) # save to CSV df2 = pd.read_csv("employees.csv") # read from CSV df2 = pd.read_csv("employees.csv", dtype={"salary": float}, # specify column types na_values=["N/A", "-"]) # what counts as missing # Reading Excel # df_excel = pd.read_excel("data.xlsx", sheet_name="Sheet1") # ── SELECTING DATA ───────────────────────────────────────────── # Single column (returns Series) print(df["name"]) # Multiple columns (returns DataFrame) print(df[["name", "salary"]]) # By position: iloc[rows, columns] print(df.iloc[0]) # first row (as Series) print(df.iloc[1:3]) # rows 1-2 print(df.iloc[0, 2]) # row 0, column 2 → 95000 # By label: loc[rows, columns] print(df.loc[0, "salary"]) # row with index 0, "salary" column # ── FILTERING ───────────────────────────────────────────────── # Boolean mask — condition returns a Series of True/False engineers = df[df["dept"] == "Eng"] print(engineers) # Multiple conditions — must use & (and) | (or), not 'and'/'or' senior_eng = df[(df["dept"] == "Eng") & (df["years"] >= 4)] print(senior_eng) high_salary = df[df["salary"] > 90000] print(high_salary["name"]) # just the names column # .isin() — match any value in a list tech_depts = df[df["dept"].isin(["Eng", "IT"])] # .str accessor for string operations on columns df["name_upper"] = df["name"].str.upper() df["name_length"] = df["name"].str.len() # ── GROUPBY — split-apply-combine ───────────────────────────── # GroupBy is one of the most powerful pandas features # "Group by dept, then calculate the mean salary for each group" dept_stats = df.groupby("dept")["salary"].agg(["mean", "min", "max", "count"]) print(dept_stats) # Group by multiple columns # df.groupby(["dept", "seniority"])["salary"].mean() # ── MERGE (SQL-style JOIN) ───────────────────────────────────── projects = pd.DataFrame({ "name": ["Alice", "Bob", "Alice"], "project": ["Alpha", "Beta", "Gamma"] }) merged = pd.merge(df, projects, on="name", how="left") # left join print(merged) # ── APPLY — custom function on each row/column ──────────────── def grade_salary(salary): if salary >= 100000: return "Senior" if salary >= 80000: return "Mid" return "Junior" df["level"] = df["salary"].apply(grade_salary) # apply to each row in column print(df[["name", "salary", "level"]]) # ── PIVOT TABLE ──────────────────────────────────────────────── pivot = df.pivot_table(values="salary", index="dept", aggfunc="mean") print(pivot) # ── HANDLING MISSING VALUES ──────────────────────────────────── df_with_nulls = pd.DataFrame({"a": [1, None, 3], "b": [4, 5, None]}) print(df_with_nulls.isnull()) # True where value is missing print(df_with_nulls.isnull().sum()) # count missing per column df_dropped = df_with_nulls.dropna() # remove rows with ANY null df_filled = df_with_nulls.fillna(0) # replace null with 0 df_filled2 = df_with_nulls.fillna(df_with_nulls.mean()) # fill with column mean
numpy — Numerical Computing Intermediate
numpy provides n-dimensional arrays and the mathematical operations on them. It's the foundation of pandas, matplotlib, scikit-learn — virtually all scientific Python. It's fast because operations happen in C underneath.
import numpy as np # ── CREATING ARRAYS ──────────────────────────────────────────── a = np.array([1, 2, 3, 4, 5]) # from a Python list b = np.array([[1,2,3],[4,5,6]]) # 2D array (matrix) print(a.shape) # (5,) — 1D array with 5 elements print(b.shape) # (2, 3) — 2 rows, 3 columns print(b.ndim) # 2 — number of dimensions print(b.dtype) # int64 — data type print(b.size) # 6 — total number of elements # Common array creation functions zeros = np.zeros((3, 4)) # 3x4 array of zeros ones = np.ones((2, 3)) # 2x3 array of ones eye = np.eye(4) # 4x4 identity matrix rng = np.arange(0, 10, 2) # [0, 2, 4, 6, 8] — like range() but returns array lin = np.linspace(0, 1, 5) # 5 evenly spaced from 0 to 1: [0, 0.25, 0.5, 0.75, 1] rand = np.random.rand(3, 3) # 3x3 random floats [0, 1) randn = np.random.randn(100) # 100 values from normal distribution # ── VECTORIZED OPERATIONS — no loops needed! ─────────────────── # Operations apply to ALL elements at once — much faster than Python loops prices = np.array([10.0, 20.0, 30.0, 40.0]) tax_rate = 0.08 # No loop needed — operates on entire array prices_with_tax = prices * (1 + tax_rate) print(prices_with_tax) # [10.8, 21.6, 32.4, 43.2] discounted = prices - 5 # subtract 5 from each doubled = prices * 2 # multiply each by 2 squared = prices ** 2 # square each # Math functions that work element-wise x = np.linspace(0, 2*np.pi, 100) y = np.sin(x) # sin of every element y2 = np.exp(np.array([1, 2, 3])) # [e^1, e^2, e^3] # ── BROADCASTING — operations between different-shaped arrays ── # numpy auto-expands smaller array to match larger matrix = np.array([[1,2,3],[4,5,6],[7,8,9]]) row = np.array([10, 20, 30]) # shape (3,) result = matrix + row # row broadcasts to match matrix print(result) # [[11, 22, 33], # [14, 25, 36], # [17, 28, 39]] # ── ARRAY MATH AND STATISTICS ────────────────────────────────── data = np.random.randn(1000) # 1000 random numbers print(np.mean(data)) # mean (≈ 0 for standard normal) print(np.std(data)) # standard deviation (≈ 1) print(np.min(data), np.max(data)) print(np.percentile(data, [25, 50, 75])) # quartiles print(np.sum(data)) print(np.cumsum(np.array([1,2,3,4]))) # [1, 3, 6, 10] cumulative sum # ── INDEXING AND SLICING ─────────────────────────────────────── arr = np.arange(12).reshape(3, 4) # reshape 1D to 3x4 matrix print(arr) print(arr[1, 2]) # row 1, col 2 → 6 print(arr[0, :]) # entire first row print(arr[:, 1]) # entire second column print(arr[1:, 2:]) # bottom-right 2x2 block # Boolean indexing — filter with a condition print(arr[arr > 6]) # only values > 6 # ── WHY IS NUMPY FASTER THAN LISTS? ─────────────────────────── # Python lists store objects (with overhead). numpy stores raw numbers. # Operations use optimized C/Fortran code under the hood. # For 1 million elements: # Python loop: ~100ms # numpy vectorized: ~1ms (100x faster!) import time n = 1_000_000 # Python list approach py_list = list(range(n)) start = time.time() result = [x**2 for x in py_list] print(f"Python: {time.time() - start:.3f}s") # numpy approach np_arr = np.arange(n) start = time.time() result = np_arr ** 2 print(f"numpy: {time.time() - start:.3f}s") # ~50-100x faster!
requests — HTTP for Humans Intermediate
The requests library is the standard way to make HTTP requests in Python — fetching web pages, calling APIs, submitting data. Its tagline is literally "HTTP for Humans."
import requests # ── GET REQUEST — fetch data ─────────────────────────────────── response = requests.get("https://api.github.com") # Response object properties print(response.status_code) # 200 = success, 404 = not found, 500 = server error print(response.ok) # True if status_code < 400 print(response.headers) # response headers dict print(response.text) # response body as string data = response.json() # parse JSON body → Python dict print(data) # GET with query parameters (URL: ?page=1&per_page=5) params = {"page": 1, "per_page": 5} response = requests.get( "https://api.github.com/repos/python/cpython/issues", params=params # requests builds the query string for you ) issues = response.json() for issue in issues: print(f"#{issue['number']}: {issue['title']}") # ── POST REQUEST — send data ─────────────────────────────────── # Sending form data response = requests.post( "https://httpbin.org/post", # test endpoint that echoes back what you send data={"username": "alice", "password": "secret"} ) # Sending JSON response = requests.post( "https://httpbin.org/post", json={"user": "alice", "score": 100} # auto-sets Content-Type: application/json ) print(response.json()) # ── HEADERS — authentication, content type ──────────────────── headers = { "Authorization": "Bearer YOUR_API_TOKEN_HERE", "Accept": "application/json", "User-Agent": "MyApp/1.0" } response = requests.get("https://api.example.com/data", headers=headers) # ── ERROR HANDLING ───────────────────────────────────────────── def safe_request(url, **kwargs): """Make a GET request with proper error handling.""" try: response = requests.get(url, timeout=10, **kwargs) # timeout in seconds response.raise_for_status() # raises HTTPError for 4xx/5xx responses return response.json() except requests.exceptions.Timeout: print("Request timed out") except requests.exceptions.ConnectionError: print("No internet connection") except requests.exceptions.HTTPError as e: print(f"HTTP Error: {e.response.status_code}") except requests.exceptions.JSONDecodeError: print("Response was not valid JSON") return None # ── SESSION — reuse connection and headers ───────────────────── # Sessions are more efficient for multiple requests to the same host with requests.Session() as session: session.headers["Authorization"] = "Bearer TOKEN" # set once, used for all session.headers["Accept"] = "application/json" r1 = session.get("https://api.example.com/users") r2 = session.get("https://api.example.com/posts") # both use the same headers and connection pool # ── OTHER METHODS ────────────────────────────────────────────── # requests.put(url, json=data) — update a resource # requests.patch(url, json=data) — partial update # requests.delete(url) — delete a resource # requests.head(url) — get headers only (no body)
matplotlib & seaborn — Visualization Intermediate
pip install matplotlib seaborn
import matplotlib.pyplot as plt # 'plt' is the universal alias import seaborn as sns # 'sns' is the universal alias import numpy as np import pandas as pd # ── LINE CHART ───────────────────────────────────────────────── x = np.linspace(0, 2*np.pi, 100) # 100 points from 0 to 2π plt.figure(figsize=(10, 4)) # create figure: 10 wide, 4 tall (inches) plt.plot(x, np.sin(x), label="sin(x)", color="blue", linewidth=2) plt.plot(x, np.cos(x), label="cos(x)", color="red", linestyle="--") plt.title("Sine and Cosine", fontsize=14) plt.xlabel("x") plt.ylabel("y") plt.legend() # show legend with labels plt.grid(True, alpha=0.3) # light grid lines plt.tight_layout() # auto-adjust spacing plt.savefig("sine_cosine.png", dpi=150) # save to file plt.show() # display (blocks until window closed) # ── BAR CHART ────────────────────────────────────────────────── categories = ["Engineering", "Marketing", "HR", "Finance"] values = [95000, 72000, 68000, 85000] colors = ["#1a4a8c", "#2e6dbf", "#a8c4e8", "#0f2d5a"] plt.figure(figsize=(8, 5)) bars = plt.bar(categories, values, color=colors, edgecolor="white", linewidth=1.5) plt.title("Average Salary by Department") plt.ylabel("Salary ($)") # Add value labels on top of each bar for bar, val in zip(bars, values): plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 500, f"${val:,}", ha="center", va="bottom", fontweight="bold") plt.tight_layout() plt.show() # ── SCATTER PLOT ─────────────────────────────────────────────── n = 100 x = np.random.randn(n) y = 2 * x + np.random.randn(n) # linear relationship with noise colors_scatter = np.abs(y) # color by y value plt.figure(figsize=(7, 6)) scatter = plt.scatter(x, y, c=colors_scatter, cmap="Blues", alpha=0.7) plt.colorbar(scatter, label="|y|") # add color legend plt.title("Scatter Plot with Color Mapping") plt.show() # ── SUBPLOTS — multiple plots in one figure ──────────────────── fig, axes = plt.subplots(2, 2, figsize=(12, 8)) # 2x2 grid of plots fig.suptitle("Dashboard", fontsize=16) axes[0,0].plot([1,2,3,4], [1,4,2,3]) axes[0,0].set_title("Line") axes[0,1].bar(["A","B","C"], [3,7,2], color="#1a4a8c") axes[0,1].set_title("Bar") axes[1,0].scatter(np.random.randn(50), np.random.randn(50)) axes[1,0].set_title("Scatter") axes[1,1].hist(np.random.randn(500), bins=20, color="#2e6dbf") axes[1,1].set_title("Histogram") plt.tight_layout() plt.show() # ── SEABORN — prettier, higher-level plots ───────────────────── sns.set_theme(style="whitegrid") # set a nice default theme # Sample dataset df = pd.DataFrame({ "dept": ["Eng","Eng","Mktg","Mktg","HR"]*10, "salary": np.random.normal([95000,95000,72000,72000,68000]*10, 5000) }) # Box plot — distribution by category plt.figure(figsize=(8, 5)) sns.boxplot(data=df, x="dept", y="salary", palette="Blues") plt.title("Salary Distribution by Department") plt.show() # Heatmap — great for correlation matrices corr_data = pd.DataFrame(np.random.randn(100, 4), columns=["Revenue","Costs","Profit","Growth"]) plt.figure(figsize=(7, 6)) sns.heatmap(corr_data.corr(), annot=True, fmt=".2f", cmap="Blues", vmin=-1, vmax=1, center=0) plt.title("Correlation Matrix") plt.tight_layout() plt.show()
Part XV — Type Hints & Static Typing
Type hints transform Python from a dynamic scripting language into a professionally engineered codebase. They don't change runtime behavior — they add a communication layer that tools, IDEs, and teammates all rely on.
Why Type Hints Exist Intermediate
Imagine a restaurant kitchen where every ingredient container is labeled. You could work with unlabeled containers — you'd figure it out eventually — but labeled ones prevent mistakes, speed up new cooks, and make the whole kitchen more reliable. Type hints are labels for your code's data.
Python is dynamically typed — you never have to declare what type a variable holds. This is flexible, but it means type errors only surface at runtime, when real users hit them. Type hints add a voluntary "second layer" that tools can verify before the code ever runs.
def double(x):
return x * 2
double("hello")
# Returns "hellohello" — not an error!
# But probably a bug. You find out
# when production breaks at 3am.
def double(x: int) -> int:
return x * 2
double("hello")
# mypy: Argument 1 to "double" has
# incompatible type "str"; expected "int"
# Caught before code even runs!
When to use type hints: Working on a team (improves communication), building anything over ~200 lines (catches bugs early), using an IDE (enables autocomplete and inline error detection), or building a library (your users get better tooling). For one-off scripts or quick experiments, hints are optional — the investment pays off as codebase size grows.
Type hints are not enforced at runtime. def f(x: int): ... will happily accept a string at runtime — Python ignores the hint. Type hints only matter when you run a tool like mypy or use an IDE that reads them. Think of them as documentation that tools can verify.
Basic Type Annotations 🐍 3.9+ built-in generics Intermediate
# ── Variable annotations ─────────────────────────────────────── name: str = "Alice" age: int = 25 score: float = 98.6 is_active: bool = True # You can annotate without assigning (declares intent) user_id: int # not yet assigned, but tools know it must be int # ── Function annotations ────────────────────────────────────── def greet(name: str, times: int = 1) -> str: """Return greeting repeated `times` times.""" return (name + "!") * times def log_message(msg: str) -> None: # -> None = returns nothing print(f"[LOG] {msg}") # ── Collection annotations ───────────────────────────────────── # Python 3.9+: use built-in types directly scores: list[int] = [95, 87, 91] lookup: dict[str, int] = {"alice": 1, "bob": 2} coords: tuple[float, float] = (51.5, -0.1) unique: set[str] = {"python", "rust"} # Python 3.7-3.8: must import from typing from typing import List, Dict, Tuple, Set scores_old: List[int] = [95, 87, 91] # same, older syntax # Nested collections matrix: list[list[float]] = [[1.0, 0.0], [0.0, 1.0]] users: list[dict[str, str]] = [{"name": "Alice"}, {"name": "Bob"}] # ── Typed class ─────────────────────────────────────────────── class User: name: str age: int email: str def __init__(self, name: str, age: int, email: str) -> None: self.name = name self.age = age self.email = email def greet(self) -> str: return f"Hi, I'm {self.name}"
The typing Module — Advanced Annotations 🐍 3.10+ union syntax Intermediate
from typing import Optional, Union, Any, Callable, TypeVar # ── Optional — value can be the type OR None ─────────────────── # Use when a function might return nothing def find_user(user_id: int) -> Optional[str]: # returns str or None users = {1: "Alice", 2: "Bob"} return users.get(user_id) # returns None if not found # Python 3.10+ shorthand: str | None (no import needed!) def find_user_modern(user_id: int) -> str | None: # 3.10+ syntax users = {1: "Alice"} return users.get(user_id) # ── Union — value can be one of several types ────────────────── def process(value: Union[int, str]) -> str: return str(value) # Python 3.10+: int | str (much cleaner!) def process_modern(value: int | str) -> str: return str(value) # ── Any — escape hatch (use sparingly!) ─────────────────────── from typing import Any def flexible(data: Any) -> Any: # turns off type checking for this function return data # ── Callable — annotating functions that take functions ──────── from typing import Callable def apply_twice(func: Callable[[int], int], x: int) -> int: # func: a callable that takes int → returns int return func(func(x)) apply_twice(lambda n: n * 2, 3) # 12 (3 → 6 → 12) # ── TypedDict — typed structure for dict data ────────────────── from typing import TypedDict class MovieRecord(TypedDict): title: str year: int rating: float # mypy now knows exactly what fields a MovieRecord has movie: MovieRecord = {"title": "Inception", "year": 2010, "rating": 8.8} # ── Literal — restrict to specific values ───────────────────── from typing import Literal def set_direction(direction: Literal["left", "right", "up", "down"]) -> None: print(f"Moving {direction}") # mypy error if you pass "diagonal" — only the 4 literals allowed
Any silently disables type checking for everything it touches. Using Any on a parameter means mypy won't catch any errors involving that parameter — it's a complete opt-out, not a partial one. Use Any only as a last resort when a proper type is genuinely impossible to express. If you find yourself writing Any frequently, you're losing the benefits of type hints entirely.
Running mypy — Static Type Checking Mastery
mypy reads your type hints and checks them without running your code. It catches entire categories of bugs before they ever reach production.
# Install mypy pip install mypy # Check a single file mypy script.py # Check an entire package mypy src/ # Strict mode — catches more issues mypy --strict script.py # Common useful flags mypy --ignore-missing-imports script.py # suppress errors for untyped 3rd-party libs mypy --show-error-codes script.py # show error codes (e.g., [arg-type])
# Example mypy errors and what they mean: # script.py:12: error: Argument 1 to "greet" has incompatible type "int"; # expected "str" [arg-type] # → You passed an int where a str was expected on line 12 # script.py:18: error: Item "None" of "Optional[str]" has no attribute "upper" # [union-attr] # → You called .upper() on something that might be None — add a None check! # script.py:25: error: Return type declared as "str" but return statement # returns "int" [return-value] # → Your function signature says -> str but you returned an int # Suppressing a specific line when you KNOW it's safe: result = some_api_call() # type: ignore[return-value] # Use sparingly — prefer fixing the underlying type issue # mypy config in pyproject.toml or mypy.ini # [mypy] # strict = true # ignore_missing_imports = true
Gradual vs. strict typing: For new codebases, start with mypy --strict from day one — it's easier to maintain strict typing than to retrofit it. For existing codebases, use gradual typing: add hints file by file, module by module. Use # type: ignore on legacy code temporarily, then remove it as you type more. Even 20% type coverage catches the most common bugs.
🔑 Key Takeaways — Type Hints
- Type hints are documentation that tools can verify — they don't change runtime behavior or performance
- Use
str | None(Python 3.10+) orOptional[str]for values that might be None — always check for None before using - Use
list[int]directly in Python 3.9+. Usefrom typing import List; List[int]for 3.7/3.8 - Avoid
Any— it silently disables all type checking for everything it touches TypedDictgives you typed dicts without needing a full class; great for API response shapes- Run
mypy --stricton new projects; use gradual typing on existing codebases
Part XVI — Mastery
Knowing syntax is not mastery. Mastery is writing code others can read, maintain, and trust. It's knowing when NOT to be clever. These habits separate junior developers from senior ones.
Writing Clean Pythonic Code Mastery
# ── PYTHONIC PATTERNS ────────────────────────────────────────── # ❌ C-style loop ✅ Direct iteration for i in range(len(items)): print(items[i]) # bad for item in items: print(item) # good # ❌ Verbose swap ✅ Tuple swap temp = a; a = b; b = temp # bad a, b = b, a # good — Pythonic one-liner # ❌ Checking empty list ✅ Truth value if len(items) > 0: ... # bad if items: ... # good # ❌ Dict get with KeyError risk ✅ .get() with default val = d[k] if k in d else "default" # verbose val = d.get(k, "default") # clean # ❌ String concat in loop (O(n²)) ✅ join result = "" for w in words: result += w + ", " # very slow result = ", ".join(words) # fast and clean # ❌ Counting with a loop ✅ Counter from collections import Counter counts = Counter(words) # instant, no loop needed # ❌ Building a dict with a loop ✅ Dict comprehension lengths = {w: len(w) for w in words} # ✅ Use enumerate for index + value for i, item in enumerate(items, 1): print(f"{i}. {item}") # ✅ Unpack meaningfully (don't use index[0], index[1]) first, *rest = items name, age, city = user_tuple # ✅ Use walrus := to assign and test in one step if (n := len(data)) > 10: print(f"Large dataset: {n} items") # ✅ "Ask forgiveness" (EAFP) rather than "look before you leap" (LBYL) # LBYL style: if "key" in my_dict: val = my_dict["key"] # EAFP style (more Pythonic): try: val = my_dict["key"] except KeyError: val = None
PEP 8 — Python Style Guide Mastery
| Rule | Wrong | Correct |
|---|---|---|
| Indentation | 2 spaces or tabs | 4 spaces always |
| Line length | 200+ chars | ≤79 chars (88 with Black) |
| Variable names | myVar, MyVar | my_var (snake_case) |
| Constants | max_retries | MAX_RETRIES |
| Class names | my_class | MyClass (PascalCase) |
| Spaces around = | x=5+y | x = 5 + y |
| Comparison to None | x == None | x is None |
| Truthy check | if x == True | if x: |
| Import order | anywhere in file | Top of file: stdlib → third-party → local |
# Auto-format and lint — install these essential tools pip install black ruff mypy black my_script.py # auto-formats code to PEP 8 ruff check my_script.py # fast linting — catches bugs and style issues mypy my_script.py # type checking
Type Hints Advanced
from typing import Optional, Union # Function signatures with type hints def greet(name: str) -> str: return f"Hello, {name}!" def add(a: int, b: int) -> int: return a + b # Optional: can be None or the type def find_user(uid: int) -> Optional[str]: return "Alice" if uid == 1 else None # Python 3.10+ — cleaner union syntax def process(value: int | str) -> str: return str(value) # Collection hints (Python 3.9+: use built-in types) def get_names() -> list[str]: # list of strings return ["Alice", "Bob"] def get_scores() -> dict[str, int]: # dict mapping str to int return {"Alice": 95} def get_pair() -> tuple[str, int]: # tuple of exactly str, int return ("Alice", 25) # Variable annotations name: str = "Alice" ages: list[int] = [25, 30, 35] # Type hints DON'T enforce at runtime — use mypy to validate # mypy my_script.py → reports type errors without running the code
Testing with pytest Advanced
pip install pytest
# pytest conventions: # - File names start with test_ or end with _test.py # - Function names start with test_ # - Use plain 'assert' statements — pytest makes them informative import pytest # The code being tested (normally in a separate file) def divide(a, b): if b == 0: raise ZeroDivisionError("Cannot divide by zero") return a / b def clamp(value, min_val, max_val): """Return value clamped to [min_val, max_val].""" return max(min_val, min(value, max_val)) # ── BASIC TESTS ──────────────────────────────────────────────── def test_divide_normal(): assert divide(10, 2) == 5.0 def test_divide_float(): assert divide(7, 2) == pytest.approx(3.5) # use approx for floats! def test_divide_by_zero(): # Test that an exception IS raised with pytest.raises(ZeroDivisionError, match="Cannot divide by zero"): divide(5, 0) def test_clamp_within_range(): assert clamp(5, 1, 10) == 5 # value already in range def test_clamp_below_min(): assert clamp(-5, 1, 10) == 1 # below minimum → return minimum def test_clamp_above_max(): assert clamp(100, 1, 10) == 10 # above maximum → return maximum # ── PARAMETRIZE — run same test with multiple inputs ─────────── @pytest.mark.parametrize("a,b,expected", [ (10, 2, 5.0), # case 1 (9, 3, 3.0), # case 2 (1, 4, 0.25), # case 3 (0, 5, 0.0), # case 4 — zero numerator ]) def test_divide_parametrized(a, b, expected): assert divide(a, b) == pytest.approx(expected) # This runs 4 separate tests from one function! # ── FIXTURES — shared test setup ────────────────────────────── @pytest.fixture def sample_data(): """Provide a fresh dict for each test that needs it.""" return {"Alice": 95, "Bob": 87, "Charlie": 92} def test_top_scorer(sample_data): # pytest injects the fixture automatically assert max(sample_data, key=sample_data.get) == "Alice"
# Run tests pytest # run all test files pytest test_math_utils.py # run one file pytest -v # verbose output — see each test name pytest -k "test_divide" # run only tests matching pattern pytest --tb=short # shorter traceback on failures pytest --cov=mymodule # measure code coverage (pip install pytest-cov)
Common Interview Questions Mastery
# ── Q: What is the difference between a list and a tuple? ────── # A: Both are ordered sequences. Lists are mutable (can change), # tuples are immutable (cannot change). Tuples are hashable # (can be dict keys), lists are not. # ── Q: What are *args and **kwargs? ─────────────────────────── # A: *args collects extra positional arguments into a tuple. # **kwargs collects extra keyword arguments into a dict. # They let functions accept any number of arguments. # ── Q: Explain list vs generator ────────────────────────────── # A: A list stores all values in memory at once. # A generator produces values on-demand, one at a time. # Generators use far less memory for large datasets. # ── Q: What's the difference between == and is? ──────────────── a = [1, 2, 3] b = [1, 2, 3] print(a == b) # True — same VALUES print(a is b) # False — different OBJECTS in memory # Use == for value equality, 'is' only for None/True/False checks # ── Q: What is a decorator? ──────────────────────────────────── # A: A function that takes a function and returns a modified function. # Used to add behavior (logging, timing, auth) without changing # the original function. # ── Q: What is the GIL? ──────────────────────────────────────── # A: The Global Interpreter Lock. CPython only runs one thread at a # time for Python code (though I/O can be concurrent). For true # CPU parallelism, use multiprocessing instead of threading. # ── Q: Mutable vs immutable? ────────────────────────────────── # Immutable: int, float, str, tuple, bool, None # — cannot be changed after creation # Mutable: list, dict, set, and most objects # — can be modified in place # ── Q: What is a shallow vs deep copy? ──────────────────────── import copy original = [[1, 2], [3, 4]] shallow = original.copy() # or original[:] # shallow copy: new outer list, but inner lists are SHARED shallow[0].append(99) # modifies the shared inner list! print(original[0]) # [1, 2, 99] — original changed! deep = copy.deepcopy(original) # fully independent copy at all levels deep[0].append(99) print(original[0]) # [1, 2, 99] — unchanged # ── Q: What is a context manager? ───────────────────────────── # A: An object implementing __enter__ and __exit__ used with 'with'. # Ensures resources (files, connections) are properly released # even when exceptions occur. # ── Q: How does Python handle memory? ───────────────────────── # A: Reference counting + cyclic garbage collector. When an object's # reference count drops to 0, it's freed immediately. # Python also detects reference cycles with gc module.
Anti-Patterns to Avoid Mastery
# ── ANTI-PATTERN 1: Mutable default arguments ───────────────── def bad(item, items=[]): # [] is created ONCE and shared! items.append(item) return items def good(item, items=None): # None is safe if items is None: items = [] items.append(item) return items # ── ANTI-PATTERN 2: Bare except ─────────────────────────────── try: risky() except: # catches EVERYTHING including KeyboardInterrupt! Bad! print("error") try: risky() except Exception as e: # specific — still catches Ctrl+C but not SystemExit print(f"Error: {e}") # ── ANTI-PATTERN 3: Global state ────────────────────────────── total = 0 # bad: hidden dependency def add_to_total(n): global total total += n def add(current, n): # good: explicit, testable return current + n # ── ANTI-PATTERN 4: Deep nesting ────────────────────────────── # bad: the "arrow anti-pattern" if user: if user.is_active(): if user.has_permission("admin"): if data: process(data) # good: early return / guard clauses if not user: return if not user.is_active(): return if not user.has_permission("admin"): return if not data: return process(data) # ── ANTI-PATTERN 5: Catching without logging ────────────────── try: something() except Exception: pass # silently swallows errors — debugging nightmare! import logging try: something() except Exception: logging.exception("something() failed") # logs full traceback! raise # then re-raise # ── ANTI-PATTERN 6: Type checking with type() ──────────────── # bad: breaks with subclasses if type(x) == int: ... # good: allows subclasses (True for bool, which IS-A int) if isinstance(x, int): ...
Performance Tips Mastery
# ── PROFILING — find the bottleneck first, then optimize ─────── import cProfile cProfile.run("my_slow_function()") # see where time is spent # Line-by-line profiling # pip install line_profiler # @profile ← decorator (only when running with kernprof) # kernprof -l -v my_script.py # Time a snippet import timeit print(timeit.timeit("[x**2 for x in range(1000)]", number=1000)) # ── TIP 1: Generators over lists for large data ──────────────── # Bad: creates full list in memory total = sum([x**2 for x in range(1_000_000)]) # Good: generates values on the fly total = sum(x**2 for x in range(1_000_000)) # ── TIP 2: Set for membership testing ───────────────────────── valid_ids = {1, 2, 3, 4, 5} # O(1) lookup valid_ids_list = [1, 2, 3, 4, 5] # O(n) lookup — slow for large lists if user_id in valid_ids: # instant, regardless of set size process(user_id) # ── TIP 3: Local variables are faster ───────────────────────── import math # In a tight loop, local references are faster than global lookups local_sqrt = math.sqrt # cache the function locally results = [local_sqrt(x) for x in range(10000)] # ── TIP 4: String joining > concatenation ───────────────────── parts = ["a", "b", "c", "d"] result = "".join(parts) # O(n) — one allocation # vs: result = "a" + "b" + "c" + "d" — O(n²) — new string each time # ── TIP 5: Use numpy/pandas for numerical work ──────────────── # Pure Python loop: ~100ms for 1M items # numpy vectorized: ~1ms for 1M items (100x faster) # ── TIP 6: Avoid global variable access in hot loops ────────── CONSTANT = 42 # Slower: Python looks up CONSTANT in global scope each iteration def slow(): return [x * CONSTANT for x in range(1000000)] # Faster: pass as argument → local variable access def fast(c=CONSTANT): return [x * c for x in range(1000000)] # ── TIP 7: Prefer dict.get() over try/except for common misses # If key is usually present: try/except is fast # If key often missing: .get() is faster (no exception overhead) # ── TIP 8: Use __slots__ for memory-efficient classes ────────── class Point: __slots__ = ["x", "y"] # no __dict__ — 40-50% less memory per instance def __init__(self, x, y): self.x = x self.y = y # Useful when creating millions of instances points = [Point(i, i*2) for i in range(1_000_000)]
Next Steps Roadmap Mastery
Learn FastAPI (modern, async REST APIs) or Django (full-featured web framework with ORM, admin panel, auth). FastAPI is the modern choice for APIs; Django for full apps.
Go deep on pandas + numpy, then learn scikit-learn (classical ML), matplotlib/seaborn (visualization), and eventually PyTorch or TensorFlow for deep learning.
Python is the scripting language of the cloud. Learn boto3 (AWS), paramiko (SSH), subprocess, Selenium/Playwright (browser automation), and Ansible.
When Python is too slow: Cython (compile Python to C), Numba (JIT for loops), or call C/C++ extensions. For true concurrency: asyncio and multiprocessing.
# WEEK 1-2: Fundamentals (Parts I-VI of this Bible) # - Variables, types, operators, control flow, loops # - Project: Number guessing game, FizzBuzz, basic calculator # WEEK 3-4: Core Python (Parts VII-XI) # - Functions, data structures, files, errors, modules # - Project: File organizer, contact book, word frequency counter # MONTH 2: Intermediate (Parts XII-XIII) # - OOP, comprehensions, generators, decorators # - Project: Bank account simulator, web scraper with requests # MONTH 3: Specialized Track # - Pick ONE track based on your goal # - Data: pandas, numpy, matplotlib → analyze a real dataset # - Web: FastAPI → build a REST API # - Automation: selenium, schedule → automate something you do daily # MONTH 4+: Real Projects # - Contribute to open source (find "good first issue" on GitHub) # - Build something you actually want to use # - Practice on LeetCode/HackerRank for interview prep # DAILY HABITS of great Python developers: # 1. Read other people's code (GitHub, Real Python, PyPI source) # 2. Write tests before you think you need them # 3. Use type hints consistently # 4. Keep functions small and focused # 5. Refactor mercilessly — first make it work, then make it right
You've covered everything from what a variable is to writing production-grade Python with tests, type hints, and professional patterns. The only thing left is practice. Write code every day — even just 30 minutes. Build things that interest you. Break things. Fix them. That's how you get from knowing Python to thinking in Python.
🔑 Key Takeaways — Mastery
- Mastery is writing code that others can read, maintain, and trust — not code that shows off what you know
- PEP 8 is the community standard. Follow it. Use a formatter like
blackto automate it - Write tests first (or immediately after). Code without tests is code you can't safely change
- Type hints + mypy + pytest is the professional feedback loop: define types → write functions → write tests → run both tools
- Anti-patterns are patterns that seem smart but make code harder to maintain. The most dangerous is "clever" code
- Performance: measure first with
cProfile, optimize second. Most bottlenecks are not where you think they are
The Python Bible
A complete Python training system — from absolute zero to professional mastery.
Created & curated by Cody Jent