Decision Trees and Random Forests
Build ensemble models and visualize decision boundaries
What are Decision Trees?
Decision Trees are intuitive ML models that make decisions by asking a series of yes/no questions. Think of it like a flowchart - you start at the top and follow branches until you reach a decision. Random Forests combine many trees to create more powerful, accurate predictions.
💡 The Core Idea
Decision Tree Construction: The CART Algorithm
🌳 How Trees Grow: Recursive Partitioning
The Greedy Algorithm
🔍 Finding the Best Split
🛑 Stopping Criteria
• Prevents overly complex trees
• Typical: 3-10 for good generalization
• Not enough data to split reliably
• Prevents fitting noise
• Perfect classification achieved
• No benefit from further splitting
• Features exhausted or uninformative
• Can't improve predictions
⏱️ Computational Complexity
📚 Classic Algorithms
✅ Advantages
✗ Disadvantages
💡 Key Insight
1. Build Your Decision Tree
🌳 Interactive: Construct a Tree
🌳 How it works: Start at the root, ask a question about a feature, branch left or right based on the answer, repeat until reaching a leaf node (final decision).
Splitting Criteria: Gini vs Entropy
📊 Measuring Impurity: How Mixed Is This Node?
🎯 What Is Impurity?
📐 Gini Impurity: Probability of Misclassification
🌀 Entropy: Information-Theoretic Uncertainty
⚖️ Gini vs Entropy: Side-by-Side
📈 Information Gain: The Reduction in Impurity
🔑 Key Takeaway
💡 Practical Tip
2. Splitting Criteria: Gini vs Entropy
📊 Interactive: Measure Impurity
Gini Impurity
Entropy
⚖️ Maximum impurity - completely mixed classes
3. Feature Importance
⭐ Interactive: Which Features Matter Most?
4. Preventing Overfitting
✂️ Interactive: Prune Your Tree
⚖️ Moderate complexity. Consider slightly more constraints.
Random Forests: Ensemble Learning Power
🌳 From One Tree to a Forest
🎯 The Problem with Single Decision Trees
🌲 Random Forest Algorithm
(Probability of being selected at least once: 1 - (1-1/N)^N → 1 - 1/e ≈ 0.632)
• Regression: m/3 features (e.g., 12 features → sample 4)
📊 Why Random Forests Work: Bias-Variance Tradeoff
📈 Out-of-Bag (OOB) Error Estimation
⚔️ Random Forest vs Other Ensembles
🔑 Key Insight
💡 Practical Tip
5. Build a Random Forest
🌲 Interactive: Grow Your Forest
6. Bagging: Bootstrap Aggregating
🎒 Interactive: Sample & Aggregate
7. Make a Prediction
🔮 Interactive: Loan Approval Predictor
💡 Decision Path: Age ≥ 25 ✓ → Income ≥ $30K ✓ → Approved!
8. Ensemble Voting Mechanism
🗳️ Interactive: Democracy of Trees
9. Random Feature Selection
🎲 Interactive: Feature Subsampling
Each tree in a Random Forest considers only a random subset of features at each split. This creates diverse trees and prevents overfitting. Typical: √n features for classification, n/3 for regression.
10. Single Tree vs Random Forest
⚖️ Interactive: Compare Performance
🎯 Key Takeaways
Decision Trees are Intuitive
Easy to understand and visualize. Great for interpretability. But single trees overfit easily - they memorize training data noise.
Random Forests Fix Overfitting
Combine many trees trained on random subsets of data and features. Ensemble voting reduces variance. Industry standard for tabular data!
Feature Importance is Gold
Random Forests tell you which features matter most. Use this for feature selection, data understanding, and stakeholder communication.
Pruning Prevents Overfitting
Limit max_depth (3-10 typical), set min_samples_leaf (5-20), or use pre-pruning. Don't let trees grow too deep or they'll memorize noise.
Bagging Builds Better Models
Bootstrap Aggregating: train each tree on random sample with replacement. Reduces variance by √n. The foundation of Random Forests.
When to Use Random Forests
Tabular data, classification or regression, when accuracy matters more than speed. Alternatives: XGBoost/LightGBM (faster, often better), Neural Nets (unstructured data).