Index¶
This file is auto-generated. Do not edit by hand.
Activations¶
- Activation Functions
- ELU (Exponential Linear Unit)
- GELU (Gaussian Error Linear Unit)
- Leaky ReLU
- ReLU (Rectified Linear Unit)
- SELU (Scaled ELU)
- Sigmoid (Logistic)
- Softmax
- Softplus
- Swish / SiLU
- Tanh (Hyperbolic Tangent)
Calculus¶
- Chain Rule
- Derivative
- Gradient
- Hessian
- Integral
- Jacobian
- Multivariable Chain Rule
- Partial Derivative
- Taylor Expansion
Deep Learning¶
- Attention
- Backpropagation
- Batch Normalization
- Cosine Embedding Loss (Metric Learning)
- Cross-Attention
- Dropout
- Embedding
- Feedforward Network (Transformer FFN)
- Layer Normalization
- MLP (Multi-Layer Perceptron)
- Multi-Head Attention
- Positional Encoding
- Residual Connection (Skip Connection)
- Self-Attention
- Transformer
Graphs¶
- Adjacency Matrix
- Augmenting Path
- Bellman-Ford Algorithm
- Breadth-First Search (BFS)
- Clustering Coefficient
- Connected Components
- Degree Matrix
- Depth-First Search (DFS)
- Dijkstra's Algorithm
- Dinic's Algorithm
- Edge Cut vs Vertex Cut
- Edmonds-Karp Algorithm
- Flow Network
- Ford-Fulkerson Method
- Global Minimum Cut
- Graph Algorithms (Overview)
- Graph Fourier Transform
- Graph Laplacian
- Graph Motifs
- Karger's Algorithm (Min Cut)
- Kruskal's Algorithm
- Label Propagation (LPA)
- Max-Flow Min-Cut Theorem
- Maximum Flow
- Message Passing (on Graphs)
- Minimum Cut
- Minimum Spanning Tree (MST)
- Modularity (Community Quality)
- Network Significance Profile
- PageRank
- Prim's Algorithm
- Residual Graph
- Shortest Path (Overview)
- Spectral Clustering
- Topological Sort
Information Theory¶
- Channel Capacity
- Conditional Entropy
- Cross Entropy
- Data Processing Inequality
- Entropy Rate
- Fisher Information
- Jensen-Shannon Divergence
- Joint Entropy
- KL Divergence (Relative Entropy)
- Mutual Information
- Perplexity
- Shannon Entropy
Linear Algebra¶
- Cholesky Decomposition
- Condition Number
- Determinant
- Dot Product
- Eigendecomposition
- Eigenvalues and Eigenvectors
- Least Squares
- Linear Independence
- Matrix Rank
- Nuclear Norm (Trace Norm)
- Orthogonal Projection
- Orthonormal Basis
- PCA Explained Variance Ratio
- Positive Definite Matrix
- Principal Component Analysis (PCA)
- Pseudo-inverse (Moore-Penrose)
- QR Decomposition
- Scree Plot
- Singular Value Decomposition (SVD)
- Span
- Trace
- Vector Norms
Machine Learning¶
- Bias-Variance Tradeoff
- Categorical Encoding
- Class Imbalance
- Concept Drift
- Cross-Validation
- Data Drift
- Data Leakage
- DBSCAN
- Decision Tree
- Elastic Net
- Feature Scaling (Standardization vs Normalization)
- Feature Store
- Gradient Boosting
- Hierarchical Clustering
- Imputation
- K-Means Clustering
- K-Nearest Neighbors (k-NN)
- Lasso Regression (L1)
- Linear Regression
- Logistic Regression
- Machine Learning Pipeline
- Missing Data
- Model Interpretability
- Partial Dependence Plot (PDP)
- Permutation Feature Importance
- Probability Calibration
- Random Forest
- Ridge Regression (L2)
- SHAP Values
- Threshold Selection
- Train/Validation/Test Split
ML Metrics¶
- Adjusted Rand Index (ARI)
- Akaike Information Criterion (AIC)
- Average Precision
- Bayesian Information Criterion (BIC)
- Brier Score
- Calibration Error (ECE)
- Calinski-Harabasz Score
- Cohen's Kappa
- Confusion Matrix
- Davies-Bouldin Index
- Density-Based Clustering Validation (DBCV)
- Dice Coefficient (Sorensen-Dice)
- F1 Score
- Fowlkes-Mallows Index
- Jaccard Similarity / IoU
- Log Loss (Binary Cross-Entropy)
- Matthews Correlation Coefficient
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Negative Log-Likelihood (NLL)
- Normalized Mutual Information (NMI)
- Precision
- Precision-Recall Curve
- R^2 (Coefficient of Determination)
- Recall
- ROC AUC
- ROC Curve
- Root Mean Squared Error (RMSE)
- Sensitivity (True Positive Rate)
- Silhouette Score
- Specificity (True Negative Rate)
NLP¶
- Causal Language Modeling
- Inverse Document Frequency (IDF)
- Language Model
- Masked Language Modeling (MLM)
- Next-Token Prediction
- Subword Tokenization (BPE / WordPiece)
- Temperature (Sampling)
- Term Frequency (TF)
- TF-IDF
- Tokenization
- Top-k Sampling
- Top-p Sampling (Nucleus)
Optimization¶
- Adam Optimizer
- AdamW Optimizer
- Early Stopping
- Gradient Clipping
- Gradient Descent
- KKT Conditions
- Lagrangian
- Lagrangian Duality
- Learning Rate Schedule
- Learning Rate Warmup
- Line Search
- Momentum (SGD with Momentum)
- Newton's Method
- Proximal Operator
- Regularization (L1/L2)
- RMSProp
- Stochastic Gradient Descent
- Weight Decay
Probability Stats¶
- A/B Testing
- ANOVA (Analysis of Variance)
- Bayes' Rule
- Benjamini-Hochberg Procedure
- Bernoulli Distribution
- Bernoulli Trial
- Beta Distribution
- Binomial Distribution
- Bootstrap
- CDF (Cumulative Distribution Function)
- Central Limit Theorem
- Characteristic Function
- Chebyshev's Inequality
- Chi-Square Test
- Cohen's d (Effect Size)
- Conditional Expectation
- Conditional Probability
- Confidence Interval
- Correlation
- Covariance
- Expectation
- Exponential Distribution
- False Discovery Rate (FDR)
- Gamma Distribution
- Geometric Distribution
- Independence
- Jensen's Inequality
- Joint Probability
- Law of Large Numbers
- Law of Total Probability
- Lognormal Distribution
- Marginal Probability
- Markov's Inequality
- Maximum A Posteriori (MAP)
- Maximum Likelihood Estimation (MLE)
- Mean (Expected Value)
- Median
- Moment Generating Function
- Multinomial Distribution
- Multiple Hypothesis Testing
- Normal Distribution (Gaussian)
- Null Hypothesis
- One-Sample t-Test
- P-Value
- Paired t-Test
- PDF (Probability Density Function)
- Permutation Test
- PMF (Probability Mass Function)
- Poisson Distribution
- Power Analysis (Sample Size Planning)
- Probability Distribution
- Random Variable
- Standard Deviation
- Statistical Power
- Two-Sample t-Test (Welch)
- Type I and Type II Errors
- Uniform Distribution
- Variance
- Z-Score