Matthews Correlation Coefficient Calculator
Matthews Correlation Coefficient (MCC) calculator. Binary classification metric for imbalanced data.
Why This Statistical Analysis Matters
Why: Statistical calculator for analysis.
How: Enter inputs and compute results.
Matthews Correlation Coefficient — The Gold Standard for Imbalanced Data
MCC ranges [-1, +1]. More informative than accuracy when classes are imbalanced. Also computes F1, informedness, markedness, phi, χ².
Quick Presets — Click to Load
Confusion Matrix Inputs
Confusion Matrix Heatmap
TP | FN — FP | TN
Metrics Comparison
MCC Scale Visualization (-1 to +1)
MCC = 0.9151 — Good
📐 Step-by-Step Formulas
For educational and informational purposes only. Verify with a qualified professional.
Key Takeaways
- MCC ranges from -1 to +1: +1 = perfect, 0 = random, -1 = inverse prediction
- MCC is more informative than accuracy for imbalanced datasets — it considers all four confusion matrix cells
- MCC = √(Informedness × Markedness) — geometric mean of two complementary metrics
- Phi coefficient equals MCC for 2×2 contingency tables
- χ² = n × MCC² relates MCC to chi-square test statistic
Did You Know?
How It Works
1. The MCC Formula
MCC = (TP×TN - FP×FN) / √((TP+FP)(TP+FN)(TN+FP)(TN+FN)). The denominator ensures the result is in [-1, 1].
2. Interpretation
MCC > 0.5: good; 0.3–0.5: moderate; 0–0.3: weak; < 0: worse than random. MCC = 0 for random or when any row/column sum is zero.
3. Informedness and Markedness
Informedness = TPR + TNR - 1 (Youden's J). Markedness = PPV + NPV - 1. MCC = √(Informedness × Markedness).
4. Phi and Chi-Square
Phi = MCC for 2×2 tables. χ² = n × MCC². Use chi-square test for significance of association.
5. Why MCC Over Accuracy?
With 99% negatives, predicting all negative gives 99% accuracy but MCC = 0. MCC penalizes such degenerate solutions.
Expert Tips
Use MCC for imbalanced data
When classes are 90:10 or worse, accuracy is misleading. MCC remains interpretable.
Compare with F1
F1 focuses on positives; MCC balances all four cells. Use both for full picture.
Threshold tuning
Maximize MCC (or Youden) to find optimal classification threshold.
Multi-class MCC
MCC generalizes to multi-class via confusion matrix; sklearn supports it.
Confusion Matrix Layout
| Predicted + | Predicted − | |
|---|---|---|
| Actual + | TP | FN |
| Actual − | FP | TN |
Frequently Asked Questions
When should I use MCC instead of accuracy?
Use MCC when classes are imbalanced (e.g., 95% negative). Accuracy can be 95% with a useless "always negative" classifier; MCC would be 0.
What does MCC = 0 mean?
Random prediction, or when any row/column sum is zero (no positives predicted, or no negatives, etc.).
How does MCC relate to F1?
F1 focuses on precision and recall (positive class). MCC considers all four cells and is symmetric. Both are useful; MCC is better for severe imbalance.
What is Informedness (Youden's J)?
J = TPR + TNR - 1. It measures how informed the positive predictions are. MCC = √(J × Markedness).
What is Markedness?
Markedness = PPV + NPV - 1. It measures how marked the negative predictions are. Complements Informedness.
Is Phi the same as MCC?
Yes, for 2×2 contingency tables. Phi coefficient and MCC use the same formula.
How do I interpret MCC < 0?
The classifier is inversely correlated with the truth — swapping predictions would improve performance.
Can MCC be used for multi-class?
Yes. The multi-class MCC generalizes the formula using the full confusion matrix. scikit-learn supports it.
MCC Interpretation Scale
Official Sources
Worked Example: Imbalanced Medical Screening
TP=85, FP=15, TN=885, FN=15. N=1000. Accuracy = (85+885)/1000 = 97%. But prevalence = 10%.
MCC = (85×885 - 15×15) / √(100×100×900×900) = 75225 / 90000 ≈ 0.836. This reflects strong performance.
Informedness = 85/100 + 885/900 - 1 ≈ 0.85 + 0.983 - 1 = 0.833. Markedness = 85/100 + 885/900 - 1 ≈ 0.833. MCC ≈ √(0.833×0.833) ≈ 0.833.
A classifier with 97% accuracy could be "predict negative for everyone" (0% sensitivity). MCC would be 0. MCC correctly identifies this as useless.
Step-by-Step MCC Calculation
Step 1: From the confusion matrix, identify TP, FP, TN, FN.
Step 2: Compute the numerator: N = TP×TN − FP×FN.
Step 3: Compute the denominator: D = √((TP+FP)(TP+FN)(TN+FP)(TN+FN)).
Step 4: MCC = N / D. If D = 0 (degenerate case), MCC is undefined or 0.
Step 5: Interpret: MCC ∈ [−1, 1]. +1 = perfect, 0 = random, −1 = inverse.
Example: TP=90, FP=10, TN=895, FN=5. N = 90×895 − 10×5 = 80500 − 50 = 80450. D = √(100×95×905×900) ≈ √7753500000 ≈ 88054. MCC ≈ 80450/88054 ≈ 0.914.
Degenerate Cases and Edge Conditions
MCC is undefined (or conventionally 0) when the denominator is zero. This happens when: (1) All predictions are positive (TN+FN=0), (2) All predictions are negative (TP+FP=0), (3) All actuals are positive (TN+FP=0), or (4) All actuals are negative (TP+FN=0).
In such cases, accuracy can still be computed (e.g., 100% if predicting all positive when all are positive), but MCC correctly indicates that the classifier has no discriminative ability. This is why MCC is preferred for model selection when dealing with imbalanced or edge-case datasets.
MCC vs Other Metrics: When to Use Each
| Metric | Best For | Limitation |
|---|---|---|
| MCC | Imbalanced binary classification | Less intuitive than accuracy |
| Accuracy | Balanced data, quick check | Misleading when imbalanced |
| F1 | Focus on positive class | Ignores true negatives |
| Balanced Accuracy | Imbalanced data, simple | Does not consider all four cells |
| Precision | When FP costly (e.g., spam) | Ignores FN |
| Recall | When FN costly (e.g., cancer) | Ignores FP |
Common Pitfalls When Using MCC
- Using accuracy instead of MCC for imbalanced datasets — a "always negative" classifier can have 99% accuracy but MCC = 0
- Ignoring the sign of MCC — MCC < 0 means the classifier is inversely correlated; swapping predictions would improve performance
- Expecting MCC to be defined when any row or column sum is zero — MCC is undefined (or 0) in such degenerate cases
- Comparing MCC across different datasets without considering class balance — MCC is more interpretable than accuracy but context still matters
- Assuming high MCC means the model is "good enough" — always inspect the confusion matrix for actionable insights
Numerical Stability and Implementation Notes
When implementing MCC in code, avoid overflow for large counts. The product TP×TN and FP×FN can be large; consider using log-space or incremental computation for very large confusion matrices. scikit-learn's matthews_corrcoef handles edge cases and returns 0 for degenerate matrices.
For multi-class MCC, the formula generalizes using the full confusion matrix. The result is still in [−1, 1] and measures the correlation between predicted and actual class labels across all classes.
Applications of MCC Beyond Binary Classification
MCC generalizes to multi-class classification. The multi-class MCC uses the full confusion matrix and is computed as a correlation coefficient between predicted and actual labels.
In bioinformatics, MCC is the standard for evaluating protein structure prediction, gene function prediction, and drug-target interaction models. In medical AI, MCC is preferred over accuracy when disease prevalence is low.
For recommender systems and information retrieval, MCC may be less common than precision@k or NDCG, but for binary relevance (click vs no-click), MCC provides a balanced single metric.
References and Further Reading
Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2), 442–451. The original paper introducing MCC for protein structure prediction.
Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. BMC Genomics. Discusses MCC for imbalanced classification in genomics.
Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies. Comprehensive survey linking MCC to informedness and markedness.
Use this calculator to evaluate binary classifiers from confusion matrices. MCC is the recommended metric for imbalanced datasets in ML competitions and research. Compare with accuracy, F1, and balanced accuracy for a complete picture.
Disclaimer: This calculator is for educational and ML model evaluation. For medical or critical applications, consult domain experts and use established validation frameworks.
MCC is symmetric: swapping predicted and actual labels does not change MCC. All metrics are computed from the confusion matrix.
Related Calculators
Average Rating Calculator
Compute weighted average rating from star ratings. Bayesian average, Wilson score confidence interval, distribution, mode, median, and standard deviation.
StatisticsBox Plot Calculator
Generate box-and-whisker plots from data. Quartiles, IQR, fences, outlier detection with Tukey method.
StatisticsClass Width Calculator
Calculate optimal class width for histograms using Sturges', Scott's, Freedman-Diaconis, Rice, and Square Root rules.
StatisticsCoefficient of Variation Calculator
Computes CV = (SD/mean) × 100%. Measures relative variability. Compare variability of datasets with different units or scales.
StatisticsConstant of Proportionality Calculator
Find the constant k in y = kx (direct) or y = k/x (inverse) from data. Test whether data follows a proportional relationship with R² goodness of fit.
StatisticsCorrelation Coefficient Calculator
Compute Pearson's r, Spearman's ρ, and Kendall's τ from paired data. Scatter plot, regression line, residuals, p-value, 95% confidence interval, and R².
Statistics