STATISTICSDescriptive StatisticsStatistics Calculator
📊

Matthews Correlation Coefficient Calculator

Matthews Correlation Coefficient (MCC) calculator. Binary classification metric for imbalanced data.

Run CalculatorExplore data analysis and statistical calculations

Why This Statistical Analysis Matters

Why: Statistical calculator for analysis.

How: Enter inputs and compute results.

📊
BINARY CLASSIFICATION

Matthews Correlation Coefficient — The Gold Standard for Imbalanced Data

MCC ranges [-1, +1]. More informative than accuracy when classes are imbalanced. Also computes F1, informedness, markedness, phi, χ².

Quick Presets — Click to Load

Confusion Matrix Inputs

mcc_calc.shCALCULATED
Matthews Correlation Coefficient
0.9151
Accuracy
98.50%
Balanced Acc
96.82%
F1
92.31%
Informedness
0.9363
Markedness
0.8944
χ² = n×MCC²
837.49

Confusion Matrix Heatmap

Predicted +
Predicted −
Actual +
90
5
Actual −
10
895

TP | FN — FP | TN

Share:

Metrics Comparison

MCC Scale Visualization (-1 to +1)

MCC = 0.9151 — Good

📐 Step-by-Step Formulas

Numerator = TP×TN − FP×FN = 90×895 − 10×5 = 80500
Denominator = √((TP+FP)(TP+FN)(TN+FP)(TN+FN))
MCC = 0.9151
χ² = n × MCC² = 1000 × 0.9151² = 837.49

For educational and informational purposes only. Verify with a qualified professional.

Key Takeaways

  • MCC ranges from -1 to +1: +1 = perfect, 0 = random, -1 = inverse prediction
  • MCC is more informative than accuracy for imbalanced datasets — it considers all four confusion matrix cells
  • MCC = √(Informedness × Markedness) — geometric mean of two complementary metrics
  • Phi coefficient equals MCC for 2×2 contingency tables
  • χ² = n × MCC² relates MCC to chi-square test statistic

Did You Know?

📊MCC was introduced by biochemist Brian Matthews in 1975 for protein structure prediction
⚖️MCC is the only metric that is high only when all four confusion matrix cells are correctly predicted
🎯A random classifier always has MCC = 0, regardless of class balance
🔬Phi coefficient and MCC are identical for binary classification — both measure correlation
📐Informedness (Youden's J) measures how informed positives are; Markedness measures how marked negatives are
🤖scikit-learn uses matthews_corrcoef() for binary and multi-class MCC
🩺Medical AI often reports MCC because accuracy is misleading when disease prevalence is low
📧Spam filters with 99% accuracy can have low MCC if they miss many spam emails (high FN)

How It Works

1. The MCC Formula

MCC = (TP×TN - FP×FN) / √((TP+FP)(TP+FN)(TN+FP)(TN+FN)). The denominator ensures the result is in [-1, 1].

2. Interpretation

MCC > 0.5: good; 0.3–0.5: moderate; 0–0.3: weak; < 0: worse than random. MCC = 0 for random or when any row/column sum is zero.

3. Informedness and Markedness

Informedness = TPR + TNR - 1 (Youden's J). Markedness = PPV + NPV - 1. MCC = √(Informedness × Markedness).

4. Phi and Chi-Square

Phi = MCC for 2×2 tables. χ² = n × MCC². Use chi-square test for significance of association.

5. Why MCC Over Accuracy?

With 99% negatives, predicting all negative gives 99% accuracy but MCC = 0. MCC penalizes such degenerate solutions.

Expert Tips

Use MCC for imbalanced data

When classes are 90:10 or worse, accuracy is misleading. MCC remains interpretable.

Compare with F1

F1 focuses on positives; MCC balances all four cells. Use both for full picture.

Threshold tuning

Maximize MCC (or Youden) to find optimal classification threshold.

Multi-class MCC

MCC generalizes to multi-class via confusion matrix; sklearn supports it.

Confusion Matrix Layout

Predicted +Predicted −
Actual +TPFN
Actual −FPTN

Frequently Asked Questions

When should I use MCC instead of accuracy?

Use MCC when classes are imbalanced (e.g., 95% negative). Accuracy can be 95% with a useless "always negative" classifier; MCC would be 0.

What does MCC = 0 mean?

Random prediction, or when any row/column sum is zero (no positives predicted, or no negatives, etc.).

How does MCC relate to F1?

F1 focuses on precision and recall (positive class). MCC considers all four cells and is symmetric. Both are useful; MCC is better for severe imbalance.

What is Informedness (Youden's J)?

J = TPR + TNR - 1. It measures how informed the positive predictions are. MCC = √(J × Markedness).

What is Markedness?

Markedness = PPV + NPV - 1. It measures how marked the negative predictions are. Complements Informedness.

Is Phi the same as MCC?

Yes, for 2×2 contingency tables. Phi coefficient and MCC use the same formula.

How do I interpret MCC &lt; 0?

The classifier is inversely correlated with the truth — swapping predictions would improve performance.

Can MCC be used for multi-class?

Yes. The multi-class MCC generalizes the formula using the full confusion matrix. scikit-learn supports it.

MCC Interpretation Scale

+1
Perfect
0.5–1
Good
0–0.5
Weak to moderate
-1–0
Random to inverse

Worked Example: Imbalanced Medical Screening

TP=85, FP=15, TN=885, FN=15. N=1000. Accuracy = (85+885)/1000 = 97%. But prevalence = 10%.

MCC = (85×885 - 15×15) / √(100×100×900×900) = 75225 / 90000 ≈ 0.836. This reflects strong performance.

Informedness = 85/100 + 885/900 - 1 ≈ 0.85 + 0.983 - 1 = 0.833. Markedness = 85/100 + 885/900 - 1 ≈ 0.833. MCC ≈ √(0.833×0.833) ≈ 0.833.

A classifier with 97% accuracy could be "predict negative for everyone" (0% sensitivity). MCC would be 0. MCC correctly identifies this as useless.

Step-by-Step MCC Calculation

Step 1: From the confusion matrix, identify TP, FP, TN, FN.

Step 2: Compute the numerator: N = TP×TN − FP×FN.

Step 3: Compute the denominator: D = √((TP+FP)(TP+FN)(TN+FP)(TN+FN)).

Step 4: MCC = N / D. If D = 0 (degenerate case), MCC is undefined or 0.

Step 5: Interpret: MCC ∈ [−1, 1]. +1 = perfect, 0 = random, −1 = inverse.

Example: TP=90, FP=10, TN=895, FN=5. N = 90×895 − 10×5 = 80500 − 50 = 80450. D = √(100×95×905×900) ≈ √7753500000 ≈ 88054. MCC ≈ 80450/88054 ≈ 0.914.

Degenerate Cases and Edge Conditions

MCC is undefined (or conventionally 0) when the denominator is zero. This happens when: (1) All predictions are positive (TN+FN=0), (2) All predictions are negative (TP+FP=0), (3) All actuals are positive (TN+FP=0), or (4) All actuals are negative (TP+FN=0).

In such cases, accuracy can still be computed (e.g., 100% if predicting all positive when all are positive), but MCC correctly indicates that the classifier has no discriminative ability. This is why MCC is preferred for model selection when dealing with imbalanced or edge-case datasets.

MCC vs Other Metrics: When to Use Each

MetricBest ForLimitation
MCCImbalanced binary classificationLess intuitive than accuracy
AccuracyBalanced data, quick checkMisleading when imbalanced
F1Focus on positive classIgnores true negatives
Balanced AccuracyImbalanced data, simpleDoes not consider all four cells
PrecisionWhen FP costly (e.g., spam)Ignores FN
RecallWhen FN costly (e.g., cancer)Ignores FP

Common Pitfalls When Using MCC

  • Using accuracy instead of MCC for imbalanced datasets — a "always negative" classifier can have 99% accuracy but MCC = 0
  • Ignoring the sign of MCC — MCC < 0 means the classifier is inversely correlated; swapping predictions would improve performance
  • Expecting MCC to be defined when any row or column sum is zero — MCC is undefined (or 0) in such degenerate cases
  • Comparing MCC across different datasets without considering class balance — MCC is more interpretable than accuracy but context still matters
  • Assuming high MCC means the model is "good enough" — always inspect the confusion matrix for actionable insights

Numerical Stability and Implementation Notes

When implementing MCC in code, avoid overflow for large counts. The product TP×TN and FP×FN can be large; consider using log-space or incremental computation for very large confusion matrices. scikit-learn's matthews_corrcoef handles edge cases and returns 0 for degenerate matrices.

For multi-class MCC, the formula generalizes using the full confusion matrix. The result is still in [−1, 1] and measures the correlation between predicted and actual class labels across all classes.

Applications of MCC Beyond Binary Classification

MCC generalizes to multi-class classification. The multi-class MCC uses the full confusion matrix and is computed as a correlation coefficient between predicted and actual labels.

In bioinformatics, MCC is the standard for evaluating protein structure prediction, gene function prediction, and drug-target interaction models. In medical AI, MCC is preferred over accuracy when disease prevalence is low.

For recommender systems and information retrieval, MCC may be less common than precision@k or NDCG, but for binary relevance (click vs no-click), MCC provides a balanced single metric.

References and Further Reading

Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2), 442–451. The original paper introducing MCC for protein structure prediction.

Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. BMC Genomics. Discusses MCC for imbalanced classification in genomics.

Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies. Comprehensive survey linking MCC to informedness and markedness.

Use this calculator to evaluate binary classifiers from confusion matrices. MCC is the recommended metric for imbalanced datasets in ML competitions and research. Compare with accuracy, F1, and balanced accuracy for a complete picture.

Disclaimer: This calculator is for educational and ML model evaluation. For medical or critical applications, consult domain experts and use established validation frameworks.

MCC is symmetric: swapping predicted and actual labels does not change MCC. All metrics are computed from the confusion matrix.

👈 START HERE
⬅️Jump in and explore the concept!
AI

Related Calculators