P-Value Calculator: Z-Score to P-Value (All Tail Types)

P-Value Calculator: Z-Scores, Tail Probabilities, and Hypothesis Testing

A p-value is the probability of obtaining a test statistic at least as extreme as the one observed, assuming the null hypothesis is true. For a Z-test with Z-score = 2.0: the two-tailed p-value is 2×P(Z > 2) = 2×0.0228 = 0.0456 — a 4.56% chance of seeing |Z| ≥ 2 by chance. Since 0.0456 < 0.05, we reject the null at the 5% significance level.

Two-tailed p-value: p = 2 × (1 − Φ(|Z|)) | Critical value for α=0.05: |Z| ≥ 1.96

Z-Score	Two-Tailed P	Significance
1.645	0.1000	10% level
1.960	0.0500	5% level ★
2.576	0.0100	1% level

The p-value calculator converts between Z-scores and all five standard probability measures: left tail, right tail, two-tailed, center, and between bounds. It also runs in reverse — enter a significance level α and get the critical Z-score. This is the core tool for Z-tests in statistics and scientific research.

One-Tailed vs Two-Tailed Tests

The choice between one-tailed and two-tailed depends on the hypothesis. A two-tailed test checks for any difference (either direction) and uses p = 2×P(|Z| > |z_obs|). Example: "does drug A change blood pressure (either direction)?" A one-tailed test tests a directional hypothesis and uses p = P(Z > z_obs) for right-tail (or P(Z < z_obs) for left-tail). Example: "does drug A lower blood pressure (specifically lower)?" One-tailed tests have more statistical power for detecting the specified direction but cannot detect effects in the opposite direction. Two-tailed tests are more conservative and more commonly required in publication standards.

Interpreting the P-Value

A p-value of 0.03 means: if the null hypothesis were true, there would be only a 3% chance of observing a test statistic this extreme (or more) by chance. It does NOT mean: "there is a 3% chance the null hypothesis is true" (Bayesian prior) or "there is a 97% chance the alternative hypothesis is true." The p-value is a statement about the data given the null hypothesis, not about the probability of hypotheses. The widely used threshold α = 0.05 is a convention (introduced by R.A. Fisher), not a fundamental law — many fields use 0.01, and particle physics uses 0.0000003 (5σ).

The Five P-Value Regions for Z = 2.0

For Z = 2.0, all five probability regions are: Left tail P(x < 2) = Φ(2) ≈ 0.9772 (97.72% of the distribution lies below Z=2). Right tail P(x > 2) = 1 − 0.9772 = 0.0228. Two-tailed P(|x| > 2) = 2 × 0.0228 = 0.0456. Between P(−2 < x < 2) = 0.9544. Center P(0 < x < 2) = 0.9772 − 0.5 = 0.4772. These five values are interrelated: left tail + right tail = 1; two-tailed = 2 × right tail (for |Z|); between = 1 − two-tailed; center = between/2.

Frequently Asked Questions

What is a p-value in statistics?

A p-value is the probability of observing a test statistic as extreme as (or more extreme than) the one calculated from your data, assuming the null hypothesis is true. For a Z-test with Z=1.96, the two-tailed p-value is P(|Z| ≥ 1.96) ≈ 0.05 — a 5% chance of getting a Z-score this far from zero if the null is true. Small p-values (typically p < 0.05) provide evidence against the null hypothesis. Important: the p-value is NOT the probability that the null hypothesis is true, nor the probability that the result occurred by chance alone.

What Z-score corresponds to p = 0.05 (5% significance level)?

For a two-tailed test at α=0.05, the critical Z-score is ±1.960: P(|Z| > 1.960) = 0.05, meaning 5% of the standard normal distribution lies in the two tails beyond ±1.96. For a one-tailed right test at α=0.05, the critical value is Z=1.645: P(Z > 1.645) = 0.05. For α=0.01 two-tailed: Z=±2.576. For α=0.001 two-tailed: Z=±3.291. The critical value Z* = Φ⁻¹(1−α/2) for two-tailed or Φ⁻¹(1−α) for one-tailed right.

When should I use a one-tailed vs two-tailed p-value?

Use a two-tailed test when testing for any difference from the null (either direction) — this is the standard in most research. Use a one-tailed test only when: (1) there is a clear, pre-specified directional hypothesis based on theory, and (2) the opposite direction would have no scientific meaning. Example of valid one-tailed: testing whether a new drug reduces (not just changes) blood pressure. Two-tailed tests are more conservative (harder to reject the null) and are required by most journals. Never choose the tail after seeing the data — that is p-hacking.

What is statistical significance and what does p < 0.05 mean?

Statistical significance at level α means the p-value is less than α, indicating the observed result is unlikely under the null hypothesis. p < 0.05 is the most common threshold, introduced by R.A. Fisher in the 1920s. For Z=1.96: p = 0.05, just at the boundary. For Z=2.58: p = 0.01. For Z=3.29: p = 0.001 (3 stars in most tables). Statistical significance does not imply practical significance — a tiny, unimportant effect can be highly significant in a large sample. Effect size (Cohen's d, r²) measures practical importance and should always accompany p-values.

How is a Z-score test statistic calculated?

For a one-sample Z-test: Z = (x̄ − μ₀) / (σ/√n), where x̄ is the sample mean, μ₀ is the hypothesized population mean, σ is the known population standard deviation, and n is the sample size. Example: test whether a sample (n=50, x̄=22, σ=5) differs from μ₀=20: Z = (22−20)/(5/√50) = 2/0.707 ≈ 2.83; two-tailed p = 2×(1−Φ(2.83)) ≈ 0.0046 — significant at 1% level. Use the t-distribution (and t-test) instead of Z when σ is estimated from the sample rather than known.

📉P-Value Calculator

P-Value by Tail Type (%)

✦ Ask AI Instead