Skip to content

[BUG] Pareto _pdf and _log_pdf return nonzero values outside support (x < scale) #967

@ANANYA542

Description

@ANANYA542

Describe the bug
The Pareto distribution is only defined for x >= scale. For x < scale, the PDF should be exactly 0. However, Pareto._pdf and Pareto._log_pdf in skpro/distributions/pareto.py blindly evaluate the PDF formula for all x, returning large wrong positive values for inputs outside the support.
For example, with alpha=3, scale=2, calling _pdf(1.0) returns 24.0 instead of 0.0.
The _cdf method in the same file already correctly handles this boundary using np.where(x < scale, 0, ...), so the fix pattern is straightforward.

To Reproduce

import numpy as np
from scipy.stats import pareto as scipy_pareto
from skpro.distributions.pareto import Pareto
alpha, scale = 3.0, 2.0
d = Pareto(alpha=alpha, scale=scale)
test_x = [0.5, 1.0, 1.5, 2.0, 3.0, 5.0]
print(f"{'x':>5}  {'in support?':>12}  {'skpro _pdf':>12}  {'scipy pdf':>12}")
for x in test_x:
    x_arr = np.array([[x]])
    skpro_val = float(np.asarray(d._pdf(x_arr)).flat[0])
    scipy_val = scipy_pareto.pdf(x, alpha, scale=scale)
    in_supp = "YES" if x >= scale else "NO"
    print(f"{x:>5.1f}  {in_supp:>12}  {skpro_val:>12.4f}  {scipy_val:>12.4f}")

Output:

    x   in support?     skpro _pdf     scipy pdf
  0.5            NO     384.0000        0.0000
  1.0            NO      24.0000        0.0000
  1.5            NO       4.7407        0.0000
  2.0           YES       1.5000        1.5000
  3.0           YES       0.2963        0.2963
  5.0           YES       0.0384        0.0384

_log_pdf has the same issue — returns finite positive values instead of -inf:

  x=0.5: skpro=5.9506  scipy=-inf
  x=1.0: skpro=3.1781  scipy=-inf

Screenshot of full local reproduction attached below:

Image

Expected behavior
1._pdf(x) should return 0.0 for any x < scale, matching scipy.stats.pareto.
2_log_pdf(x) should return -inf for x < scale.

Environment

  • OS: macOS
  • Python: 3.11
  • skpro: latest main branch

Additional context
The _cdf method in the same file (line 145) already handles the support boundary correctly:

cdf_arr = np.where(x < scale, 0, 1 - np.power(scale / x, alpha))

The _pdf and _log_pdf methods simply need the same np.where guard added — a one-line addition per method.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions