Describe the bug
The Pareto distribution is only defined for x >= scale. For x < scale, the PDF should be exactly 0. However, Pareto._pdf and Pareto._log_pdf in skpro/distributions/pareto.py blindly evaluate the PDF formula for all x, returning large wrong positive values for inputs outside the support.
For example, with alpha=3, scale=2, calling _pdf(1.0) returns 24.0 instead of 0.0.
The _cdf method in the same file already correctly handles this boundary using np.where(x < scale, 0, ...), so the fix pattern is straightforward.
To Reproduce
import numpy as np
from scipy.stats import pareto as scipy_pareto
from skpro.distributions.pareto import Pareto
alpha, scale = 3.0, 2.0
d = Pareto(alpha=alpha, scale=scale)
test_x = [0.5, 1.0, 1.5, 2.0, 3.0, 5.0]
print(f"{'x':>5} {'in support?':>12} {'skpro _pdf':>12} {'scipy pdf':>12}")
for x in test_x:
x_arr = np.array([[x]])
skpro_val = float(np.asarray(d._pdf(x_arr)).flat[0])
scipy_val = scipy_pareto.pdf(x, alpha, scale=scale)
in_supp = "YES" if x >= scale else "NO"
print(f"{x:>5.1f} {in_supp:>12} {skpro_val:>12.4f} {scipy_val:>12.4f}")
Output:
x in support? skpro _pdf scipy pdf
0.5 NO 384.0000 0.0000
1.0 NO 24.0000 0.0000
1.5 NO 4.7407 0.0000
2.0 YES 1.5000 1.5000
3.0 YES 0.2963 0.2963
5.0 YES 0.0384 0.0384
_log_pdf has the same issue — returns finite positive values instead of -inf:
x=0.5: skpro=5.9506 scipy=-inf
x=1.0: skpro=3.1781 scipy=-inf
Screenshot of full local reproduction attached below:
Expected behavior
1._pdf(x) should return 0.0 for any x < scale, matching scipy.stats.pareto.
2_log_pdf(x) should return -inf for x < scale.
Environment
- OS: macOS
- Python: 3.11
- skpro: latest main branch
Additional context
The _cdf method in the same file (line 145) already handles the support boundary correctly:
cdf_arr = np.where(x < scale, 0, 1 - np.power(scale / x, alpha))
The _pdf and _log_pdf methods simply need the same np.where guard added — a one-line addition per method.