Skip to content

Commit 296ac07

Browse files
blaltermanclaude
andauthored
feat(fitfunctions): add hinge, composite, and Heaviside fit functions (#422)
* fix(fitfunctions): catch FitFailedError in make_fit when return_exception=True The exception handler on line 813 only caught RuntimeError and ValueError, but FitFailedError (raised by _run_least_squares when max_nfev exceeded) inherits from FitFunctionError, not RuntimeError. This caused make_fit to raise instead of returning the exception when return_exception=True. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(fitfunctions): add HingeSaturation class for saturation modeling Piecewise linear function with hinge point for modeling saturation behavior: - Rising region: f(x) = m1*(x-x1) where m1 = yh/(xh-x1) - Plateau region: f(x) = m2*(x-x2) where x2 = xh - yh/m2 Parameters: xh (hinge x), yh (hinge y), x1 (x-intercept), m2 (plateau slope) Includes 24 comprehensive tests covering: - Function evaluation (rising, plateau, sloped plateau) - Parameter recovery from clean and noisy data (2σ tolerance) - Initial parameter estimation - Weighted fitting with heteroscedastic noise - Edge cases and error handling Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * test(fitfunctions): add tests for hinge piecewise linear functions Add comprehensive test coverage for TwoLine, Saturation, HingeMin, HingeMax, and HingeAtPoint fit functions. Tests include: - Function evaluation with known parameters - Parameter recovery from clean and noisy data - Derived property consistency (xs, s, theta, m2, x_intercepts) - Continuity at hinge points - Initial guess (p0) estimation - Edge cases and numerical stability Tests written first following TDD - implementations in subsequent commits. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * test(fitfunctions): add tests for Gaussian-Heaviside composite functions Add comprehensive test coverage for GaussianPlusHeavySide, GaussianTimesHeavySide, and GaussianTimesHeavySidePlusHeavySide. Tests include: - Function evaluation with known parameters - Parameter recovery from clean and noisy data - Gaussian component behavior (normalization, peak location) - Heaviside step transitions - Component interaction verification - Initial guess (p0) estimation with guess_x0 parameter Tests written first following TDD - implementations in subsequent commits. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * test(fitfunctions): add tests for HeavySide step function Add comprehensive test coverage for HeavySide fit function. Tests include: - Function evaluation with known parameters - Step transition behavior (x < x0, x == x0, x > x0) - Parameter recovery from clean and noisy data - Initial guess (p0) estimation with optional guess parameters - Edge cases (step at data boundary, flat data) - TeX function representation Tests written first following TDD - implementation in subsequent commit. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(fitfunctions): add hinge piecewise linear functions Add five piecewise linear fit functions for modeling transitions: - TwoLine: Two intersecting lines (minimum), params: x1, x2, m1, m2 - Saturation: Linear rise with saturation plateau, params: x1, xs, s, theta - HingeMin: Minimum of two lines at hinge point, params: m1, x1, x2, h - HingeMax: Maximum of two lines at hinge point, params: m1, x1, x2, h - HingeAtPoint: Piecewise linear with specified hinge point, params: m1, b1, m2, b2 All classes include: - Analytic function definitions using np.minimum/np.maximum - Data-driven initial guess (p0) estimation - Derived properties (xs, s, theta, m2, x_intercepts as applicable) - TeX function representations for plotting Contributed from nh/vanishing_speed_hinge_fits.py with improvements: - Consistent API with existing FitFunction classes - TODO comments for future data-driven p0 estimation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(fitfunctions): add Gaussian-Heaviside composite functions Add three composite fit functions combining Gaussian and Heaviside: - GaussianPlusHeavySide: Gaussian + Heaviside step params: x0, y0, y1, mu, sigma, A - GaussianTimesHeavySide: Gaussian × Heaviside step params: x0, mu, sigma, A - GaussianTimesHeavySidePlusHeavySide: (Gaussian × Heaviside) + Heaviside params: x0, y1, mu, sigma, A All classes include: - Analytic function definitions - Data-driven initial guess (p0) estimation - Optional guess_x0 parameter for step location hint - TeX function representations Contributed from nh/vanishing_speed_hinge_fits.py with fixes: - Fixed typo bug: return gaussian_heavy_size -> gaussian_heavy_side - Renamed p0_x0 to guess_x0 for API consistency Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(fitfunctions): add HeavySide step function Add HeavySide fit function for modeling abrupt transitions: - HeavySide: Step function using np.heaviside params: x0 (transition point), y0 (baseline), y1 (step height) Features: - Analytic function: y1 * H(x0 - x) + y0 - Data-driven initial guess (p0) estimation - Optional guess_x0, guess_y0, guess_y1 parameters - TeX function representation Contributed from nh/vanishing_speed_hinge_fits.py with fixes: - Implemented p0 estimation (original raised NotImplementedError) Also updates __init__.py to export all new classes: - TwoLine, Saturation, HingeMin, HingeMax, HingeAtPoint - GaussianPlusHeavySide, GaussianTimesHeavySide, GaussianTimesHeavySidePlusHeavySide - HeavySide Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(fitfunctions): add module-specific contribution guide Add comprehensive CONTRIBUTING.md for the fitfunctions module covering: - Development workflow (TDD: tests before implementation) - FitFunction class requirements (function, p0, TeX_function) - Data-driven p0 estimation (no hardcoded domain values) - Test categories E1-E7 with tolerance specifications - Test patterns and anti-patterns - Non-trivial test criteria (6 requirements) - Test parameterization for DRY multi-case tests - Quality checklist for PR submissions This standalone document will be integrated into unified project docs once all submodules have contribution standards. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * style(fitfunctions): apply Black formatting to source and test files Fix CI validation failure caused by Black formatting violations in: - solarwindpy/fitfunctions/composite.py (2 line-length issues) - tests/fitfunctions/test_composite.py - tests/fitfunctions/test_heaviside.py - tests/fitfunctions/test_hinge.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 6bbaa8a commit 296ac07

9 files changed

Lines changed: 6936 additions & 3 deletions

File tree

Lines changed: 374 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,374 @@
1+
# Contributing to fitfunctions
2+
3+
This document defines the standards, conventions, and quality requirements for contributing
4+
to the `solarwindpy.fitfunctions` module. It is standalone and will be integrated into
5+
unified project documentation once all submodules have contribution standards.
6+
7+
## 1. Overview
8+
9+
The `fitfunctions` module provides a framework for fitting mathematical models to data
10+
using `scipy.optimize.curve_fit`. Each fit function is a class that inherits from
11+
`FitFunction` and implements three required abstract properties.
12+
13+
**Key files:**
14+
- `core.py` - Base `FitFunction` class and exceptions
15+
- `hinge.py`, `lines.py`, `gaussians.py`, etc. - Concrete implementations
16+
- `tests/fitfunctions/` - Test suite
17+
18+
## 2. Development Workflow (TDD)
19+
20+
Follow Test-Driven Development with separate commits for tests and implementation:
21+
22+
```
23+
1. Requirements → What does this function model? What are the parameters?
24+
2. Test Writing → Commit: test(fitfunctions): add tests for <ClassName>
25+
3. Implementation → Commit: feat(fitfunctions): add <ClassName>
26+
4. Verification → All tests pass, including existing tests
27+
```
28+
29+
**Commit order matters:** Tests are committed before implementation. This documents the
30+
expected behavior and ensures tests are not written to pass existing code.
31+
32+
## 3. FitFunction Class Requirements
33+
34+
### 3.1 Required Abstract Properties
35+
36+
Every `FitFunction` subclass MUST implement these three properties:
37+
38+
| Property | Returns | Purpose |
39+
|----------|---------|---------|
40+
| `function` | callable | The mathematical function `f(x, *params)` to fit |
41+
| `p0` | list | Initial parameter guesses (data-driven) |
42+
| `TeX_function` | str | LaTeX representation for plotting |
43+
44+
**Minimal implementation:**
45+
46+
```python
47+
from .core import FitFunction
48+
49+
class MyFunction(FitFunction):
50+
r"""One-line description.
51+
52+
Extended description with math:
53+
54+
.. math::
55+
56+
f(x) = m \cdot x + b
57+
58+
Parameters
59+
----------
60+
xobs : array-like
61+
Independent variable observations.
62+
yobs : array-like
63+
Dependent variable observations.
64+
**kwargs
65+
Additional arguments passed to :class:`FitFunction`.
66+
"""
67+
68+
@property
69+
def function(self):
70+
def my_func(x, m, b):
71+
return m * x + b
72+
return my_func
73+
74+
@property
75+
def p0(self) -> list:
76+
assert self.sufficient_data
77+
x = self.observations.used.x
78+
y = self.observations.used.y
79+
# Data-driven estimation (see §3.2)
80+
m = (y[-1] - y[0]) / (x[-1] - x[0])
81+
b = y[0] - m * x[0]
82+
return [m, b]
83+
84+
@property
85+
def TeX_function(self) -> str:
86+
return r"f(x) = m \cdot x + b"
87+
```
88+
89+
### 3.2 p0 Estimation (Data-Driven)
90+
91+
Initial parameter guesses MUST be data-driven. Hardcoded domain values are prohibited.
92+
93+
**REQUIRED pattern:**
94+
95+
```python
96+
@property
97+
def p0(self) -> list:
98+
assert self.sufficient_data
99+
x = self.observations.used.x
100+
y = self.observations.used.y
101+
102+
# Data-driven estimation examples:
103+
x0 = (x.max() + x.min()) / 2 # Midpoint for transitions
104+
y0 = np.median(y[x > x0]) # Baseline from data
105+
m = np.polyfit(x[:10], y[:10], 1)[0] # Slope from segment
106+
A = y.max() - y.min() # Amplitude from range
107+
108+
return [x0, y0, m, A]
109+
```
110+
111+
**PROHIBITED (hardcoded values):**
112+
113+
```python
114+
# BAD: Domain-specific hardcoded values
115+
x0 = 425 # Solar wind speed
116+
m1 = 0.0163 # Kasper 2007 value
117+
```
118+
119+
**Why data-driven?**
120+
- Works on arbitrary datasets, not just solar wind
121+
- Enables reuse across scientific domains
122+
- Reduces "magic number" bugs
123+
124+
### 3.3 Optional Overrides
125+
126+
**Custom `__init__` with guess parameters:**
127+
128+
```python
129+
def __init__(
130+
self,
131+
xobs,
132+
yobs,
133+
guess_x0: float | None = None, # Optional user hint
134+
**kwargs,
135+
):
136+
self._guess_x0 = guess_x0
137+
super().__init__(xobs, yobs, **kwargs)
138+
```
139+
140+
**Derived properties:**
141+
142+
```python
143+
@property
144+
def xs(self) -> float:
145+
"""Saturation x-coordinate (derived from fitted params)."""
146+
return self.popt["x1"] + self.popt["yh"] / self.popt["m1"]
147+
```
148+
149+
### 3.4 Code Conventions
150+
151+
| Convention | Standard | Example |
152+
|------------|----------|---------|
153+
| Class names | PascalCase | `HingeSaturation`, `GaussianPlusHeavySide` |
154+
| Method names | snake_case | `make_fit()`, `build_plotter()` |
155+
| Property names | snake_case | `popt`, `TeX_function` |
156+
| Docstrings | NumPy style with `r"""` | See example above |
157+
| Type hints | Selective (params with defaults) | `guess_x0: float = None` |
158+
| Imports | Relative, future annotations | `from .core import FitFunction` |
159+
| LaTeX | Raw strings | `r"$\chi^2_\nu$"` |
160+
161+
**Import template:**
162+
163+
```python
164+
r"""Module docstring."""
165+
166+
from __future__ import annotations
167+
168+
import numpy as np
169+
170+
from .core import FitFunction
171+
```
172+
173+
## 4. Test Requirements
174+
175+
### 4.1 Test Categories (E1-E7)
176+
177+
Every FitFunction MUST have tests in categories E1-E5. E6-E7 are recommended where applicable.
178+
179+
| Category | Purpose | Tolerance | Required? |
180+
|----------|---------|-----------|-----------|
181+
| E1. Function Evaluation | Verify exact f(x) values | `rtol=1e-10` | YES |
182+
| E2. Parameter Recovery (Clean) | Fit recovers known params | `rel_error < 2%` | YES |
183+
| E3. Parameter Recovery (Noisy) | Statistical precision | `deviation < 2σ` | YES |
184+
| E4. Initial Parameter (p0) | p0 enables convergence | `isfinite(popt)` | YES |
185+
| E5. Edge Cases | Error handling | `raises Exception` | YES |
186+
| E6. Derived Properties | Internal consistency | `rtol=1e-6` | If applicable |
187+
| E7. Behavioral | Continuity, transitions | `rtol=0.1` | If applicable |
188+
189+
### 4.2 Fixture Pattern
190+
191+
All fixtures MUST return `(x, y, w, true_params)`:
192+
193+
```python
194+
@pytest.fixture
195+
def clean_gaussian_data():
196+
"""Clean Gaussian data with known parameters."""
197+
true_params = {"mu": 5.0, "sigma": 1.0, "A": 10.0}
198+
x = np.linspace(0, 10, 200)
199+
y = gaussian(x, **true_params)
200+
w = np.ones_like(x)
201+
return x, y, w, true_params
202+
203+
204+
@pytest.fixture
205+
def noisy_gaussian_data():
206+
"""Noisy Gaussian data with known parameters."""
207+
rng = np.random.default_rng(42) # Deterministic seed
208+
true_params = {"mu": 5.0, "sigma": 1.0, "A": 10.0}
209+
noise_std = 0.5 # 5% of amplitude
210+
211+
x = np.linspace(0, 10, 200)
212+
y_true = gaussian(x, **true_params)
213+
y = y_true + rng.normal(0, noise_std, len(x))
214+
w = np.ones_like(x) / noise_std
215+
216+
return x, y, w, true_params
217+
```
218+
219+
**Conventions:**
220+
- Random seed: `np.random.default_rng(42)` for reproducibility
221+
- Noise level: 3-5% of signal amplitude
222+
- Weights: `w = np.ones_like(x) / noise_std` for noisy data
223+
224+
### 4.3 Assertion Patterns
225+
226+
**REQUIRED: Use `np.testing.assert_allclose` with `err_msg`:**
227+
228+
```python
229+
np.testing.assert_allclose(
230+
result, expected, rtol=0.02,
231+
err_msg=f"param: fitted={result:.4f}, expected={expected:.4f}"
232+
)
233+
```
234+
235+
**Tolerance Reference:**
236+
237+
| Test Type | Tolerance | Justification |
238+
|-----------|-----------|---------------|
239+
| Exact math (E1) | `rtol=1e-10` | Floating point precision |
240+
| Clean fitting (E2) | `rel_error < 0.02` | curve_fit convergence |
241+
| Noisy fitting (E3) | `deviation < 2*sigma` | 95% confidence interval |
242+
| Derived quantities (E6) | `rtol=1e-6` | Computed from fitted params |
243+
| Behavioral (E7) | `rtol=0.1` | Approximate behavior |
244+
245+
### 4.4 Test Parameterization (REQUIRED for multi-case tests)
246+
247+
Use `@pytest.mark.parametrize` to avoid code duplication:
248+
249+
**Pattern 1: Multiple parameter sets**
250+
251+
```python
252+
@pytest.mark.parametrize(
253+
"true_params",
254+
[
255+
{"mu": 5.0, "sigma": 1.0, "A": 10.0}, # Standard case
256+
{"mu": 0.0, "sigma": 0.5, "A": 5.0}, # Edge: mu at origin
257+
{"mu": 10.0, "sigma": 2.0, "A": 1.0}, # Edge: small amplitude
258+
],
259+
ids=["standard", "mu_at_origin", "small_amplitude"],
260+
)
261+
def test_gaussian_recovers_parameters(true_params):
262+
"""Test parameter recovery across configurations."""
263+
# Single test logic, multiple cases
264+
```
265+
266+
**Pattern 2: Multiple classes**
267+
268+
```python
269+
@pytest.mark.parametrize("cls", [Line, LineXintercept])
270+
def test_make_fit_success(cls, simple_linear_data):
271+
"""All line classes should fit successfully."""
272+
x, y, w, _ = simple_linear_data
273+
fit = cls(x, y)
274+
fit.make_fit()
275+
assert np.isfinite(fit.popt['m'])
276+
```
277+
278+
**Best practices:**
279+
- Use `ids=` for readable test names
280+
- Use dict for multiple parameters
281+
- Document each case with comments
282+
- Include edge cases
283+
284+
## 5. Test Patterns (DO THIS)
285+
286+
| Pattern | Purpose | Example |
287+
|---------|---------|---------|
288+
| Analytic expected values | Verify math is correct | `expected = m * x + b` |
289+
| Known parameter recovery | Verify fitting works | Generate data, fit, compare |
290+
| Statistical bounds | Handle noise properly | `assert deviation < 2 * sigma` |
291+
| Boundary conditions | Verify edge behavior | Test at x=x0 for step functions |
292+
| Derived property consistency | Verify internal math | `assert m2 == (y2-y1)/(x2-x1)` |
293+
| Documented tolerances | Explain precision | `rtol=0.02 # curve_fit convergence` |
294+
| Error messages with context | Enable debugging | `err_msg=f"fitted={x:.3f}"` |
295+
| Deterministic random seeds | Reproducibility | `rng = np.random.default_rng(42)` |
296+
297+
## 6. Test Anti-Patterns (DO NOT DO THIS)
298+
299+
| Anti-Pattern | Why It's Bad | Good Alternative |
300+
|--------------|--------------|------------------|
301+
| `assert fit.popt is not None` | Proves nothing about correctness | `assert_allclose(fit.popt['m'], 2.0, rtol=0.02)` |
302+
| `assert isinstance(fit.popt, dict)` | Verifies structure, not behavior | Verify actual parameter values |
303+
| `assert len(fit.popt) == 3` | Trivial, no math validation | Verify each parameter value |
304+
| `rtol=0.1 # works` | Unexplained, arbitrary | `rtol=0.02 # curve_fit convergence` |
305+
| `assert a == b` (no message) | Hard to debug failures | `assert a == b, f"got {a}, expected {b}"` |
306+
| `rtol=1e-15` for noisy data | Flaky tests | `rtol=0.02` for fitting |
307+
| Only test clean data | Misses real-world behavior | Include noisy data with 2σ bounds |
308+
| `np.random.normal(...)` | Non-reproducible failures | `rng.normal(...)` with fixed seed |
309+
310+
## 7. Non-Trivial Test Criteria
311+
312+
**Definition:** A test is **non-trivial** if it would FAIL on a plausible incorrect implementation.
313+
314+
Every test MUST satisfy ALL of these criteria:
315+
316+
| Criterion | Requirement | Anti-Example |
317+
|-----------|-------------|--------------|
318+
| Numeric assertion | Uses `assert_allclose` with explicit tolerance | `assert popt is not None` |
319+
| Known expected value | Expected value is analytically computed | `assert result` (truthy) |
320+
| Justified tolerance | rtol/atol documented with reasoning | `rtol=0.1 # seems to work` |
321+
| Failure diagnostic | Error message shows actual vs expected | Bare `AssertionError` |
322+
| Mathematical meaning | Tests a model property, not structure | `assert len(popt) == 4` |
323+
| Would fail if broken | A plausible bug would cause failure | Test that always passes |
324+
325+
**Example non-trivial test:**
326+
327+
```python
328+
def test_line_evaluates_correctly():
329+
"""Line: f(x) = m*x + b should give exact values.
330+
331+
Non-trivial because:
332+
- Tests specific numeric values, not just "runs"
333+
- Would fail if formula were m*x - b (sign error)
334+
- Tolerance is 1e-10 (floating point, not fitting)
335+
"""
336+
m, b = 2.0, 1.0
337+
x = np.array([0.0, 1.0, 2.0])
338+
expected = np.array([1.0, 3.0, 5.0]) # Analytically computed
339+
340+
fit = Line(x, expected)
341+
result = fit.function(x, m, b)
342+
343+
np.testing.assert_allclose(
344+
result, expected, rtol=1e-10,
345+
err_msg=f"Line(x, m={m}, b={b}) should equal m*x + b"
346+
)
347+
```
348+
349+
## 8. Quality Checklist
350+
351+
Before submitting a PR, verify:
352+
353+
- [ ] All 3 abstract properties implemented (`function`, `p0`, `TeX_function`)
354+
- [ ] p0 is data-driven (no hardcoded domain values)
355+
- [ ] Tests cover categories E1-E5 minimum
356+
- [ ] All tests are non-trivial (pass criteria in §7)
357+
- [ ] Docstrings complete with `.. math::` blocks
358+
- [ ] `make_fit()` converges on clean data
359+
- [ ] `make_fit()` within 2σ on noisy data
360+
- [ ] Class exported in `__init__.py`
361+
- [ ] No regressions (all existing tests pass)
362+
363+
## 9. Complete Examples
364+
365+
For complete implementation examples, see:
366+
367+
- **Implementation:** `hinge.py:HingeSaturation` (lines 20-180)
368+
- **Tests:** `tests/fitfunctions/test_hinge.py:TestHingeSaturation`
369+
370+
For test organization patterns:
371+
372+
- **Parameterization:** `tests/fitfunctions/test_composite.py` (lines 359-365)
373+
- **Fixtures:** `tests/fitfunctions/test_hinge.py` (fixture definitions)
374+
- **Categories E1-E7:** `tests/fitfunctions/test_heaviside.py` (section headers)

0 commit comments

Comments
 (0)