Skip to content

Conversation

@tswast
Copy link
Collaborator

@tswast tswast commented Oct 27, 2025

This commit adds the BigFrames.bigquery.st_regionstats method, which allows users to compute statistics for a raster band within a given geography.

The implementation includes:

  • A new StRegionStatsOp in bigframes/operations/geo_ops.py.
  • Compiler implementations for both the SQLGlot and Ibis backends.
  • A unit test with a SQL snapshot.
  • A code sample in samples/snippets/wildfire_risk.py that demonstrates the use of the new function.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

This commit adds the `BigFrames.bigquery.st_regionstats` method, which allows users to compute statistics for a raster band within a given geography.

The implementation includes:
- A new `StRegionStatsOp` in `bigframes/operations/geo_ops.py`.
- Compiler implementations for both the SQLGlot and Ibis backends.
- A unit test with a SQL snapshot.
- A code sample in `samples/snippets/wildfire_risk.py` that demonstrates the use of the new function.
@tswast tswast requested review from a team as code owners October 27, 2025 21:13
@tswast tswast requested review from glasnt and jialuoo October 27, 2025 21:13
@product-auto-label product-auto-label bot added the size: l Pull request size is large. label Oct 27, 2025
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. label Oct 27, 2025
raster: bigframes.series.Series,
band: str,
*,
options: Mapping[str, Union[str, int, float]] = {},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing "include".

Also, "options" might not work if it is keyword-only.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, don't use mutable values as the default.

if op.options:
args.append(bigframes_vendored.ibis.literal(op.options, type="json"))
return bigframes_vendored.ibis.remote_function(
"st_regionstats", args, output_type="struct<min: float, max: float, sum: float, count: int, mean: float>" # type: ignore
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# See the License for the specific language governing permissions and
# limitations under the License.

import bigframes.bigquery as bbq
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should put this inside a test file so we actually run it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest.importorskip("pytest_snapshot")


class TestGeoCompiler:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use pytest style not unittest style.

google-labs-jules bot and others added 7 commits October 27, 2025 21:54
This commit adds the `BigFrames.bigquery.st_regionstats` method, which allows users to compute statistics for a raster band within a given geography.

The implementation includes:
- A new `StRegionStatsOp` in `bigframes/operations/geo_ops.py`.
- Compiler implementations for both the SQLGlot and Ibis backends.
- A unit test with a SQL snapshot.
- A system test that demonstrates the use of the new function.

This commit also addresses feedback from the code review, including:
- Adding `area` to the output struct of `st_regionstats`.
- Making the `options` parameter a positional argument.
- Adding comments to explain the use of `pass_op`.
- Converting the unit test to a pytest-style function.
- Moving the sample code to a system test.
This commit introduces the `st_regionstats` method in `bigframes.bigquery`, allowing users to compute statistics for a raster band within a given geography.

Key changes:
- Added `StRegionStatsOp` in `bigframes/operations/geo_ops.py`.
- Implemented compilation logic for the operation in both Ibis and SQLGlot compilers.
- Exposed the `st_regionstats` function in `bigframes/bigquery/_operations/geo.py` and the public API.
- Added a new `_apply_ternary_op` method to `bigframes.series.Series`.
- Included a unit test with a snapshot to verify the generated SQL.
- Added a system test that demonstrates the functionality by converting a complex wildfire risk analysis query from SQL to BigFrames.
- Refactored compiler registries to support `pass_op=True` for ternary operations, enabling access to operator parameters during compilation.
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. and removed size: l Pull request size is large. labels Oct 28, 2025
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge. Revert these changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be reverted too.

This commit introduces the `st_regionstats` method in `bigframes.bigquery`, allowing users to compute statistics for a raster band within a given geography.

Key changes:
- Added `StRegionStatsOp` in `bigframes/operations/geo_ops.py`.
- Implemented compilation logic for the operation in both Ibis and SQLGlot compilers.
- Exposed the `st_regionstats` function in `bigframes/bigquery/_operations/geo.py` and the public API.
- Added a new `_apply_ternary_op` method to `bigframes.series.Series`.
- Included a unit test with a snapshot to verify the generated SQL.
- Added a system test that demonstrates the functionality by converting a complex wildfire risk analysis query from SQL to BigFrames.
- Refactored compiler registries to support `pass_op=True` for ternary operations, enabling access to operator parameters during compilation.
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: xl Pull request size is extra large. labels Oct 28, 2025
@snippet-bot
Copy link

snippet-bot bot commented Oct 29, 2025

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

Comment on lines +41 to +43
# TODO: Add st_simplify when it is available in BigFrames.
# https://github.com/googleapis/python-bigquery-dataframes/issues/1497
# countries["simplified_geometry"] = bq.st_simplify(countries["geometry"], 10000)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to add st_simplify first so that this sample can work without any todos for us.

# [START bigquery_dataframes_st_regionstats]
from typing import cast

import bigframes.bigquery as bq
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use bbq instead of bq.

@tswast
Copy link
Collaborator Author

tswast commented Nov 4, 2025

Closing in favor of #2228

@tswast tswast closed this Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants