-
Notifications
You must be signed in to change notification settings - Fork 62
feat: Add BigFrames.bigquery.st_regionstats method #2200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds the `BigFrames.bigquery.st_regionstats` method, which allows users to compute statistics for a raster band within a given geography. The implementation includes: - A new `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Compiler implementations for both the SQLGlot and Ibis backends. - A unit test with a SQL snapshot. - A code sample in `samples/snippets/wildfire_risk.py` that demonstrates the use of the new function.
| raster: bigframes.series.Series, | ||
| band: str, | ||
| *, | ||
| options: Mapping[str, Union[str, int, float]] = {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing "include".
Also, "options" might not work if it is keyword-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, don't use mutable values as the default.
| if op.options: | ||
| args.append(bigframes_vendored.ibis.literal(op.options, type="json")) | ||
| return bigframes_vendored.ibis.remote_function( | ||
| "st_regionstats", args, output_type="struct<min: float, max: float, sum: float, count: int, mean: float>" # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_regionstats it should also include area.
samples/snippets/wildfire_risk.py
Outdated
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import bigframes.bigquery as bbq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should put this inside a test file so we actually run it.
samples/snippets/wildfire_risk.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pytest.importorskip("pytest_snapshot") | ||
|
|
||
|
|
||
| class TestGeoCompiler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use pytest style not unittest style.
This commit adds the `BigFrames.bigquery.st_regionstats` method, which allows users to compute statistics for a raster band within a given geography. The implementation includes: - A new `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Compiler implementations for both the SQLGlot and Ibis backends. - A unit test with a SQL snapshot. - A system test that demonstrates the use of the new function. This commit also addresses feedback from the code review, including: - Adding `area` to the output struct of `st_regionstats`. - Making the `options` parameter a positional argument. - Adding comments to explain the use of `pass_op`. - Converting the unit test to a pytest-style function. - Moving the sample code to a system test.
This commit introduces the `st_regionstats` method in `bigframes.bigquery`, allowing users to compute statistics for a raster band within a given geography. Key changes: - Added `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Implemented compilation logic for the operation in both Ibis and SQLGlot compilers. - Exposed the `st_regionstats` function in `bigframes/bigquery/_operations/geo.py` and the public API. - Added a new `_apply_ternary_op` method to `bigframes.series.Series`. - Included a unit test with a snapshot to verify the generated SQL. - Added a system test that demonstrates the functionality by converting a complex wildfire risk analysis query from SQL to BigFrames. - Refactored compiler registries to support `pass_op=True` for ternary operations, enabling access to operator parameters during compilation.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad merge. Revert these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be reverted too.
This commit introduces the `st_regionstats` method in `bigframes.bigquery`, allowing users to compute statistics for a raster band within a given geography. Key changes: - Added `StRegionStatsOp` in `bigframes/operations/geo_ops.py`. - Implemented compilation logic for the operation in both Ibis and SQLGlot compilers. - Exposed the `st_regionstats` function in `bigframes/bigquery/_operations/geo.py` and the public API. - Added a new `_apply_ternary_op` method to `bigframes.series.Series`. - Included a unit test with a snapshot to verify the generated SQL. - Added a system test that demonstrates the functionality by converting a complex wildfire risk analysis query from SQL to BigFrames. - Refactored compiler registries to support `pass_op=True` for ternary operations, enabling access to operator parameters during compilation.
|
Here is the summary of changes. You are about to add 1 region tag.
This comment is generated by snippet-bot.
|
| # TODO: Add st_simplify when it is available in BigFrames. | ||
| # https://github.com/googleapis/python-bigquery-dataframes/issues/1497 | ||
| # countries["simplified_geometry"] = bq.st_simplify(countries["geometry"], 10000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to add st_simplify first so that this sample can work without any todos for us.
| # [START bigquery_dataframes_st_regionstats] | ||
| from typing import cast | ||
|
|
||
| import bigframes.bigquery as bq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use bbq instead of bq.
|
Closing in favor of #2228 |
This commit adds the
BigFrames.bigquery.st_regionstatsmethod, which allows users to compute statistics for a raster band within a given geography.The implementation includes:
StRegionStatsOpinbigframes/operations/geo_ops.py.samples/snippets/wildfire_risk.pythat demonstrates the use of the new function.Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕