Releases: googleapis/python-bigquery-dataframes
Releases · googleapis/python-bigquery-dataframes
v2.9.0
2.9.0 (2025-06-30)
Features
- Add
bpd.read_arrowto convert an Arrow object into a bigframes DataFrame (#1855) (633bf98) - Add experimental polars execution (#1747) (daf0c3b)
- Add size op support in local engine (#1865) (942e66c)
- Create
deploy_remote_functionanddeploy_udffunctions to immediately deploy functions to BigQuery (#1832) (c706759) - Support index item assign in Series (#1868) (c5d251a)
- Support item assignment in series (#1859) (25684ff)
- Support local execution of comparison ops (#1849) (1c45ccb)
Bug Fixes
- Fix bug selecting column repeatedly (#1858) (cc339e9)
- Fix bug with DataFrame.agg for string values (#1870) (81e4d64)
- Generate GoogleSQL instead of legacy SQL data types for
dry_run=Truefrombpd._read_gbq_colabwith local pandas DataFrame (#1867) (fab3c38) - Revert dict back to protobuf in the iam binding update (#1838) (9fb3cb4)
Documentation
v2.8.0
2.8.0 (2025-06-23)
⚠ BREAKING CHANGES
- add required param 'engine' to multimodal functions (#1834)
Features
- Add
bpd.options.compute.maximum_result_rowsoption to limit client data download (#1829) (e22a3f6) - Add
bpd.options.display.repr_mode = "anywidget"to create an interactive display of the results (#1820) (be0a3cf) - Add DataFrame.ai.forecast() support (#1828) (7bc7f36)
- Add describe() method to Series (#1827) (a4205f8)
- Add required param 'engine' to multimodal functions (#1834) (37666e4)
Performance Improvements
Documentation
v2.7.0
2.7.0 (2025-06-16)
Features
- Add bbq.json_query_array and warn bbq.json_extract_array deprecated (#1811) (dc9eb27)
- Add bbq.json_value_array and deprecate bbq.json_extract_string_array (#1818) (019051e)
- Add groupby cumcount (#1798) (18f43e8)
- Support custom build service account in
remote_function(#1796) (e586151)
Bug Fixes
- Correct read_csv behaviours with use_cols, names, index_col (#1804) (855031a)
- Fix single row broadcast with null index (#1803) (080eb7b)
Documentation
v2.6.0
2.6.0 (2025-06-09)
Features
- Add blob.transcribe function (#1773) (86159a7)
- Implement ai.classify() (#1781) (8af26d0)
- Implement item() for Series and Index (#1792) (d2154c8)
- Implement ST_ISCLOSED geography function (#1789) (36bc179)
- Implement ST_LENGTH geography function (#1791) (c5b7fda)
- Support isin with bigframes.pandas.Index arg (#1779) (e480d29)
Bug Fixes
- Address
read_csvwith bothindex_colanduse_colsbehavior inconsistency with pandas (#1785) (ba7c313) - Allow KMeans model init parameter as k-means++ alias (#1790) (0b59cf1)
- Replace function now can handle bpd.NA value. (#1786) (7269512)
Documentation
v2.5.0
2.5.0 (2025-05-30)
⚠ BREAKING CHANGES
- the updated
ai.map()parameter list is not backward-compatible
Features
- Add
bpd.options.bigquery.requests_transport_adaptersoption (#1755) (bb45db8) - Add bbq.json_query and warn bbq.json_extract deprecated (#1756) (ec81dd2)
- Add bpd.options.reset() method (#1743) (36c359d)
- Add DataFrame.round method (#1742) (3ea6043)
- Add deferred data uploading (#1720) (1f6442e)
- Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs (#1723) (80aad9a)
- Add structured output for ai map, ai filter and ai join (#1746) (133ac6b)
- Add support for df.loc[list, column(s)] (#1761) (768a757)
- Include bq schema and query string in dry run results (#1752) (bb51147)
- Support
inplace=Trueinrenameandrename_axis(#1744) (734cc65) - Support
unique()for Index (#1750) (27fac78) - Support astype conversions to and from JSON dtypes (#1716) (8ef4de1)
- Support dict param for dataframe.agg() (#1772) (f9c29c8)
- Support dtype parameter in read_csv for bigquery engine (#1749) (50dca4c)
- Use read api for some peek ops (#1731) (108f4d2)
Bug Fixes
- Fix clip int series with float bounds (#1739) (d451aef)
- Fix error with self-merge operations (#1774) (e5fe143)
- Fix the default value for na_value for numpy conversions (#1766) (0629cac)
- Include location in Session-based temporary storage manager DDL queries (#1780) (acba032)
- Prevent creating unnecessary client objects in multithreaded environments (#1757) (1cf9f5e)
- Reduce bigquery table modification via DML for to_gbq (#1737) (545cdca)
- Stop ignoring arguments to
MatrixFactorization.score(X, y)(#1726) (55c07e9) - Support JSON and STRUCT for bbq.sql_scalar (#1754) (190390b)
- Support str.replace re.compile with flags (#1736) (f8d2cd2)
Performance Improvements
- Faster local data comparison using idenitity (#1738) (2858b1e)
- Optimize repr for unordered gbq table (#1778) (2bc4fbc)
- Use JOB_CREATION_OPTIONAL when
allow_large_results=False(#1763) (15f3f2a)
Dependencies
Documentation
- Add llm output_schema notebook (#1732) (b2261cc)
- Add MatrixFactorization to the table of contents (#1725) (611e43b)
- Fix typo for "population" in the
GeminiTextGenerator.predict(..., output_schema={...})sample notebook (#1748) (bd07e05) - Integrations notebook extracts token from
bqclient._http.credentialsinstead ofbqclient._credentials(#1784) (6e63eca) - Updated multimodal notebook instructions (#1745) (1df8ca6)
- Use partial ordering mode in the quickstart sample (#1734) (476b7dd)
v2.4.0
2.4.0 (2025-05-12)
Features
- Add "dayofyear" property for
dtaccessors (#1692) (9d4a59d) - Add
.dt.days,.dt.seconds,dt.microseconds, anddt.total_seconds()for timedelta series. (#1713) (2b3a45f) - Add
DatetimeIndexclass (#1719) (c3c830c) - Add
isocalendar()for dt accessor" (#1717) (0479763) - Add bigframes.bigquery.json_value (#1697) (46a9c53)
- Add blob.exif function support (#1703) (3f79528)
- Add inplace arg support to sort methods (#1710) (d1ccb52)
- Improve error message in
Series.applyfor direct udfs (#1673) (1a658b2) - Publish bigframes blob(Multimodal) to preview (#1693) (e4c85ba)
- Support () operator between timedeltas (#1702) (edaac89)
- Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models (#1305) (b16740e)
- Support to_strip parameter for str.strip, str.lstrip and str.rstrip (#1705) (a84ee75)
Bug Fixes
- Fix dayofyear doc test (#1701) (9b777a0)
- Fix issues with chunked arrow data (#1700) (e3289b7)
- Rename columns with protected names such as
_TABLE_SUFFIXinto_gbq()(#1691) (8ec6079)
Performance Improvements
- Defer query in
read_gbqwith wildcard tables (#1661) (5c125c9) - Rechunk result pages client side (#1680) (67d8760)
Dependencies
Documentation
- Add snippets for Matrix Factorization tutorials (#1630) (24b37ae)
- Deprecate
bpd.options.bigquery.allow_large_resultsin favor ofbpd.options.compute.allow_large_results(#1597) (18780b4) - Include import statement in the bigframes code snippet (#1699) (08d70b6)
- Include the clean-up step in the udf code snippet (#1698) (48992e2)
- Move multimodal notebook out of experimental folder (#1712) (68b6532)
- Update blob_display option in snippets (#1714) (8b30143)
v2.3.0
2.3.0 (2025-05-06)
Features
Bug Fixes
- Guarantee guid thread safety across threads (#1684) (cb0267d)
- Support large lists of lists in bpd.Series() constructor (#1662) (0f4024c)
- Use value equality to check types for unix epoch functions and timestamp diff (#1690) (81e8fb8)
Performance Improvements
Documentation
v2.2.0
2.2.0 (2025-04-30)
Features
- Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints (#1650) (4fb54df)
- Add GeminiTextGenerator.predict structured output (#1653) (6199023)
- DataFrames.getitem support for slice input (#1668) (563f0cb)
- Print right origin of
PreviewWarningfor thebpd.udf(#1629) (48d10d1) - Session.bytes_processed_sum will be updated when allow_large_re… (#1669) (ae312db)
- Short circuit query for local scan (#1618) (e84f232)
- Support names parameter in read_csv for bigquery engine (#1659) (3388191)
- Support passing list of values to bigframes.core.sql.simple_literal (#1641) (102d363)
- Support write api as loading option (#1617) (c46ad06)
Bug Fixes
- DataFrame accessors is not pupulated (#1639) (28afa2c)
- Prefer remote schema instead of throwing on materialize conflicts (#1644) (53fc25b)
- Remove itertools.pairwise usage (#1638) (9662745)
- Resolve issue where pre-release versions of google-auth are installed (#1491) (ebb7a5e)
- Resolve some of the typo errors (#1655) (cd7fbde)
Performance Improvements
Dependencies
Documentation
v2.1.0
2.1.0 (2025-04-22)
Features
- Add
bigframes.bigquery.st_distancefunction (#1637) (bf1ae70) - Enable local json string validations (#1614) (233347a)
- Enhance
read_csvindex_colparameter support (#1631) (f4e5b26)
Bug Fixes
- Add retry for test_clean_up_via_context_manager (#1627) (58e7cb0)
- Improve robustness of managed udf code extraction (#1634) (8cc56d5)
Documentation
v2.0.0
2.0.0 (2025-04-17)
⚠ BREAKING CHANGES
- make
datasetandnameparams mandatory inudf(#1619) - Locational endpoints support is not available in BigFrames 2.0.
- change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
- change default ingress setting for
remote_functionto internal-only (#1544) - make
remote_functionparams keyword only (#1537) - make
remote_functiondefault service account explicit (#1537) - set
allow_large_results=Falseby default (#1541)
Features
- Add
onparameter indataframe.rolling()anddataframe.groupby.rolling()(#1556) (45c9d9f) - Add component to manage temporary tables (#1559) (0a4e245)
- Add Series.to_pandas_batches() method (#1592) (09ce979)
- Add support for creating a Matrix Factorization model (#1330) (b5297f9)
- Allow
input_types,output_type, anddatasetto be used positionally inremote_function(#1560) (bcac8c6) - Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
- Change default ingress setting for
remote_functionto internal-only (#1544) (c848a80) - Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
- Drop support for locational endpoints (#1542) (4bf2e43)
- Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
- Improve local data validation (#1598) (815e471)
- Make
remote_functiondefault service account explicit (#1537) (9eb9089) - Set
allow_large_results=Falseby default (#1541) (e9fb712) - Support bigquery connection in managed function (#1554) (f6f697a)
- Support bq connection path format (#1550) (e7eb918)
- Support gemini-2.0-X models (#1558) (3104fab)
- Support inlining small list, struct, json data (#1589) (2ce891f)
- Support time range rolling on Series. (#1590) (6e98a2c)
- Use session temp tables for all ephemeral storage (#1569) (9711b83)
- Use validated local storage for data uploads (#1612) (aee4159)
- Warn the deprecated
max_download_size,random_stateandsampling_methodparameters in(DataFrame|Series).to_pandas()(#1573) (b9623da)
Bug Fixes
to_pandas_batches()respectspage_sizeandmax_resultsagain (#1572) (27c5905)- Ensure
page_sizeworks correctly into_pandas_batcheswhenmax_resultsis not set (#1588) (570cff3) - Include role and service account in IAM exception (#1564) (8c50755)
- Make
datasetandnameparams mandatory inudf(#1619) (637e860) - Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
- Prevent
KeyErrorinbpd.concatwith empty DF and struct/array types DF (#1568) (b4da1cf) - Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
- Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)
Performance Improvements
Dependencies
- Remove jellyfish dependency (#1604) (1ac0e1e)
- Remove parsy dependency (#1610) (293f676)
- Remove test dependency on pytest-mock package (#1622) (1ba72ea)
- Support a shapely versions 1.8.5+ (#1621) (e39ee3b)
Documentation
- Add details for
bigquery_connectionin[@bpd](https://github.com/bpd).udfdocstring ([#1609](https://github.com/googleapis/python-bigq...