Releases: googleapis/python-bigquery-dataframes
Releases · googleapis/python-bigquery-dataframes
v0.5.0
0.5.0 (2023-09-28)
Features
- Add
DataFrame.kurtosis/DF.kurtmethod (c1900c2) - Add
DataFrame.rollingandDataFrame.expandingmethods (c1900c2) - Add
items,applymethods toDataFrame. (#43) (3adc1b3) - Add axis param to simple df aggregations (#52) (9cf9972)
- Add index
dtype,astype,drop,fillna, aggregate attributes. (#38) (1a254a4) - Add ml.preprocessing.LabelEncoder (#50) (2510461)
- Add ml.preprocessing.MaxAbsScaler (#56) (14b262b)
- Add ml.preprocessing.MinMaxScaler (#64) (392113b)
- Add more index methods (#54) (a6e32aa)
- Support
calculate_p_valuesparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
class_weights="balanced"inLogisticRegressionmodel (c1900c2) - Support
df[column_name] = df_only_one_column(c1900c2) - Support
early_stopparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
enable_global_explainparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
l2_regparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
learn_rate_strategyparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
ls_init_learn_rateparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
max_iterationsparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
min_rel_progressparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support
optimize_strategyparameter inbigframes.ml.linear_model.LinearRegression(c1900c2) - Support casting string to integer or float (#59) (3502f83)
Bug Fixes
- Fix header skipping logic in
read_csv(#49) (d56258c) - Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
- LabelEncoder params consistent with Sklearn (#60) (632caec)
- Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
Performance Improvements
- Add ability to cache dataframe and series to session table (#51) (416d7cb)
- Inline small
SeriesandDataFramesin query text (#45) (5e199ec) - Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
- Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
Documentation
- Link to Remote Functions code samples from README and API reference (c1900c2)
v0.4.0
0.4.0 (2023-09-16)
Features
- Add
axisparameter todroplevelandreorder_levels(7c6b0dd) - Add
bfillandffilltoDataFrameandSeries(7c6b0dd) - Add
DataFrame.combineandDataFrame.combine_first(#27) (7c6b0dd) - Add
DataFrame.nlargest,nsmallest(7c6b0dd) - Add
DataFrame.pct_changeandSeries.pct_change(7c6b0dd) - Add
DataFrame.skewandGroupBy.skew(7c6b0dd) - Add
DataFrame.to_dict,to_excel,to_latex,to_records,to_string,to_markdown,to_pickle,to_orc(7c6b0dd) - Add
diffmethod toDataFrameandGroupBy(7c6b0dd) - Add
filterandreindextoSeriesandDataFrame(7c6b0dd) - Add
reindex_liketoDataFrameandSeries(7c6b0dd) - Add
swapleveltoDataFrameandSeries(7c6b0dd) - Add partial support for
Sereies.replace(7c6b0dd) - Support
DataFrame.loc[bool_series, column] = scalar(7c6b0dd) - Support a persistent
nameinremote_function(7c6b0dd)
Bug Fixes
remote_functionuses same credentials as other APIs (7c6b0dd)- Add type hints to models (7c6b0dd)
- Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
- Remove
transformsparameter inmodel.fit(breaking change) (7c6b0dd) - Support column joins with "None indexer" (7c6b0dd)
- Use for literals
Int64Dtypeincut(7c6b0dd) - Use lowercase strings for parameter literals in
bigframes.ml(breaking change) (7c6b0dd)
Performance Improvements
bigframes-apilabel to I/O query jobs (7c6b0dd)
Documentation
v0.3.2
v0.3.1
v0.3.0
0.3.0 (2023-09-02)
Features
- Add
bigframes.get_global_session()andbigframes.reset_session()aliases (a32b747) - Add
bigframes.pandas.read_picklefunction (a32b747) - Add
components_,explained_variance_, andexplained_variance_ratio_properties tobigframes.ml.decomposition.PCA(89b9503) - Add
fit_transformtobigquery.mltransformers (a32b747) - Add
Series.dropnaandDataFrame.fillna(8fab755) - Add
Series.strmethodsisalpha,isdigit,isdecimal,isalnum,isspace,islower,isupper,zfill,center(a32b747) - Support
bigframes.pandas.merge()(8fab755) - Support
DataFrame.isinwith list and dict inputs (8fab755) - Support
DataFrame.pivot(a32b747) - Support
DataFrame.stack(89b9503) - Support
DataFrame-DataFramebinary operations (8fab755) - Support
df[my_column] = [a python list](89b9503) - Support
Index.is_monotonic(8fab755) - Support
np.arcsin,np.arccos,np.arctan,np.sinh,np.cosh,np.tanh,np.arcsinh,np.arccosh,np.arctanh,np.expwith Series argument (89b9503) - Support
np.sin,np.cos,np.tan,np.log,np.log10,np.sqrt,np.abswith Series argument (89b9503) - Support
pow()and power operator inDataFrameandSeries(8fab755) - Support
read_jsonwithengine=bigqueryfor newline-delimited JSON files (89b9503) - Support
Series.corr(89b9503) - Support
Series.map(8fab755) - Support for
np.add,np.subtract,np.multiply,np.divide,np.power(8fab755) - Support MultiIndex for DataFrame columns (a32b747)
- Use
pandas.Indexfor column labels (a32b747) - Use default session and connection in
ml.llmandml.imported(8fab755)
Bug Fixes
- Add error message to
set_index(a32b747) - Align column names with pandas in
DataFrame.aggresults (89b9503) - Allow (but still not recommended)
ORDER BYinread_gbqinput when anindex_colis defined (89b9503) - Check for IAM role on the BigQuery connection when initializing a
remote_function(89b9503) - Check that types are specified in
read_gbq_function(a32b747) - Don't use query cache for Session construction (a32b747)
- Include survey link in abstract
NotImplementedErrorexception messages (89b9503) - Label temp table creation jobs with
source=bigquery-dataframes-templabel (89b9503) - Make
X_trainargument names consistent across methods (8fab755) - Raise AttributeError for unimplemented pandas methods (89b9503)
- Raise exception for invalid function in
read_gbq_function(a32b747) - Support spaces in column names in
DataFrameinitializater (89b9503)
Performance Improvements
- Add local cache for
__repr_*__methods (a32b747) - Lazily instantiate client library objects (89b9503)
- Use
row_number()filter forhead/tail(8fab755)
Documentation
- Add ML section under Overview (a32b747)
- Add release status to table of contents (a32b747)
- Add samples and best practices to
read_gbqdocs (a32b747) - Correct the return types of Dataframe and Series (a32b747)
- Create subfolders for notebooks (a32b747)
- Fix link to GitHub (89b9503)
- Highlight bigframes is open-source (a32b747)
- Sample ML Drug Name Generation notebook (a32b747)
- Set
options.bigquery.projectin sample code (89b9503) - Transform remote function user guide into sample code (a32b747)
- Update remote function notebook with read_gbq_function usage (8fab755)
Version 0.2.0
0.2.0 (2023-08-17)
Features
- Add KMeans.cluster_centers_.
- Allow column labels to be any type handled by bq df, column labels can be integers now.
- Add dataframegroupby.agg().
- Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
- Add match, fullmatch, get, pad str methods.
- Add series isin function.
Bug Fixes
- Update ML package to use sessions for queries.
- Optimize
read_gbqwithindex_colset to cluster byindex_col. - Raise ValueError if the location mismatched.
read_gbqno longer uses 'time travel' with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
Version 0.1.1
Documentation
- Correct link to code repository in
setup.pyand use correct terminology for
console.cloud.google.comlinks.
Version 0.1.0
0.1.0 (2023-08-11)
Features
- Add
bigframes.pandaspackage with an API compatible with
pandas. Supported data sources include:
BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local
and Cloud Storage), and more. - Add
bigframes.mlpackage with an API inspired by
scikit-learn. Train machine learning
models and run batch predicition, powered by BigQuery
ML.