Changelog

PyPI History

2.25.0 (2025-10-13)

Features

Add barh, pie plot types (#2146) (5cc3c5b)
Add Index.eq for consts, aligned objects (#2141) (8514200)
Add output_schema parameter to ai.generate() (#2139) (ef0b0b7)
Create session-scoped cut, DataFrame, MultiIndex, Index, Series, to_datetime, and to_timedelta methods (#2157) (5e1e809)
Replace ML.GENERATE_TEXT with AI.GENERATE for audio transcription (#2151) (a410d0a)
Support string literal inputs for AI functions (#2152) (7600001)

Bug Fixes

Address typo in error message (#2142) (cdf2dd5)
Avoid possible circular imports in global session (#2115) (095c0b8)
Fix too many cluster columns requested by caching (#2155) (35c1c33)
Show progress even in job optional queries (#2119) (1f48d3a)
Yield row count from read session if otherwise unknown (#2148) (8997d4d)

Documentation

Add a brief intro notebook for bbq AI functions (#2150) (1f434fb)
Fix ai function related docs (#2149) (93a0749)
Remove progress bar from getting started template (#2143) (d13abad)

2.24.0 (2025-10-07)

Features

Add ai.classify() to bigframes.bigquery package (#2137) (56e5033)
Add ai.generate() to bigframes.bigquery module (#2128) (3810452)
Add ai.if_() and ai.score() to bigframes.bigquery package (#2132) (32502f4)

Bug Fixes

Fix internal type errors with temporal accessors (#2125) (c390da1)
Fix row count local execution bug (#2133) (ece0762)
Join on, how args are now positional (#2140) (b711815)
Only show JSON dtype warning when accessing dtypes directly (#2136) (eca22ee)
Remove noisy AmbiguousWindowWarning from partial ordering mode (#2129) (4607f86)

Performance Improvements

Scale read stream workers to cpu count (#2135) (67e46cd)

2.23.0 (2025-09-29)

Features

Add ai.generate_double to bigframes.bigquery package (#2111) (6b8154c)

Bug Fixes

Prevent invalid syntax for no-op .replace ops (#2112) (c311876)

Documentation

Add timedelta notebook sample (#2124) (d1a9888)

2.22.0 (2025-09-25)

Features

Add GroupBy.__iter__ (#1394) (c56a78c)
Add ai.generate_int to bigframes.bigquery package (#2109) (af6b862)
Add Groupby.describe() (#2088) (328a765)
Implement Index.to_list() (#2106) (60056ca)
Implement inplace parameter for DataFrame.drop (#2105) (3487f13)
Support callable for series map method (#2100) (ac25618)
Support df.info() with null index (#2094) (fb81eea)

Bug Fixes

Avoid ibis fillna warning in compiler (#2113) (7ef667b)
Negative start and stop parameter values in Series.str.slice() (#2104) (f57a348)
Throw type error for incomparable join keys (#2098) (9dc9695)
Transformers with non-standard column names throw errors (#2089) (a2daa3f)

2.21.0 (2025-09-17)

Features

Add bigframes.bigquery.to_json (#2078) (0fc795a)
Support average=’binary’ in precision_score() (#2080) (920f381)
Support pandas series in ai.generate_bool (#2086) (a3de53f)

Bug Fixes

Allow bigframes.options.bigquery.credentials to be None (#2092) (78f4001)

2.20.0 (2025-09-16)

Features

Add __dataframe__ interchange support (#2063) (3b46a0d)
Add ai_generate_bool to the bigframes.bigquery package (#2060) (70d6562)
Add bigframes.bigquery.to_json_string (#2076) (41e8f33)
Add rank(pct=True) support (#2084) (c1e871d)
Add StreamingDataFrame.to_bigtable and .to_pubsub start_timestamp parameter (#2066) (a63cbae)
Can call agg with some callables (#2055) (17a1ed9)
Support astype to json (#2073) (6bd6738)
Support pandas.Index as key for DataFrame.setitem() (#2062) (b3cf824)
Support pd.cut() for array-like type (#2064) (21eb213)
Support to cast struct to json (#2067) (b0ff718)

Bug Fixes

Deflake ai_gen_bool multimodel test (#2085) (566a37a)
Do not scroll page selector in anywidget repr_mode (#2082) (5ce5d63)
Fix the potential invalid VPC egress configuration (#2068) (cce4966)
Return a DataFrame containing query stats for all non-SELECT statements (#2071) (a52b913)
Use the remote and managed functions for bigframes results (#2079) (49b91e8)

Performance Improvements

Avoid re-authenticating if credentials have already been fetched (#2058) (913de1b)
Improve apply axis=1 performance (#2077) (12e4380)

2.19.0 (2025-09-09)

Features

Add str.join method (#2054) (8804ada)
Support display.max_colwidth option (#2053) (5229e07)
Support VPC egress setting in remote function (#2059) (5df779d)

Bug Fixes

Fix issue mishandling chunked array while loading data (#2051) (873d0ee)
Remove warning for slot_millis_sum (#2047) (425a691)

2.18.0 (2025-09-03)

⚠ BREAKING CHANGES

add allow_large_results option to read_gbq_query, aligning with bpd.options.compute.allow_large_results option (#1935)

Features

Add allow_large_results option to read_gbq_query, aligning with bpd.options.compute.allow_large_results option (#1935) (a7963fe)
Add parameter shuffle for ml.model_selection.train_test_split (#2030) (2c72c56)
Can pivot unordered, unindexed dataframe (#2040) (1a0f710)
Local date accessor execution support (#2034) (7ac6fe1)
Support args in dataframe apply method (#2026) (164c481)
Support args in series apply method (#2013) (d9d725c)
Support callable for dataframe mask method (#2020) (9d4504b)
Support multi-column assignment for DataFrame (#2028) (ba0d23b)
Support string matching in local executor (#2032) (c0b54f0)

Bug Fixes

Fix scalar op lowering tree walk (#2029) (935af10)
Read_csv fails when check file size for wildcard gcs files (#2019) (b0d620b)
Resolve the validation issue for other arg in dataframe where method (#2042) (8689199)

Performance Improvements

Improve axis=1 aggregation performance (#2036) (fbb2094)
Improve iter_nodes_topo performance using Kahn’s algorithm (#2038) (3961637)

2.17.0 (2025-08-22)

Features

Add isin local execution impl (#1993) (26df6e6)
Add reset_index names, col_level, col_fill, allow_duplicates args (#2017) (c02a1b6)
Support callable for series mask method (#2014) (5ac32eb)

2.16.0 (2025-08-20)

Features

Add bigframes.pandas.options.display.precision option (#1979) (15e6175)
Add level, inplace params to reset_index (#1988) (3446950)
Add ML code samples from dbt blog post (#1978) (ebaa244)
Add where, coalesce, fillna, casewhen, invert local impl (#1976) (f7f686c)
Adjust anywidget CSS to prevent overflow (#1981) (204f083)
Format page number in table widget (#1992) (e83836e)
Or, And, Xor can execute locally (#1994) (59c52a5)
Support callable bigframes function for dataframe where (#1990) (44c1ec4)
Support callable for series where method (#2005) (768b82a)
When using repr_mode = "anywidget", numeric values align right (15e6175)

Bug Fixes

Address the packages issue for bigframes function (#1991) (68f1d22)
Correct pypdf dependency specifier for remote PDF functions (#1980) (0bd5e1b)
Enable default retries in calls to BQ Storage Read API (#1985) (f25d7bd)
Fix the copyright year in dbt sample files (#1996) (fad5722)

Performance Improvements

Faster session startup by defering anon dataset fetch (#1982) (2720c4c)

Documentation

Add examples of running bigframes in kaggle (#2002) (7d89d76)
Remove preview warning from partial ordering mode sample notebook (#1986) (132e0ed)

2.15.0 (2025-08-11)

Features

Add st_buffer, st_centroid, and st_convexhull and their corresponding GeoSeries methods (#1963) (c4c7fa5)
Add first, last support to GroupBy (#1969) (41dda88)
Add value_counts to GroupBy classes (#1974) (82175a4)
Allow callable as a conditional or replacement input in DataFrame.where (#1971) (a8d57d2)
Can cast locally in hybrid engine (#1944) (d9bc4a5)
Df.join lsuffix and rsuffix support (#1857) (26515c3)

Bug Fixes

Add warnings for duplicated or conflicting type hints in bigfram… (#1956) (d38e42c)
Make remote_function more robust when there are create_function retries (#1973) (cd954ac)
Make ExecutionMetrics stats tracking more robust to missing stats (#1977) (feb3ff4)

Performance Improvements

Remove an unnecessary extra dry_run query from read_gbq_table (#1972) (d17b711)

Documentation

Divide BQ DataFrames quickstart code cell (#1975) (fedb8f2)

2.14.0 (2025-08-05)

Features

Dynamic table width for better display across devices (https://github.com/googleapis/python-bigquery-dataframes/issues/1948) (a6d30ae) (a6d30ae)
Retry AI/ML jobs that fail more often (#1965) (25bde9f)
Support series input in managed function (#1920) (62a189f)

Bug Fixes

Enhance type error messages for bigframes functions (#1958) (770918e)

Performance Improvements

Use promote_offsets for consistent row number generation for index.get_loc (#1957) (c67a25a)

Documentation

Add code snippet for storing dataframes to a CSV file (#1943) (a511e09)
Add code snippet for storing dataframes to a CSV file (#1953) (a298a02)

2.13.0 (2025-07-25)

Features

_read_gbq_colab creates hybrid session (#1901) (31b17b0)
Add CSS styling for TableWidget pagination interface (#1934) (5b232d7)
Add row numbering local pushdown in hybrid execution (#1932) (92a2377)
Implement Index.get_loc (#1921) (bbbcaf3)

Bug Fixes

Add license header and correct issues in dbt sample (#1931) (ab01b0a)

Dependencies

Replace google-cloud-iam with grpc-google-iam-v1 (#1864) (e5ff8f7)

2.12.0 (2025-07-23)

Features

Add code samples for dbt bigframes integration (#1898) (7e03252)
Add isin local execution to hybrid engine (#1915) (c0cefd3)
Add ml.metrics.mean_absolute_error method (#1910) (15b8449)
Allow local arithmetic execution in hybrid engine (#1906) (ebdcd02)
Provide day_of_year and day_of_week for dt accessor (#1911) (40e7638)
Support params max_batching_rows, container_cpu, and container_memory for udf (#1897) (8baa912)
Support typed pyarrow.Scalar in assignment (#1930) (cd28e12)

Bug Fixes

Correct min field from max() to min() in remote function tests (#1917) (d5c54fc)
Resolve location reset issue in bigquery options (#1914) (c15cb8a)
Series.str.isdigit in unicode superscripts and fractions (#1924) (8d46c36)

Documentation

Add code snippets for session and IO public docs (#1919) (6e01cbe)
Add snippets for performance optimization doc (#1923) (4da309e)

2.11.0 (2025-07-15)

Features

Add __contains__ to Index, Series, DataFrame (#1899) (07222bf)
Add thresh param for Dataframe.dropna (#1885) (1395a50)
Add concat pushdown for hybrid engine (#1891) (813624d)
Add pagination buttons (prev/next) to anywidget mode for DataFrames (#1841) (8eca767)
Add total_rows property to pandas batches iterator (#1888) (e3f5e65)
Hybrid engine local join support (#1900) (1aa7950)
Support date data type for to_datetime() (#1902) (24050cb)
Support bpd.Series(json_data, dtype=”json”) (#1882) (05cb7d0)

Bug Fixes

Bpd.merge on common columns (#1905) (a1fa112)
DataFrame string addition respects order (#1894) (52c8233)
Show slot_millis_sum warning only when allow_large_results=False (#1892) (25efabc)
Used query row count metadata instead of table metadata (#1893) (e1ebc53)

2.10.0 (2025-07-08)

Features

df.to_pandas_batches() returns one empty DataFrame if df is empty (#1878) (e43d15d)
Add filter pushdown to hybrid engine (#1871) (6454aff)
Add simple stats support to hybrid local pushdown (#1873) (8715105)

Bug Fixes

Fix issues where duration type returned as int (#1875) (f30f750)

Documentation

Update gsutil commands to gcloud commands (#1876) (c289f70)

2.9.0 (2025-06-30)

Features

Add bpd.read_arrow to convert an Arrow object into a bigframes DataFrame (#1855) (633bf98)
Add experimental polars execution (#1747) (daf0c3b)
Add size op support in local engine (#1865) (942e66c)
Create deploy_remote_function and deploy_udf functions to immediately deploy functions to BigQuery (#1832) (c706759)
Support index item assign in Series (#1868) (c5d251a)
Support item assignment in series (#1859) (25684ff)
Support local execution of comparison ops (#1849) (1c45ccb)

Bug Fixes

Fix bug selecting column repeatedly (#1858) (cc339e9)
Fix bug with DataFrame.agg for string values (#1870) (81e4d64)
Generate GoogleSQL instead of legacy SQL data types for dry_run=True from bpd._read_gbq_colab with local pandas DataFrame (#1867) (fab3c38)
Revert dict back to protobuf in the iam binding update (#1838) (9fb3cb4)

Documentation

Add data visualization samples for public doc (#1847) (15e1277)
Changed broken logo (#1866) (e3c06b4)
Update ai.forecast notebook (#1844) (1863538)

2.8.0 (2025-06-23)

⚠ BREAKING CHANGES

add required param ‘engine’ to multimodal functions (#1834)

Features

Add bpd.options.compute.maximum_result_rows option to limit client data download (#1829) (e22a3f6)
Add bpd.options.display.repr_mode = "anywidget" to create an interactive display of the results (#1820) (be0a3cf)
Add DataFrame.ai.forecast() support (#1828) (7bc7f36)
Add describe() method to Series (#1827) (a4205f8)
Add required param ‘engine’ to multimodal functions (#1834) (37666e4)

Performance Improvements

Produce simpler sql (#1836) (cf9c22a)

Documentation

Add ai.forecast notebook (#1840) (2430497)

2.7.0 (2025-06-16)

Features

Add bbq.json_query_array and warn bbq.json_extract_array deprecated (#1811) (dc9eb27)
Add bbq.json_value_array and deprecate bbq.json_extract_string_array (#1818) (019051e)
Add groupby cumcount (#1798) (18f43e8)
Support custom build service account in remote_function (#1796) (e586151)

Bug Fixes

Correct read_csv behaviours with use_cols, names, index_col (#1804) (855031a)
Fix single row broadcast with null index (#1803) (080eb7b)

Documentation

Document how to use ai.map() for information extraction (#1808) (b586746)
Rearrange README.rst to include a short code sample (#1812) (f6265db)
Use pandas API instead of pandas-like or pandas-compatible (#1825) (aa32369)

2.6.0 (2025-06-09)

Features

Add blob.transcribe function (#1773) (86159a7)
Implement ai.classify() (#1781) (8af26d0)
Implement item() for Series and Index (#1792) (d2154c8)
Implement ST_ISCLOSED geography function (#1789) (36bc179)
Implement ST_LENGTH geography function (#1791) (c5b7fda)
Support isin with bigframes.pandas.Index arg (#1779) (e480d29)

Bug Fixes

Address read_csv with both index_col and use_cols behavior inconsistency with pandas (#1785) (ba7c313)
Allow KMeans model init parameter as k-means++ alias (#1790) (0b59cf1)
Replace function now can handle bpd.NA value. (#1786) (7269512)

Documentation

Adjust strip method examples to match latest pandas (#1797) (817b0c0)
Fix docstrings to improve html rendering of code examples (#1788) (38d9b73)

2.5.0 (2025-05-30)

⚠ BREAKING CHANGES

the updated ai.map() parameter list is not backward-compatible

Features

Add bpd.options.bigquery.requests_transport_adapters option (#1755) (bb45db8)
Add bbq.json_query and warn bbq.json_extract deprecated (#1756) (ec81dd2)
Add bpd.options.reset() method (#1743) (36c359d)
Add DataFrame.round method (#1742) (3ea6043)
Add deferred data uploading (#1720) (1f6442e)
Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs (#1723) (80aad9a)
Add structured output for ai map, ai filter and ai join (#1746) (133ac6b)
Add support for df.loclist, column(s) (768a757)
Include bq schema and query string in dry run results (#1752) (bb51147)
Support inplace=True in rename and rename_axis (#1744) (734cc65)
Support unique() for Index (#1750) (27fac78)
Support astype conversions to and from JSON dtypes (#1716) (8ef4de1)
Support dict param for dataframe.agg() (#1772) (f9c29c8)
Support dtype parameter in read_csv for bigquery engine (#1749) (50dca4c)
Use read api for some peek ops (#1731) (108f4d2)

Bug Fixes

Fix clip int series with float bounds (#1739) (d451aef)
Fix error with self-merge operations (#1774) (e5fe143)
Fix the default value for na_value for numpy conversions (#1766) (0629cac)
Include location in Session-based temporary storage manager DDL queries (#1780) (acba032)
Prevent creating unnecessary client objects in multithreaded environments (#1757) (1cf9f5e)
Reduce bigquery table modification via DML for to_gbq (#1737) (545cdca)
Stop ignoring arguments to MatrixFactorization.score(X, y) (#1726) (55c07e9)
Support JSON and STRUCT for bbq.sql_scalar (#1754) (190390b)
Support str.replace re.compile with flags (#1736) (f8d2cd2)

Performance Improvements

Faster local data comparison using idenitity (#1738) (2858b1e)
Optimize repr for unordered gbq table (#1778) (2bc4fbc)
Use JOB_CREATION_OPTIONAL when allow_large_results=False (#1763) (15f3f2a)

Dependencies

Avoid gcsfs==2025.5.0 (#1762) (68d5e2c)

Documentation

Add llm output_schema notebook (#1732) (b2261cc)
Add MatrixFactorization to the table of contents (#1725) (611e43b)
Fix typo for “population” in the GeminiTextGenerator.predict(..., output_schema={...}) sample notebook (#1748) (bd07e05)
Integrations notebook extracts token from bqclient._http.credentials instead of bqclient._credentials (#1784) (6e63eca)
Updated multimodal notebook instructions (#1745) (1df8ca6)
Use partial ordering mode in the quickstart sample (#1734) (476b7dd)

2.4.0 (2025-05-12)

Features

Add “dayofyear” property for dt accessors (#1692) (9d4a59d)
Add .dt.days, .dt.seconds, dt.microseconds, and dt.total_seconds() for timedelta series. (#1713) (2b3a45f)
Add DatetimeIndex class (#1719) (c3c830c)
Add isocalendar() for dt accessor” (#1717) (0479763)
Add bigframes.bigquery.json_value (#1697) (46a9c53)
Add blob.exif function support (#1703) (3f79528)
Add inplace arg support to sort methods (#1710) (d1ccb52)
Improve error message in Series.apply for direct udfs (#1673) (1a658b2)
Publish bigframes blob(Multimodal) to preview (#1693) (e4c85ba)
Support () operator between timedeltas (#1702) (edaac89)
Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models (#1305) (b16740e)
Support to_strip parameter for str.strip, str.lstrip and str.rstrip (#1705) (a84ee75)

Bug Fixes

Fix dayofyear doc test (#1701) (9b777a0)
Fix issues with chunked arrow data (#1700) (e3289b7)
Rename columns with protected names such as _TABLE_SUFFIX in to_gbq() (#1691) (8ec6079)

Performance Improvements

Defer query in read_gbq with wildcard tables (#1661) (5c125c9)
Rechunk result pages client side (#1680) (67d8760)

Dependencies

Move bigtable and pubsub to extras (#1696) (597d817)

Documentation

Add snippets for Matrix Factorization tutorials (#1630) (24b37ae)
Deprecate bpd.options.bigquery.allow_large_results in favor of bpd.options.compute.allow_large_results (#1597) (18780b4)
Include import statement in the bigframes code snippet (#1699) (08d70b6)
Include the clean-up step in the udf code snippet (#1698) (48992e2)
Move multimodal notebook out of experimental folder (#1712) (68b6532)
Update blob_display option in snippets (#1714) (8b30143)

2.3.0 (2025-05-06)

Features

Add dry_run parameter to read_gbq(), read_gbq_table() and read_gbq_query() (#1674) (4c5dee5)

Bug Fixes

Guarantee guid thread safety across threads (#1684) (cb0267d)
Support large lists of lists in bpd.Series() constructor (#1662) (0f4024c)
Use value equality to check types for unix epoch functions and timestamp diff (#1690) (81e8fb8)

Performance Improvements

to_datetime() now avoids caching inputs unless data is inspected to infer format (#1667) (dd08857)

Documentation

Add a visualization notebook to BigFrame samples (#1675) (ee062bf)
Fix spacing of k-means code snippet (#1687) (99f45dd)
Update snippet for Create a k-means model tutorial (#1664) (761c364)

2.2.0 (2025-04-30)

Features

Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints (#1650) (4fb54df)
Add GeminiTextGenerator.predict structured output (#1653) (6199023)
DataFrames.getitem support for slice input (#1668) (563f0cb)
Print right origin of PreviewWarning for the bpd.udf (#1629) (48d10d1)
Session.bytes_processed_sum will be updated when allow_large_re… (#1669) (ae312db)
Short circuit query for local scan (#1618) (e84f232)
Support names parameter in read_csv for bigquery engine (#1659) (3388191)
Support passing list of values to bigframes.core.sql.simple_literal (#1641) (102d363)
Support write api as loading option (#1617) (c46ad06)

Bug Fixes

DataFrame accessors is not pupulated (#1639) (28afa2c)
Prefer remote schema instead of throwing on materialize conflicts (#1644) (53fc25b)
Remove itertools.pairwise usage (#1638) (9662745)
Resolve issue where pre-release versions of google-auth are installed (#1491) (ebb7a5e)
Resolve some of the typo errors (#1655) (cd7fbde)

Performance Improvements

Fold row count ops when known (#1656) (c958dbe)
Use flyweight for node fields (#1654) (8482bfc)

Dependencies

Support shapely 1.8.5+ again (#1651) (ae83e61)

Documentation

Add JSON data types notebook (#1647) (9128c4a)
Add sample code snippets for udf (#1649) (53caa8d)
Fix bq_dataframes_template notebook to work if partial ordering mode is enabled (#1665) (f442e7a)
Note that udf is in preview and must be python 3.11 compatible (#1629) (48d10d1)

2.1.0 (2025-04-22)

Features

Add bigframes.bigquery.st_distance function (#1637) (bf1ae70)
Enable local json string validations (#1614) (233347a)
Enhance read_csv index_col parameter support (#1631) (f4e5b26)

Bug Fixes

Add retry for test_clean_up_via_context_manager (#1627) (58e7cb0)
Improve robustness of managed udf code extraction (#1634) (8cc56d5)

Documentation

Add code samples in the udf API docstring (#1632) (f68b80c)

2.0.0 (2025-04-17)

⚠ BREAKING CHANGES

make dataset and name params mandatory in udf (#1619)
Locational endpoints support is not available in BigFrames 2.0.
change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
change default ingress setting for remote_function to internal-only (#1544)
make remote_function params keyword only (#1537)
make remote_function default service account explicit (#1537)
set allow_large_results=False by default (#1541)

Features

Add on parameter in dataframe.rolling() and dataframe.groupby.rolling() (#1556) (45c9d9f)
Add component to manage temporary tables (#1559) (0a4e245)
Add Series.to_pandas_batches() method (#1592) (09ce979)
Add support for creating a Matrix Factorization model (#1330) (b5297f9)
Allow input_types, output_type, and dataset to be used positionally in remote_function (#1560) (bcac8c6)
Allow pandas.cut ‘labels’ parameter to accept a list of string (#1549) (af842b1)
Change default ingress setting for remote_function to internal-only (#1544) (c848a80)
Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
Drop support for locational endpoints (#1542) (4bf2e43)
Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
Improve local data validation (#1598) (815e471)
Make remote_function default service account explicit (#1537) (9eb9089)
Set allow_large_results=False by default (#1541) (e9fb712)
Support bigquery connection in managed function (#1554) (f6f697a)
Support bq connection path format (#1550) (e7eb918)
Support gemini-2.0-X models (#1558) (3104fab)
Support inlining small list, struct, json data (#1589) (2ce891f)
Support time range rolling on Series. (#1590) (6e98a2c)
Use session temp tables for all ephemeral storage (#1569) (9711b83)
Use validated local storage for data uploads (#1612) (aee4159)
Warn the deprecated max_download_size, random_state and sampling_method parameters in (DataFrame|Series).to_pandas() (#1573) (b9623da)

Bug Fixes

to_pandas_batches() respects page_size and max_results again (#1572) (27c5905)
Ensure page_size works correctly in to_pandas_batches when max_results is not set (#1588) (570cff3)
Include role and service account in IAM exception (#1564) (8c50755)
Make dataset and name params mandatory in udf (#1619) (637e860)
Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
Prevent KeyError in bpd.concat with empty DF and struct/array types DF (#1568) (b4da1cf)
Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)

Performance Improvements

Directly read gbq table for simple plans (#1607) (6ad38e8)

Dependencies

Remove jellyfish dependency (#1604) (1ac0e1e)
Remove parsy dependency (#1610) (293f676)
Remove test dependency on pytest-mock package (#1622) (1ba72ea)
Support a shapely versions 1.8.5+ (#1621) (e39ee3b)

Documentation

Add details for bigquery_connection in [@bpd](https://github.com/bpd).udf docstring (#1609) (ef63772)
Add explain forecast snippet to multiple time series tutorial (#1586) (40c55a0)
Add message to remove default model for version 3.0 (#1563) (910be2b)
Add samples for ArimaPlus time_series_id_col feature (#1577) (1e4cd9c)
Add warning for bigframes 2.0 (#1557) (3f0eaa1)
Deprecate default model in TextEmbedddingGenerator, GeminiTextGenerator, and other bigframes.ml.llm classes (#1570) (89ab33e)
Include all licenses for vendored packages in the root LICENSE file (#1626) (8116ed0)
Remove gemini-1.5 deprecation warning for GeminiTextGenerator (#1562) (0cc6784)
Use restructured text to allow publishing to PyPI (#1565) (d1e9ec2)

Miscellaneous Chores

Make remote_function params keyword only (#1537) (9eb9089)

1.42.0 (2025-03-27)

Features

Add closed parameter in rolling() (#1539) (8bcc89b)
Add GeoSeries.difference() and bigframes.bigquery.st_difference() (#1471) (e9fe815)
Add GeoSeries.intersection() and bigframes.bigquery.st_intersection() (#1529) (8542bd4)
Add df.take and series.take (#1509) (7d00be6)
Add Linear_Regression.global_explain() (#1446) (7e5b6a8)
Allow iloc to support lists of negative indices (#1497) (a9cf215)
Support dry_run in to_pandas() (#1436) (75fc7e0)
Support window partition by geo column (#1512) (bdcb1e7)
Upgrade BQ managed udf to preview (#1536) (4a7fe4d)

Bug Fixes

Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X (#1534) (c93e720)
Change the default value for pdf extract/chunk (#1517) (a70a607)
Local data always has sequential index (#1514) (014bd33)
Read_pandas inline returns None when exceeds limit (#1525) (578081e)
Temporary fix for StreamingDataFrame not working backend bug (#1533) (6ab4ffd)
Tolerate BQ connection service account propagation delay (#1505) (6681f1f)

Performance Improvements

Update shape to use quer_and_wait (#1519) (34ab9b8)

Documentation

Update GeoSeries.difference() and bigframes.bigquery.st_difference() docs (#1526) (d553fa2)

1.41.0 (2025-03-19)

Features

Add support for the ‘right’ parameter in ‘pandas.cut’ (#1496) (8aff128)
Support BQ managed functions through read_gbq_function (#1476) (802183d)
Warn when the BigFrames version is more than a year old (#1455) (00e0750)

Bug Fixes

Fix pandas.cut errors with empty bins (#1499) (434fb5d)
Fix read_gbq with ORDER BY query and index_col set (#963) (de46d2f)

Performance Improvements

Eliminate count queries in llm retry (#1489) (1c934c2)

Documentation

Add a sample notebook for vector search (#1500) (f3bf139)

1.40.0 (2025-03-11)

⚠ BREAKING CHANGES

reading JSON data as a custom arrow extension type (#1458)

Features

Reading JSON data as a custom arrow extension type (#1458) (e720f41)
Support list output for managed function (#1457) (461e9e0)

Bug Fixes

Fix list-like indexers in partial ordering mode (#1456) (fe72ada)
Fix the merge issue between 1424 and 1373 (#1461) (7b6e361)
Use == instead of is for timedelta type equality checks (#1480) (0db248b)

Performance Improvements

Compilation no longer bounded by recursion (#1464) (27ab028)

1.39.0 (2025-03-05)

Features

(Preview) Support diff() for date series (#1423) (521e987)
(Preview) Support aggregations over timedeltas (#1418) (1251ded)
(Preview) Support arithmetics between dates and timedeltas (#1413) (962b152)
(Preview) Support automatic load of timedelta from BQ tables. (#1429) (b2917bb)
Add allow_large_results option to many I/O methods. Set to False to reduce latency (#1428) (dd2f488)
Add GeoSeries.boundary() (#1435) (32cddfe)
Add allow_large_results to peek (#1448) (67487b9)
Add groupby.rank() (#1433) (3a633d5)
Iloc multiple columns selection. (#1437) (ddfd02a)
Support interface for BigQuery managed functions (#1373) (2bbf53f)
Warn if default ingress_settings is used in remote_functions (#1419) (dfd891a)

Bug Fixes

Do not compare schema description during schema validation (#1452) (03a3a56)
Remove warnings for null index and partial ordering mode in prep for GA (#1431) (6785aee)
Warn if default cloud_function_service_account is used in remote_function (#1424) (fe7463a)
Window operations over JSON columns (#1451) (0070e77)
Write chunked text instead of dummy text for pdf chunk (#1444) (96b0e8a)

Performance Improvements

Speed up DataFrame corr, cov (#1309) (c598c0a)

Documentation

Add snippet for explaining the linear regression model prediction (#1427) (7c37c7d)

1.38.0 (2025-02-24)

Features

(Preview) Support diff aggregation for timestamp series. (#1405) (abe48d6)
Add GeoSeries.from_wkt()and GeoSeries.to_wkt() (#1401) (2993b28)
Support DF.array(copy=True) (#1403) (693ed8c)
Support routines with ARRAY return type in read_gbq_function (#1412) (4b60049)

Bug Fixes

Calling to_timdelta() over timedeltas no longer changes their values (#1411) (650a190)
Replace empty dict with None to avoid mutable default arguments (#1416) (fa4e3ad)

Performance Improvements

Avoid redundant SQL casts (#1399) (6ee48d5)

Dependencies

Remove scikit-learn and sqlalchemy as required dependencies (#1296) (fd8bc89)

Documentation

Add samples using SQL methods via the bigframes.bigquery module (#1358) (f54e768)
Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial (#1310) (c6c9120)

1.37.0 (2025-02-19)

Features

(Preview) Support add, sub, mult, div, and more between timedeltas (#1396) (ffa63d4)
(Preview) Support comparison, ordering, and filtering for timedeltas (#1387) (34d01b2)
(Preview) Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns (#1390) (50ad3a5)
JSON dtype support for read_pandas and Series constructor (#1391) (44f4137)

Bug Fixes

Ensure binops with pandas objects returns bigquery dataframes (#1404) (3cee24b)

Performance Improvements

Prune projections more aggressively (#1398) (7990262)
Simplify sum aggregate SQL text (#1395) (0145656)
Use simple null constraints to simplify queries (#1381) (00611d4)

Documentation

Add DataFrame.struct docs (#1348) (7e9e93a)

1.36.0 (2025-02-11)

Features

(Preview) Support addition between a timestamp and a timedelta (#1369) (b598aa8)
(Preview) Support casting floats and list-likes to timedelta series (#1362) (65933b6)
(Preview) Support timestamp subtractions (#1346) (86b7e72)
Add bigframes.bigquery.st_area and suggest it from GeoSeries.area (#1318) (8b5ffa8)
Add GeoSeries.from_xy() (#1364) (3c3e14c)

Bug Fixes

Dtype parameter ineffective in Series/DataFrame construction (#1354) (b9bdca8)
Translate labels to col ids when copying dataframes (#1372) (0c55b07)

Performance Improvements

Prune unused operations from sql (#1365) (923da03)
Simplify merge join key coalescing (#1361) (7ae565d)

1.35.0 (2025-02-04)

Features

(Preview) Support timedeltas for read_pandas() (#1349) (866ba9e)
Add Series.keys() (#1342) (deb015d)
Allow case_when to change dtypes if case list contains the condition (True, some_default_value) (#1311) (5c2a2c6)
Support python type as astype arg (#1316) (b26e135)
Support time_series_id_col in ARIMAPlus (#1282) (97532c9)

Bug Fixes

Exclude DataFrame and Series __call__ from unimplemented API metrics (#1351) (f2d5264)
Make DataFrame __getattr__ and __setattr__ more robust to subclassing (#1352) (417de3a)

Performance Improvements

Fall back to ordering by bq pk when possible (#1350) (3c4abf2)
Improve isin performance (#1203) (db087b0)
Prevent inlining of remote ops (#1347) (012081a)

Dependencies

Add support for Python 3.13 for everything but remote functions (#1307) (533db96)

Documentation

Add GeoSeries docs (#1327) (05f83d1)
Add link to DataFrames intro to improve SEO (#1176) (aafb5be)
Add snippet to explain the univariate model’s forecast result in the Forecast a single time series with a univariate model tutorial (#1272) (c22126b)

1.34.0 (2025-01-27)

⚠ BREAKING CHANGES

Enable reading JSON data with dbjson extension dtype (#1139)

Features

(df|s).hist(), (df|s).line(), (df|s).area(), (df|s).bar(), df.scatter() (#1320) (bd3f584)
(Preview) Define timedelta type and to_timedelta function (#1317) (3901951)
Add DataFrame.corrwith method (#1315) (b503355)
Add DataFrame.mask method (#1302) (8b8155f)
Enable reading JSON data with dbjson extension dtype (#1139) (f672262)

1.33.0 (2025-01-22)

Features

Add bigframes.bigquery.sql_scalar() to apply SQL syntax on Series objects (#1293) (aa2f73a)
Add unix_seconds, unix_millis and unix_micros for timestamp series. (#1297) (e4b0c8d)
DataFrame.join supports Series other (#1303) (ee37a0a)
Support array output in remote_function (#1057) (bdee173)

Bug Fixes

Dataframe sort_values Series input keyerror. (#1285) (5a2731b)
Fix read_gbq_function issue in dataframe apply method (#1174) (0318764)
Series sort_index and sort_values now raises when axis!=0 (#1294) (94bc2f2)

Documentation

Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial (#1271) (a687050)
Update bigframes.pandas.Series docs (#1273) (0cac64f)

1.32.0 (2025-01-13)

Features

Add max_retries to TextEmbeddingGenerator and Claude3TextGenerator (#1259) (8077ff4)
Bigframes.bigquery.parse_json (#1265) (27bbd80)
Support DataFrame.astype(dict) (#1262) (5934f8e)

Bug Fixes

Avoid global mutation in BigQueryOptions.client_endpoints_override (#1280) (788f6e9)
Fix erroneous window bounds removal during compilation (#1163) (f91756a)

Dependencies

Relax sqlglot upper bound (#1278) (c71ec09)

Documentation

Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents (#1266) (58f13cb)
Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial (#1267) (3dcae2d)
Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial (#1268) (059a564)
Update bigframes.pandas.pandas docstrings (#1247) (c4bffc3)
Use 002 model for better scalability in text generation (#1270) (bb7a850)

1.31.0 (2025-01-05)

Features

Implement confirmation threshold for semantic operators (#1251) (5ba4511)

Bug Fixes

Raise if trying to change ordering_mode after session has started (#1252) (8cfaae8)
Reduce the number of labels added to query jobs (#1245) (fdcdc18)

Documentation

Remove bq studio link (#1258) (dd4fd2e)
Update bigframes.pandas.DatetimeMethods docstrings (#1246) (10f08da)
Update semantic_operators.ipynb (#1260) (a2ed989)

1.30.0 (2024-12-30)

Features

Add GeoSeries.x and GeoSeries.y (#1126) (4c3548f)
Add LinearRegression.predict_explain() to generate ML.EXPLAIN_PREDICT columns (#1190) (e13eca2)
Add LogisticRegression.predict_explain() to generate ML.EXPLAIN_PREDICT columns (#1222) (bcbc732)
Add write_engine parameter to read_FORMATNAME methods to control how data is written to BigQuery (#371) (ed47ef1)
Add client side retry to GeminiTextGenerator (#1242) (8193abe)
Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 (#1208) (298fc73)
Add support for LinearRegression.predict_explain and LogisticRegression.predict_explain parameter, top_k_features (#1228) (3068e19)
Support dataframe where method (#1166) (71b4053)

Bug Fixes

Arima model series input. (#1237) (f7d52d9)
Json in struct destination type (#1187) (200c9bb)
Throw an error message when setting is_row_processor=True to read a multi param function (#1160) (b2816a5)

Documentation

Add an “open in BQ Studio” link to all BigFrames sample notebooks (#1223) (e0a8288)
Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” (#1239) (840aaff)
Add example for logistic regression (#1240) (4d854fd)
Add examples for ml PCA and SimpleImputer (#1236) (0d84459)
Add KMeans example (#1234) (d87ab97)
Add linear model example (#1235) (2c3e1fd)
Add ml.model_selection examples (#1238) (50648e4)
Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial (#1227) (20f3190)

1.29.0 (2024-12-12)

Features

Add Gemini 2.0 preview text model support (#1209) (1021d57)

Documentation

Add Gemini 2.0 text gen sample notebook (#1211) (9596b66)
Update bigframes.pandas.index docs return types (#1191) (c63e7da)

1.28.0 (2024-12-11)

Features

(Series | DataFrame).plot.bar (#1152) (0fae2e0)
bigframes.bigquery.vector_search supports use_brute_force and fraction_lists_to_search parameters (#1158) (131edc3)
Add ARIMAPlus.predict_explain() to generate forecasts with explanation columns (#1177) (05f8b4d)
Add client_endpoints_override to bq options (#1167) (be74b99)
Add support for temporal types in dataframe’s describe() method (#1189) (2d564a6)
Allow join-free alignment of analytic expressions (#1168) (daef4f0)
Series.isin supports bigframes.Series arg (#1195) (0d8a16b)
Update llm.TextEmbeddingGenerator to 005 (#1186) (3072d38)

Bug Fixes

Fix error loading local dataframes into bigquery (#1165) (5b355ef)
Fix null index join with ‘on’ arg (#1153) (9015c33)
Fix series.isin using local path always (#1202) (a44eafd)

Performance Improvements

Update df.corr, df.cov to be used with more than 30 columns case. (#1161) (9dcf1aa)

Dependencies

Remove ibis-framework by vendoring a fork of the package to bigframes_vendored. (#1170) (421d24d)

Documentation

Add a code sample using bpd.options.bigquery.ordering_mode = "partial" (#909) (f80d705)
Add snippet for creating boosted tree model (#1142) (a972668)
Add snippet for evaluating a boosted tree model (#1154) (9d8970a)
Add snippet for predicting classifications using a boosted tree model (#1156) (e7b83f1)
Add third party pandas.Index methods and docstrings (#1171) (a970294)
Fix Bigframes.Pandas.General_Function missing docs (#1164) (de923d0)
Update bigframes.pandas.Index docstrings (#1144) (557ab8d)

1.27.0 (2024-11-16)

Features

Add astype(type, errors=’null’) to cast safely (#1122) (b4d17ff)

Bug Fixes

Dataframe fillna with scalar. (#1132) (37f8c32)
Exclude index columns from model fitting processes. (#1138) (8d4da15)
Unordered mode too many labels issue. (#1148) (7216b21)

Documentation

Document groupby.head and groupby.size methods (#1111) (a61eb4d)

1.26.0 (2024-11-12)

Features

Add basic geopandas functionality (#962) (3759c63)
Support json_extract_string_array in the bigquery module (#1131) (4ef8bac)

Bug Fixes

Fix Series.to_frame generating string label instead of int where name is None (#1118) (14e32b5)
Update the API documentation with newly added rep (#1120) (72c228b)

Performance Improvements

Reduce CURRENT_TIMESTAMP queries (#1114) (32274b1)
Reduce dry runs from read_gbq with table (#1129) (f7e4354)

Documentation

Add file for Classification with a Boosted Treed Model and snippet for preparing sample data (#1135) (7ac6639)
Add snippet for Linear Regression tutorial Predict Outcomes section (#1101) (108f4a9)
Update DataFrame docstrings to include the errors section (#1127) (a38d4c4)
Update GroupBy docstrings (#1103) (9867a78)
Update Session doctrings to include exceptions (#1130) (a870421)

1.25.0 (2024-10-29)

Features

Add the ground_with_google_search option for GeminiTextGenerator predict (#1119) (ca02cd4)
Add warning when user tries to access struct series fields with __getitem__ (#1082) (20e5c58)
Allow fit to take additional eval data in linear and ensemble models (#1096) (254875c)
Support context manager for bigframes session (#1107) (5f7b8b1)

Performance Improvements

Improve series.unique performance and replace drop_duplicates i… (#1108) (499f24a)

1.24.0 (2024-10-24)

Features

Support series items method (#1089) (245a89c)

Documentation

Update docstrings of DataFrame and related files (#1092) (15e9fd5)

1.23.0 (2024-10-23)

Features

Add bigframes.bigquery.create_vector_index to assist in creating vector index on ARRAY<FLOAT64> columns (#1024) (863d694)
Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. (#1105) (7094c85)
Add support for pandas series & data frames as inputs for ml models. (#1088) (30c8883)
Cleanup temp resources with session deletion (#1068) (1d5373d)
Show possible correct key(s) in .__getitem__ KeyError message (#1097) (32fab96)
Support uploading local geo data (#1036) (51cdd33)

Bug Fixes

Escape ids more consistently in ml module (#1074) (103e998)
Model.fit metric not collected issue. (#1085) (06cec00)
Remove index requirement from some dataframe APIs (#1073) (2d16f6d)
Update session metrics in read_gbq_query (#1084) (dced460)

Performance Improvements

Speed up tree transforms during sql compile (#1071) (d73fe9d)
Utilize ORDER BY LIMIT over ROW_NUMBER where possible (#1077) (7003d1a)

Documentation

Add ml tutorial for Evaluate the model (#1038) (a120bae)
Show best practice of closing the session to cleanup resources in sample notebooks (#1095) (62a88e8)
Update docstrings of Session and related files (#1087) (bf93e80)

1.22.0 (2024-10-09)

Features

Support regional endpoints for more bigquery locations (#1061) (45b672a)
Update LLM generators to warn user about model name instead of raising error. (#1048) (650d80d)

Bug Fixes

Access MATERIALIZED_VIEW with read_gbq (#1070) (601e984)
Correct zero row count in DataFrame from table view (#1062) (b536070)
Fix generic error message when entering an incorrect column name (#1031) (5ac217d)
Make explode respect the index labels (#1064) (99ca0df)
Make invalid location warning case-insensitive (#1044) (b6cd55a)
Remove palm2 test case from llm load test (#1063) (575a10a)
Show warning for unknown location set through .ctor (#1052) (02c2da7)

Performance Improvements

Reduce schema tracking overhead (#1056) (1c3879d)
Repr generates fewer queries (#1046) (d204603)
Speedup internal tree comparisons (#1060) (4379438)

Documentation

Add docstring return type section to BigQueryOptions class (#964) (307385f)

1.21.0 (2024-10-02)

Features

Add deprecation warning to PaLM2TextGenerator model (#1035) (1183b0f)
Add DeprecationWarning for PaLM2TextEmbeddingGenerator (#1018) (4af5bbb)
Add ml.model_selection.cross_validate support (#1020) (1a38063)
Allow access of struct fields with dot operators on Series (#1019) (ef76f13)

Bug Fixes

Ensure no double execution for to_pandas (#1032) (4992cc2)
Remove pre-caching of remote function results (#1028) (0359bc8)

Documentation

Add ml cross-validation notebook (#1037) (057f3f0)

1.20.0 (2024-09-25)

Features

Add bigframes.bigquery.approx_top_count (#1010) (3263bd7)
Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations (#955) (1930b4e)
Allow multiple columns input for llm models (#998) (2fe5e48)

Bug Fixes

Fix repr caching with partial ordering (#1016) (208a984)

Documentation

Limit pypi notebook to 7 days and add more info about differences with partial ordering mode (#1013) (3c54399)
Move and edit existing linear-regression tutorial snippet (#991) (4cb62fd)

1.19.0 (2024-09-24)

Features

Add ml.model_selection.KFold class (#1001) (952cab9)
Support bool and bytes types in describe(include='all') (#994) (cc48f58)
Support ingress settings in remote_function (#1011) (8e9919b)

Bug Fixes

Fix miscasting issues with case_when (#1003) (038139d)

Performance Improvements

Join op discards child ordering in unordered mode (#923) (1b5b0ee)

Dependencies

Update ibis version in prerelease tests (#1012) (f89785f)

1.18.0 (2024-09-18)

Features

Add “include” param to describe for string types (#973) (deac6d2)
Add subset parameter to DataFrame.dropna to select which columns to consider (#981) (f7c03dc)

Bug Fixes

DataFrameGroupby.agg now works with unnamed tuples (#985) (0f047b4)
Fix a bug that raises exception when re-indexing columns with their original order (#988) (596b03b)
Make the Series.apply outcome assignable to the original dataframe in partial ordering mode (#874) (c94ead9)

Dependencies

Limit ibis-framework version to 9.2.0 (#989) (06c1b33)
Update to ibis-framework 9.x and newer sqlglot (#827) (89ea44f)

1.17.0 (2024-09-11)

Features

Add __version__ alias to bigframes.pandas (#967) (9ce10b4)
Add Gemini 1.5 stable models support (#945) (c1cde19)
Allow setting table labels in to_gbq (#941) (cccc6ca)
Define list accessor for bigframes Series (#946) (8e8279d)
Enable read_csv() to process other files (#940) (3b35860)
Include the bigframes package version alongside the feedback link in error messages (#936) (7b59b6d)

Bug Fixes

Astype Decimal to Int64 conversion. (#957) (27764a6)
Make read_gbq_function work for multi-param functions (#947) (c750be6)
Support read_gbq_function for axis=1 application (#950) (86e54b1)

Documentation

Add docstring returns section to Options (#937) (a2640a2)
Update title of pypi notebook example to reflect use of the PyPI public dataset (#952) (cd62e60)

1.16.0 (2024-09-04)

Features

Add DataFrame.struct.explode to add struct subfields to a DataFrame (#916) (ad2f75e)
Implement bigframes.bigquery.json_extract_array (#910) (575a29e)
Recover struct column from exploded Series (#904) (7dd304c)

Bug Fixes

Fix issue with iterating on >10gb dataframes (#949) (2b0f0fa)
Improve Series.replace for dict input (#907) (4208044)
NullIndex in ML model.predict error (#917) (612271d)
Struct field non-nullable type issue. (#914) (149d5ff)
Unordered mode errors in ml train_test_split (#925) (85d7c21)

Performance Improvements

Improve repr performance (#918) (46f2dd7)

Dependencies

Re-introduce support for numpy 1.24.x (#931) (3d71913)
Update minimum support to Pandas 1.5.3 and Pyarrow 10.0.1 (#903) (7ed3962)

Documentation

Add Claude3 ML and RemoteFunc notebooks (#930) (cfd16c1)
Create sample notebook to manipulate struct and array data (#883) (3031903)
Update struct examples. (#953) (d632cd0)
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook (#890) (d1883cc)

1.15.0 (2024-08-20)

Features

Add llm.TextEmbeddingGenerator to support new embedding models (#905) (6bc6a41)
Add ml.llm.Claude3TextGenerator model (#901) (7050038)

Documentation

Add columns for “requires ordering/index” to supported APIs summary (#892) (d2fc51a)
Remove duplicate description for kms_key_name (#898) (1053d56)
Update embedding model notebooks (#906) (d9b8ef5)

1.14.0 (2024-08-14)

Features

Implement bigframes.bigquery.json_extract (#868) (3dbf84b)
Implement Series.str.__getitem__ (#897) (e027b7e)

Bug Fixes

Fix caching from generating row numbers in partial ordering mode (#872) (52b7786)

Performance Improvements

Generate SQL with fewer CTEs (#877) (eb60804)
Speed up compilation by reducing redundant type normalization (#896) (e0b11bc)

Documentation

Add streaming html docs (#884) (171da6c)
Fix the DisplayOptions doc rendering (#893) (3eb6a17)
Update streaming notebook (#887) (6e6f9df)

1.13.0 (2024-08-05)

Features

df.apply(axis=1) to support remote function with mutiple params (#851) (2158818)
Allow windowing in ‘partial’ ordering mode (#861) (ca26fe5)
Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters (#879) (8753bdd)

Bug Fixes

Fix issue with invalid sql generated by ml distance functions (#865) (9959fc8)

Documentation

Create sample notebook using ordering_mode="partial" (#880) (c415eb9)
Update streaming notebook (#875) (e9b0557)

1.12.0 (2024-07-31)

Features

Add bigframes-mode label to query jobs (#832) (c9eaff0)
Add config option to set partial ordering mode (#855) (823c0ce)
Add stratify param support to ml.model_selection.train_test_split method (#815) (27f8631)
Add streaming.StreamingDataFrame class (#864) (a7d7197)
Allow DataFrame.join for self-join on Null index (#860) (e950533)
Support remote function cleanup with session.close (#818) (ed06436)
Support to_csv/parquet/json to local files/objects (#858) (d0ab9cc)

Bug Fixes

Fewer relation joins from df self-operations (#823) (0d24f73)
Fix ‘sql’ property for null index (#844) (1b6a556)
Fix unordered mode using ordered path to print frame (#839) (93785cb)
Reduce redundant remote_function deployments (#856) (cbf2d42)

Documentation

Add partner attribution steps to integrations sample notebook (#835) (d7b333f)
Make get_global_session/close_session/reset_session appears in the docs (#847) (01d6bbb)

1.11.1 (2024-07-08)

Documentation

Remove session and connection in llm notebook (#821) (74170da)
Remove the experimental flask icon from the public docs (#820) (067ff17)

1.11.0 (2024-07-01)

Features

Add .agg support for size (#792) (87e6018)
Add bigframes.bigquery.json_set (#782) (1b613e0)
Add bigframes.streaming.to_pubsub method to create continuous query that writes to Pub/Sub (#801) (b47f32d)
Add DataFrame.to_arrow to create Arrow Table from DataFrame (#807) (1e3feda)
Add PolynomialFeatures support to to_gbq and pipelines (#805) (57d98b9)
Add Series.peek to preview data efficiently (#727) (580e1b9)
Expose gcf memory param in remote_function (#803) (014765c)
More informative error when query plan too complex (#811) (136dc24)

Bug Fixes

Include internally required packages in remote_function hash (#799) (4b8fc15)

Documentation

Document dtype limitation on row processing remote_function (#800) (487dff6)

1.10.0 (2024-06-21)

Features

Add dataframe.insert (#770) (e8bab68)
Add groupby head API (#791) (44202bc)
Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
Bigframes.streaming module for continuous queries (#703) (0433a1c)
Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)

Bug Fixes

Allow __repr__ to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)
Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
Ensure numpy version matches in remote_function deployment (#798) (324d93c)
Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
Self-join optimization doesn’t needlessly invalidate caching (#797) (1b96b80)

1.9.0 (2024-06-10)

Features

Allow functions returned from bpd.read_gbq_function to execute outside of apply (#706) (ad7d8ac)
Support bigquery.vector_search() (#736) (dad66fd)
Support score() in GeminiTextGenerator (#740) (b2c7d8b)
Support bytes type in remote_function (#761) (4915424)
Support fit() in GeminiTextGenerator (#758) (d751f5c)

Bug Fixes

ARIMAPlus loads auto_arima_min_order param (#752) (39d7013)
Improve to_pandas_batches for large results (#746) (61f18cb)
Resolve issue with unset thread-local options (#741) (d93dbaf)

Documentation

Fix ML.EVALUATE spelling (#749) (7899749)
Remove LogisticRegression normal_equation strategy (#753) (ea5d367)

1.8.0 (2024-05-31)

Features

merge only generates a default index if both inputs already have an index (#733) (25d049c)
Add +, - as unary ops, ^ binary op (#724) (968d825)
Add GroupBy.size() to get number of rows in each group (#479) (1fca588)
Add DataFrame ~ operator (#721) (354abc1)
Add GeminiText 1.5 Preview models (#737) (56cbd3b)
Add slot_millis and add stats to session object (#725) (72e9583)
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
Allow functions decorated with bpd.remote_function() to execute locally (#704) (d850da6)
Ensure "bigframes-api" label is always set on jobs, even if the API is unknown (#722) (1832778)
Support ml.SimpleImputer in bigframes (#708) (4c4415f)
Support type annotations to supply input and output types to bpd.remote_function() decorator (#717) (4a12e3c)
Support type annotations with bpd.remote_function() and axis=1 (a preview feature) (#730) (e5a2992)

Bug Fixes

Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
Fix Null index assign series to column (#711) (ffb4b57)
Set bpd.remote_function()s input_types and output_types default to None to allow omitting them when type annotations are present (#729) (0e25a3b)
Warn and disable time travel for linked datasets (#712) (085fa9d)

Performance Improvements

Optimize dataframe-series alignment on axis=1 (#732) (3d39221)

Documentation

Add examples to DataFrameGroupBy and SeriesGroupBy (#701) (e7da0f0)

1.7.0 (2024-05-20)

Features

read_gbq_query supports filters (9386373)
read_gbq suggests a correct column name when one is not found (9386373)
Add DefaultIndexKind.NULL to use as index_col in read_gbq\*, creating an indexless DataFrame/Series (#662) (29e4886)
Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
To_datetime supports utc=False for string inputs (#579) (adf9889)

Bug Fixes

read_gbq_table respects primary keys even when filters are set (#689) (9386373)
Fix type error in test_cluster (#698) (14d81c1)
Improve escaping of literals and identifiers (#682) (da9b136)
Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
Remove a usage of the resource package when not available, such as on Windows (#681) (96243f2)
The imported samples error and use peek() (#688) (1a0b744)

Performance Improvements

Don’t run query immediately from read_gbq_table if filters is set (9386373)
Use a LIMIT clause when max_results is set (9386373)

Documentation

Add code snippets for imported onnx tutorials (#684) (cb36e46)
Add code snippets for imported tensorflow model (#679) (b02c401)
Use class_weight="balanced" in the logistic regression prediction tutorial (#678) (b951549)

1.6.0 (2024-05-13)

Features

Add DataFrame.__delitem__ (#673) (2218c21)
Add Series.case_when() (#673) (2218c21)
Add strategy="quantile" in KBinsDiscretizer (#654) (c6c487f)
Add Series.combine (#680) (2fd1b81)
Series.str.split (#675) (6eb19a7)
Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
Support axis=1 in df.apply for scalar outputs (#629) (f6bdc4a)
Support gcf vpc connector in remote_function (#677) (9ca92d0)
Warn with a more specific DefaultLocationWarning category when no location can be detected (#648) (e084e54)

Bug Fixes

Include index_col when selecting columns and filters in read_gbq_table (#648) (e084e54)

Dependencies

Add jellyfish as a dependency for spelling correction (57ccabc)

Documentation

Add code snippets for llm text generatiion (#669) (93416ed)
Add logistic regression samples (#673) (2218c21)
Address lint errors in code samples (#665) (4fc8964)
Document inlining of small data in read_\* APIs (#670) (306953a)

1.5.0 (2024-05-07)

Features

bigframes.options and bigframes.option_context now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)
Add ARIMAPlus.coef_ property exposing ML.ARIMA_COEFFICIENTS functionality (#585) (81d1262)
Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
Add the bigframes.bigquery sub-package with a bigframes.bigquery.array_length function (#630) (9963f85)
Always do a query dry run when option.repr_mode == "deferred" (#652) (651fd7d)
Custom query labels for compute options (#638) (f561799)
Warn with DefaultIndexWarning from read_gbq on clustered/partitioned tables with no index_col or filters set (#631, #658) (2715d2b, 73064dd)
Support index_col=False in read_csv and engine="bigquery" (73064dd)
Support gcf max instance count in remote_function (#657) (36578ab)

Bug Fixes

Don’t raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
Fix bug with na in the column labels in stack (#659) (4a34293)
Use explicit session in PaLM2TextGenerator (#651) (e4f13c3)

Documentation

Add python code sample for multiple forecasting time series (#531) (16866d2)
Fix the Palm2TextGenerator output token size (#649) (c67e501)

1.4.0 (2024-04-29)

Features

Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
Allow single input type in remote_function (#641) (3aa643f)
Expose gcf max timeout in remote_function (#639) (dfeaad0)
Series binary ops compatible with more types (#618) (518d315)
Support the score method for PaLM2TextGenerator (#634) (3ffc1d2)

Bug Fixes

Allow to_pandas to download more than 10GB (#637) (ce56495)
Extend row hash to 128 bits to guarantee unique row id (#632) (9005c6e)
Llm fine tuning tests (#627) (4724a1a)
Llm palm score tests (#643) (cf4ec3a)

Performance Improvements

Automatically condense internal expression representation (#516) (03c1b0d)
Cache transpose to allow performant retranspose (#635) (44b738d)

Documentation

Add supported pandas apis on the main page (#628) (8d2a51c)
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
Address more technical writers’ feedback (#640) (1e7793c)

1.3.0 (2024-04-22)

Features

Add Series.struct.dtypes property (#599) (d924ec2)
Add fine tuning fit() for Palm2TextGenerator (#616) (9c106bd)
Add quantile statistic (#613) (bc82804)
Expose max_batching_rows in remote_function (#622) (240a1ac)
Support primary key(s) in read_gbq by using as the index_col by default (#625) (75bb240)
Warn if location is set to unknown location (#609) (3706b4f)

Bug Fixes

Address technical writers fb (#611) (9f8f181)
Infer narrowest numeric type when combining numeric columns (#602) (8f9ece6)
Use exact median implementation by default (#619) (9d205ae)

Documentation

Fix rendering of examples for multiple apis (#620) (9665e39)
Set index_cols in read_gbq as a best practice (#624) (70015b7)

1.2.0 (2024-04-15)

Features

Add hasnans, combine_first, update to Series (#600) (86e0f38)
Add MultiIndex subclass. (#596) (5d0f149)
Add pivot_table for DataFrame. (#473) (5f1d670)
Add Series.autocorr (#605) (4ec8034)
Support list of numerics in pandas.cut (#580) (290f95d)

Bug Fixes

Address more technical writers feedback (#581) (4b08d92)
Error for object dtype on read_pandas (#570) (8702dcf)
Inverting int now does bitwise inversion rather than sign flip (#574) (5f1db8b)
Loc setitem dtype issue. (#603) (b94bae9)
Toc menu missing plotting name (#591) (eed12c1)

Documentation

(Series|Dataframe).dtypes (#598) (edef48f)
Add code samples for str accessor methdos (#594) (a557ea2)
Add docs for DataFrame and Series dunder methods (#562) (8fc26c4)
Add examples for at/iat (#582) (3be4a2e)

1.1.0 (2024-04-04)

Features

(Series|DataFrame).explode (#556) (9e32f57)
Add DataFrame.eval and DataFrame.query (#361) (5e28ebd)
Add ColumnTransformer save/load (#541) (9d8cf67)
Add ml.metrics.mean_squared_error (#559) (853c25e)
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
Add transformers save/load (#552) (d805241)
Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
Expose DataFrame.bqclient to assist in integrations (#519) (0be8911)
Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
Support ML.GENERATE_EMBEDDING in PaLM2TextEmbeddingGenerator (#539) (1156c1e)
Support max_columns in repr and make repr more efficient (#515) (54e49cf)

Bug Fixes

Assign NaN scalar to column error. (#513) (0a4153c)
Don’t download 100gb onto local python machine in load test (#537) (082c58b)
Exclude list-like s parameter in plot.scatter (#568) (1caac27)
Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
Fix error in Series.drop(0) (#575) (75dd786)
Include all names in MultiIndex repr (#564) (b188146)
Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
Product operation produces float result for all input types (#501) (6873b30)
Reloaded transformer .transform error (#569) (39fe474)
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
Restore string to date/time type coercion (#565) (4ae0262)
Sync the notebook with embedding changes (#550) (347f2dd)
Use bytes limit on frame inlining rather than element count (#576) (659a161)

Performance Improvements

Add multi-query execution capability for complex dataframes (#427) (d2d7e33)

Dependencies

Include pyarrow as a dependency (#529) (9b1525a)

Documentation

bigframes.options.bigquery.project and location are optional in some circumstances (#548) (90bcec5)
Add “Supported pandas APIs” reference to the documentation (#542) (74c3915)
Add General Availability banner to README (#507) (262ff59)
Add opeartions in API docs (#557) (ea95761)
Add progress_bar code sample (#508) (92a1af3)
Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
Address more comments from technical writers to meet legal purposes (#571) (9084df3)
Fix docs of ARIMAPlus.predict (#512) (3b80f95)
Include Index in table-of-contents (#564) (b188146)
Mark Gemini model as Pre-GA (#543) (769868b)
Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)

1.0.0 (2024-03-25)

⚠ BREAKING CHANGES

rename model parameter min_rel_progress to tol
early_stop setting no longer supported, always uses True
rename model parameter n_parallell_trees to n_estimators
rename class_weights to class_weight
rename learn_rate to learning_rate
PCA n_components supports float value and None, default to None
rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491)

Features

Add configuration option to read_gbq (#401) (85cede2)
Add ml ARIMAPlus model params (#488) (352cb85)
Add ml KMeans model params (#477) (23a8d9a)
Add ml LogisticRegression model params (#481) (f959b65)
Add ml PCA model params (#474) (fb5d83b)
Add params for LinearRegression model (#464) (21b2188)
Add support for Python 3.12 (#231) (df2976f)
Allow assigning directly to Series.name property (#495) (ad0e99e)
Ensure Series.str.len() can get length of array columns (#497) (10c0446)
Option to use bq connection without check (#460) (0b3f8e5)
PCA n_components supports float value and None, default to None (65c6f47)
Rename class_weights to class_weight (65c6f47)
Rename learn_rate to learning_rate (65c6f47)
Rename model parameter min_rel_progress to tol (65c6f47)
Rename model parameter n_parallell_trees to n_estimators (65c6f47)
Rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491) (65c6f47)
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 (#504) (fbada4a)
Support dataframe.cov (#498) (c4beafd)
Support Series.dt.floor (#493) (2dd01c2)
Support Series.dt.normalize (#483) (0bf1e91)
Update plot sample to 1000 rows (#458) (60d4a7b)

Bug Fixes

early_stop setting no longer supported, always uses True (65c6f47)
Fix -1 offset lookups failing (#463) (2dfb9c2)
Plot.scatter c argument functionalities (#494) (d6ee994)
Properly support format param for numerical input. (#486) (ae20c35)
Renable to_csv and to_json related tests (#468) (2b9a01d)
Sampling plot cannot preserve ordering if index is not ordered (#475) (a5345fe)
Use actual BigQuery types rather than ibis types in to_pandas (#500) (82b4f91)

Dependencies

Support pandas 2.2 (#492) (e2cf50e)

Documentation

Add code samples for metrics.{accuracy_score, confusion_matrix} (#478) (3e3329a)
Add code samples for metrics.{recall_score, precision_score, f11_score} (#502) (370fe90)
Improve API documentation (#489) (751266e)
Update bigquery connection documentation (#499) (4bfe094)
Update LLM + K-means notebook to handle partial failures (#496) (97afad9)

0.26.0 (2024-03-20)

⚠ BREAKING CHANGES

exclude remote models for .register() (#465)

Features

(Series|DataFrame).plot (#438) (1c3e668)
read_gbq_table supports LIKE as a operator in filters (#454) (d2d425a)
Add DataFrame.pipe() method (#421) (95f5a6e)
Set force=True by default in DataFrame.peek() (#469) (4e8e97d)
Support datetime related casting in (Series|DataFrame|Index).astype (#442) (fde339b)
Support Series.dt.strftime (#453) (8f6e955)

Bug Fixes

Any() on empty set now correctly returns False (#471) (f55680c)
Df.drop_na preserves columns dtype (#457) (3bab1a9)
Disable to_json and to_csv related tests (#462) (874026d)
Exclude remote models for .register() (#465) (73fe0f8)
Fix broken link in covid notebook (#450) (adadb06)
Fix broken multiindex loc cases (#467) (b519197)
Fix grouping series on multiple other series (#455) (3971bd2)
Groupby aggregates no longer check if grouping keys are numeric (#472) (4fbf938)
Raise ValueError when read_pandas() receives a bigframes DataFrame (#447) (b28f9fd)
Series.(to_csv|to_json) leverages bq export (#452) (718a00c)
Warn when read_gbq / read_gbq_table uses the snapshot time cache (#441) (e16a8c0)

Documentation

Add code samples for ml.metrics.r2_score (#459) (85fefa2)
Add the docs for loc and iloc indexers (#446) (14ab8d8)
Add the pages for at and iat indexers (#456) (340f0b5)
Add version information to bug template (#437) (91bd39e)
Indicate that project and location are optional in example notebooks (#451) (1df0140)

0.25.0 (2024-03-14)

Features

(Series|DataFrame).plot.(line|area|scatter) (#431) (0772510)
Support CMEK for remote_function cloud functions (#430) (2fd69f4)

0.24.0 (2024-03-12)

⚠ BREAKING CHANGES

read_parquet uses a “pandas” engine to parse files by default. Use engine="bigquery" for the previous behavior

Features

(Series|Dataframe).plot.hist() (#420) (4aadff4)
Add detect_anomalies to ml ARIMAPlus and KMeans models (#426) (6df28ed)
Add engine parameter to read_parquet (#413) (31325a1)
Add ml PCA.detect_anomalies method (#422) (8d82945)
Support BYOSA in remote_function (#407) (d92ced2)
Support CMEK for BQ tables (#403) (9a678e3)

Bug Fixes

Move third_party.bigframes_vendored to bigframes_vendored (#424) (763edeb)
Only do row identity based joins when joining by index (#356) (76b252f)
Read_pandas inline respects location (#412) (ae0e3ea)

Documentation

Add predict sample to samples/snippets/bqml_getting_started_test.py (#388) (6a3b0cc)
Document minimum IAM requirement (#416) (36173b0)
Fix the note rendering for DataFrames methods: nlargest, nsmallest (#417) (38bd2ba)

0.23.0 (2024-03-05)

Features

Add ml.metrics.pairwise.euclidean_distance (#397) (1726588)
Add TextEmbedding model version support (#394) (e0f1ab0)

Bug Fixes

Code exception in remote_function now prevents retry and surfaces in the client (#387) (dd3643d)
Docs link for metrics.pairwise (#400) (a60aba7)

Dependencies

Update ibis to version 8.0.0 and refactor remote_function to use ibis UDF method (#277) (350499b)

Documentation

Update README to point to new summary pages (#402) (bfe2b23)

0.22.0 (2024-02-27)

⚠ BREAKING CHANGES

rename cosine_similarity to paired_cosine_distances (#393)
move model optional args to kwargs (#381)

Features

Add DataFrames.corr() method (#379) (67fd434)
Add ml.metrics.pairwise.manhattan_distance (#392) (9d31865)
Enable regional endpoints for me-central2 (#386) (469674d)

Bug Fixes

Avoid ibis warning for “database” table() method argument (#390) (a0490a4)
Correct the numeric literal dtype (#365) (93b02cd)
Rename cosine_similarity to paired_cosine_distances (#393) (81ece46)

Performance Improvements

Inline read_pandas for small data (#383) (59b446b)

Dependencies

Add minimum version constraint for sqlglot to 19.9.0 (#389) (8b62d77)

Documentation

Add a code sample for creating a kmeans model (#267) (4291d65)
Fix bigframes.pandas.concat documentation (#382) (234b61c)

Miscellaneous Chores

Release 0.22.0 (#396) (8f73d9e)

Code Refactoring

Move model optional args to kwargs (#381) (4037992)

0.21.0 (2024-02-13)

Features

Add Series.cov method (#368) (443db22)
Add ml.llm.GeminiTextGenerator model (#370) (de1e0a4)
Add ml.metrics.pairwise.cosine_similarity function (#374) (126f566)
Add XGBoostModel (#363) (d5518b2)
Limited support of lambdas in Series.apply (#345) (208e081)
Support bigframes.pandas.to_datetime for scalars, iterables and series. (#372) (ffb0d15)
Support read_gbq wildcard table path (#377) (90caf86)

Bug Fixes

Error message fix. (#375) (930cf6b)

Documentation

Clarify ADC pre-auth in a non-interactive environment (#348) (99a9e6e)

0.20.1 (2024-02-06)

Performance Improvements

Make repr cache the block where appropriate (#350) (068879f)

Documentation

Add a sample to demonstrate the evaluation results (#364) (cff0919)
Fix the DataFrame.apply code sample (#366) (1866a26)

0.20.0 (2024-01-30)

Features

Add DataFrame.peek() as an efficient alternative to head() results preview (#318) (9c34d83)
Add ARIMA_EVAULATE options in forecasting models (#336) (73e997b)
Add Index constructor, repr, copy, get_level_values, to_series (#334) (e5d054e)
Improve error message for drive based BQ table reads (#344) (0794788)
Update cut to work without labels = False and show intervals as dict (#335) (4ff53db)

Bug Fixes

Chance default connection name in getting_started.ipnyb (#347) (677f014)
Series iteration correctly returns values instead of index (#339) (2c6af9b)

Documentation

Add code samples for Series.{between, cumprod} (#353) (09a52fd)

0.19.2 (2024-01-22)

Bug Fixes

Read_gbq large response issue (#332) (b8178b9)
Use object dtype for ARRAY columns in to_pandas() with pandas 1.x (#329) (374ddb5)

Documentation

Add DataFrame.applymap documentation (#326) (bd531a1)
Add code samples for series methods (#323) (32cc6fa)
Add remote model requirements (#333) (c91f70c)

0.19.1 (2024-01-17)

Bug Fixes

Handle multi-level columns for df aggregates properly (#305) (5bb45ba)
Update max_output_token limitation. (#308) (5cccd36)

Documentation

Add code samples for Series.corr (#316) (9150c16)

0.19.0 (2024-01-09)

Features

Add ‘columns’ as an alias for ‘col_order’ (#298) (a01b271)
Add Series dt.tz and dt.unit properties (#303) (2e1a403)
Add to_gbq() method for LLM models (#299) (dafbc1b)
Allow manually set clustering_columns in dataframe.to_gbq (#302) (9c21323)
Support assigning to columns like a property (#304) (f645c56)
Support upcasting numeric columns in concat (#294) (e3a056a)

Bug Fixes

DF.drop tuple input as multi-index (#301) (21391a9)
Fix bug converting non-string labels to sql ids (#296) (a61c5fe)

Documentation

Add code samples for Series.ffill and DataFrame.ffill (#307) (1c63b45)

0.18.0 (2024-01-02)

Features

Add dataframe.to_html (#259) (2cd6489)
Add IntervalIndex support to bigframes.pandas.cut (#254) (6c1969a)
Add replace method to DataFrame (#261) (5092215)
Specific pyarrow mappings for decimal, bytes types (#283) (a1c0631)

Bug Fixes

Dataframes to_gbq now creates dataset if it doesn’t exist (#222) (bac62f7)
Exclude pandas 2.2.0rc0 to unblock prerelease tests (#292) (ac1a745)
Fix DataFrameGroupby.agg() issue with as_index=False (#273) (ab49350)
Make Series.str.replace work for simple strings (#285) (ad67465)
Update dataframe.to_gbq to dedup column names. (#286) (746115d)
Use setuptools.find_namespace_packages (#246) (9ec352a)

Dependencies

Migrate to ibis-framework >= "7.1.0" (#53) (9798a2b)

Documentation

Add code snippets for explore query result page (#278) (7cbbb7d)
Code samples for astype common to DataFrame and Series (#280) (95b673a)
Code samples for DataFrame.copy and Series.copy (#290) (7cbc2b0)
Code samples for drop and fillna (#284) (9c5012e)
Code samples for isna, isnull, dropna, isin (#289) (ad51035)
Code samples for rename , size (#293) (eb69f60)
Code samples for reset_index and sort_values (#282) (acc0eb7)
Code samples for sample, get, Series.round (#295) (c2b1892)
Code samples for Series.{add, replace, unique, T, transpose} (#287) (0e1bbfc)
Code samples for Series.{map, to_list, count} (#290) (7cbc2b0)
Code samples for Series.{name, std, agg} (#293) (eb69f60)
Code samples for Series.groupby and Series.{sum,mean,min,max} (#280) (95b673a)
Code samples for DataFrame set_index, items (#295) (c2b1892)
Fix the rendering for get_dummies (#291) (252f3a2)

0.17.0 (2023-12-14)

Features

Add filters argument to read_gbq for enhanced data querying (#198) (034f71f)
Add module/class level api tracking (#272) (4f3db3d)
Deprecate use_regional_endpoints (#199) (319a1f2)

Bug Fixes

Increase recursion limit, cache compilation tree hashes (#184) (b54791c)
Replaced raise NotImplementedError with return NotImplemented (#258) (a133822)

Documentation

Add code samples for values and value_counts (#249) (f247d95)
Add sample for getting started with BQML (#141) (fb14f54)

0.16.0 (2023-12-12)

Features

Add ARIMAPlus.predict parameters (#264) (99598c7)
Add DataFrame from_dict and from_records methods (#244) (8d81e24)
Add DataFrame.select_dtypes method (#242) (1737acc)
Add nunique method to Series/DataFrameGroupby (#256) (c8ec245)
Support dataframe.loc with conditional columns selection (#233) (3febea9)

Bug Fixes

Enfore pandas version requirement <2.1.4 (#265) (9dd63f6)
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests (b02fc2c)
Fix value_counts column label for normalize=True (#245) (d3fa6f2)
Migrate e2e tests to bigframes-load-testing project (8766ac6)
Ml.sql logic (#262) (68c6fdf)
Update the llm_kmeans notebook (#247) (66d1839)

Documentation

Add code samples for shape and head (#257) (5bdcc65)
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… (#252) (8c63697)
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… (#234) (e735412)
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod (#243) (0523a31)
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… (#251) (77074ec)
Correct the docs for option_context (#263) (d21c6dd)
Correct the params rendering for ml.remote and ml.ensemble modules (#248) (c2829e3)
Fix return annotation in API docstrings (#253) (89a1c67)

0.15.0 (2023-11-29)

⚠ BREAKING CHANGES

model.predict returns all the columns (#204)

Features

Add info and memory_usage methods to dataframe (#219) (9d6613d)
Add remote vertex model support (#237) (0bfc4fb)
Add the recent api method for ML component (#225) (ed8876d)
Model.predict returns all the columns (#204) (416171a)
Send warnings on LLM prediction partial failures (#216) (81125f9)

Bug Fixes

Add df snapshots lookup for read_gbq (#229) (d0d9b84)
Avoid unnecessary row_number() on sort key for io (#211) (a18d40e)
Dedup special character (#209) (dd78acb)
Invalid JSON type of the notebook (#215) (a729831)
Make to_pandas override enable_downsampling when sampling_method is manually set. (#200) (ae03756)
Polish the llm+kmeans notebook (#208) (e8532b1)
Update the llm+kmeans notebook with recent change (#236) (f8917ab)
Use anonymous dataset to create remote_function (#205) (69b016e)

Documentation

Add code samples for index and column properties (#212) (c88d38e)
Add code samples for df reshaping, function, merge, and join methods (#203) (010486c)
Add examples for dataframe.kurt, dataframe.std, dataframe.count (#232) (f9c6e72)
Add examples for dataframe.mean, dataframe.median, dataframe.va… (#228) (edd0522)
Add examples for dataframe.min, dataframe.max and dataframe.sum (#227) (3a375e8)
Code samples for Series.dot and DataFrame.dot (#226) (b62a07a)
Code samples for Series.where and Series.mask (#217) (52dfad2)
Code samples for dataframe.any, dataframe.all and dataframe.prod (#223) (d7957fa)
Make the code samples reflect default bq connection usage (#206) (71844b0)

Miscellaneous Chores

Release 0.15.0 (#241) (6c899be)

0.14.1 (2023-11-16)

Bug Fixes

Correctly handle null values when initializing fingerprint ordering (#210) (8324f13)

Documentation

Add an example notebook about line graphs (#197) (f957b27)

0.14.0 (2023-11-14)

Features

Add ‘cross’ join support (#176) (765446a)
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods (#162) (6a28403)
Add series.sample (identical to existing dataframe.sample) (#187) (37914a4)
Add unordered sql compilation (#156) (58f420c)
Log most recent API calls as recent-bigframes-api-xx labels on BigQuery jobs (#145) (4ea33b7)
Read_gbq creates order deterministically without table copy (#191) (8ab81de)
Support date_series.astype("string[pyarrow]") to cast DATE to STRING (#186) (aee0e8e)
Support series.at[row_label] = scalar (#173) (0c8bd33)
Temporary resources no longer use BigQuery Sessions (#194) (4a02cac)

Bug Fixes

All sort operation are now stable (#195) (3a2761f)
Default to 7 days expiration for read_csv, read_json, read_parquet (#193) (03606cd)
Deprecate the remote_service_type in llm model (#180) (a8a409a)
For reset_index on unnamed multiindex, always use level_[n] label (#182) (f95000d)
Match pandas behavior when assigning listlike to empty dfs (#172) (c1d1f42)
Use anonymous dataset instead of session dataset for temp tables (#181) (800d44e)
Use random table for read_pandas (#192) (741c75e)
Use random table when loading data for read_csv, read_json, read_parquet (#175) (9d2e6dc)

Documentation

Add code samples for read_gbq_function using community UDFs (#188) (7506eab)
Add docstring code samples for Series.apply and DataFrame.map (#185) (c816d84)
Add llm kmeans notebook as an included example (#177) (d49ae42)
Use head() to get top n results, not to preview results (#190) (87f84c9)

0.13.0 (2023-11-07)

Features

to_gbq without a destination table writes to a temporary table (#158) (e1817c9)
Add DataFrame.__iter__, DataFrame.iterrows, DataFrame.itertuples, and DataFrame.keys methods (#164) (c065071)
Add Series.__iter__ method (#164) (c065071)
Add interpolate() to series and dataframe (#157) (b9cb55c)
Support 32k text-generation and multilingual embedding models (#161) (5f0ea37)

Bug Fixes

Update default temp table expiration to 7 days (#174) (4ff26cd)

0.12.0 (2023-11-01)

Features

Add DataFrame.melt (#113) (4e4409c)
Add DataFrame.to_pandas_batches() to download large DataFrame objects (#136) (3afd4a3)
Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs (#133) (63c7919)
Add pandas.qcut (#104) (8e44518)
Add pd.get_dummies (#149) (d8baad5)
Add unstack to series, add level param (#115) (5edcd19)
Implement operator @ for DataFrame.dot (#139) (79a638e)
Populate ibis version in user agent (#140) (c639a36)

Bug Fixes

Don’t override the global logging config (#138) (2ddbf74)
Fix bug with column names under repeated column assignment (#150) (29032d0)
Resolve plotly rendering issue by using ipython html for job pro… (#134) (39df43e)
Use indexee’s session for loc listlike cases (#152) (27c5725)

Documentation

Add artithmetic df sample code (#153) (ac44ccd)
Fix indentation on read_gbq_function code sample (#163) (0801d96)
Link to ML.EVALUATE BQML page for score() methods (#137) (45c617f)

0.11.0 (2023-10-26)

Features

Add back reset_session as an alias for close_session (#124) (694a85a)
Change query parameter to query_or_table in read_gbq (#127) (f9bb3c4)

Bug Fixes

Expose bigframes.pandas.reset_session as a public API (#128) (b17e1f4)
Use series’s own session in series.reindex listlike case (#135) (95bff3f)

Documentation

Add runnable code samples for DataFrames I/O methods and property (#129) (6fea8ef)
Add runnable code samples for reading methods (#125) (a669919)

0.10.0 (2023-10-19)

Features

Implement DataFrame.dot for matrix multiplication (#67) (29dd414)

0.9.0 (2023-10-18)

⚠ BREAKING CHANGES

rename bigframes.pandas.reset_session to close_session (#101)

Features

Add bigframes.options.bigquery.application_name for partner attribution (#117) (52d64ff)
Add AtIndexer getitems (#107) (752b01f)
Rename bigframes.pandas.reset_session to close_session (#101) (36693bf)
Send BigQuery cancel request when canceling bigframes process (#103) (e325fbb)
Support external packages in remote_function (#98) (ec10c4a)
Use ArrowDtype for STRUCT columns in to_pandas (#85) (9238fad)

Bug Fixes

Support multiindex for three loc getitem overloads (#113) (68e3cd3)

Performance Improvements

If primary keys are defined, read_gbq avoids copying table data (#112) (e6c0cd1)

Documentation

Add documentation for Series.struct.field and Series.struct.explode (#114) (a6dab9c)
Add open-source link in API doc (#106) (db51fe3)
Update ML overview API doc (#105) (1b3f3a5)

0.8.0 (2023-10-12)

⚠ BREAKING CHANGES

The default behavior of to_parquet is changing from no compression to 'snappy' compression.

Features

Support compression in to_parquet (a8c286f)

Bug Fixes

Create session dataset for remote functions only when needed (#94) (1d385be)

0.7.0 (2023-10-11)

Features

Add aliases for several series properties (#80) (c0efec8)
Add equals methods to series/dataframe (#76) (636a209)
Add iat and iloc accessing by tuples of integers (#90) (228aeba)
Add level param to DataFrame.stack (#88) (97b8bec)
Allow df.drop to take an index object (#68) (740c451)
Use default session connection (#87) (4ae4ef9)

Bug Fixes

Change the invalid url in docs (#93) (969800d)

Documentation

Add more preprocessing models into the docs menu. (#97) (1592315)

0.6.0 (2023-10-04)

Features

Add df.unstack (#63) (4a84714)
Add idxmin, idxmax to series, dataframe (#74) (781307e)
Add ml.preprocessing.KBinsDiscretizer (#81) (24c6256)
Add multi-column dataframe merge (#73) (c9fa85c)
Add update and align methods to dataframe (#57) (bf050cf)
Support STRUCT data type with Series.struct.field to extract child fields (#71) (17afac9)

Bug Fixes

Avoid 403 response too large to return error with read_gbq and large query results (#77) (8f3b5b2)
Change return type of Series.loc[scalar] (#40) (fff3d45)
Fix df/series.iloc by list with multiindex (#79) (971d091)

0.5.0 (2023-09-28)

Features

Add DataFrame.kurtosis / DF.kurt method (c1900c2)
Add DataFrame.rolling and DataFrame.expanding methods (c1900c2)
Add items, apply methods to DataFrame. (#43) (3adc1b3)
Add axis param to simple df aggregations (#52) (9cf9972)
Add index dtype, astype, drop, fillna, aggregate attributes. (#38) (1a254a4)
Add ml.preprocessing.LabelEncoder (#50) (2510461)
Add ml.preprocessing.MaxAbsScaler (#56) (14b262b)
Add ml.preprocessing.MinMaxScaler (#64) (392113b)
Add more index methods (#54) (a6e32aa)
Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support class_weights="balanced" in LogisticRegression model (c1900c2)
Support df[column_name] = df_only_one_column (c1900c2)
Support early_stop parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support casting string to integer or float (#59) (3502f83)

Bug Fixes

Fix header skipping logic in read_csv (#49) (d56258c)
Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
LabelEncoder params consistent with Sklearn (#60) (632caec)
Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)

Performance Improvements

Add ability to cache dataframe and series to session table (#51) (416d7cb)
Inline small Series and DataFrames in query text (#45) (5e199ec)
Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
Simplify join order to use multiple order keys instead of string. (#36) (5056da6)

Documentation

Link to Remote Functions code samples from README and API reference (c1900c2)

0.4.0 (2023-09-16)

Features

Add axis parameter to droplevel and reorder_levels (7c6b0dd)
Add bfill and ffill to DataFrame and Series (7c6b0dd)
Add DataFrame.combine and DataFrame.combine_first (#27) (7c6b0dd)
Add DataFrame.nlargest, nsmallest (7c6b0dd)
Add DataFrame.pct_change and Series.pct_change (7c6b0dd)
Add DataFrame.skew and GroupBy.skew (7c6b0dd)
Add DataFrame.to_dict, to_excel, to_latex, to_records, to_string, to_markdown, to_pickle, to_orc (7c6b0dd)
Add diff method to DataFrame and GroupBy (7c6b0dd)
Add filter and reindex to Series and DataFrame (7c6b0dd)
Add reindex_like to DataFrame and Series (7c6b0dd)
Add swaplevel to DataFrame and Series (7c6b0dd)
Add partial support for Sereies.replace (7c6b0dd)
Support DataFrame.loc[bool_series, column] = scalar (7c6b0dd)
Support a persistent name in remote_function (7c6b0dd)

Bug Fixes

remote_function uses same credentials as other APIs (7c6b0dd)
Add type hints to models (7c6b0dd)
Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
Remove transforms parameter in model.fit (breaking change) (7c6b0dd)
Support column joins with “None indexer” (7c6b0dd)
Use for literals Int64Dtype in cut (7c6b0dd)
Use lowercase strings for parameter literals in bigframes.ml (breaking change) (7c6b0dd)

Performance Improvements

bigframes-api label to I/O query jobs (7c6b0dd)

Documentation

Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
Document region logic in README (7c6b0dd)
Fix OneHotEncoder sample (7c6b0dd)

0.3.2 (2023-09-06)

Bug Fixes

Make release.sh script for PyPI upload executable (#20) (9951610)

0.3.1 (2023-09-05)

Bug Fixes

release: Use correct directory name for release build config (#17) (3dd25b3)

0.3.0 (2023-09-02)

Features

Add bigframes.get_global_session() and bigframes.reset_session() aliases (a32b747)
Add bigframes.pandas.read_pickle function (a32b747)
Add components_, explained_variance_, and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA (89b9503)
Add fit_transform to bigquery.ml transformers (a32b747)
Add Series.dropna and DataFrame.fillna (8fab755)
Add Series.str methods isalpha, isdigit, isdecimal, isalnum, isspace, islower, isupper, zfill, center (a32b747)
Support bigframes.pandas.merge() (8fab755)
Support DataFrame.isin with list and dict inputs (8fab755)
Support DataFrame.pivot (a32b747)
Support DataFrame.stack (89b9503)
Support DataFrame-DataFrame binary operations (8fab755)
Support df[my_column] = [a python list] (89b9503)
Support Index.is_monotonic (8fab755)
Support np.arcsin, np.arccos, np.arctan, np.sinh, np.cosh, np.tanh, np.arcsinh, np.arccosh, np.arctanh, np.exp with Series argument (89b9503)
Support np.sin, np.cos, np.tan, np.log, np.log10, np.sqrt, np.abs with Series argument (89b9503)
Support pow() and power operator in DataFrame and Series (8fab755)
Support read_json with engine=bigquery for newline-delimited JSON files (89b9503)
Support Series.corr (89b9503)
Support Series.map (8fab755)
Support for np.add, np.subtract, np.multiply, np.divide, np.power (8fab755)
Support MultiIndex for DataFrame columns (a32b747)
Use pandas.Index for column labels (a32b747)
Use default session and connection in ml.llm and ml.imported (8fab755)

Bug Fixes

Add error message to set_index (a32b747)
Align column names with pandas in DataFrame.agg results (89b9503)
Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined (89b9503)
Check for IAM role on the BigQuery connection when initializing a remote_function (89b9503)
Check that types are specified in read_gbq_function (a32b747)
Don’t use query cache for Session construction (a32b747)
Include survey link in abstract NotImplementedError exception messages (89b9503)
Label temp table creation jobs with source=bigquery-dataframes-temp label (89b9503)
Make X_train argument names consistent across methods (8fab755)
Raise AttributeError for unimplemented pandas methods (89b9503)
Raise exception for invalid function in read_gbq_function (a32b747)
Support spaces in column names in DataFrame initializater (89b9503)

Performance Improvements

Add local cache for __repr_\*__ methods (a32b747)
Lazily instantiate client library objects (89b9503)
Use row_number() filter for head / tail (8fab755)

Documentation

Add ML section under Overview (a32b747)
Add release status to table of contents (a32b747)
Add samples and best practices to read_gbq docs (a32b747)
Correct the return types of Dataframe and Series (a32b747)
Create subfolders for notebooks (a32b747)
Fix link to GitHub (89b9503)
Highlight bigframes is open-source (a32b747)
Sample ML Drug Name Generation notebook (a32b747)
Set options.bigquery.project in sample code (89b9503)
Transform remote function user guide into sample code (a32b747)
Update remote function notebook with read_gbq_function usage (8fab755)

0.2.0 (2023-08-17)

Features

Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.

Bug Fixes

Update ML package to use sessions for queries.
Optimize read_gbq with index_col set to cluster by index_col.
Raise ValueError if the location mismatched.
read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

Add bigframes.pandas package with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.
Add bigframes.ml package with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.

0.0.0 (2023-02-22)

Empty package to reserve package name.