Changelog
0.6.0 (2023-10-04)
Features
Bug Fixes
0.5.0 (2023-09-28)
Features
Add
DataFrame.kurtosis
/DF.kurt
method (c1900c2)Add
DataFrame.rolling
andDataFrame.expanding
methods (c1900c2)Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. (#38) (1a254a4)Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
class_weights="balanced"
inLogisticRegression
model (c1900c2)Support
df[column_name] = df_only_one_column
(c1900c2)Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)
Bug Fixes
Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
Performance Improvements
Add ability to cache dataframe and series to session table (#51) (416d7cb)
Inline small
Series
andDataFrames
in query text (#45) (5e199ec)Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
Documentation
- Link to Remote Functions code samples from README and API reference (c1900c2)
0.4.0 (2023-09-16)
Features
Add
axis
parameter todroplevel
andreorder_levels
(7c6b0dd)Add
bfill
andffill
toDataFrame
andSeries
(7c6b0dd)Add
DataFrame.combine
andDataFrame.combine_first
(#27) (7c6b0dd)Add
DataFrame.nlargest
,nsmallest
(7c6b0dd)Add
DataFrame.pct_change
andSeries.pct_change
(7c6b0dd)Add
DataFrame.skew
andGroupBy.skew
(7c6b0dd)Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
(7c6b0dd)Add
diff
method toDataFrame
andGroupBy
(7c6b0dd)Add
filter
andreindex
toSeries
andDataFrame
(7c6b0dd)Add
reindex_like
toDataFrame
andSeries
(7c6b0dd)Add
swaplevel
toDataFrame
andSeries
(7c6b0dd)Add partial support for
Sereies.replace
(7c6b0dd)Support
DataFrame.loc[bool_series, column] = scalar
(7c6b0dd)Support a persistent
name
inremote_function
(7c6b0dd)
Bug Fixes
remote_function
uses same credentials as other APIs (7c6b0dd)Add type hints to models (7c6b0dd)
Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
Remove
transforms
parameter inmodel.fit
(breaking change) (7c6b0dd)Support column joins with “None indexer” (7c6b0dd)
Use for literals
Int64Dtype
incut
(7c6b0dd)Use lowercase strings for parameter literals in
bigframes.ml
(breaking change) (7c6b0dd)
Performance Improvements
bigframes-api
label to I/O query jobs (7c6b0dd)
Documentation
Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
Document region logic in README (7c6b0dd)
Fix OneHotEncoder sample (7c6b0dd)
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases (a32b747)Add
bigframes.pandas.read_pickle
function (a32b747)Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
(89b9503)Add
fit_transform
tobigquery.ml
transformers (a32b747)Add
Series.dropna
andDataFrame.fillna
(8fab755)Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
(a32b747)Support
bigframes.pandas.merge()
(8fab755)Support
DataFrame.isin
with list and dict inputs (8fab755)Support
DataFrame.pivot
(a32b747)Support
DataFrame.stack
(89b9503)Support
DataFrame
-DataFrame
binary operations (8fab755)Support
df[my_column] = [a python list]
(89b9503)Support
Index.is_monotonic
(8fab755)Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument (89b9503)Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument (89b9503)Support
pow()
and power operator inDataFrame
andSeries
(8fab755)Support
read_json
withengine=bigquery
for newline-delimited JSON files (89b9503)Support
Series.corr
(89b9503)Support
Series.map
(8fab755)Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
(8fab755)Support MultiIndex for DataFrame columns (a32b747)
Use
pandas.Index
for column labels (a32b747)Use default session and connection in
ml.llm
andml.imported
(8fab755)
Bug Fixes
Add error message to
set_index
(a32b747)Align column names with pandas in
DataFrame.agg
results (89b9503)Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined (89b9503)Check for IAM role on the BigQuery connection when initializing a
remote_function
(89b9503)Check that types are specified in
read_gbq_function
(a32b747)Don’t use query cache for Session construction (a32b747)
Include survey link in abstract
NotImplementedError
exception messages (89b9503)Label temp table creation jobs with
source=bigquery-dataframes-temp
label (89b9503)Make
X_train
argument names consistent across methods (8fab755)Raise AttributeError for unimplemented pandas methods (89b9503)
Raise exception for invalid function in
read_gbq_function
(a32b747)Support spaces in column names in
DataFrame
initializater (89b9503)
Performance Improvements
Add local cache for
__repr_\*__
methods (a32b747)Lazily instantiate client library objects (89b9503)
Use
row_number()
filter forhead
/tail
(8fab755)
Documentation
Add ML section under Overview (a32b747)
Add release status to table of contents (a32b747)
Add samples and best practices to
read_gbq
docs (a32b747)Correct the return types of Dataframe and Series (a32b747)
Create subfolders for notebooks (a32b747)
Fix link to GitHub (89b9503)
Highlight bigframes is open-source (a32b747)
Sample ML Drug Name Generation notebook (a32b747)
Set
options.bigquery.project
in sample code (89b9503)Transform remote function user guide into sample code (a32b747)
Update remote function notebook with read_gbq_function usage (8fab755)
0.2.0 (2023-08-17)
Features
Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.
Bug Fixes
Update ML package to use sessions for queries.
Optimize
read_gbq
withindex_col
set to cluster byindex_col
.Raise ValueError if the location mismatched.
read_gbq
no longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
0.1.0 (2023-08-11)
Features
Add
bigframes.pandas
package with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.Add
bigframes.ml
package with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.
0.0.0 (2023-02-22)
- Empty package to reserve package name.