[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-27。"],[[["\u003cp\u003eThis page details various schema design patterns for storing time series data in Bigtable, building upon general schema design concepts.\u003c/p\u003e\n"],["\u003cp\u003eTime series data, which includes measurements and timestamps, can be stored using patterns like time buckets, where rows represent a time period, or single-timestamp rows, where each row represents a single event.\u003c/p\u003e\n"],["\u003cp\u003eTime bucket patterns offer advantages like better performance and data compression but are more complex to design, while single-timestamp patterns can be easier to implement but may lead to performance issues.\u003c/p\u003e\n"],["\u003cp\u003eWithin time buckets, data can be structured by adding new columns for each event, useful when changes in data aren't crucial, or by adding new timestamped cells to existing columns, allowing for change measurement over time.\u003c/p\u003e\n"],["\u003cp\u003eSingle-timestamp rows can utilize serialized or unserialized data storage, with serialized data offering storage efficiency and speed at the cost of flexibility, while unserialized data provides ease of use but reduced performance.\u003c/p\u003e\n"]]],[],null,["# Schema design for time series data\n==================================\n\nThis page describes schema design patterns for storing time series data in\nBigtable. This page builds on [Designing your schema](/bigtable/docs/schema-design) and\nassumes you are familiar with the concepts and recommendations described on that\npage.\n\nA time series is a collection of data that consists of measurements and the\ntimes when the measurements are recorded. Examples of time series include the\nfollowing:\n\n- The plot of memory usage on your computer\n- Temperature over time on a news report\n- Stock market prices over a period of time\n\nA good schema results in excellent performance and scalability, and a bad schema\ncan lead to a poorly performing system. However, no single schema design\nprovides the best fit for all use cases.\n\nThe patterns described on this page provide a starting point. Your unique\ndataset and the queries you plan to use are the most important things to\nconsider as you design a schema for your time-series data.\n\nThe basic design patterns for storing time-series data in\nBigtable are as follows:\n\n- [Rows are time buckets](#time-buckets)\n\n - [New columns for new events](#new-columns)\n - [New cells for new events](#new-cells)\n- [Rows represent single timestamps](#single-timestamp)\n\n - [Serialized column data](#serialized)\n - [Unserialized column data](#unserialized)\n\nData for examples\n-----------------\n\nTo illustrate the differences between patterns, the examples on this page assume\nthat you are storing data for an app that records the\nmeasurements that weather balloons take once every minute. We use *event* to\nmean a single request that writes one or multiple cells at the\nsame time. Location IDs correspond with Google Cloud regions.\n\nTime buckets\n------------\n\nIn a time bucket pattern, each row in your table represents a \"bucket\" of time,\nsuch as an hour, day, or month. A row key includes a non-timestamp identifier,\nsuch as `week49`, for the time period recorded in the row, along with other\nidentifying data.\n\nThe size of the bucket that you use --- such as minute, hour, or day ---\ndepends on the queries that you plan to use and on\n[Bigtable data size limits](/bigtable/quotas#limits-data-size). For example, if\nrows that contain an hour of data are bigger the recommended maximum size\nper row of 100 MB, then rows that represent a half hour\nor a minute are probably a better choice.\n\n**Advantages** of time bucket patterns include the following:\n\n- You'll see better [performance](/bigtable/docs/performance). For example, if you store 100\n measurements, Bigtable writes and reads those measurements faster\n if they are in one row than if they are in 100 rows.\n\n- Data stored in this way is compressed more efficiently than data in tall,\n narrow tables.\n\n**Disadvantages** include the following:\n\n- Time-bucket schema design patterns are more complicated than single-timestamp patterns and can take more time and effort to develop.\n\n### Adding new columns for new events\n\nIn this time bucket pattern, you write a new column to a row for each event,\nstoring the *data in the column qualifier rather than as a cell value*.\nThis means that for each cell, you send the column family, column qualifier, and\ntimestamp, but no value.\n\nUsing this pattern for the sample weather balloon data, each row contains all\nthe measurements for a single metric, such as `pressure`, for a single weather\nballoon, over the course of a week. Each row key contains the location, balloon\nID, metric that you are recording in the row, and a week number. Every time a\nballoon reports its data for a metric, you add a new column to the row. The\ncolumn qualifier contains the measurement, the pressure in Pascals, for the\nminute identified by the cell timestamp.\n\nIn this example, after three minutes a row might look like this:\n\n**Use cases** for this pattern include the following:\n\n- You don't need to measure changes in your time series data.\n\n- You want to save storage space by [using column qualifiers as data](/bigtable/docs/schema-design#columns).\n\n### Adding new cells for new events\n\nIn this time bucket pattern, you add new cells to existing columns when you\nwrite a new event. This pattern lets you take advantage of\nBigtable's ability to let you store multiple timestamped cells in\na given row and column. It's important to specify garbage collection rules when\nyou use this pattern.\n\nUsing the weather balloon data as an example, each row contains all the\nmeasurements for a single weather balloon over the course of a week. The row key\nprefix is an identifier for the week, so you can read an entire week's worth of\ndata for multiple balloons with a single query. The other row key segments are\nthe location where the balloon operates and the ID number for the balloon. The\ntable has one column family, `measurements`, and that column family has one\ncolumn for each type of measurement: `pressure`, `temperature`, `humidity`, and\n`altitude`.\n\nEvery time a balloon sends its measurements, the application\nwrites new values to the row that holds the current week's data for the balloon,\nwriting additional timestamped cells to each column. At the end of the week,\neach column in each row has one measurement for each minute of the week, or\n10,080 cells (if your garbage collection policy allows it).\n\nEach column in each row holds a measurement for each minute of the week. In this\ncase, after three minutes, the first two columns in a row might look like this:\n\n**Use cases** for this pattern include the following:\n\n- You want to be able to measure changes in measurements over time.\n\nSingle-timestamp rows\n---------------------\n\nIn this pattern, you create a row for each new event or measurement instead of\nadding cells to columns in existing rows. The row key suffix is the timestamp\nvalue. Tables that follow this pattern tend to be *tall and narrow*, and each\ncolumn in a row contains only one cell.\n| **Important:** To avoid hotspots, *never* use a timestamp value as a row key prefix.\n\n### Single-timestamp serialized\n\nIn this pattern, you store all the data for a row in a single column in a\nserialized format such as a protocol buffer (protobuf). This approach is\ndescribed in more detail on [Designing your schema](/bigtable/docs/schema-design#query-flux).\n\nFor example, if you use this pattern to store the weather balloon data, your\ntable might look like this after four minutes:\n\n**Advantages** of this pattern include the following:\n\n- Storage efficiency\n\n- Speed\n\n**Disadvantages** include the following:\n\n- The inability to retrieve only certain columns when you read the data\n\n- The need to deserialize the data after it's read\n\n**Use cases** for this pattern include the following:\n\n- You are not sure how you will query the data or your queries might\n fluctuate.\n\n- Your need to keep costs down outweighs your need to be able to filter data\n before you retrieve it from Bigtable.\n\n- Each event contains so many measurements that you might exceed the\n 100 MB per-row limit if you store the data in multiple\n columns.\n\n### Single-timestamp unserialized\n\nIn this pattern, you store each event in its own row, even if you are recording\nonly one measurement. The data in the columns is not serialized.\n\n**Advantages** of this pattern include the following:\n\n- It is generally easier to implement than a time-bucket pattern.\n\n- You might spend less time [refining your schema](/bigtable/docs/schema-design-steps#refine) before using\n it.\n\n**Disadvantages** of this pattern often outweigh the advantages:\n\n- Bigtable is less performant with this pattern.\n\n- Data stored this way is not as efficiently compressed as data in wider\n columns.\n\n- Even when the timestamp is at the end of the row key, this pattern can result\n in hotspots.\n\n**Use cases** for this pattern include the following:\n\n- You want to always retrieve all columns but only a specified range of\n timestamps, but you have a reason not to store the data in a serialized\n structure.\n\n- You want to store an unbounded number of events.\n\nUsing the [weather balloon example data](#example-data), the column family and\ncolumn qualifiers are the same as the example using time buckets and new cells.\nIn this pattern, however, every set of reported measurements for each weather\nballoon is written to a *new row*. The following table shows five rows that are\nwritten using this pattern:\n\nAdditional strategies\n---------------------\n\nIf you need to send multiple different queries for the same dataset, consider\nstoring your data in multiple tables, each with a row key designed for one of\nthe queries.\n\nYou can also combine patterns in some cases. For example, you can store\n[serialized data](#serialized) in rows that represent\n[time buckets](#time-buckets), as long as you don't let the rows become too big.\n\nWhat's next\n-----------\n\n- Review the steps involved in [planning a schema](/bigtable/docs/schema-design-steps).\n- Understand the [best practices for designing a schema](/bigtable/docs/schema-design).\n- Read about [the performance you can expect from Bigtable](/bigtable/docs/performance).\n- Explore the diagnostic capabilities of [Key Visualizer](/bigtable/docs/keyvis-overview)."]]