Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Ringkasan pra-pemrosesan fitur
Pra-pemrosesan fitur adalah salah satu langkah terpenting dalam siklus proses
machine learning. Proses ini terdiri dari membuat fitur dan membersihkan data pelatihan. Membuat fitur juga disebut sebagai rekayasa fitur.
BigQuery ML menyediakan teknik pra-pemrosesan fitur berikut:
Pra-pemrosesan otomatis. BigQuery ML melakukan pra-pemrosesan
otomatis selama pelatihan. Untuk mengetahui informasi selengkapnya, lihat Pra-pemrosesan fitur otomatis.
Pra-pemrosesan manual. Anda dapat menggunakan klausa TRANSFORM
dalam pernyataan CREATE MODEL untuk menentukan pra-pemrosesan kustom menggunakan fungsi
pra-pemrosesan
manual.
Anda juga dapat menggunakan fungsi ini di luar klausa TRANSFORM untuk
memproses data pelatihan sebelum membuat model.
Mendapatkan informasi fitur
Anda dapat menggunakan fungsi
ML.FEATURE_INFO untuk
mengambil statistik semua kolom fitur input.
Artikel pusat informasi yang direkomendasikan
Dengan menggunakan setelan default dalam pernyataan CREATE MODEL dan fungsi inferensi, Anda dapat membuat dan menggunakan model BigQuery ML bahkan tanpa banyak pengetahuan ML. Namun, memiliki pengetahuan dasar tentang siklus proses pengembangan ML, seperti rekayasa fitur dan pelatihan model, akan membantu Anda mengoptimalkan data dan model untuk memberikan hasil yang lebih baik. Sebaiknya gunakan referensi berikut untuk mengembangkan
pengetahuan tentang teknik dan proses ML:
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-17 UTC."],[[["\u003cp\u003eFeature preprocessing, encompassing both feature creation (engineering) and data cleaning, is a crucial step in the machine learning process.\u003c/p\u003e\n"],["\u003cp\u003eBigQuery ML offers automatic preprocessing during training, simplifying the process for users.\u003c/p\u003e\n"],["\u003cp\u003eManual preprocessing is also available in BigQuery ML, allowing for custom preprocessing definitions using the \u003ccode\u003eTRANSFORM\u003c/code\u003e clause and specific functions.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.FEATURE_INFO\u003c/code\u003e function enables users to retrieve statistics about the input feature columns.\u003c/p\u003e\n"],["\u003cp\u003eBasic knowledge of the ML development lifecycle, including feature engineering and model training, is recommended for better optimization of data and models.\u003c/p\u003e\n"]]],[],null,["# Feature preprocessing overview\n==============================\n\n*Feature preprocessing* is one of the most important steps in the machine\nlearning lifecycle. It consists of creating features and cleaning the training\ndata. Creating features is also referred as *feature engineering*.\n\nBigQuery ML provides the following feature preprocessing techniques:\n\n- **Automatic preprocessing** . BigQuery ML performs automatic\n preprocessing during training. For more information, see [Automatic feature\n preprocessing](/bigquery/docs/reference/standard-sql/bigqueryml-auto-preprocessing).\n\n- **Manual preprocessing** . You can use the [`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\n in the `CREATE MODEL` statement to define custom preprocessing using [manual\n preprocessing\n functions](/bigquery/docs/manual-preprocessing#types_of_preprocessing_functions).\n You can also use these functions outside of the `TRANSFORM` clause to\n process training data before creating the model.\n\nGet feature information\n-----------------------\n\nYou can use the [`ML.FEATURE_INFO`\nfunction](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-feature) to\nretrieve the statistics of all input feature columns.\n\nRecommended knowledge\n---------------------\n\nBy using the default settings in the `CREATE MODEL` statements and the\ninference functions, you can create and use BigQuery ML models\neven without much ML knowledge. However, having basic knowledge about the\nML development lifecycle, such as feature engineering and model training,\nhelps you optimize both your data and your model to\ndeliver better results. We recommend using the following resources to develop\nfamiliarity with ML techniques and processes:\n\n- [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course)\n- [Intro to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning)\n- [Data Cleaning](https://www.kaggle.com/learn/data-cleaning)\n- [Feature Engineering](https://www.kaggle.com/learn/feature-engineering)\n- [Intermediate Machine Learning](https://www.kaggle.com/learn/intermediate-machine-learning)\n\nWhat's next\n-----------\n\nLearn about [feature serving](/bigquery/docs/feature-serving) in\nBigQuery ML."]]