Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Übersicht über die Feature-Vorverarbeitung
Die Feature-Vorverarbeitung ist einer der wichtigsten Schritte im Lebenszyklus des maschinellen Lernens. Sie beinhaltet das Erstellen von Features und das Bereinigen der Trainingsdaten. Das Erstellen von Features wird auch als Feature Engineering bezeichnet.
BigQuery ML bietet die folgenden Techniken zur Vorverarbeitung von Features:
Automatische Vorverarbeitung: BigQuery ML führt während des Trainings eine automatische Vorverarbeitung durch. Weitere Informationen finden Sie unter Automatische Feature-Vorverarbeitung.
Manuelle Vorverarbeitung. Sie können mit der TRANSFORM-Klausel in der Anweisung CREATE MODEL die benutzerdefinierte Vorverarbeitung mithilfe von manuellen Vorverarbeitungsfunktionen festlegen.
Diese Funktionen können auch außerhalb der TRANSFORM-Klausel genutzt werden, um Trainingsdaten vor dem Erstellen des Modells zu verarbeiten.
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],["Zuletzt aktualisiert: 2025-01-07 (UTC)."],[[["\u003cp\u003eFeature preprocessing, encompassing both feature creation (engineering) and data cleaning, is a crucial step in the machine learning process.\u003c/p\u003e\n"],["\u003cp\u003eBigQuery ML offers automatic preprocessing during training, simplifying the process for users.\u003c/p\u003e\n"],["\u003cp\u003eManual preprocessing is also available in BigQuery ML, allowing for custom preprocessing definitions using the \u003ccode\u003eTRANSFORM\u003c/code\u003e clause and specific functions.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.FEATURE_INFO\u003c/code\u003e function enables users to retrieve statistics about the input feature columns.\u003c/p\u003e\n"],["\u003cp\u003eBasic knowledge of the ML development lifecycle, including feature engineering and model training, is recommended for better optimization of data and models.\u003c/p\u003e\n"]]],[],null,["# Feature preprocessing overview\n==============================\n\n*Feature preprocessing* is one of the most important steps in the machine\nlearning lifecycle. It consists of creating features and cleaning the training\ndata. Creating features is also referred as *feature engineering*.\n\nBigQuery ML provides the following feature preprocessing techniques:\n\n- **Automatic preprocessing** . BigQuery ML performs automatic\n preprocessing during training. For more information, see [Automatic feature\n preprocessing](/bigquery/docs/reference/standard-sql/bigqueryml-auto-preprocessing).\n\n- **Manual preprocessing** . You can use the [`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\n in the `CREATE MODEL` statement to define custom preprocessing using [manual\n preprocessing\n functions](/bigquery/docs/manual-preprocessing#types_of_preprocessing_functions).\n You can also use these functions outside of the `TRANSFORM` clause to\n process training data before creating the model.\n\nGet feature information\n-----------------------\n\nYou can use the [`ML.FEATURE_INFO`\nfunction](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-feature) to\nretrieve the statistics of all input feature columns.\n\nRecommended knowledge\n---------------------\n\nBy using the default settings in the `CREATE MODEL` statements and the\ninference functions, you can create and use BigQuery ML models\neven without much ML knowledge. However, having basic knowledge about the\nML development lifecycle, such as feature engineering and model training,\nhelps you optimize both your data and your model to\ndeliver better results. We recommend using the following resources to develop\nfamiliarity with ML techniques and processes:\n\n- [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course)\n- [Intro to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning)\n- [Data Cleaning](https://www.kaggle.com/learn/data-cleaning)\n- [Feature Engineering](https://www.kaggle.com/learn/feature-engineering)\n- [Intermediate Machine Learning](https://www.kaggle.com/learn/intermediate-machine-learning)\n\nWhat's next\n-----------\n\nLearn about [feature serving](/bigquery/docs/feature-serving) in\nBigQuery ML."]]