Schedule data exports
This page describes how to schedule exports of your Firestore data. To run exports on a schedule, we recommend using Cloud Run functions and Cloud Scheduler.
Before you begin
Before you schedule managed data exports, you must complete the following tasks:
- Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the export and import feature.
- Export operations require a destination Cloud Storage bucket. Create a Cloud Storage bucket in a location near your Firestore database location. You cannot use a Requester Pays bucket for export operations.
Create a Cloud Function and a Cloud Scheduler job
Follow the steps below to create a Node.js Cloud Function that initiates a Firestore data export and a Cloud Scheduler job to call that function:
Firebase CLI
-
Install the Firebase CLI. In a new directory, initialize the CLI for Cloud Run functions:
firebase init functions --project PROJECT_ID
- Select JavaScript for the language.
- Optionally, enable ESLint.
- Enter
y
to install dependencies.
-
Replace the code in the
functions/index.js
file with the following:const functions = require('firebase-functions'); const firestore = require('@google-cloud/firestore'); const client = new firestore.v1.FirestoreAdminClient(); // Replace BUCKET_NAME const bucket = 'gs://BUCKET_NAME'; exports.scheduledFirestoreExport = functions.pubsub .schedule('every 24 hours') .onRun((context) => { const projectId = process.env.GCP_PROJECT; const databaseName = client.databasePath(projectId, '(default)'); return client.exportDocuments({ name: databaseName, outputUriPrefix: bucket, // Leave collectionIds empty to export all collections // or set to a list of collection IDs to export, // collectionIds: ['users', 'posts'] collectionIds: [] }) .then(responses => { const response = responses[0]; console.log(`Operation Name: ${response['name']}`); }) .catch(err => { console.error(err); throw new Error('Export operation failed'); }); });
-
In the code above, modify the following:
- Replace
BUCKET_NAME
with the name of your bucket. - Modify
every 24 hours
to set your export schedule. Use either AppEngine cron.yaml syntax or the unix-cron format (* * * * *
). -
Modify
collectionIds: []
to export only the specified collection groups. Leave as is to export all collections.
- Replace
-
Deploy the scheduled function:
firebase deploy --only functions
Google Cloud console
Create a Cloud Function
-
Go to the Cloud Functions page in the Google Cloud console:
- Click Create Function
- Enter a function name such as
firestoreExport
- Under Trigger, select Cloud Pub/Sub
- Under Topic, select Create new Topic. Enter a name for
the pub/sub topic, such as
initiateFirestoreExport
. Take note of the topic name as you need it to create your Cloud Scheduler job. - Under Source code, select Inline editor. Enter the
following code under
index.js
:const firestore = require('@google-cloud/firestore'); const client = new firestore.v1.FirestoreAdminClient(); // Replace BUCKET_NAME const bucket = 'gs://BUCKET_NAME' exports.scheduledFirestoreExport = (event, context) => { const databaseName = client.databasePath( process.env.GCP_PROJECT, '(default)' ); return client .exportDocuments({ name: databaseName, outputUriPrefix: bucket, // Leave collectionIds empty to export all collections // or define a list of collection IDs: // collectionIds: ['users', 'posts'] collectionIds: [], }) .then(responses => { const response = responses[0]; console.log(`Operation Name: ${response['name']}`); return response; }) .catch(err => { console.error(err); }); };
In the code above, modify the following:- Replace
BUCKET_NAME
with the name of your bucket. -
Modify
collectionIds: []
to export only the specified collection groups. Leave as is to export all collections.
- Replace
- Under
package.json
, add the following dependency:{ "dependencies": { "@google-cloud/firestore": "^1.3.0" } }
- Under Function to execute, enter
scheduledFirestoreExport
, the name of the function inindex.js
. - Click Create to deploy the Cloud Function.
Create a Cloud Scheduler job
Next, create a Cloud Scheduler job that calls your Cloud Function:
-
Go to the Cloud Scheduler page in the Google Cloud console:
- Click Create Job.
- Enter a Name for the job such as
scheduledFirestoreExport
. - Enter a Frequency, for example,
every 24 hours
. - Select a Timezone.
- Under Target, select Pub/Sub. In the Topic field,
enter the name of the pub/sub topic you defined alongside your
Cloud Function,
initiateFirestoreExport
in the example above. - In the Payload field, enter
start export
. The job requires a payload defined, but the Cloud Function above does not actually use this value. - Click Create.
Configure access permissions
Next, give the Cloud Function permission to start export operations and to write to your GCS bucket.
This Cloud Function uses your project's default service account to authenticate and authorize its export operations. When you create a project, a default service account is created for you with the following name:
PROJECT_ID@appspot.gserviceaccount.com
This service account requires permission to start an export operation and to write to your Cloud Storage bucket. To grant these permissions, assign the following IAM roles to the default service account:
Cloud Datastore Import Export Admin
-
Owner
orStorage Admin
role on the bucket
You can use the gcloud
and gsutil
command-line
tools to assign these roles.
If not already installed, you can access these
tools from Cloud Shell in the Google Cloud console:
Start Cloud Shell
-
Assign the Cloud Datastore Import Export Admin role. Replace PROJECT_ID, and run the following command:
gcloud projects add-iam-policy-binding PROJECT_ID \ --member serviceAccount:PROJECT_ID@appspot.gserviceaccount.com \ --role roles/datastore.importExportAdmin
-
Assign the Storage Admin role on your bucket. Replace PROJECT_ID and BUCKET_NAME, and run the following command:
gsutil iam ch serviceAccount:PROJECT_ID@appspot.gserviceaccount.com:admin \ gs://BUCKET_NAME
If you disable or delete your App Engine default service account, your App Engine app will lose access to your Firestore database. If you disabled your App Engine service account, you can re-enable it, see enabling a service account. If you deleted your App Engine service account within the last 30 days, you can restore your service account, see undeleting a service account.
Test your Cloud Scheduler job and Cloud Function
You can test your Cloud Scheduler job in the Cloud Scheduler page of the Google Cloud console.
Go to the Cloud Scheduler page in the Google Cloud console.
Go to Cloud SchedulerIn the row for your new Cloud Scheduler job, click Run now.
After a few seconds, the Cloud Scheduler job should update the result column to Success and Last run to the current time. You may need to click Refresh.
The Cloud Scheduler page only confirms that the job called your Cloud Function. Open the Cloud Function page to see your function's logs.
View the Cloud Function logs
To see if the Cloud Function successfully started an export operation, open the function's logs:
Firebase Console
Go to the Cloud Run functions page in the Firebase console.
GCP Console
Go to the Cloud Run functions page in the Google Cloud console.
View export progress
You can use the gcloud firestore operations list
command to view the
progress of your export operations, see
managing export and import operations.
After an export operation completes, you can view the output files in your Cloud Storage bucket:
Open the Cloud Storage browser