Scheduling backups in a remote server

This page describes how to schedule backups for Cassandra without the Cloud Storage. In this method, backups are stored on a remote server specified by you instead of a Cloud Storage bucket. Apigee uses SSH to communicate with the remote server.

You must schedule the backups as cron jobs. Once a backup schedule has been applied to your hybrid cluster, a Kubernetes backup job is periodically executed according to the schedule in the runtime plane. The job interacts with each Cassandra pod in your hybrid cluster to collect all the data, create an archive (compressed) file of the data, and send the archive to the server specified in your overrides.yaml file.

The following steps include common examples for completing specific tasks, like creating an SSH key pair. Use the methods that are appropriate to your installation.

The procedure has the following parts:

Set up the server and SSH
Set the schedule and destination for backup

Set up the server and SSH

Designate a Linux or Unix server for your backups. This server must be reachable using SSH from your Apigee hybrid runtime plane. It must have enough storage for your backups.
Set up an SSH server on the server, or ensure that it has a secure SSH server configured.
Caution: For security purposes, make sure your SSH server is up to date.

Create an SSH key pair and store the private key file in a path that is accessible from your hybrid runtime plane. You must use a blank password for your key pair or the backup will fail. For example:

ssh-keygen -t rsa -b 4096 -C exampleuser@example.com
  Enter file in which to save the key (/Users/exampleuser/.ssh/id_rsa): $APIGEE_HOME/hybrid-files/certs/ssh_key
  Enter passphrase (empty for no passphrase):
  Enter same passphrase again:
  Your identification has been saved in ssh_key
  Your public key has been saved in ssh_key.pub
  The key fingerprint is:
  SHA256:DWKo334XMZcZYLOLrd/8HNpjTERPJJ0mc11UYmrPvSA exampleuser@example.com
  The key's randomart image is:
  +---[RSA 4096]----+
  |          +.  ++X|
  |     .   . o.=.*+|
  |    . o . . o==o |
  |   . . . =oo+o...|
  |  .     S +E oo .|
  |   . .   .. . o .|
  |    . . .  . o.. |
  |     .  ...o ++. |
  |      .. .. +o+. |
  +----[SHA256]-----+

Where: exampleuser@example.com is a string. Any string that follows -C in the ssh-keygen command becomes a comment included in the newly created ssh key. The input string can be any string. When you use an account name in the form of exampleuser@example.com, you can quickly identify which account goes with the key.

Create a user account on the backup server with the name apigee. Make sure the new apigee user has a home directory under /home.
On the backup server, create an .ssh directory in the new /home/apigee directory.
Copy the public key (ssh_key.pub in the previous example) into a file named authorized_keys in the new /home/apigee/.ssh directory. For example:
```
cd /home/apigee
mkdir .ssh
cd .ssh
vi authorized_keys
```
On your backup server, create a backup directory within the /home/apigee/ directory. The backup directory can be any directory as long as the apigee user has access to it. For example:
```
cd /home/apigee
mkdir cassandra-backup
```

Set the schedule and destination for backup

You set the schedule and destination for backups in your overrides.yaml file.

Add the following parameters to your overrides.yaml file:

Parameters

cassandra:
  backup:
    enabled: true
    keyFile: "PATH_TO_PRIVATE_KEY_FILE"
    server: "BACKUP_SERVER_IP"
    storageDirectory: "/home/apigee/BACKUP_DIRECTORY"
    cloudProvider: "HYBRID" # required verbatim "HYBRID" (all caps)
    schedule: "SCHEDULE"

Example

cassandra:
  backup:
    enabled: true
    keyFile: "private.key"# path relative to apigee-datastore path
    server: "34.56.78.90"
    storageDirectory: "/home/apigee/cassbackup"
    cloudProvider: "HYBRID"
    schedule: "0 2 * * *"

Where:

Property	Description
`backup:enabled`	Backup is disabled by default. You must set this property to `true`.
`backup:keyFile`	`PATH_TO_PRIVATE_KEY_FILE` The path on your local file system to the SSH private key file (named `ssh_key` in the step where you created the SSH key pair). This path must be relative to the `apigee-datastore` chart directory.
`backup:server`	`BACKUP_SERVER_IP` The IP address of your backup server.
`backup:storageDirectory`	`BACKUP_DIRECTORY` The name of the backup directory on your backup server. This must be a directory within `home/apigee` (the backup directory is named `cassandra_backup` in the step where you created the backup directory).
`backup:cloudProvider`	`HYBRID` For a remote server backup, set the property to `HYBRID`.
`backup:schedule`	`SCHEDULE` The time when the backup starts, specified in standard crontab syntax. Times are in the local time zone of the Kubernetes cluster. Default: `0 2 * * ` Note:* Avoid scheduling a backup that starts a short time after you apply the backup configuration to your cluster.

Apply the backup configuration to the storage scope of your cluster:
```
helm upgrade datastore apigee-datastore/ \
  --install \
  --namespace APIGEE_NAMESPACE \
  --atomic \
  -f OVERRIDES_FILE.yaml
```
Where OVERRIDES_FILE is the path to the overrides file you just edited.

Verify the backup job. For example:

kubectl get cronjob -n APIGEE_NAMESPACE

NAME                      SCHEDULE     SUSPEND   ACTIVE   LAST SCHEDULE   AGE
apigee-cassandra-backup   33 * * * *   False     0        <none>          94s