New to Stash? Please start here.
A complete backup or restore process may consist of several steps. For example, in order to backup a PostgreSQL database we first need to dump the database and upload the dumped file to a backend. Then we need to update the respectiveRepository and BackupSession status and send Prometheus metrics. In Stash, we call such individual steps a Function.
A Function is a Kubernetes CustomResourceDefinition(CRD) which basically specifies a template for a container that performs only a specific action. For example, postgres-backup-* function only dumps and uploads the dumped file into the backend where update-status function updates the status of respective BackupSession and Repository and sends Prometheus metrics to pushgateway based on the output of postgres-backup-* function.
When you install Stash, some Functions will be pre-installed for supported targets like databases, etc. However, you can create your own function to customize or extend the backup/restore process.
Like any official Kubernetes resource, a Function has TypeMeta, ObjectMeta and Spec sections. However, unlike other Kubernetes resources, it does not have a Status section.
A sample Function object to backup a PostgreSQL is shown below,
apiVersion: stash.appscode.com/v1beta1
kind: Function
metadata:
name: postgres-backup-11.2
spec:
image: stashed/postgres-stash:11.2
args:
- backup-pg
- --provider=${REPOSITORY_PROVIDER:=}
- --bucket=${REPOSITORY_BUCKET:=}
- --endpoint=${REPOSITORY_ENDPOINT:=}
- --path=${REPOSITORY_PREFIX:=}
- --secret-dir=/etc/repository/secret
- --scratch-dir=/tmp
- --hostname=${HOSTNAME:=host-0}
- --pg-args=${pgArgs:=}
- --namespace=${NAMESPACE:=default}
- --app-binding=${TARGET_NAME:=}
- --retention-keep-last=${RETENTION_KEEP_LAST:=0}
- --retention-prune=${RETENTION_PRUNE:=false}
- --output-dir=${outputDir:=}
- --enable-cache=${ENABLE_CACHE:=true}
- --max-connections=${MAX_CONNECTIONS:=0}
volumeMounts:
- name: ${secretVolume}
mountPath: /etc/repository/secret
runtimeSettings:
container:
resources:
requests:
memory: 256M
limits:
memory: 256M
securityContext:
runAsUser: 5000
runAsGroup: 5000
A sample Function that updates BackupSession and Repository status and sends metrics to Prometheus pushgateway is shown below,
apiVersion: stash.appscode.com/v1beta1
kind: Function
metadata:
name: update-status
spec:
image: appscode/stash:pg
args:
- update-status
- --namespace=${NAMESPACE:=default}
- --repository=${REPOSITORY_NAME:=}
- --backup-session=${BACKUP_SESSION:=}
- --restore-session=${RESTORE_SESSION:=}
- --output-dir=${outputDir:=}
Here, we are going to describe the various sections of a Function crd.
SpecA Function object has the following fields in the spec section:
spec.image specifies the docker image to use to create a container using the template specified in this Function.
spec.command specifies the commands to be executed by the container. Docker image’s ENTRYPOINT will be executed if no commands are specified.
spec.args specifies a list of arguments that will be passed to the entrypoint. You can templatize this section using envsubst style variables. Stash will resolve all the variables before creating the respective container. A variable should follow the following patterns:
In the first case, if Stash can’t resolve the variable, the default value will be used in place of this variable. In the second case, if Stash can’t resolve the variable, an empty string will be used to replace the variable.
Stash operator provides the following built-in variables based on BackupConfiguration, BackupSession, RestoreSession, Repository, Task, Function, BackupBlueprint etc.
| Environment Variable | Usage |
|---|---|
NAMESPACE |
Namespace of backup or restore job/workload |
BACKUP_SESSION |
Name of the respective BackupSession object |
RESTORE_SESSION |
Name of the respective RestoreSession object |
REPOSITORY_NAME |
Name of the Repository object that holds respective backend information |
REPOSITORY_PROVIDER |
Type of storage provider. i.e. gcs, s3, aws, local etc. |
REPOSITORY_SECRET_NAME |
Name of the secret that holds the credentials to access the backend |
REPOSITORY_BUCKET |
Name of the bucket where backed up data will be stored |
REPOSITORY_PREFIX |
A prefix of the directory inside bucket where backed up data will be stored |
REPOSITORY_ENDPOINT |
URL of S3 compatible Minio/Rook server |
REPOSITORY_URL |
URL of the REST server for REST backend |
HOSTNAME |
An identifier for the backed up data. If multiple pods backup in same Repository (i.e. StatefulSet or DaemonSet) this host name is to used identify data of the individual host. |
SOURCE_HOSTNAME |
An identifier of the host whose backed up data will be restored |
TARGET_NAME |
Name of the target of backup or restore |
TARGET_API_VERSION |
API version of the target of backup or restore |
TARGET_KIND |
Kind of the target of backup or restore |
TARGET_NAMESPACE |
Namespace of the target object for backup or restore |
TARGET_MOUNT_PATH |
Directory where target PVC will be mounted in stand-alone PVC backup or restore |
TARGET_PATHS |
Array of file paths that are subject to backup |
RESTORE_PATHS |
Array of file paths that are subject to restore |
RESTORE_SNAPSHOTS |
Name of the snapshot that will be restored |
TARGET_APP_VERSION |
Version of the application pointed by an AppBinding |
TARGET_APP_GROUP |
The application group where the app pointed by an AppBinding belongs |
TARGET_APP_RESOURCE |
The resource kind under an application group that the app pointed by an AppBinding works with |
TARGET_APP_TYPE |
The total types of the application. It’s simply TARGET_APP_GROUP/TARGET_APP_RESOURCE |
TARGET_APP_REPLICAS |
Number of replicas of an application targeted for backup or restore |
RETENTION_KEEP_LAST |
Number of latest snapshots to keep |
RETENTION_KEEP_HOURLY |
Number of hourly snapshots to keep |
RETENTION_KEEP_DAILY |
Number of daily snapshots to keep |
RETENTION_KEEP_WEEKLY |
Number of weekly snapshots to keep |
RETENTION_KEEP_MONTHLY |
Number of monthly snapshots to keep |
RETENTION_KEEP_YEARLY |
Number of yearly snapshots to keep |
RETENTION_KEEP_TAGS |
Keep only those snapshots that have these tags |
RETENTION_PRUNE |
Specify whether to remove data of old snapshot completely from the backend |
RETENTION_DRY_RUN |
Specify whether to run cleanup in test mode |
ENABLE_CACHE |
Specify whether to use cache while backup or restore |
MAX_CONNECTIONS |
Specifies number of parallel connections to upload/download data to/from backend |
NICE_ADJUSTMENT |
Adjustment value to configure nice to throttle the load on cpu. |
IONICE_CLASS |
Name of the ionice class |
IONICE_CLASS_DATA |
Value of the ionice class data |
ENABLE_STATUS_SUBRESOURCE |
Specifies whether crd has subresource enabled |
PROMETHEUS_PUSHGATEWAY_URL |
URL of the Prometheus pushgateway that collects the backup/restore metrics |
INTERIM_DATA_DIR |
Directory to store backed up or restored data temporarily before uploading to the backend or injecting into the target |
If you want to use a variable that is not present this table, you have to provide its value in spec.task.params section of BackupConfiguration crd.
spec.workDir specifies the container’s working directory. If this field is not specified, the container’s runtime default will be used.
spec.ports specifies a list of the ports to expose from the respective container that will be created for this function.
spec.volumeMounts specifies a list of volume names and their mountPath that will be mounted into the container that will be created for this function.
spec.volumeDevices specifies a list of the block devices to be used by the container that will be created for this function.
spec.runtimeSettings.container allows to configure runtime environment of a backup job at container level. You can configure the following container level parameters:
| Field | Usage |
| —————– | ————————————————————————————————————————————————————————————————————————– |
| resources | Compute resources required by sidecar container or backup job. To know how to manage resources for containers, please visit here. |
| livenessProbe | Periodic probe of backup sidecar/job container’s liveness. Container will be restarted if the probe fails. |
| readinessProbe | Periodic probe of backup sidecar/job container’s readiness. Container will be removed from service endpoints if the probe fails. |
| lifecycle | Actions that the management system should take in response to container lifecycle events. |
| securityContext | Security options that backup sidecar/job’s container should run with. For more details, please visit here. |
| nice | Set CPU scheduling priority for the backup process. For more details about nice, please visit here. |
| ionice | Set I/O scheduling class and priority for the backup process. For more details about ionice, please visit here. |
| env | A list of the environment variables to set in the container that will be created for this function. |
| envFrom | This allows to set environment variables to the container that will be created for this function from a Secret or ConfigMap. |
If you are using a PSP enabled cluster and the function needs any specific permission then you can specify the PSP name using spec.podSecurityPolicyName field. Stash will add this PSP in the respective RBAC roles that will be created for this function.
Note that Stash operator can’t give permission to use a PSP to a backup job if the operator itself does not have permission to use it. So, if you want to specify PSP name in this section, make sure to add that in
stash-operatorClusterRole too.
Function to create a Task from here.