Data Backup and Restoration

Instructions to restoration of obsrv from the backups


πŸ“˜ Overview

This guide provides comprehensive instructions to restore Obsrv data using Velero, Redis, and PostgreSQL backups. It also includes procedures for Druid data migration between S3 buckets.


🧾 Introduction

This document provides a step-by-step guide for restoring data from backups in an Obsrv deployment. It covers:

  • Restoration of critical services (PostgreSQL, Redis, etc.)

  • Velero-based system-wide recovery

  • Data migration between object storage buckets (e.g., for Druid segments)


πŸ—‚ Backup Storage Details

Service

S3 Bucket Naming Convention

PostgreSQL

backups-{building_block}-{env}-{account-id}

Denorm Redis

backups-{building_block}-{env}-{account-id}

Dedup Redis

backups-{building_block}-{env}-{account-id}

Dataset Events

{building_block}-{env}-{account-id}

Terraform State

Stored in: AWS_TERRAFORM_BACKEND_BUCKET_NAME

Velero Backups

velero-{building_block}-{env}-{account-id}

Flink Checkpoints

checkpoint-{building_block}-{env}-{account-id}


πŸ”„ Velero: Full Restoration Workflow

βœ… Prerequisites

Ensure the following tools are installed:

  • AWS CLI

  • Velero CLI

To install Velero:

Ensure:

  • AWS credentials are configured

  • Kubernetes cluster is accessible

  • Velero is installed and running in the cluster


πŸ“¦ Restoration Procedure

πŸ” Restore All Services

πŸ”§ Restore a Specific Service (e.g., PostgreSQL)

βœ… Verify Restoration


πŸ§ͺ Post-Restoration Checklist

  • Ensure all pods are running (PostgreSQL, Redis, Prometheus, etc.)

  • Verify data in all critical databases

  • Check data processors (Kafka, Flink, Druid) resume from last committed state

  • Run test queries to validate dataset access and accuracy


πŸ’Ύ Redis & PostgreSQL: Targeted Restoration

Use this when restoring only Redis and PostgreSQL without changes to S3 bucket paths.


πŸ”’ Pre-Restoration

Pause these services:

  • Flink Jobs

  • Web Console

  • Obsrv APIs

  • Command Service

  • Druid

  • Superset


🐘 PostgreSQL Restoration

1. Access the Pod

2. Pre-Check & Clean Old Data

3. Copy Backup File

4. Restore


🧠 Redis Restoration

1. Download and Decompress

2. Access the Pod

3. Disable Persistence Temporarily

4. Replace and Restart

5. Verify Keys

6. Revert Configs


βœ… Final Steps After Restoration

Once PostgreSQL and Redis are restored:

Validate all datasets and operational behavior through the Obsrv console.


🚚 Druid Data Migration

Use this section to migrate Druid segment data from one S3 bucket to another.


πŸ“Œ Pre-Migration Checklist

  • Ensure all segments are copied to the target bucket.

  • Scale down Druid components:


πŸ“¦ Migration Execution

  1. Deploy a temporary Python pod:

  1. Run the migration script inside the pod using the AWS CLI or boto3 script to copy paths from the old bucket to the new bucket.


⚠️ Important Notes

  • Always take snapshots before migration or restoration.

  • Confirm with stakeholders before starting restoration in a production environment.

  • Monitor resource usage post-restoration for anomalies.


Last updated