Migration Plan: Obsrv GA to Obsrv 1.0

This document outlines the migration strategy from Obsrv GA to Obsrv 1.0, with a focus on data integrity, minimal downtime, and operational continuity.

Overview

This document outlines the migration strategy from Obsrv GA to Obsrv 1.0, with a focus on data integrity, minimal downtime, and operational continuity.

Migration Timeline & Activities

The following visual (to be added) captures the phased migration activities and associated timelines:

Step-by-Step Migration Process

1. Provision Obsrv 1.0 Cluster

Deploy a dedicated Obsrv 1.0 cluster.
Ensure isolation from live traffic during setup and validation.

2. Configure Backup Paths

Update Obsrv 1.0 to point to existing GA backup locations:

Druid: Reuse GA S3 backup path.
Secor: Reuse GA S3 backup path.
Velero: Reuse GA S3 backup path.

3. Disable Ingestion in Obsrv 1.0

Temporarily disable all ingestion pipelines to prevent data writes during migration.

4. Decommission Obsrv GA Ingestion

a. Infosys Transformer

Scale down the transformer job.
Record current consumer group offsets.

b. Connectors

Scale down GA connectors: Kafka, Debezium, and Neo4j.
Capture consumer group offsets per partition for each.

5. Resolve Pipeline Backlogs

Ensure all pipelines in GA are fully drained.
Specifically, clear:
- Secor lags
- Druid ingestion lags
Temporarily scale infra, if needed, for quicker lag resolution.

6. Trigger Final Backups

Run manual backups for Postgres and Redis after pipeline clearance.
Confirm successful sync to S3.

7. Restore & Migrate Core Data

a. Postgres

Restore and migrate GA data using the official table migration script.

b. Redis

Restore Redis snapshot and convert to the 1.0-compatible format.

c. Schema Transformation

Execute the schema migration automation to convert all legacy table formats.

🔄 Post-Migration Adjustments
Update Redis path references to Valkey.
Remove <env> prefix from the dataset_config column in topic values.

8. Validate Migrated Data

Verify dataset visibility via the Obsrv 1.0 console.
Confirm API-level queryability.
Ensure all Druid supervisors are healthy and running.

9. Reinitialize Supervisors

Re-run the ingestion Helm task to create the system events datasource.
Suspend and resume all supervisors through the Druid console.

10. Reconfigure Connectors in Obsrv 1.0

Update each connector to resume from the last committed GA offset:

Kafka, Neo4j, Debezium
Infosys Transformer (via automation script)

Update respective consumer group details in the 1.0 configuration.

11. Republish Datasets

Republish all datasets, adhering to the 1.0 schema structure.

12. Scale Up Obsrv 1.0

Bring ingestion pipelines and connectors back online.
Begin real-time ingestion and processing.

Post-Migration Validation

Monitor end-to-end ingestion in 1.0.
Validate:
- Data availability
- API and query response integrity
- Consumer group offset correctness
- System performance and stability

Final Cutover

Once confidence is established:

Redirect all downstream consumers to Obsrv 1.0 APIs.
Retire remaining Obsrv GA components gracefully.

PreviousMigration Guide NextMigration Guide: Obsrv 1.x to Obsrv 2.x

Last updated 6 months ago