AWS Installation Guide

Installation Steps

1. Clone the Obsrv automation Repository

Start by cloning the Obsrv automation repository and checkout to either latest release tag or master.

git clone [email protected]:Sanketika-Obsrv/obsrv-automation.git
git checkout <latest_release_tag> or <main>

2. Configure the Kubernetes Cluster

By executing the following commands which will bring up the kubernetes cluster in the AWS environment of configured region.

Navigate to the Configuration Directory:
```
cd ./obsrv-automation/terraform/aws/vars
```

Update Configuration Files:

Open cluster_overides.tf and modify the configuration values to match your environment.

building_block = "obsrv"
env = "dev"
region = "us-east-2"
availability_zones = ["us-east-2a", "us-east-2b", "us-east-2c"]
timezone = "UTC"
create_kong_ingress_ip = "false"  # Set to "true" if Kong service type is LoadBalancer, otherwise set to "false" for NodePort.
create_vpc = "false"
create_velero_user = "false"
eks_node_group_instance_type = ["t2.xlarge"] # Choose depending on your requirements by considering the CPU requirements
eks_node_group_capacity_type = "ON_DEMAND"
eks_node_group_scaling_config = { desired_size = 5, max_size = 5, min_size = 1 } # Choose depending on your requirements by considering the CPU requirements
eks_node_disk_size = 100

Configure S3 for Cluster State:

Open obsrv.conf in the obsrvation-automation/infra-setup directory and update your AWS credentials and bucket names

<aside> 💡

If EC2 instance is configured with Assumed Identity then no need of defining the AWS Credentials.

</aside>

AWS_ACCESS_KEY_ID=<your_access_key_id>
AWS_SECRET_ACCESS_KEY=<your_secret_access_key>
AWS_DEFAULT_REGION="us-east-2"
KUBE_CONFIG_PATH="$HOME/.kube/obsrv-kube-config.yaml"
AWS_TERRAFORM_BACKEND_BUCKET_NAME="obsrv-tfstate"
AWS_TERRAFORM_BACKEND_BUCKET_REGION="us-east-2"

3. Run the Installation Script

Make the Script Executable: The file is located in the obsrvation-automation/infra-setup directory
```
chmod +x ./obsrv.sh
```
Run the Installation:
- To start the installation, run the script:
```
./obsrv.sh install --config ./obsrv.conf --install_dependencies false
```
- If you want the installer to automatically handle dependencies, set install_dependencies=true.

4. Verify the Cluster

Once the installation completes, verify that your Kubernetes cluster is up and running:

kubectl get nodes

The result of the above command should show the nodes in your Kubernetes cluster.

Helm Chart Configuration

1. Navigate to the Helm Chart Directory

cd ./obsrv-automation/helmcharts/

2. Update AWS Cloud Configuration

Modify global-cloud-values-aws.yaml with the appropriate values for your environment:

global:
  cloud_storage_provider: "aws"
  cloud_store_provider: "s3"
  cloud_storage_region: "<region>"
  dataset_api_cloud_bucket: "<dataset_bucket_name>" #backups-<building-block>-<env>-<account-id>
  config_api_cloud_bucket: "<config_bucket_name>" #backups-<building-block>-<env>-<account-id>
  postgresql_backup_cloud_bucket: "<backup_bucket_name>" #backups-<building-block>-<env>-<account-id>
  velero_backup_cloud_bucket: "<velero_backup_bucket_name>" #velero-<building-block>-<env>-<account-id>
  cloud_storage_bucket: "<storage_bucket_name>" #<building-block>-<env>-<account-id>
  hudi_metadata_bucket: "s3a://<hudi_bucket_name>/hudi" #s3a://<building-block>-<env>-<account-id>/hudi
  cloud_storage_config: |
    '{"identity":"<access-key>","credential":"<secret-key>","region":"<region-name>"}'

  storage_class_name: "gp3"
  checkpoint_bucket: "s3://<checkpoint-bucket-name>" # #s3://checkpoint-<building-block>-<env>-<account-id>
  s3_access_key: "<aws-access-key>"
  s3_secret_key: "<aws-secret-key>"

# Update only if the Kong service type is LoadBalancer
kong_annotations:
  service.beta.kubernetes.io/aws-load-balancer-type: nlb
  service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
  service.beta.kubernetes.io/aws-load-balancer-eip-allocations: "<elastic-ip>"
  service.beta.kubernetes.io/aws-load-balancer-subnets: "<subnet-id>"

service_accounts:
  enabled: true
  secor: eks.amazonaws.com/role-arn: "<role-arn>" #arn:aws:iam::<account-id>:role/<env>-<building-block>-secor-sa-iam-role
  dataset_api: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-dataset-api-sa-iam-role
  config_api: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-config-api-sa-iam-role
  druid_raw: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-druid-raw-sa-iam-role
  flink: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-flink-sa-iam-role
  postgresql_backup: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-postgresql-backup-sa-iam-role
  s3_exporter: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-s3-exporter-sa-iam-role
  spark: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-spark-sa-iam-role
  velero: eks.amazonaws.com/role-arn: "<role-arn>"#arn:aws:iam::<account-id>:role/<env>-<building-block>-velero-sa-iam-role

trino:
  additionalCatalogs:
    lakehouse: |-
      connector.name=hudi
      hive.metastore.uri=thrift://hudi-hms.hms.svc:9083
      hive.s3.aws-access-key=<aws-access-key>
      hive.s3.aws-secret-key=<aws-secret-key>
      hive.s3.ssl.enabled=false

kong:
  proxy:
    type: NodePort  # Update the Kong service to NodePort and configure an external ingress service using the NodePort IP. Otherwise, use LoadBalancer.

3. Configure Domain

Update the global-values.yaml file, replacing <domain> with your actual domain, Elastic IP, or cluster node IP and port, depending on your Kong service type:

LoadBalancer: If Kong's service type is LoadBalancer, retrieve the Elastic IP from the AWS console and use the following format: Domain: <eip>.sslip.io
NodePort: If Kong's service type is NodePort, use the external IP of the cluster node where Kong is deployed, along with Kong's NodePort, in this format: Domain: <Cluster node external IP>:<node-port-of-kong>

Important: Security Group Update (NodePort Only)

This step is only required if Kong's service type is NodePort. You must update the security group's inbound rules.

Instructions for Modifying Security Group Inbound Rules:

Add a new inbound rule.
Set the Type to "Custom TCP."
For Port Range, specify the port used by Kong. Retrieve this port by running: kubectl get svc -n kong-ingress. This command will show you Kong's NodePort.
To restrict access, set the Source to your organization's IP address in /32 CIDR format. Ensure the Port Rangematches Kong's NodePort.
Additionally, set the Source to the external IPs of all nodes in your EKS cluster (also in /32 CIDR format). Again, the Port Range must be Kong's NodePort. This ensures proper cluster access restriction. </aside>

4. Clone the Obsrv Client Automation Repository

Start by cloning the Obsrv automation repository and checkout to either latest release tag or master.

git clone [email protected]:sanketika-labs/obsrv-scripts-infy.git 
git checkout <latest_release_tag>
cd ./obsrv-script-infy/automation-v2/enterprise-automation/
export OBSRV_AUTOMATION_HOME="<path-of-the-obsrvation-automation>" # Export the path of the repository cloned in the step 1

5. Install Obsrv

Make the script executable and set the environment variables and run the installation

The file enterprise.sh is located in the /obsrv-scripts-infy/automation-v2/enterprise-automation/kitchen/

export cloud_env=aws
export AWS_ACCESS_KEY_ID=<aws-access-key>
export AWS_SECRET_ACCESS_KEY=<aws-secret-key>
export AWS_DEFAULT_REGION=<aws-region>
export KUBE_CONFIG_PATH="$HOME/.kube/obsrv-kube-config.yaml"
export KUBECONFIG="$HOME/.kube/obsrv-kube-config.yaml"

chmod +x ./kitchen/enterprise.sh
./kitchen/enterprise.sh all

Post-Installation Verification

After completing the installation, follow these steps to verify that all components are running correctly:

1. Check Kubernetes Components

Verify all pods are running:
```
kubectl get pods -A
```
All pods should be in Running state. Common namespaces to check:
- flink: Core Pipeline
- monitoring: Monitoring stack
- dataset-api: Dataset APIs
- web-console: Dataset Management console
Check Services:
```
kubectl get svc -A
```
Verify that essential services have external IPs assigned, particularly the Kong service.

If any component fails these checks, refer to the component-specific logs:

kubectl logs -f <pod-name> -n <namespace>

By following these steps, you will ensure a successful installation and configuration of Obsrv on AWS.

Sanity Checklist

After installation, a sanity test must be performed to validate the deployment

hashtagInstallation Steps

hashtag1. Clone the Obsrv automation Repository

hashtag2. Configure the Kubernetes Cluster

hashtag3. Run the Installation Script

hashtag4. Verify the Cluster

hashtagHelm Chart Configuration

hashtag1. Navigate to the Helm Chart Directory

hashtag2. Update AWS Cloud Configuration

hashtag3. Configure Domain

hashtag4. Clone the Obsrv Client Automation Repository

hashtag5. Install Obsrv

hashtagPost-Installation Verification

hashtag1. Check Kubernetes Components

hashtagSanity Checklist