Skip to content

Configure Lakehouse Store

This documentation will help you configure Apache Hudi lakehouse data storage for your dataset

Lakehouse storage is configurable but disabled by default. It must be enabled at the web console and API service level.

Follow these steps:

  1. Update the management-console/web-console Helm chart’s values.yaml file by setting lake_house: true under the STORAGE_TYPES configuration:

    STORAGE_TYPES: '{"lake_house":true,"realtime_store":true}'
  2. Update the dataset-api Helm chart’s values.yaml file by setting lake_house: true under the storage_types configuration:

    storage_types: '{"lake_house":true,"realtime_store":true}'
  3. Upgrade both the management-console/web-console and dataset-api Helm charts for the changes to take effect.

  4. Access the Obsrv console and configure the lakehouse store to your dataset.


How to configure lakehouse store to your dataset?

Section titled “How to configure lakehouse store to your dataset?”

Follow this video to learn how to configure lakehouse storage for your dataset:

Watch: Configure Lakehouse Storage for a Dataset (Google Drive)


You can query the data lake in two ways:

1. Query through Superset:

Watch: Query Data Lake via Superset (Google Drive)

2. Query using Trino API:

Watch: Query Data Lake via Trino API (Google Drive)

Terminal window
curl --location 'http://localhost:8080/v1/statement' \
--header 'X-Trino-User: trino' \
--header 'X-Trino-Catalog: lakehouse' \
--header 'X-Trino-Schema: hms' \
--header 'Content-Type: text/plain' \
--data 'SELECT * FROM {{datasource}}'