Configure Lakehouse Store
This documentation will help you configure Apache Hudi lakehouse data storage for your dataset
Enable the Lakehouse Store
Section titled “Enable the Lakehouse Store”Lakehouse storage is configurable but disabled by default. It must be enabled at the web console and API service level.
Follow these steps:
-
Update the management-console/web-console Helm chart’s
values.yamlfile by settinglake_house: trueunder theSTORAGE_TYPESconfiguration:STORAGE_TYPES: '{"lake_house":true,"realtime_store":true}' -
Update the dataset-api Helm chart’s
values.yamlfile by settinglake_house: trueunder thestorage_typesconfiguration:storage_types: '{"lake_house":true,"realtime_store":true}' -
Upgrade both the management-console/web-console and dataset-api Helm charts for the changes to take effect.
-
Access the Obsrv console and configure the lakehouse store to your dataset.
How to configure lakehouse store to your dataset?
Section titled “How to configure lakehouse store to your dataset?”Follow this video to learn how to configure lakehouse storage for your dataset:
Watch: Configure Lakehouse Storage for a Dataset (Google Drive)
How to query the data lake?
Section titled “How to query the data lake?”You can query the data lake in two ways:
1. Query through Superset:
Watch: Query Data Lake via Superset (Google Drive)
2. Query using Trino API:
Watch: Query Data Lake via Trino API (Google Drive)
curl --location 'http://localhost:8080/v1/statement' \ --header 'X-Trino-User: trino' \ --header 'X-Trino-Catalog: lakehouse' \ --header 'X-Trino-Schema: hms' \ --header 'Content-Type: text/plain' \ --data 'SELECT * FROM {{datasource}}'