Hive Connector
Arneb connects to Apache Hive Metastore (HMS) to discover and query tables managed by Hive catalogs. The connector supports HMS 4.x via an async Thrift client.
Configuration
[[catalogs]]
name = "datalake"
type = "hive"
metastore_uri = "127.0.0.1:9083"
default_schema = "default"
[catalogs.storage.s3]
region = "us-east-1"
endpoint = "http://localhost:9000"
allow_http = true| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Catalog name used in 3-part table references (catalog.schema.table) |
type | string | yes | Must be "hive" |
metastore_uri | string | yes | host:port of the Hive Metastore Thrift service (no scheme) |
default_schema | string | no | Default schema to use when schema is not specified |
Three-Part Table References
Tables in Hive catalogs use three-part naming:
SELECT * FROM datalake.demo.cities;
-- ^^^^^^^^ ^^^^ ^^^^^^
-- catalog schema tableStorage Configuration
Hive tables are typically stored in object stores. Configure storage credentials either globally or per-catalog:
# Global storage (used by all catalogs unless overridden)
[storage.s3]
region = "us-east-1"
# Per-catalog override
[catalogs.storage.s3]
region = "us-east-1"
endpoint = "http://localhost:9000"
allow_http = true
access_key_id = "minioadmin"
secret_access_key = "minioadmin"Per-catalog settings merge with and override global [storage] settings.
Local Demo Walkthrough
Arneb includes a Docker Compose setup with HMS 4.2.0 and MinIO for local development.
Prerequisites
- Docker and Docker Compose
- Rust toolchain
Step 1: Start Services
docker compose up -dThis starts:
- MinIO — S3-compatible object store on port
9000(API) and9001(console) - Hive Metastore — HMS 4.2.0 on port
9083
Step 2: Seed TPC-H Data
docker compose run --rm tpch-seedThis creates 8 TPC-H tables in the tpch schema on MinIO via Trino CTAS.
Step 3: Start Arneb
cargo run --bin arneb -- --config benchmarks/tpch/tpch-hive.tomlThe config (benchmarks/tpch/tpch-hive.toml) connects to the local HMS and MinIO:
bind_address = "127.0.0.1"
port = 5432
[storage.s3]
region = "us-east-1"
endpoint = "http://localhost:9000"
allow_http = true
access_key_id = "minioadmin"
secret_access_key = "minioadmin"
[[catalogs]]
name = "datalake"
type = "hive"
metastore_uri = "127.0.0.1:9083"
default_schema = "tpch"Step 4: Run Queries
psql -h 127.0.0.1 -p 5432 -c "SELECT COUNT(*) FROM datalake.tpch.nation;"
psql -h 127.0.0.1 -p 5432 -c "SELECT COUNT(*) FROM datalake.tpch.lineitem;"Step 5: Tear Down
docker compose downHMS Compatibility
The Hive connector uses auto-generated Thrift bindings from the Hive 4.2.0 IDL. It communicates with HMS using plain TBinaryProtocol (buffered codec), compatible with standard HMS deployments.
The Thrift bindings are in the hive-metastore crate. To regenerate after modifying the IDL:
cargo run -p hive-metastore-thrift-build