Code of Conduct
Search before asking
Describe the bug
In environments with multiple Hive Metastore (HMS) instances backed by different Kerberos principals, Spark engines launched by Kyuubi cannot authenticate against non-default HMS instances. Catalog operations against those HMSes fail with DIGEST-MD5: IO error acquiring password.
The root cause is in the engine-side credential push path, SparkTBinaryFrontendService#addHiveToken. The current logic assumes a single HMS and selects only tokens whose getService() is empty:
val newToken = newTokens
.find { case (uris, token) =>
val matched = uris.toString.split(",").exists(uriSet.contains) &&
token.getService == new Text()
...
}
Any Hive delegation token that has its service field populated — the natural way to disambiguate per-HMS tokens that share a single Credentials — is silently dropped. As a result, tokens for non-default HMSes never reach the engine UGI even when a (custom) provider has issued them correctly.
Affects Version(s)
1.11.1
Kyuubi Server Log Output
Kyuubi Engine Log Output
Authentication fails on the engine side because the token for HMS B never reached the engine UGI:
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided.
Most recent failure: org.apache.thrift.transport.TTransportException:
Peer indicated failure: DIGEST-MD5: IO error acquiring password
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(...)
at org.apache.thrift.transport.TSaslTransport.open(...)
at org.apache.thrift.transport.TSaslClientTransport.open(...)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(...)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(...)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(...)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(...)
Kyuubi Server Configurations
Kyuubi Engine Configurations
A Spark session configured with **two Iceberg catalogs, each backed by a separate HMS with different Kerberos principals** — a common setup for domain-partitioned warehouses:
# Catalog A — HMS A
spark.sql.catalog.warehouse_a org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.warehouse_a.type hive
spark.sql.catalog.warehouse_a.uri thrift://hms-a.example.com:9083
spark.sql.catalog.warehouse_a.warehouse s3a://warehouse-a/warehouse
spark.sql.catalog.warehouse_a.io-impl org.apache.iceberg.aws.s3.S3FileIO
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.uris thrift://hms-a.example.com:9083
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.sasl.enabled true
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.kerberos.principal hive-a/_HOST@EXAMPLE.COM
# Tells the HMS client which delegation token to pick from UGI credentials.
# Must match the alias used when the token was added to Credentials (= instance name).
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.token.signature warehouse_a
# Catalog B — HMS B (different Kerberos principal)
spark.sql.catalog.warehouse_b org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.warehouse_b.type hive
spark.sql.catalog.warehouse_b.uri thrift://hms-b.example.com:9083
spark.sql.catalog.warehouse_b.warehouse s3a://warehouse-b/warehouse
spark.sql.catalog.warehouse_b.io-impl org.apache.iceberg.aws.s3.S3FileIO
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.uris thrift://hms-b.example.com:9083
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.sasl.enabled true
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.kerberos.principal hive-b/_HOST@EXAMPLE.COM
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.token.signature warehouse_b
Additional context
Are you willing to submit PR?
Code of Conduct
Search before asking
Describe the bug
In environments with multiple Hive Metastore (HMS) instances backed by different Kerberos principals, Spark engines launched by Kyuubi cannot authenticate against non-default HMS instances. Catalog operations against those HMSes fail with
DIGEST-MD5: IO error acquiring password.The root cause is in the engine-side credential push path,
SparkTBinaryFrontendService#addHiveToken. The current logic assumes a single HMS and selects only tokens whosegetService()is empty:Any Hive delegation token that has its
servicefield populated — the natural way to disambiguate per-HMS tokens that share a singleCredentials— is silently dropped. As a result, tokens for non-default HMSes never reach the engine UGI even when a (custom) provider has issued them correctly.Affects Version(s)
1.11.1
Kyuubi Server Log Output
Kyuubi Engine Log Output
Kyuubi Server Configurations
Kyuubi Engine Configurations
Additional context
Are you willing to submit PR?