Skip to content

[Bug] SparkTBinaryFrontendService#addHiveToken silently drops Hive delegation tokens with non-empty service field (multi-HMS scenario) #7433

@Sunwoo-Shin

Description

@Sunwoo-Shin

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

In environments with multiple Hive Metastore (HMS) instances backed by different Kerberos principals, Spark engines launched by Kyuubi cannot authenticate against non-default HMS instances. Catalog operations against those HMSes fail with DIGEST-MD5: IO error acquiring password.

The root cause is in the engine-side credential push path, SparkTBinaryFrontendService#addHiveToken. The current logic assumes a single HMS and selects only tokens whose getService() is empty:

val newToken = newTokens
  .find { case (uris, token) =>
    val matched = uris.toString.split(",").exists(uriSet.contains) &&
      token.getService == new Text()
    ...
  }

Any Hive delegation token that has its service field populated — the natural way to disambiguate per-HMS tokens that share a single Credentials — is silently dropped. As a result, tokens for non-default HMSes never reach the engine UGI even when a (custom) provider has issued them correctly.

Affects Version(s)

1.11.1

Kyuubi Server Log Output

Kyuubi Engine Log Output

Authentication fails on the engine side because the token for HMS B never reached the engine UGI:


Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided.
  Most recent failure: org.apache.thrift.transport.TTransportException:
  Peer indicated failure: DIGEST-MD5: IO error acquiring password
    at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(...)
    at org.apache.thrift.transport.TSaslTransport.open(...)
    at org.apache.thrift.transport.TSaslClientTransport.open(...)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(...)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(...)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(...)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(...)

Kyuubi Server Configurations

Kyuubi Engine Configurations

A Spark session configured with **two Iceberg catalogs, each backed by a separate HMS with different Kerberos principals** — a common setup for domain-partitioned warehouses:


# Catalog A — HMS A
spark.sql.catalog.warehouse_a                                          org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.warehouse_a.type                                     hive
spark.sql.catalog.warehouse_a.uri                                      thrift://hms-a.example.com:9083
spark.sql.catalog.warehouse_a.warehouse                                s3a://warehouse-a/warehouse
spark.sql.catalog.warehouse_a.io-impl                                  org.apache.iceberg.aws.s3.S3FileIO
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.uris               thrift://hms-a.example.com:9083
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.sasl.enabled       true
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.kerberos.principal hive-a/_HOST@EXAMPLE.COM
# Tells the HMS client which delegation token to pick from UGI credentials.
# Must match the alias used when the token was added to Credentials (= instance name).
spark.sql.catalog.warehouse_a.hadoop.hive.metastore.token.signature    warehouse_a

# Catalog B — HMS B (different Kerberos principal)
spark.sql.catalog.warehouse_b                                          org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.warehouse_b.type                                     hive
spark.sql.catalog.warehouse_b.uri                                      thrift://hms-b.example.com:9083
spark.sql.catalog.warehouse_b.warehouse                                s3a://warehouse-b/warehouse
spark.sql.catalog.warehouse_b.io-impl                                  org.apache.iceberg.aws.s3.S3FileIO
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.uris               thrift://hms-b.example.com:9083
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.sasl.enabled       true
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.kerberos.principal hive-b/_HOST@EXAMPLE.COM
spark.sql.catalog.warehouse_b.hadoop.hive.metastore.token.signature    warehouse_b

Additional context

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions