feat: add schema conversion from avro timestamp-millis #2173

matthias-Q · 2025-07-05T15:05:24Z

Rationale for this change

The schema coonversion util from avro schema to iceberg schema did ignore timestamp-millis.

Are these changes tested?

Added tests for timestamp-millis and timestamp-micros as the latter was missing

Are there any user-facing changes?

no

…llis

matthias-Q · 2025-07-05T15:08:50Z

Some notes:

UUID is also not 100% in line with the Avro Schema specification. But I saw, that there are other PRs pending that fixes UUID related issues. (Fix UUID support #2007)
Avro schema does not have a notion of field-id and element-id. I could add a helper function that would add these. I know this is not the core responsibility of this library. I was using this to create iceberg tables from Kafka topics, where the schema is stored in the schema registry. I think this is a viable use case and hence these helpers would add value.

kevinjqliu

LGTM heres the avro doc for timestamp-millis

kevinjqliu · 2025-07-06T20:04:55Z

Avro schema does not have a notion of field-id and element-id. I could add a helper function that would add these. I know this is not the core responsibility of this library. I was using this to create iceberg tables from Kafka topics, where the schema is stored in the schema registry. I think this is a viable use case and hence these helpers would add value.

@matthias-Q im curious about the specific usecase. I think the field-id and element-id are already part of the avro schema.

According to the iceberg spec, https://iceberg.apache.org/spec/#avro under Field IDs

Iceberg struct, list, and map types identify nested types by ID. When writing data to Avro files, these IDs must be stored in the Avro schema to support ID-based column pruning.

also see

iceberg-python/tests/avro/test_reader.py

Line 262 in ecc5218

"type": ["null", {"element-id": 133, "type": "array", "items": "long"}],

matthias-Q · 2025-07-07T06:48:41Z

@kevinjqliu yes, they are part of the Iceberg schema spec, but not for Avro (see https://avro.apache.org/docs/1.12.0/specification/). They are optional, but the conversion function requires them.

The specific use case is, that I am getting an Avro schema from a Kafka schema registry and I want to use that to create/evolve iceberg tables.

feat(schema conversion): add schema conversion from avro timestamp-mi…

a12bf10

…llis

kevinjqliu approved these changes Jul 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add schema conversion from avro timestamp-millis #2173

feat: add schema conversion from avro timestamp-millis #2173

Uh oh!

matthias-Q commented Jul 5, 2025

Uh oh!

matthias-Q commented Jul 5, 2025

Uh oh!

kevinjqliu left a comment

Uh oh!

kevinjqliu commented Jul 6, 2025

Uh oh!

matthias-Q commented Jul 7, 2025

Uh oh!

Uh oh!

feat: add schema conversion from avro timestamp-millis #2173

Are you sure you want to change the base?

feat: add schema conversion from avro timestamp-millis #2173

Uh oh!

Conversation

matthias-Q commented Jul 5, 2025

Rationale for this change

Are these changes tested?

Are there any user-facing changes?

Uh oh!

matthias-Q commented Jul 5, 2025

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

kevinjqliu commented Jul 6, 2025

Uh oh!

matthias-Q commented Jul 7, 2025

Uh oh!

Uh oh!