Commit 973e6fc
authored
# Which issue does this PR close?
- Closes #4886
- Stacked on #8584
# Rationale for this change
This PR brings Arrow-Avro round‑trip coverage up to date with modern
Arrow types and the latest Avro logical types. In particular, Avro 1.12
adds `timestamp-nanos` and `local-timestamp-nanos`. Enabling these
logical types and filling in missing Avro writer encoders for Arrow’s
newer *view* and list families allows lossless read/write and simpler
pipelines.
It also hardens timestamp/time scaling in the writer to avoid silent
overflow when converting seconds to milliseconds, surfacing a clear
error instead.
# What changes are included in this PR?
* **Nanosecond timestamps**: Introduces a `TimestampNanos(bool)` codec
in `arrow-avro` that maps Avro `timestamp-nanos` /
`local-timestamp-nanos` to Arrow `Timestamp(Nanosecond, tz)`. The
reader/decoder, union field kinds, and Arrow `DataType` mapping are all
extended accordingly. Logical type detection is wired through both
`logicalType` and the `arrowTimeUnit="nanosecond"` attribute.
* **UUID logical type round‑trip fix**: When reading Avro
`logicalType="uuid"` fields, preserve that logical type in Arrow field
metadata so writers can round‑trip it back to Avro.
* **Avro writer encoders**: Add the missing array encoders and coverage
for Arrow’s `ListView`, `LargeListView`, and `FixedSizeList`, and extend
array encoder support to `BinaryView` and `Utf8View`. (See large
additions in `writer/encoder.rs`.)
* **Safer time/timestamp scaling**: Guard second to millisecond
conversions in `Time32`/`Timestamp` encoders to prevent overflow;
encoding now returns a clear `InvalidArgument` error in those cases.
* **Schema utilities**: Add `AvroSchemaOptions` with `null_order` and
`strip_metadata` flags so Avro JSON can be built while optionally
omitting internal Arrow keys during round‑trip schema generation.
* **Tests & round‑trip coverage**: Add unit tests for nanosecond
timestamp decoding (UTC, local, and with nulls) and additional
end‑to‑end/round‑trip tests for the updated writer paths.
# Are these changes tested?
Yes.
* New decoder tests validate `Timestamp(Nanosecond, tz)` behavior for
UTC and local timestamps and for nullable unions.
* Writer tests validate the nanosecond encoder and exercise an overflow
path for second→millisecond conversion that now returns an error.
* Additional round‑trip tests were added alongside the new encoders.
# Are there any user-facing changes?
N/A since `arrow-avro` is not public yet.
1 parent 161adba commit 973e6fc
8 files changed
+1234
-86
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
352 | | - | |
| 352 | + | |
| 353 | + | |
353 | 354 | | |
354 | 355 | | |
355 | 356 | | |
| |||
652 | 653 | | |
653 | 654 | | |
654 | 655 | | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
655 | 661 | | |
656 | 662 | | |
657 | 663 | | |
| |||
715 | 721 | | |
716 | 722 | | |
717 | 723 | | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
718 | 727 | | |
719 | 728 | | |
720 | 729 | | |
| |||
917 | 926 | | |
918 | 927 | | |
919 | 928 | | |
| 929 | + | |
| 930 | + | |
920 | 931 | | |
921 | 932 | | |
922 | 933 | | |
| |||
946 | 957 | | |
947 | 958 | | |
948 | 959 | | |
| 960 | + | |
| 961 | + | |
949 | 962 | | |
950 | 963 | | |
951 | 964 | | |
| |||
1399 | 1412 | | |
1400 | 1413 | | |
1401 | 1414 | | |
1402 | | - | |
| 1415 | + | |
| 1416 | + | |
| 1417 | + | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
1403 | 1426 | | |
1404 | 1427 | | |
1405 | 1428 | | |
| |||
1437 | 1460 | | |
1438 | 1461 | | |
1439 | 1462 | | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
| 1467 | + | |
| 1468 | + | |
| 1469 | + | |
| 1470 | + | |
| 1471 | + | |
| 1472 | + | |
| 1473 | + | |
| 1474 | + | |
1440 | 1475 | | |
1441 | 1476 | | |
1442 | 1477 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7437 | 7437 | | |
7438 | 7438 | | |
7439 | 7439 | | |
7440 | | - | |
7441 | 7440 | | |
7442 | 7441 | | |
7443 | 7442 | | |
| |||
7593 | 7592 | | |
7594 | 7593 | | |
7595 | 7594 | | |
| 7595 | + | |
7596 | 7596 | | |
7597 | 7597 | | |
7598 | 7598 | | |
7599 | 7599 | | |
7600 | 7600 | | |
7601 | 7601 | | |
7602 | 7602 | | |
7603 | | - | |
| 7603 | + | |
| 7604 | + | |
| 7605 | + | |
| 7606 | + | |
| 7607 | + | |
| 7608 | + | |
7604 | 7609 | | |
7605 | 7610 | | |
7606 | 7611 | | |
| |||
7617 | 7622 | | |
7618 | 7623 | | |
7619 | 7624 | | |
7620 | | - | |
| 7625 | + | |
| 7626 | + | |
| 7627 | + | |
| 7628 | + | |
| 7629 | + | |
| 7630 | + | |
7621 | 7631 | | |
7622 | 7632 | | |
7623 | 7633 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| 209 | + | |
209 | 210 | | |
210 | 211 | | |
211 | 212 | | |
| |||
324 | 325 | | |
325 | 326 | | |
326 | 327 | | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
327 | 331 | | |
328 | 332 | | |
329 | 333 | | |
| |||
530 | 534 | | |
531 | 535 | | |
532 | 536 | | |
533 | | - | |
| 537 | + | |
| 538 | + | |
534 | 539 | | |
535 | 540 | | |
536 | 541 | | |
| |||
643 | 648 | | |
644 | 649 | | |
645 | 650 | | |
646 | | - | |
| 651 | + | |
| 652 | + | |
647 | 653 | | |
648 | 654 | | |
649 | 655 | | |
| |||
854 | 860 | | |
855 | 861 | | |
856 | 862 | | |
857 | | - | |
| 863 | + | |
| 864 | + | |
858 | 865 | | |
859 | 866 | | |
860 | 867 | | |
| |||
1070 | 1077 | | |
1071 | 1078 | | |
1072 | 1079 | | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
1073 | 1084 | | |
1074 | 1085 | | |
1075 | 1086 | | |
| |||
1959 | 1970 | | |
1960 | 1971 | | |
1961 | 1972 | | |
| 1973 | + | |
1962 | 1974 | | |
1963 | 1975 | | |
1964 | 1976 | | |
| |||
1983 | 1995 | | |
1984 | 1996 | | |
1985 | 1997 | | |
| 1998 | + | |
1986 | 1999 | | |
1987 | 2000 | | |
1988 | 2001 | | |
| |||
2044 | 2057 | | |
2045 | 2058 | | |
2046 | 2059 | | |
2047 | | - | |
| 2060 | + | |
| 2061 | + | |
| 2062 | + | |
| 2063 | + | |
| 2064 | + | |
2048 | 2065 | | |
2049 | 2066 | | |
2050 | 2067 | | |
| |||
4647 | 4664 | | |
4648 | 4665 | | |
4649 | 4666 | | |
| 4667 | + | |
| 4668 | + | |
| 4669 | + | |
| 4670 | + | |
| 4671 | + | |
| 4672 | + | |
| 4673 | + | |
| 4674 | + | |
| 4675 | + | |
| 4676 | + | |
| 4677 | + | |
| 4678 | + | |
| 4679 | + | |
| 4680 | + | |
| 4681 | + | |
| 4682 | + | |
| 4683 | + | |
| 4684 | + | |
| 4685 | + | |
| 4686 | + | |
| 4687 | + | |
| 4688 | + | |
| 4689 | + | |
| 4690 | + | |
| 4691 | + | |
| 4692 | + | |
| 4693 | + | |
| 4694 | + | |
| 4695 | + | |
| 4696 | + | |
| 4697 | + | |
| 4698 | + | |
| 4699 | + | |
| 4700 | + | |
| 4701 | + | |
| 4702 | + | |
| 4703 | + | |
| 4704 | + | |
| 4705 | + | |
| 4706 | + | |
| 4707 | + | |
| 4708 | + | |
| 4709 | + | |
| 4710 | + | |
| 4711 | + | |
| 4712 | + | |
| 4713 | + | |
| 4714 | + | |
| 4715 | + | |
| 4716 | + | |
| 4717 | + | |
| 4718 | + | |
| 4719 | + | |
| 4720 | + | |
| 4721 | + | |
| 4722 | + | |
| 4723 | + | |
| 4724 | + | |
| 4725 | + | |
| 4726 | + | |
| 4727 | + | |
| 4728 | + | |
| 4729 | + | |
| 4730 | + | |
| 4731 | + | |
| 4732 | + | |
| 4733 | + | |
| 4734 | + | |
| 4735 | + | |
| 4736 | + | |
| 4737 | + | |
| 4738 | + | |
| 4739 | + | |
| 4740 | + | |
| 4741 | + | |
| 4742 | + | |
| 4743 | + | |
| 4744 | + | |
| 4745 | + | |
| 4746 | + | |
| 4747 | + | |
| 4748 | + | |
| 4749 | + | |
| 4750 | + | |
| 4751 | + | |
| 4752 | + | |
| 4753 | + | |
| 4754 | + | |
| 4755 | + | |
4650 | 4756 | | |
0 commit comments