Skip to content

feat(arrow): Streamline Apache Arrow extension types #823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Apr 27, 2023

Conversation

candiduslynx
Copy link
Contributor

@candiduslynx candiduslynx commented Apr 25, 2023

Common

Use pointer receivers:

  1. For Array types: to have all functions have the same receiver type (other functions do have pointer receivers, so updating String() method receiver is natural, see net.IPNet as a reference.
  2. For Type implementations: we do return a pointer from NewXYZType function, so it's only natural to define the receiver as pointer as well.

Inet

InetBuilder

  1. Now accepts *net.IPNet (pointer) in Append, UnsafeAppend & AppendValues
  2. AppendValues now accounts for valid param

InetArray

  1. String method uses %q format as mainstream Apache Arrow types
  2. Added Value(i int) *net.IPNet function
  3. GetOneForMarshal utilizes Value implementation

JSON

JSONBuilder

  1. Marshaling with json.DisableHTMLEscape() option per Destination plugins receive escaped data #622
  2. AppendValues now accounts for valid param
  3. UnmarshalJSON implementation is streamlined with helper Unmarshal(dec *json.Decoder) error function (as mainstream Apache Arrow types)

JSONArray

  1. String method uses %q format as mainstream Apache Arrow types
  2. Added Value(i int) any function that will unmarshal the stored data & return the concrete type
  3. Use json.UnmarshalNoEscape per Destination plugins receive escaped data #622
  4. GetOneForMarshal now doesn't perform unmarshaling and, instead, returns the stored json.RawMessage

Mac

MacBuilder

  1. AppendValues now accounts for valid param

MacArray

  1. String method uses %q format as mainstream Apache Arrow types
  2. Added Value(i int) net.HardwareAddr function

UUID

UUIDBuilder

  1. AppendValues now accounts for valid param
  2. Use uuid.Must func instead of panic in code

UUIDArray

  1. String method uses %q format as mainstream Apache Arrow types
  2. Added Value(i int) uuid.UUID func
  3. Use uuid.Must func instead of panic in code

@github-actions github-actions bot added the feat label Apr 25, 2023
@candiduslynx candiduslynx force-pushed the feat/arrow-extension-valuer branch 2 times, most recently from d43f3a6 to 6b0ea1d Compare April 25, 2023 08:28
@github-actions
Copy link

github-actions bot commented Apr 25, 2023

⏱️ Benchmark results

Comparing with 1a8072f

  • DefaultConcurrencyDFS-2 resources/s: 11,068 ⬆️ 2.50% increase vs. 1a8072f
  • DefaultConcurrencyRoundRobin-2 resources/s: 12,190 ⬆️ 2.03% increase vs. 1a8072f
  • Glob-2 ns/op: 173.7 ⬇️ 43.58% decrease vs. 1a8072f
  • TablesWithChildrenDFS-2 resources/s: 29,793 ⬆️ 18.92% increase vs. 1a8072f
  • TablesWithChildrenRoundRobin-2 resources/s: 28,855 ⬆️ 9.48% increase vs. 1a8072f
  • TablesWithRateLimitingDFS-2 resources/s: 28.35 ⬇️ 0.53% decrease vs. 1a8072f
  • TablesWithRateLimitingRoundRobin-2 resources/s: 842.1 ⬆️ 3.46% increase vs. 1a8072f
  • BufferedScanner-2 ns/op: 9.368 ⬇️ 24.15% decrease vs. 1a8072f
  • LogReader-2 ns/op: 30.47 ⬇️ 25.40% decrease vs. 1a8072f

@codecov
Copy link

codecov bot commented Apr 25, 2023

Codecov Report

Patch coverage: 48.12% and project coverage change: +0.20 🎉

Comparison is base (1a8072f) 46.98% compared to head (af39af0) 47.19%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #823      +/-   ##
==========================================
+ Coverage   46.98%   47.19%   +0.20%     
==========================================
  Files          76       76              
  Lines        7872     7844      -28     
==========================================
+ Hits         3699     3702       +3     
+ Misses       3682     3642      -40     
- Partials      491      500       +9     
Impacted Files Coverage Δ
schema/arrow.go 13.39% <0.00%> (ø)
types/inet.go 41.73% <40.54%> (-0.67%) ⬇️
types/uuid.go 43.75% <40.62%> (+3.62%) ⬆️
types/mac.go 45.73% <52.00%> (-3.12%) ⬇️
types/json.go 60.30% <58.06%> (+10.70%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@candiduslynx candiduslynx changed the title feat(arrow): Extension Value(int) <type> methods feat(arrow): Streamline Apache Arrow extension types Apr 26, 2023
@github-actions github-actions bot added feat and removed feat labels Apr 26, 2023
Copy link
Contributor

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure I understand the reasoning or the value of using pointer receivers in some of the extensions instead of regular params? Were there any performance issue? we turn them into string anyway and by using pointers we just introduce more risk of the data being changed inside while we don't really want that to happen so I can't see the value unless Im missing something.

@candiduslynx candiduslynx force-pushed the feat/arrow-extension-valuer branch from 77efac2 to af39af0 Compare April 27, 2023 15:13
@candiduslynx candiduslynx merged commit f32fac3 into main Apr 27, 2023
@candiduslynx candiduslynx deleted the feat/arrow-extension-valuer branch April 27, 2023 15:26
candiduslynx pushed a commit that referenced this pull request Apr 28, 2023
🤖 I have created a release *beep* *boop*
---


##
[2.5.0](v2.4.0...v2.5.0)
(2023-04-28)


### Features

* Add table description to Arrow schema metadata
([#824](#824))
([1a8072f](1a8072f))
* **arrow:** Streamline Apache Arrow extension types
([#823](#823))
([f32fac3](f32fac3))
* **test:** Add double migration test
([#827](#827))
([4cd3872](4cd3872))
* Time values are truncated uniformly
([#825](#825))
([ffb97b0](ffb97b0))


### Bug Fixes

* TransformWithStruct/DefaultNameTransformer change for invalid column
names ([#820](#820))
([01e6649](01e6649))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
kodiakhq bot pushed a commit that referenced this pull request May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants