-
Notifications
You must be signed in to change notification settings - Fork 54
Fix filepath wildcard for avro/parquet #564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #564 +/- ##
==========================================
+ Coverage 70.34% 70.66% +0.32%
==========================================
Files 40 40
Lines 1703 1708 +5
Branches 150 146 -4
==========================================
+ Hits 1198 1207 +9
+ Misses 505 501 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
I'm a fan of adding an optional parameter called |
ratatool-sampling/src/main/scala/com/spotify/ratatool/samplers/BigSampler.scala
Outdated
Show resolved
Hide resolved
@benkonz, I like the approach. I will create a ticket to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the one typo
ratatool-sampling/src/main/scala/com/spotify/ratatool/samplers/BigSampler.scala
Outdated
Show resolved
Hide resolved
input match { | ||
case avroPath if input.endsWith("avro") => | ||
case avroPath if fileNames.exists(_.endsWith("avro")) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we log which one we resolve it as? might help us later on
Below changes have been tested and confirmed to work. Also proposing alternative approach since I'm not a big fan of guessing whether we are dealing with Avro vs Parquet:
Approach (1)
AvroSampler
orParquetSample
Approach (2)
mode
(which can be eitherparquet
,avro
orbigquery
) and have that be a required field. It will be a breaking change, but then it will help avoid guessing which Sampler to use.