feat: KEP-2655: Add data cache system#2755
Conversation
9f5e926 to
f82b6e2
Compare
Pull Request Test Coverage Report for Build 17837270829Details
💛 - Coveralls |
3b7359b to
b640ffa
Compare
| arrow = "55.0.0" | ||
| arrow-schema = "55.0.0" |
There was a problem hiding this comment.
Shall we migrate to Arrow 55.2 as @comphead suggested ?
There was a problem hiding this comment.
I am using datafusion 47.0.0 and it builds on arrow 55.0.0 - https://github.com/apache/datafusion/blob/e4433049b04ca2c1e2031eb05d1a0990210f11d6/Cargo.toml#L90. Let me try to address as a followup which might require additional validation
There was a problem hiding this comment.
@akshaychitneni Can you create tracking issue for that please ?
We can address it in the followup PRs
| #[derive(Serialize, Deserialize, Debug)] | ||
| struct IndexPair { | ||
| start: u64, | ||
| end: u64, | ||
| } |
There was a problem hiding this comment.
Can we use this struct from main.rs ?
|
|
||
| pub async fn init(&mut self) -> Result<()> { | ||
| let _ = self.fetch_data_files().await; | ||
| let df = self.ctx.sql("select * from memtable").await?.collect().await?; |
There was a problem hiding this comment.
Does DataFusion create the memtable for us ?
There was a problem hiding this comment.
we register a table with name memtable here - https://github.com/kubeflow/trainer/pull/2755/files#diff-d6c85a17f636003dada6db08feb3052bf87b12f6672723af5b503ec390d9c980R94. Probably I will use the var instead of using name directly.
bff2ac3 to
a3e89ae
Compare
a3e89ae to
e472c01
Compare
andreyvelich
left a comment
There was a problem hiding this comment.
@akshaychitneni I left a few comments.
Overall looks good.
Can you fix the CI please ?
/cc @kubeflow/kubeflow-trainer-team @rudeigerc @raravena80 @comphead
in case you want to provide any additional comments.
| arrow = "55.0.0" | ||
| arrow-schema = "55.0.0" |
There was a problem hiding this comment.
@akshaychitneni Can you create tracking issue for that please ?
We can address it in the followup PRs
|
@andreyvelich: GitHub didn't allow me to request PR reviews from the following users: kubeflow/kubeflow-trainer-team, raravena80. Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@akshaychitneni Please can you also add a new scope to the PR title verification, e.g. |
e472c01 to
082f592
Compare
| [[package]] | ||
| name = "tracing-subscriber" | ||
| version = "0.3.19" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "e8189decb5ac0fa7bc8b96b7cb9b2701d60d48805aca84a238004d665fcc4008" | ||
| dependencies = [ | ||
| "nu-ansi-term", | ||
| "sharded-slab", | ||
| "smallvec", | ||
| "thread_local", | ||
| "tracing-core", | ||
| "tracing-log", | ||
| ] |
Check notice
Code scanning / Trivy
tracing-subscriber: Tracing log pollution Low test
082f592 to
492063d
Compare
| @@ -0,0 +1,51 @@ | |||
| use std::env; | |||
There was a problem hiding this comment.
@akshaychitneni Could you also move main files for head and worker to the cmd Similar to controller manger: https://github.com/kubeflow/trainer/tree/master/cmd/trainer-controller-manager ?
I would suggest this structure:
cmd/cache/Dockerfile
cmd/cache/head/main.rs
cmd/cache/worker/main.rs
WDYT ?
There was a problem hiding this comment.
However, I am not sure if Cargo supports such folder structure:
trainer/pkg/data_cache/Cargo.toml
Lines 27 to 33 in 492063d
There was a problem hiding this comment.
This might not be helpful as it involves moving outside of cargo
andreyvelich
left a comment
There was a problem hiding this comment.
Thanks for this awesome effort 🚀
/lgtm
/assign @astefanutti @tenzen-y @Electronic-Waste @kubeflow/wg-training-leads @kubeflow/kubeflow-trainer-team
Please let us know if you have any additional comments before move this forward.
492063d to
4f20335
Compare
|
/lgtm |
Signed-off-by: Akshay Chitneni <achitneni@apple.com>
4f20335 to
607de09
Compare
|
/lgtm |
andreyvelich
left a comment
There was a problem hiding this comment.
We can move forward with this PR and address the remaining comments in the followup changes.
Thanks again for this huge effort @akshaychitneni!
/approve
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreyvelich The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Akshay Chitneni <achitneni@apple.com> Co-authored-by: Akshay Chitneni <achitneni@apple.com>
What this PR does / why we need it:
This PR adds distributed data cache as described in https://docs.google.com/document/d/1xj3K6bOT4f0EPiC4zwr2OsbRkROjIA8u_TBRZRVxrHI
Followup PRs will include SDK and trainer integrations
Proposal - kubeflow/community#864
KEP - #2655
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes # #2757
Checklist: