Skip to content

Update accessors to store weak reference to data #894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 12, 2021
Merged

Conversation

thehomebrewnerd
Copy link
Contributor

@thehomebrewnerd thehomebrewnerd self-assigned this May 5, 2021
@thehomebrewnerd thehomebrewnerd marked this pull request as draft May 5, 2021 18:00
@codecov
Copy link

codecov bot commented May 5, 2021

Codecov Report

Merging #894 (b467f81) into main (637a019) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##              main      #894   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           44        44           
  Lines         6878      6887    +9     
=========================================
+ Hits          6878      6887    +9     
Impacted Files Coverage Δ
woodwork/column_accessor.py 100.00% <100.00%> (ø)
woodwork/table_accessor.py 100.00% <100.00%> (ø)
woodwork/tests/accessor/test_column_accessor.py 100.00% <100.00%> (ø)
woodwork/tests/accessor/test_table_accessor.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 637a019...b467f81. Read the comment docs.

@freddyaboulton
Copy link

freddyaboulton commented May 6, 2021

@thehomebrewnerd I think this implementation solves the original issue!

I visualized the object graph with this branch compared to the latest release and the accessor no longer has a reference to the original dataframe:

This branch

weakref-ww-backrefs

0.3.0 release

ww-backrefs

The memory footprint looks a lot better in evalml. Here is the memory consumption of automl search on the NYC taxi dataset (1.5 million rows)

This branch

ww-weak-ref-branch

0.3.0 release

ww_0_3_taxi

I've also run our full set of benchmark datasets and we see similar improvements in memory without sacrificing accuracy/performance.

So thank you! 👏

thehomebrewnerd and others added 2 commits May 11, 2021 15:40
Copy link
Contributor

@tamargrey tamargrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One comment about something that I was a little confused about, but feel free to ignore if you think adding a second example will be redundant!

@thehomebrewnerd thehomebrewnerd merged commit f6b2907 into main May 12, 2021
@thehomebrewnerd thehomebrewnerd deleted the weak-ref branch May 12, 2021 18:38
@thehomebrewnerd thehomebrewnerd mentioned this pull request May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use weak references to refer to the original data in the accessor
4 participants