Split Session._collection_node_cache to 3 mutually exclusive parts #6555

bluetech · 2020-01-24T12:01:50Z

Previously, this cache was used with 3 different and mutually exclusive key-type -> value-type combinations. Mypy can't properly type this. It's also quite confusing.

Split to 3 different dicts instead.

bluetech · 2020-01-24T12:02:41Z

Suggestions welcome for better names than 1, 2, 3 :)

blueyed · 2020-01-24T12:34:43Z

JFI: IIRC the node cache might not be necessary in the first place - that's something I think to remember from when looking into fixing collection issues, especially with regard to Package. I might be wrong though, and it might be more about fixing it (where this then certainly would help).

nicoddemus · 2020-01-24T13:31:22Z

Can we please do that in features? Those attribute have changed names there already.

src/_pytest/main.py

bluetech · 2020-01-25T12:27:33Z

Can we please do that in features? Those attribute have changed names there already.

Done!

JFI: IIRC the node cache might not be necessary in the first place

Yeah, hopefully this will make it easier to look at, at least. For reference, it was added in fedc785 to address issue #3358 (which is still open -- don't know if it just wasn't closed though).

Previously, this cache was used with 3 different and mutually exclusive key-type -> value-type combinations. Mypy can't properly type this. It's also quite confusing. Split to 3 different dicts instead.

blueyed · 2020-01-25T13:10:19Z

Thanks for the reference (this was part of #3389).
I think we could use this PR to try to get rid of it / minimize its usage.
That would also allow us for postponing the quest for finding appropriate names.. :)

bluetech · 2020-01-25T14:51:36Z

TBH, I find doing refactors without type checks to be a frightening prospect, at least in a "delicate" codebase like pytest 😺. That's why I try to focus on adding types and only make changes when I have to.

blueyed · 2020-01-26T19:05:40Z

TBH, I find doing refactors without type checks to be a frightening prospect, at least in a "delicate" codebase like pytest . That's why I try to focus on adding types and only make changes when I have to.

Yeah. But given that it is likely that this code can be refactored/removed, it could be done in this PR already, based on having annotations here now.

bluetech · 2020-01-28T09:49:26Z

Yeah. But given that it is likely that this code can be refactored/removed, it could be done in this PR already, based on having annotations here now.

I'd prefer not to do it in this PR, because I don't intend to work on it ATM. IMO this PR makes things slightly better already (in that whoever refactors it will have an easier time). The reason I'd like to merge it is that it blocks #6556 which in turn blocks more typing work.

blueyed · 2020-01-28T13:36:46Z

@bluetech not sure why it blocks #6556, but my point is that it is better to get rid of the caches instead of renaming them (again). I think it should have been just removed / investigated with 7b1e3d1 (#6491) to begin with.

It seems like all tests are passing with this, so it might only be useful for caching, not for logic (or is not covered really):

 src/_pytest/main.py | 46 ++++++++--------------------------------------
 1 file changed, 8 insertions(+), 38 deletions(-)

diff --git i/src/_pytest/main.py w/src/_pytest/main.py
index a0cae18ed..451923679 100644
--- i/src/_pytest/main.py
+++ w/src/_pytest/main.py
@@ -409,17 +409,6 @@ def __init__(self, config: Config) -> None:
         self.startdir = config.invocation_dir
         self._initialpaths = frozenset()  # type: FrozenSet[py.path.local]
 
-        # Keep track of any collected nodes in here, so we don't duplicate fixtures
-        self._collection_node_cache1 = (
-            {}
-        )  # type: Dict[py.path.local, Sequence[nodes.Collector]]
-        self._collection_node_cache2 = (
-            {}
-        )  # type: Dict[Tuple[Type[nodes.Collector], py.path.local], nodes.Collector]
-        self._collection_node_cache3 = (
-            {}
-        )  # type: Dict[Tuple[Type[nodes.Collector], str], CollectReport]
-
         # Dirnames of pkgs with dunder-init files.
         self._collection_pkg_roots = {}  # type: Dict[py.path.local, Package]
 
@@ -537,9 +526,6 @@ def collect(self):
                 self._notfound.append((report_arg, sys.exc_info()[1]))
 
             self.trace.root.indent -= 1
-        self._collection_node_cache1.clear()
-        self._collection_node_cache2.clear()
-        self._collection_node_cache3.clear()
         self._collection_pkg_roots.clear()
 
     def _collect(self, argpath, names):
@@ -557,13 +543,10 @@ def _collect(self, argpath, names):
                 if parent.isdir():
                     pkginit = parent.join("__init__.py")
                     if pkginit.isfile():
-                        if pkginit not in self._collection_node_cache1:
-                            col = self._collectfile(pkginit, handle_dupes=False)
-                            if col:
-                                if isinstance(col[0], Package):
-                                    self._collection_pkg_roots[parent] = col[0]
-                                # always store a list in the cache, matchnodes expects it
-                                self._collection_node_cache1[col[0].fspath] = [col[0]]
+                        col = self._collectfile(pkginit, handle_dupes=False)
+                        if col:
+                            if isinstance(col[0], Package):
+                                self._collection_pkg_roots[parent] = col[0]
 
         # If it's a directory argument, recurse and look for any Subpackages.
         # Let the Package collector deal with subnodes, don't collect here.
@@ -590,21 +573,12 @@ def _collect(self, argpath, names):
 
                 for x in self._collectfile(path):
                     key = (type(x), x.fspath)
-                    if key in self._collection_node_cache2:
-                        yield self._collection_node_cache2[key]
-                    else:
-                        self._collection_node_cache2[key] = x
-                        yield x
+                    yield x
         else:
             assert argpath.check(file=1)
 
-            if argpath in self._collection_node_cache1:
-                col = self._collection_node_cache1[argpath]
-            else:
-                collect_root = self._collection_pkg_roots.get(argpath.dirname, self)
-                col = collect_root._collectfile(argpath, handle_dupes=False)
-                if col:
-                    self._collection_node_cache1[argpath] = col
+            collect_root = self._collection_pkg_roots.get(argpath.dirname, self)
+            col = collect_root._collectfile(argpath, handle_dupes=False)
             m = self.matchnodes(col, names)
             # If __init__.py was the only file requested, then the matched node will be
             # the corresponding Package, and the first yielded item will be the __init__
@@ -717,11 +691,7 @@ def _matchnodes(self, matching, names):
                 continue
             assert isinstance(node, nodes.Collector)
             key = (type(node), node.nodeid)
-            if key in self._collection_node_cache3:
-                rep = self._collection_node_cache3[key]
-            else:
-                rep = collect_one_node(node)
-                self._collection_node_cache3[key] = rep
+            rep = collect_one_node(node)
             if rep.passed:
                 has_matched = False
                 for x in rep.result:

But since you do not plan to work on it here, let's get it improved slightly then.

bluetech mentioned this pull request Jan 24, 2020

Type annotate main.py and some parts related to collection #6556

Closed

RonnyPfannschmidt reviewed Jan 24, 2020

View reviewed changes

src/_pytest/main.py Outdated Show resolved Hide resolved

bluetech force-pushed the nodes-cache-split branch from 998f1a1 to aa475d0 Compare January 25, 2020 12:09

bluetech changed the base branch from master to features January 25, 2020 12:09

bluetech changed the title ~~Split Session._node_cache to 3 mutually exclusive parts~~ Split Session._collection_node_cache to 3 mutually exclusive parts Jan 25, 2020

bluetech force-pushed the nodes-cache-split branch from aa475d0 to 4510507 Compare January 25, 2020 12:10

Split Session._collection_node_cache to 3 mutually exclusive parts

f396733

Previously, this cache was used with 3 different and mutually exclusive key-type -> value-type combinations. Mypy can't properly type this. It's also quite confusing. Split to 3 different dicts instead.

bluetech force-pushed the nodes-cache-split branch from 4510507 to f396733 Compare January 25, 2020 12:32

This comment has been minimized.

Sign in to view

blueyed approved these changes Jan 28, 2020

View reviewed changes

bluetech merged commit e440b43 into pytest-dev:features Jan 28, 2020

bluetech deleted the nodes-cache-split branch January 28, 2020 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split Session._collection_node_cache to 3 mutually exclusive parts #6555

Split Session._collection_node_cache to 3 mutually exclusive parts #6555

bluetech commented Jan 24, 2020 •

edited

Loading

bluetech commented Jan 24, 2020

blueyed commented Jan 24, 2020

nicoddemus commented Jan 24, 2020

bluetech commented Jan 25, 2020

blueyed commented Jan 25, 2020

bluetech commented Jan 25, 2020

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

blueyed commented Jan 26, 2020

bluetech commented Jan 28, 2020

blueyed commented Jan 28, 2020

Split Session._collection_node_cache to 3 mutually exclusive parts #6555

Split Session._collection_node_cache to 3 mutually exclusive parts #6555

Conversation

bluetech commented Jan 24, 2020 • edited Loading

bluetech commented Jan 24, 2020

blueyed commented Jan 24, 2020

nicoddemus commented Jan 24, 2020

bluetech commented Jan 25, 2020

blueyed commented Jan 25, 2020

bluetech commented Jan 25, 2020

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

blueyed commented Jan 26, 2020

bluetech commented Jan 28, 2020

blueyed commented Jan 28, 2020

bluetech commented Jan 24, 2020 •

edited

Loading