Add allMutationsAffectingDocumentKeys to FSTMutationQueue #1479

wilhuff · 2018-06-29T01:22:51Z

This is a building block for eliminating quadratic behavior found recently when recomputing the view for large batches.

var-const

Thanks for doing this!

var-const · 2018-06-29T18:14:08Z

Firestore/Source/Local/FSTLevelDBMutationQueue.mm

@@ -396,6 +396,39 @@ - (nullable FSTMutationBatch *)nextMutationBatchAfterBatchID:(FSTBatchID)batchID
  return result;
 }

+- (NSArray<FSTMutationBatch *> *)allMutationBatchesAffectingDocumentKeys:
+    (const DocumentKeySet &)documentKeys {
+  NSString *userID = self.userID;


Question: is storing this value in a variable intended as an optimization or readability improvement? In the latter case, I think moving it inline would be slightly better.

It's a minor optimization. This is done fairly consistently throughout so I prefer to leave it as is.

As we rewrite to C++ this will become an instance variable reference.

var-const · 2018-06-29T18:18:47Z

Firestore/Source/Local/FSTMutationQueue.h

+ *
+ * Note that because of this requirement implementations are free to return mutation batches that
+ * don't contain any of the given document keys at all if it's convenient.
+ */


Since the implementations return batches in sorted order, perhaps it's worth it to call it out in this method's description?

I'm not sure that's a contractual thing we care about, and isn't something that can necessarily be guaranteed on the other platforms.

@wilhuff Sorry to reopen old thread, but it has relevant context. @mikelehen Michael pointed out that it would matter for batches to be in order when applying them. On Android, it's more natural to return batches unsorted, although of course it's possible to sort them as well.

Ah right. Of course.

I think it's reasonable to add a comment that indicates these come out in sorted order. I don't think it's worth changing to a data structure that necessarily enforces that order though.

Oh! I guess the implementation does actually do the right thing (I thought it didn't guarantee order) since std::set is apparently a sorted set. So yeah, I guess we just need to update the comment.

var-const · 2018-06-29T18:21:39Z

Firestore/Source/Local/FSTMutationQueue.h

@@ -115,10 +116,21 @@ NS_ASSUME_NONNULL_BEGIN
 * don't contain the document key at all if it's convenient.
 */
 // TODO(mcg): This should really return an NSEnumerator
-// also for b/32992024, all backing stores should really index by document key
 - (NSArray<FSTMutationBatch *> *)allMutationBatchesAffectingDocumentKey:


Question: perhaps this function can now be removed or reimplemented in terms of the new one? (I can do that if you think it's a good idea)

I believe the other can be removed since it's an anti-pattern. I purely went with the addition of this new method here to give you the chance to figure out how to integrate it into the reworked higher-level algorithm.

Ok, I'll do that in a follow-up.

var-const · 2018-06-29T18:24:09Z

Firestore/Source/Local/FSTLevelDBMutationQueue.mm

+  for (const DocumentKey &documentKey : documentKeys) {
+    std::string indexPrefix =
+        [FSTLevelDBDocumentMutationKey keyPrefixWithUserID:userID resourcePath:documentKey.path()];
+    indexIterator->Seek(indexPrefix);


Very optional nit: I think you can move this expression to the initialization of for loop to avoid having a missing clause.

var-const · 2018-06-29T18:27:54Z

Firestore/Source/Local/FSTLevelDBMutationQueue.mm

+      // the index rows for documentKey contiguously. In particular, all the rows for documentKey
+      // will occur before any rows for documents nested in a subcollection beneath documentKey so
+      // we can stop as soon as we hit any such row.
+      if (!absl::StartsWith(indexIterator->key(), indexPrefix) ||


There seems to be four variations of this check throughout the file. One could go away if the single-key version of this function were to be removed or rewritten to delegate to this function. Nevertheless, perhaps this check deserves its own named function? I can do this in a subsequent PR if you think it's worth it.

The trouble is that these aren't all the same. In three out of four cases we examine other parts of the row key after decoding it. In one case we consult the size of the key rather than an exact match. I'm also not keen on hiding the decoding the key too much.

The best way to solve this would probably be to add some mechanism for constructing the prefix including the marker that would start the next segment. That way we could do something like:

if (![rowKey decodeKey:indexIterator->key() withExactPrefix:suffixedIndexPrefix]) { break; }

This would effectively combine the cheap indexPrefix check we have now with the exact DocumentKey check.

However, I'm struggling with what to call that thing, since we need both kinds: we need exact matches as we do in this case, but also any path matching the prefix as we need for allMutationBatchesAffectingQuery.

Given everything we have to do I'm inclined to just leave it as is. I'd rather not obsess about this too much--we need to make progress on the larger goals we have and this, while verbose, avoids being magical.

var-const · 2018-06-29T18:32:39Z

Firestore/Source/Local/FSTLevelDBMutationQueue.mm

+        [FSTLevelDBDocumentMutationKey keyPrefixWithUserID:userID resourcePath:documentKey.path()];
+    indexIterator->Seek(indexPrefix);
+    for (; indexIterator->Valid(); indexIterator->Next()) {
+      // Only consider rows matching exactly the specific key of interest. Note that because we


Optional: it's probably too much trouble, but I think a hypothetical example of how rows are laid out would make this comment easier to grasp and perhaps allow making it shorter.

Done. This helped tighten the text too.

var-const · 2018-06-29T18:39:58Z

Firestore/Example/Tests/Local/FSTMutationQueueTests.mm

+
+  self.persistence.run("testAllMutationBatchesAffectingDocumentKey", [&]() {
+    NSArray<FSTMutation *> *mutations = @[
+      FSTTestSetMutation(@"fob/bar",


Ultranit: could you replace the b in fob with a letter that has less resemblance to an o, say, x or k or y? It took me three passes to finally spot the difference between this element and the next one.

Sorry about that. The idea is that this key precedes the others so I've chosen i.

* add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details).

* Implement global resume token (#1052) * Add a spec test that shows correct global resume token handling * Minimum implementation to handle global resume tokens * Remove unused QueryView.resumeToken * Avoid persisting the resume token unless required * Persist the resume token on unlisten * Add a type parameter to Persistence (#1047) * Cherry pick sequence number starting point * Working on typed transactions * Start plumbing in sequence number * Back out sequence number changes * [AUTOMATED]: Prettier Code Styling * Fix tests * [AUTOMATED]: Prettier Code Styling * Fix lint * [AUTOMATED]: Prettier Code Styling * Uncomment line * MemoryPersistenceTransaction -> MemoryTransaction * [AUTOMATED]: Prettier Code Styling * Review updates * Style * Lint and style * Review feedback * [AUTOMATED]: Prettier Code Styling * Revert some unintentional import churn * Line 44 should definitely be empty * Checkpoint before adding helper function for stores * Use a helper for casting PersistenceTransaction to IndexedDbTransaction * [AUTOMATED]: Prettier Code Styling * Remove errant generic type * Lint * Fix typo * Port optimizations to LocalDocumentsView from iOS (#1055) * add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details). * Add a CHANGELOG entry for #1052 (#1071) * Add a CHANGELOG entry for #1052 * Add notes for #1055 * Rename idleTimer and fix comments. (#1068) * Merge (#1073)

* Merging PersistentStream refactor * [AUTOMATED]: Prettier Code Styling * Typo * Remove canUseNetwork state. (#1076) * Merging the latest merge into the previous merge (#1077) * Implement global resume token (#1052) * Add a spec test that shows correct global resume token handling * Minimum implementation to handle global resume tokens * Remove unused QueryView.resumeToken * Avoid persisting the resume token unless required * Persist the resume token on unlisten * Add a type parameter to Persistence (#1047) * Cherry pick sequence number starting point * Working on typed transactions * Start plumbing in sequence number * Back out sequence number changes * [AUTOMATED]: Prettier Code Styling * Fix tests * [AUTOMATED]: Prettier Code Styling * Fix lint * [AUTOMATED]: Prettier Code Styling * Uncomment line * MemoryPersistenceTransaction -> MemoryTransaction * [AUTOMATED]: Prettier Code Styling * Review updates * Style * Lint and style * Review feedback * [AUTOMATED]: Prettier Code Styling * Revert some unintentional import churn * Line 44 should definitely be empty * Checkpoint before adding helper function for stores * Use a helper for casting PersistenceTransaction to IndexedDbTransaction * [AUTOMATED]: Prettier Code Styling * Remove errant generic type * Lint * Fix typo * Port optimizations to LocalDocumentsView from iOS (#1055) * add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details). * Add a CHANGELOG entry for #1052 (#1071) * Add a CHANGELOG entry for #1052 * Add notes for #1055 * Rename idleTimer and fix comments. (#1068) * Merge (#1073)

* Catch invalid provider id error (#1064) * RxFire: Api Change and documentation (#1066) * api changes and doc updates * fixes * Refactor PersistentStream (no behavior changes). (#1041) This breaks out a number of changes I made as prep for b/80402781 (Continue retrying streams for 1 minute (idle delay)). PersistentStream changes: * Rather than providing a stream event listener to every call of start(), the stream listener is now provided once to the constructor and cannot be changed. * Streams can now be restarted indefinitely, even after a call to stop(). * PersistentStreamState.Stopped was removed and we just return to 'Initial' after a stop() call. * Added `closeCount` member to PersistentStream in order to avoid bleedthrough issues with auth and stream events once stop() has been called. * Calling stop() now triggers the onClose() event listener, which simplifies stream cleanup. * PersistentStreamState.Auth renamed to 'Starting' to better reflect that it encompasses both authentication and opening the stream. RemoteStore changes: * Creates streams once and just stop() / start()s them as necessary, never recreating them completely. * Added networkEnabled flag to track whether the network is enabled or not, since we no longer null out the streams. * Refactored disableNetwork() / enableNetwork() to remove stream re-creation. Misc: * Comment improvements including a state diagram on PersistentStream. * Fixed spec test shutdown to schedule via the AsyncQueue to fix sequencing order I ran into. * Merging Persistent Stream refactor (#1069) * Merging PersistentStream refactor * [AUTOMATED]: Prettier Code Styling * Typo * Remove canUseNetwork state. (#1076) * Merging the latest merge into the previous merge (#1077) * Implement global resume token (#1052) * Add a spec test that shows correct global resume token handling * Minimum implementation to handle global resume tokens * Remove unused QueryView.resumeToken * Avoid persisting the resume token unless required * Persist the resume token on unlisten * Add a type parameter to Persistence (#1047) * Cherry pick sequence number starting point * Working on typed transactions * Start plumbing in sequence number * Back out sequence number changes * [AUTOMATED]: Prettier Code Styling * Fix tests * [AUTOMATED]: Prettier Code Styling * Fix lint * [AUTOMATED]: Prettier Code Styling * Uncomment line * MemoryPersistenceTransaction -> MemoryTransaction * [AUTOMATED]: Prettier Code Styling * Review updates * Style * Lint and style * Review feedback * [AUTOMATED]: Prettier Code Styling * Revert some unintentional import churn * Line 44 should definitely be empty * Checkpoint before adding helper function for stores * Use a helper for casting PersistenceTransaction to IndexedDbTransaction * [AUTOMATED]: Prettier Code Styling * Remove errant generic type * Lint * Fix typo * Port optimizations to LocalDocumentsView from iOS (#1055) * add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details). * Add a CHANGELOG entry for #1052 (#1071) * Add a CHANGELOG entry for #1052 * Add notes for #1055 * Rename idleTimer and fix comments. (#1068) * Merge (#1073)

Pod updates for Cocapods 1.5.3

c6c2a9c

wilhuff added the api: firestore label Jun 29, 2018

wilhuff assigned var-const Jun 29, 2018

wilhuff requested a review from var-const June 29, 2018 01:22

googlebot added the cla: yes label Jun 29, 2018

Add allMutationsAffectingDocumentKeys

76bf96a

wilhuff force-pushed the wilhuff/leveldb-mutations branch from 5e934b2 to 76bf96a Compare June 29, 2018 01:58

var-const approved these changes Jun 29, 2018

View reviewed changes

var-const assigned wilhuff and unassigned var-const Jun 29, 2018

Review feedback

532a13a

var-const approved these changes Jul 9, 2018

View reviewed changes

wilhuff merged commit a5eb895 into master Jul 9, 2018

wilhuff deleted the wilhuff/leveldb-mutations branch July 10, 2018 01:27

var-const mentioned this pull request Jul 27, 2018

Port optimizations to LocalDocumentsView from iOS firebase/firebase-js-sdk#1055

Merged

firebase locked and limited conversation to collaborators Oct 31, 2019

Add allMutationsAffectingDocumentKeys to FSTMutationQueue #1479

Add allMutationsAffectingDocumentKeys to FSTMutationQueue #1479

Uh oh!

Conversation

wilhuff commented Jun 29, 2018

Uh oh!

var-const left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!