Skip to content

Memory leak when using ParseQuery in a loop #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghugues opened this issue Dec 7, 2015 · 34 comments
Closed

Memory leak when using ParseQuery in a loop #111

ghugues opened this issue Dec 7, 2015 · 34 comments
Labels
type:bug Impaired feature or lacking behavior that is likely assumed

Comments

@ghugues
Copy link

ghugues commented Dec 7, 2015

The title of this issue might be misleading as I don't know exactly what is happening.

I originally tried to write a background job that loops over all the rows of a table. Now I'm running this on my computer until I find a fix (which is why I use the node environement).

I've tried to use both ParseQuery.each() and my own version of this method (using find() and a chain of Promises), and in both cases the memory usage of node increases until it reaches the 1.4G limit and node crashes. I think that the objects that are queried are never released either because of a memory leak or because of a chain of retains that I have yet to understand.
I've tried to run node with --expose-gc and call gc() at different points, but it doesn't work.

Here is a sample code that will reproduce the issue. It queries all the objects of a table and then starts over, indefinitely (in case there are not enough objects in the table to make the script reach the 1.4G limit).
if useParseQueryEach is set to true, it will use the ParseQuery.each() otherwise it will use my own implementation (queryEachWithBatchSize).

If this is not related to the Parse SDK but rather to the way I implemented this, please let me know, I will repost it on StackOverflow.

var Parse = require('parse/node').Parse;
var TestObject = Parse.Object.extend("TestObject");

var useParseQueryEach = true;
var queriesBatchSize = 1000;
var statusReportInterval = 1000;

Parse._initialize("APPLICATION_ID", "JAVASCRIPT_KEY", "MASTER_KEY");
Parse.Cloud.useMasterKey();

return queryAllTestObjects(1).then(null, function(error) {
    console.error("Script failed with error " + error.code + " : " + error.message);
});


function queryAllTestObjects(loopCount)
{
    console.log("Will process all test objects (loop count = " + loopCount + ")...");

    var processedTestObjectsCount = 0;
    var allTestObjectsQuery = new Parse.Query(TestObject);

    var callbackFunction = function(testObject) {
        if (++processedTestObjectsCount % statusReportInterval === 0) {
            console.log(processedTestObjectsCount + " test objects processed...");
        }
    };
    var completionHandler = function() {
        console.log(processedTestObjectsCount + " test objects processed.");
        return queryAllTestObjects(loopCount + 1);
    };

    if (useParseQueryEach) {
        return allTestObjectsQuery.each(callbackFunction, {batchSize:queriesBatchSize}).then(completionHandler);
    } else {
        return queryEachWithBatchSize(allTestObjectsQuery, queriesBatchSize, callbackFunction).then(completionHandler);
    }
}

function queryEachWithBatchSize(query, batchSize, callback)
{
    var process = function _process(startCreatedAtDate)
    {
        query.ascending("createdAt");
        query.limit(batchSize);
        if (startCreatedAtDate) {
            query.greaterThan("createdAt", startCreatedAtDate);
        }

        return query.find().then(function(objects) {
            var objectsCount = objects.length;
            for (var i = 0 ; i < objectsCount ; i++) {
                callback(objects[i]);
            }
            if (objectsCount === batchSize) {
                var lastCreatedAt = objects[batchSize - 1].createdAt;
                objects.length = 0;
                return _process(lastCreatedAt);
            } else {
                objects.length = 0;
                return Parse.Promise.as();
            }
        });
    };

    return process(null);
}
@TylerBrock
Copy link
Contributor

I'm experiencing the same sort of behavior and I don't know if it is a problem with our own code or if it is related to the SDK. Would be interested to see what @andrewimm had to say about this.

@ghugues
Copy link
Author

ghugues commented Dec 10, 2015

Hi,
Can we have some feedback on this ? Are you able to reproduce this issue on your side ?

@ghugues
Copy link
Author

ghugues commented Dec 11, 2015

I just tried it with version 1.5.0 of the SDK and things work much better.

By default, it seems memory is still not cleared immediately after each query but it does get cleared at some point, usually after the end of each loop. So I get big memory spikes if I don't change the max_old_space_size value, but in this case it's probably tied to the way garbage collection works and has nothing to do with the SDK.
I've tried running the sample code from above with version 1.5.0 and --max_old_space_size=256 and it works fine (at least up to 250k objects in only one loop, at which point I stopped the test).

If I run the same test with version 1.6.9 and --max_old_space_size=256, it crashes after querying around 38k objects (FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory).

Juts for the record, I also tested this in cloud code, but if fails with the service is currently unavailable in the dashboard and V8 Server Error in the logs for both 1.5.0 and 1.6.7 versions of the SDK (after 5 minutes and 39000 objects for 1.5.0, and 1 minute and 34000 objects for 1.6.7). But it might be a completely different issue, so let's concentrate on the 1.6.9 SDK issue for now.

I would really appreciate your feedback on this. I've had this issue for a long time now. I didn't report it before because I didn't have time to investigate further, but this is a serious issue.

@ghugues
Copy link
Author

ghugues commented Dec 11, 2015

Also I should mention that the objects I have run these tests on are rather big objects (objects that have an array of approximately 20 Parse objects). So even though I don't include the related objects, decoding the response does create empty Parse objects, which is why memory get used up so quickly with such a small number of objects.
It doesn't change the fact that there is an issue with the SDK though.

@andrewimm
Copy link
Contributor

So we've pinpointed the reason behind this error, but we'll need community input to resolve it. Ultimately, the JS SDK is designed as a client SDK: it's optimized for single-user web and react native apps. When an object is created, we store its server data in the Object Store, and let the ParseObject instance act as an interaction layer for that data. This gives us a number of abilities we didn't have in the past. This is a pretty common scenario when dealing with client-side caches. However, this also means that object data does not get garbage collected when the shallow instances are.

We've never really designed the JS SDK to be used to perform long-running jobs over many thousands of objects. If we were to build something to that end, it probably would not resemble the client/server SDK.

A short-term solution would be to give you the ability to release() objects when you're sure they're not used elsewhere (ideal if you're just running an .each() loop). There is already an internal method that can clear all data from the ObjectStore, but this would be a bit more granular. We could even go further with some level of ref-counting and such, but that seems a bit ridiculous.

myQuery.each((obj) => {
  // ... process obj
  // ...
  // ...

  obj.release();
});

Thoughts?

@roddylindsay
Copy link

How about writing a piece of express middleware that would track the cache entries during a request and free them when the response is issued? I think that would get us most of the way there in a node/express environment without introducing sprinkling release() calls everywhere in the code

@andrewimm
Copy link
Contributor

If we add release(), there are a number of solutions that can be implemented. The middleware idea is one; another is to special-case .each() with an option to release it when the loop finishes.

myQuery.each((obj) => {
  // ...
}, { releaseObjects: true })

The logic could be something along the lines of: if the object was not previously allocated, and options.releaseObjects is true, automatically release each object when the each() callback completes.

@roddylindsay
Copy link

@andrewimm sorry, just to be clear, does this issue affect long-running node servers running parse queries as well? or is this limited to a job scenario where one is touching lots of objects in a single job?

@andrewimm
Copy link
Contributor

This would probably affect long-running servers that see a significantly large number of objects. It may be that we need some restructuring when running in single-instance object mode.

FWIW, I'm unofficially exploring some new directions around a server SDK that might eventually become part of the core of the standard JS SDK -- things like purely-functional APIs for interacting with objects, which reduce side effects and remove the need for any sort of global store. It's completely experimental for now

@ghugues
Copy link
Author

ghugues commented Dec 11, 2015

First of all, I added the following method to Parse.js. I call it in my custom queryEachWithBatchSize function after processing the queried objects and it works great.

Parse.clearObjectStore = function () {
    _interopRequireWildcard(require('./ObjectState'))._clearAllState();
    _CoreManager2['default'].getStorageController().clear();
};

Please note that I have absolutely no idea what I'm doing here, so if that code made your eyes bleed, I'm very sorry. But I think something like this could be a good quick fix (even if it remains a private undocumented method).

Going further, and about your proposition. Again, I'm an iOS developper not a javascript developper, but here are my thoughts anyway :

I think that manually releasing objects is going to be a pain. I'm guessing that if an object I just queried has a pointer to another parse object, then an entry for that object is also added to the object store ? That means that I have to recursively release all referenced objects if I want to clean up everything properly. And then there are model changes with new properties and objects, and loops and it gets very complicated to handle if not impossible.

I've seen somewhere in the source a mention to 'single instance objects' that apparently can be enabled or disabled ? I'm guessing you have something similar to the iOS SDK local data store where I have only one instance for a given objectId, and that instance gets reused even if I requery the object using it's id ?
If so, that's probably the reason you took this approach with a shared object store. But what if I don't want this functionality ? If 'single instance objects' are disabled, then there is no need for a shared ObjectState pool, and every ParseObject can have it's own ObjectState ? I think that's how it works in the iOS SDK when local datastore is disabled.
My thinking is that 'single instance objects' are great for a user facing app (a browser), but in other cases (could code triggers, background jobs and even express apps), I don't see the benefits.

Going even further, what about implementing the ObjectStore using a weakMap or a weakSet ? I've seen these objects as part of ES6, and if they behave the way I think they do, this could solve the issue. ParseObjects would keep a strong reference to their objectState and the ObjectStore would keep a weak reference ?

Bottom line is I see 3 solutions (that could be implemented one after the other, in this order)

  • Add a public or private clearObjectStore() method
  • Add the ability to disable the shared ObjectStore (if that's even possible)
  • Implement using weak references (if Javascript is even capable of such a thing)

@ghugues
Copy link
Author

ghugues commented Dec 11, 2015

And I almost said : what about a 'command line tool SDK' ? You're right, it seems like a missing part.
I've written a command line tool using the OS X SDK once (yes !). It really didn't work well, but it was just for uploading stuff and at the time I felt like a small wrapper around the REST API with objects would be great.

@andrewimm
Copy link
Contributor

@ghugues https://github.com/ParsePlatform/Parse-SDK-JS/blob/master/src/ParseObject.js#L1031
Already got you covered with the unpublicized Parse.Object._clearAllState() 😉

You're correct in your assessment of Single-instance objects, and the reasoning behind them. Nearly all use cases of the JS SDK are for clients, so it's optimized for these cases. Single-instance mode is disabled by default for users.
The way JS implements GC complicates things here, but I think the answer is "have a better API to handle scenarios like this," than "modify the JS SDK to have completely divergent behavior when a toggle is flipped"

@ghugues
Copy link
Author

ghugues commented Dec 11, 2015

Ah thanks ! I searched for some private API to make this call without modifying the source but didn't find it.

@ghugues
Copy link
Author

ghugues commented Dec 11, 2015

Personally I have nothing against a deadly toggle as long as it's properly documented.

But a good API can solve the issue as well. However as I mentioned, it should include a recursiveRelease() that will release an object and all the objects it references (and handle loops correctly). But I guess you already have that code somewhere for the save() method.

@kevinwritescode
Copy link

I was about to post this separately but this seems like the same situation. I have been running Parse 1.5.0 in Node 4.2.2 and memory has maintained an average of 256 MB, but if I try to upgrade to any version of Parse 1.6.*, I start to see steady memory usage that overflows up to 1GB each day and triggers alerts on my Heroku servers.

Along with changing the package.json file, I am updating the Parse require:

var Parse = require("parse").Parse; // 1.5.0
var Parse = require("parse/node").Parse; // 1.6.*

I am also commenting out the following line when migrating to 1.6.*

Parse.User.enableUnsafeCurrentUser(); // 1.5.0
// Parse.User.enableUnsafeCurrentUser(); // 1.6.*

I have no other differences between my two releases. @andrewimm, is there a recommended way of using Parse in the Node environment so I can avoid these memory spikes? The discussion here seems unique to the ways that @ghugues and I may be using Parse objects?

I feel like I'm using Parse in a fairly standard behavior of querying objects, using them to handle various bits of logic during a user's API request, maybe performing an insert or update, then closing the request. I do stub objects for querying purposes, but those are only used per user API request.

image

@TylerBrock
Copy link
Contributor

Yep, Kevin I'm seeing this excatly.

Andrewimm is it your recommendation that users running a web application call Parse.Object_clearAllState() at the end of each request (or n requests) in middleware somewhere?

I see its defined but the SDK itself does not call it anywhere.

@andrewimm
Copy link
Contributor

_clearAllState() is actually designed to be used by tests, but it achieves this end as well. The reason I want to explore a more granular version is that the Object Store is a singleton; if you have n parallel requests in progress, clearing object state for one will likely corrupt the rest.

I do think that to solve this, we may need to avoid the global store in single-instance mode, which would be a significant refactor but not impossible. However, I also personally feel that there is room for an API that is significantly more tailored towards the server experience.

@TylerBrock
Copy link
Contributor

I agree on all accounts. This is painful right now in production so any efforts to ease this problem are greatly appreciated.

@andrewimm
Copy link
Contributor

This is currently my top priority

@andrewimm andrewimm added the type:bug Impaired feature or lacking behavior that is likely assumed label Dec 21, 2015
@andrewimm
Copy link
Contributor

Since this requires some major restructuring under the hood for non-single-instance mode, I'm going to reserve these changes for 1.7.0 (even though they don't affect public APIs in any way).

I'll try to get much of that squared away today.

@andrewimm andrewimm assigned andrewimm and unassigned peterdotjs Dec 21, 2015
@TylerBrock
Copy link
Contributor

Hey Andrew, any updates on this one?

@ghost
Copy link

ghost commented Jan 7, 2016

Hello Andrew,

I too am experiencing odd Parse Job errors in Cloud Code with my custom functions that query large datasets. I'm receiving "the service is currently unavailable", as my returned status. It seems like Parse completely terminates the function, without running through promise error handling. Here is a sample of my code:

Parse.Cloud.job('userJob', function (request, status) {
    var Users = queryLarge(1, startQuery, endQuery);
    Parse.Promise.when([Users]).then(function (Users) {
        // do stuff
    },function(error) {
        // error
    });
});

function setQuery (choice, startQuery, endQuery) {

    var QueryObject = "_User";

    // Query QueryObject
    var query = new Parse.Query(QueryObject);

    if (choice == 1) {
        // _User
        query.greaterThanOrEqualTo('createdAt', startQuery);
        query.lessThanOrEqualTo('createdAt', endQuery);
    }

    return query;

}

function queryLarge (choice, startQuery, endQuery) {

    // Start at first 1000
    // Can only pull 1000 at a time

    var skip = 0;
    var query = setQuery(choice, startQuery, endQuery);
    var QueryObject = setQueryObject(choice);

    Parse.Cloud.useMasterKey();

    return query.count().then(function (length) {

        var Items = [];

        // Create a Promise
        var Promise = new Parse.Promise();

        async.whilst(
            function() {return skip < length;},
            function(callback) {

                if ((skip+1000) > length) {
                    console.log(" Pulling "+skip+" to "+length+" "+QueryObject+"s...");
                } else {
                    console.log(" Pulling "+skip+" to "+(skip+1000)+" "+QueryObject+"s...");
                }

                // If skip exceeds 10,000, the call fails. Therefore, a createdAt check
                // must be performed to ensure that all objects are returned.

                if (Items.length > 0) {
                    // If we've gone through the function at least once, let's set the createdAt date
                    // of the final element.
                    var createdAt = Items[Items.length - 1];
                    createdAt = createdAt.createdAt;

                } else {
                    // If first time through, set createdAt to a future date
                    var createdAt = '2020-01-14T02:05:24.711Z';
                }

                runQuery(skip, createdAt, choice, startQuery, endQuery, function (newItems) {
                    Items = Items.concat(newItems);
                    skip = skip + 1000;
                    callback(null);
                });

            },
            //callback for the whilst loop. this will be called once the condition is no longer met (i < length)
            function(err) {

                // All done

                // Set oldest user to top
                var ItemsPromise = Items.reverse();

                // Resolve Promise
                Promise.resolve(Items);

            });

        // Return Resolved Promise
        return Parse.Promise.when([Promise]);

    }).then(function (Items) {

        // Success
        // Return Items to Function Call
        return Items;

    }, function (error) {

        // Error
        return Parse.Promise.error(error);

    });

}

function runQuery (skip, createdAt, choice, startQuery, endQuery, callback) {

    // Run query object

    var query = setQuery(choice, startQuery, endQuery);

    query.limit(1000);
    query.descending("createdAt");
    query.lessThan('createdAt', createdAt);

    Parse.Cloud.useMasterKey();

    query.find().then(function (Results) {

        callback(Results);

    }, function (error) {

        // Error
        console.log(error);
        callback(error);

    });

}

@ghost
Copy link

ghost commented Jan 8, 2016

I transferred my function to a local machine, and received this error after about 20 minutes of running my loop:

FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory
Abort trap: 6

This call also gets extremely slow once the Items size get very large:

Items = Items.concat(newItems);

I haven't tested with .push() but expect similar results. I'm unsure whether its the size of the array, or the provided memory for the entire execution of node.

@jmornar
Copy link

jmornar commented Jan 11, 2016

@andrewimm First of all thanks for such fast reply regarding version 1.7 release date. Regarding suggested workaround (Parse.Object._clearAllState()) I'm not sure if it's good idea in our case - we have node instance that's communicating with Parse and serving multiple clients (iOS, Android, Website) so we have many request in parallel and as you noted above that could lead to unwanted behaviour. Wanted to hear your thoughts on one possible workaround.. If we use Parse SDK to construct Parse query and then use such constructed query ( query.toJSON() ) to retrieve data using Parse REST API (without calling SKD method find() at all). Could that be possible solution to memory leak and global store concept issue?

@andrewimm
Copy link
Contributor

I recommend clearing objects individually ("releasing" them), when you're sure you're done with them. This should be possible with ObjectStore.removeState(), but it really depends on knowing whether it's safe in your specific environment. _clearAllState is absolutely dangerous in real scenarios, as you point out.

You could also definitely use pieces of the SDK like Parse.Query's toJSON, and the methods of RESTController (Parse.CoreManager.get('RESTController')) to make your requests without allocating new Parse Objects. When it comes to the future of effective server-side programming for Parse, I believe that there is a significant value in exploring purely-functional SDK functionality that operates over immutable data, without allocating or interacting with OOP-style instances.

@TylerBrock
Copy link
Contributor

So... I made something awful that abuses the cluster module to seamlessly replace workers after they consume some amount of resident memory OR service a given number of requests:
https://github.com/HustleInc/regiment

Wanted to share in case anyone else was brave and waiting for 1.7

@TylerBrock
Copy link
Contributor

@andrewimm I don't think its manageable in large applications to keep track of all the objects and individually release them. When including related objects would you need to removeState() for them as well? What about a role instance that retrieves users via a relation? There are too many edge cases that should simply covered by GC.

@andrewimm
Copy link
Contributor

As I said, it's really tied to your specific code. If all your function does is run an .each() over a bunch of objects and mutate them all in some way (which seems to be a common cloud operation), you could be pretty safe about this. In other scenarios, it's a lot riskier.
There are also some solutions that work a lot better in Cloud Code's isolated sandbox environment than they do in a node environment where all memory is shared.

Ultimately, it's probably not worth too much thought. I estimate we can have 1.7 out in the next day.

@andrewimm
Copy link
Contributor

(or at least an RC of it... I actually think I want to release it that way first)

@ghost
Copy link

ghost commented Jan 12, 2016

Andrew,

How much different is Cloud Code's sandbox than a Node + Parse Javascript SDK? I've certainly dealt with issues getting node modules working correctly. Do you suggest any modifications to my code to get around the memory issue of holding a large amount of objects in an array for processing? (I do not save anything, just run calculations).

@andrewimm
Copy link
Contributor

@bionetmarco cloud code and node are both based on the v8 engine, but node adds a lot of abstractions and core libs that aren't available in cloud code. In node, you have a single long-running process for a server, while in cloud code each request spins up a new execution environment for memory safety. I'd love to dig into ways to be memory-efficient when just running calculations, but I think the best bet is now to just install 1.7.0-rc1 if you're running on node.

For everyone: these memory issues should be resolved in the 1.7.0 RC, available with npm install [email protected]. We'll probably do a full release of 1.7.0 in the next week (including a release for cloud code), assuming no major issues crop up.

@TylerBrock
Copy link
Contributor

@andrew Thanks for all the hard work here. I can see that you are now using WeakMap for object state which will help a ton but I wanted to see if you have any guidance around taking advantage of this restructuring beyond just swapping the library in for 1.6.x (accounting for other changes aside)

@andrewimm
Copy link
Contributor

@TylerBrock I understand you're wary of just dropping in new SDK versions :)
In node environments, where single instance mode is disabled by default, there should be absolutely no behavioral changes. Before, object data was fetched through a combination of classname, objectId, and a unique number generated in each object's constructor. Now, they are fetched by a reference to each individual ParseObject instance. This maintains the invariant that each new ParseObject() has its own set of server data and local changes, but improves upon the old style by garbage collecting this data when the object itself is GC'd.

Both the unit tests and our internal integration tests suggest that there are no behavior changes (at least related to this individual change). The move to A+ compliance by default means that certain internal exceptions are now passed to reject clauses of Promises, rather than cascading up through the stack.

@TylerBrock
Copy link
Contributor

Yup, thanks. The change log was legit this time around, I just wanted to make sure that there wasn't anything that needed to be done other than requiring parse/node (which selects the single instance store).

Our init code and tests disable A+ for now so that the behavior there is the same for now although we are itching to take advantage of it in the future. Thank you again for all the hard work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Impaired feature or lacking behavior that is likely assumed
Projects
None yet
Development

No branches or pull requests

7 participants