-
-
Notifications
You must be signed in to change notification settings - Fork 596
Memory leak when using ParseQuery in a loop #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm experiencing the same sort of behavior and I don't know if it is a problem with our own code or if it is related to the SDK. Would be interested to see what @andrewimm had to say about this. |
Hi, |
I just tried it with version By default, it seems memory is still not cleared immediately after each query but it does get cleared at some point, usually after the end of each loop. So I get big memory spikes if I don't change the If I run the same test with version Juts for the record, I also tested this in cloud code, but if fails with I would really appreciate your feedback on this. I've had this issue for a long time now. I didn't report it before because I didn't have time to investigate further, but this is a serious issue. |
Also I should mention that the objects I have run these tests on are rather big objects (objects that have an array of approximately 20 Parse objects). So even though I don't include the related objects, decoding the response does create empty Parse objects, which is why memory get used up so quickly with such a small number of objects. |
So we've pinpointed the reason behind this error, but we'll need community input to resolve it. Ultimately, the JS SDK is designed as a client SDK: it's optimized for single-user web and react native apps. When an object is created, we store its server data in the Object Store, and let the ParseObject instance act as an interaction layer for that data. This gives us a number of abilities we didn't have in the past. This is a pretty common scenario when dealing with client-side caches. However, this also means that object data does not get garbage collected when the shallow instances are. We've never really designed the JS SDK to be used to perform long-running jobs over many thousands of objects. If we were to build something to that end, it probably would not resemble the client/server SDK. A short-term solution would be to give you the ability to myQuery.each((obj) => {
// ... process obj
// ...
// ...
obj.release();
}); Thoughts? |
How about writing a piece of express middleware that would track the cache entries during a request and free them when the response is issued? I think that would get us most of the way there in a node/express environment without introducing sprinkling |
If we add myQuery.each((obj) => {
// ...
}, { releaseObjects: true }) The logic could be something along the lines of: if the object was not previously allocated, and |
@andrewimm sorry, just to be clear, does this issue affect long-running node servers running parse queries as well? or is this limited to a job scenario where one is touching lots of objects in a single job? |
This would probably affect long-running servers that see a significantly large number of objects. It may be that we need some restructuring when running in single-instance object mode. FWIW, I'm unofficially exploring some new directions around a server SDK that might eventually become part of the core of the standard JS SDK -- things like purely-functional APIs for interacting with objects, which reduce side effects and remove the need for any sort of global store. It's completely experimental for now |
First of all, I added the following method to Parse.clearObjectStore = function () {
_interopRequireWildcard(require('./ObjectState'))._clearAllState();
_CoreManager2['default'].getStorageController().clear();
}; Please note that I have absolutely no idea what I'm doing here, so if that code made your eyes bleed, I'm very sorry. But I think something like this could be a good quick fix (even if it remains a private undocumented method). Going further, and about your proposition. Again, I'm an iOS developper not a javascript developper, but here are my thoughts anyway : I think that manually releasing objects is going to be a pain. I'm guessing that if an object I just queried has a pointer to another parse object, then an entry for that object is also added to the object store ? That means that I have to recursively release all referenced objects if I want to clean up everything properly. And then there are model changes with new properties and objects, and loops and it gets very complicated to handle if not impossible. I've seen somewhere in the source a mention to 'single instance objects' that apparently can be enabled or disabled ? I'm guessing you have something similar to the iOS SDK local data store where I have only one instance for a given objectId, and that instance gets reused even if I requery the object using it's id ? Going even further, what about implementing the ObjectStore using a weakMap or a weakSet ? I've seen these objects as part of ES6, and if they behave the way I think they do, this could solve the issue. ParseObjects would keep a strong reference to their objectState and the ObjectStore would keep a weak reference ? Bottom line is I see 3 solutions (that could be implemented one after the other, in this order)
|
And I almost said : what about a 'command line tool SDK' ? You're right, it seems like a missing part. |
@ghugues https://github.com/ParsePlatform/Parse-SDK-JS/blob/master/src/ParseObject.js#L1031 You're correct in your assessment of Single-instance objects, and the reasoning behind them. Nearly all use cases of the JS SDK are for clients, so it's optimized for these cases. Single-instance mode is disabled by default for users. |
Ah thanks ! I searched for some private API to make this call without modifying the source but didn't find it. |
Personally I have nothing against a deadly toggle as long as it's properly documented. But a good API can solve the issue as well. However as I mentioned, it should include a |
I was about to post this separately but this seems like the same situation. I have been running Parse 1.5.0 in Node 4.2.2 and memory has maintained an average of 256 MB, but if I try to upgrade to any version of Parse 1.6.*, I start to see steady memory usage that overflows up to 1GB each day and triggers alerts on my Heroku servers. Along with changing the package.json file, I am updating the Parse require: var Parse = require("parse").Parse; // 1.5.0
var Parse = require("parse/node").Parse; // 1.6.* I am also commenting out the following line when migrating to 1.6.* Parse.User.enableUnsafeCurrentUser(); // 1.5.0
// Parse.User.enableUnsafeCurrentUser(); // 1.6.* I have no other differences between my two releases. @andrewimm, is there a recommended way of using Parse in the Node environment so I can avoid these memory spikes? The discussion here seems unique to the ways that @ghugues and I may be using Parse objects? I feel like I'm using Parse in a fairly standard behavior of querying objects, using them to handle various bits of logic during a user's API request, maybe performing an insert or update, then closing the request. I do stub objects for querying purposes, but those are only used per user API request. |
Yep, Kevin I'm seeing this excatly. Andrewimm is it your recommendation that users running a web application call Parse.Object_clearAllState() at the end of each request (or n requests) in middleware somewhere? I see its defined but the SDK itself does not call it anywhere. |
I do think that to solve this, we may need to avoid the global store in single-instance mode, which would be a significant refactor but not impossible. However, I also personally feel that there is room for an API that is significantly more tailored towards the server experience. |
I agree on all accounts. This is painful right now in production so any efforts to ease this problem are greatly appreciated. |
This is currently my top priority |
Since this requires some major restructuring under the hood for non-single-instance mode, I'm going to reserve these changes for 1.7.0 (even though they don't affect public APIs in any way). I'll try to get much of that squared away today. |
Hey Andrew, any updates on this one? |
Hello Andrew, I too am experiencing odd Parse Job errors in Cloud Code with my custom functions that query large datasets. I'm receiving "the service is currently unavailable", as my returned status. It seems like Parse completely terminates the function, without running through promise error handling. Here is a sample of my code:
|
I transferred my function to a local machine, and received this error after about 20 minutes of running my loop:
This call also gets extremely slow once the Items size get very large:
I haven't tested with .push() but expect similar results. I'm unsure whether its the size of the array, or the provided memory for the entire execution of node. |
@andrewimm First of all thanks for such fast reply regarding version 1.7 release date. Regarding suggested workaround (Parse.Object._clearAllState()) I'm not sure if it's good idea in our case - we have node instance that's communicating with Parse and serving multiple clients (iOS, Android, Website) so we have many request in parallel and as you noted above that could lead to unwanted behaviour. Wanted to hear your thoughts on one possible workaround.. If we use Parse SDK to construct Parse query and then use such constructed query ( query.toJSON() ) to retrieve data using Parse REST API (without calling SKD method find() at all). Could that be possible solution to memory leak and global store concept issue? |
I recommend clearing objects individually ("releasing" them), when you're sure you're done with them. This should be possible with You could also definitely use pieces of the SDK like Parse.Query's |
So... I made something awful that abuses the Wanted to share in case anyone else was brave and waiting for 1.7 |
@andrewimm I don't think its manageable in large applications to keep track of all the objects and individually release them. When including related objects would you need to |
As I said, it's really tied to your specific code. If all your function does is run an Ultimately, it's probably not worth too much thought. I estimate we can have 1.7 out in the next day. |
(or at least an RC of it... I actually think I want to release it that way first) |
Andrew, How much different is Cloud Code's sandbox than a Node + Parse Javascript SDK? I've certainly dealt with issues getting node modules working correctly. Do you suggest any modifications to my code to get around the memory issue of holding a large amount of objects in an array for processing? (I do not save anything, just run calculations). |
@bionetmarco cloud code and node are both based on the v8 engine, but node adds a lot of abstractions and core libs that aren't available in cloud code. In node, you have a single long-running process for a server, while in cloud code each request spins up a new execution environment for memory safety. I'd love to dig into ways to be memory-efficient when just running calculations, but I think the best bet is now to just install 1.7.0-rc1 if you're running on node. For everyone: these memory issues should be resolved in the 1.7.0 RC, available with |
@andrew Thanks for all the hard work here. I can see that you are now using WeakMap for object state which will help a ton but I wanted to see if you have any guidance around taking advantage of this restructuring beyond just swapping the library in for 1.6.x (accounting for other changes aside) |
@TylerBrock I understand you're wary of just dropping in new SDK versions :) Both the unit tests and our internal integration tests suggest that there are no behavior changes (at least related to this individual change). The move to A+ compliance by default means that certain internal exceptions are now passed to reject clauses of Promises, rather than cascading up through the stack. |
Yup, thanks. The change log was legit this time around, I just wanted to make sure that there wasn't anything that needed to be done other than requiring Our init code and tests disable A+ for now so that the behavior there is the same for now although we are itching to take advantage of it in the future. Thank you again for all the hard work. |
The title of this issue might be misleading as I don't know exactly what is happening.
I originally tried to write a background job that loops over all the rows of a table. Now I'm running this on my computer until I find a fix (which is why I use the node environement).
I've tried to use both
ParseQuery.each()
and my own version of this method (usingfind()
and a chain of Promises), and in both cases the memory usage of node increases until it reaches the 1.4G limit and node crashes. I think that the objects that are queried are never released either because of a memory leak or because of a chain of retains that I have yet to understand.I've tried to run node with
--expose-gc
and callgc()
at different points, but it doesn't work.Here is a sample code that will reproduce the issue. It queries all the objects of a table and then starts over, indefinitely (in case there are not enough objects in the table to make the script reach the 1.4G limit).
if
useParseQueryEach
is set totrue
, it will use theParseQuery.each()
otherwise it will use my own implementation (queryEachWithBatchSize
).If this is not related to the Parse SDK but rather to the way I implemented this, please let me know, I will repost it on StackOverflow.
The text was updated successfully, but these errors were encountered: