-
Notifications
You must be signed in to change notification settings - Fork 58
[generator] Use GC.KeepAlive for reference type method parameters. #722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -162,6 +162,7 @@ public partial class MyClass { | |||
_members.InstanceMethods.InvokeVirtualVoidMethod (__id, this, __args); | |||
} finally { | |||
JNIEnv.DeleteLocalRef (native_value); | |||
global::System.GC.KeepAlive (value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not need to be addressed for d16-8: We don't need a GC.KeepAlive()
for types which will be "deep marshaled". This includes System.String
and array types. For master, would we be able to skip this?
@@ -260,6 +261,7 @@ public partial class MyClass { | |||
return __rm; | |||
} finally { | |||
JNIEnv.DeleteLocalRef (native_key); | |||
global::System.GC.KeepAlive (key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not need to be addressed for d16-8: We don't need a GC.KeepAlive()
for types which will be "deep marshaled". This includes System.String
and array types. For master, would we be able to skip this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should be able to do that.
For the array types, would we never need the KeepAlive
, or does it depend on the array element type?
ie: int[]
versus object[]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will depend on the array element type, as if we have a Java.Lang.Object[]
, then everything within the array needs to be kept alive, and GC.KeepAlive(array)
will ensure that everything within the array is kept alive…
So maybe all array types should retain the GC.KeepAlive()
…
@@ -103,6 +103,7 @@ public partial class MyClass { | |||
__args [0] = new JniArgumentValue (value); | |||
_members.InstanceMethods.InvokeAbstractVoidMethod (__id, this, __args); | |||
} finally { | |||
global::System.GC.KeepAlive (value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
byte
is a value type. Can we remove this GC.KeepAlive()
?
Commit message: Context: https://github.com/xamarin/java.interop/issues/719
@brendanzagaeski has been investigating a Xamarin.Android app crash:
JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3d86
from android.view.View crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(android.view.LayoutInflater, android.view.ViewGroup, android.os.Bundle)
…
at crc64720bb2db43a66fe9.FragmentContainer.n_onCreateView(Native method)
at crc64720bb2db43a66fe9.FragmentContainer.onCreateView(FragmentContainer.java:41)
This had been a head-scratcher, but we had a GREF log, so all should
be clear, right?
09-10 17:56:48.280 10123 10123 I monodroid-gref: +g+ grefc 1141 gwrefc 0 obj-handle 0x79/I -> new-handle 0x3d86/G from thread '(null)'(1)
09-10 17:56:48.294 10123 10123 I monodroid-gref: +w+ grefc 1140 gwrefc 2 obj-handle 0x3d86/G -> new-handle 0x1e3/W from thread 'finalizer'(10123)
09-10 17:56:48.294 10123 10123 I monodroid-gref: -g- grefc 1139 gwrefc 2 handle 0x3d86/G from thread 'finalizer'(10123)
The GREF log *wasn't* immediately clear: sure, the GREF was turned
into a Weak-GREF, and the Weak-GREF was then collected, but none of
this explained *how* were were using this deleted GREF.
(We were at this state of affairs for months: we "know" we're using
a deleted GREF, but we don't know *how* or *why*. It was a very hard
to hit bug.)
Eventually we had a ["that's funny"][0] event: *sure*, the GREF is
deleted, but:
1. Why is it being deleted by the finalizer?
2. …14ms *after construction*?
A [Garbage Collection][1] refresher may be in order, but in short:
1. All `Java.Lang.Object` subclasses are "bridged" objects.
2. During a GC, all "collectable" bridged objects are gathered.
A collectable object is one in which nothing in the managed
GC references the object.
3. Once the GC is complete, all gathered collectable bridged objects
are passed to a "cross references" callback.
The callback is called *outside* the "scope" of a GC; "the world"
is *not* frozen, other threads may be executing.
4. The "cross references" callback is the
`MonoGCBridgeCallbacks::cross_references` field provided provided
to [`mono_gc_register_bridge_callbacks()`][2].
In a Xamarin.Android app, the "cross references" callback will
"toggle" a JNI Global Reference to a JNI Weak Global Reference,
invoke a Java-side GC, and then try to obtain a
JNI Global Reference from the JNI Weak Global Reference.
If a non-`NULL` pointer is returned, the bridged object is kept
alive. If a `NULL` pointer is returned, the bridged object will
be considered dead, and will be added to the Finalization Queue
(as `Java.Lang.Object` has a finalizer).
Thus, it seemed "funny" that within 14ms an instance was created,
GC'd, and determined to be garbage, *especially* when we *knew* that
this instance was being passed to Java, which we expected to retain
the instance. (Yet wasn't…?)
After much banging of heads, and the yeoman's work of creating a
simplified and consistently reproducible test case, we *think* we
know the cause of the crash.
Consider our normal Managed-to-Java marshaling code, e.g.
partial class MaterialButton {
public unsafe MaterialButton (global::Android.Content.Context context)
: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
{
const string __id = "(Landroid/content/Context;)V";
if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
return;
try {
JniArgumentValue* __args = stackalloc JniArgumentValue [1];
/* 1 */ __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
/* 2 */ var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
/* 3 */ _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
} finally {
}
}
}
At (1), `context.Handle` is -- a JNI GREF -- is stored into a
`JniArgumentValue* __args` value, and at (3) `__args` is passed into
JNI, which will presumably "Do Something" with that handle.
However, nothing ties `context.Handle` to `context`, so from
(2) onward, `context` *may* be eligible for garbage collection.
See also Chris Brumme's [Lifetime, GC.KeepAlive, handle recycling][3]
blog article. It's about .NET Framework, but the same fundamental
multithreading concepts apply.
The fix is to *ensure* that `context` is kept alive
*for as long as* `context.Handle` will be used, i.e. across the
JNI `_members.InstanceMethods.FinishCreateInstance()` call:
partial class MaterialButton {
public unsafe MaterialButton (global::Android.Content.Context context)
: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
{
const string __id = "(Landroid/content/Context;)V";
if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
return;
try {
JniArgumentValue* __args = stackalloc JniArgumentValue [1];
/* 1 */ __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
/* 2 */ var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
/* 3 */ _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
} finally {
/* 4 */ global::System.GC.KeepAlive (context);
}
}
}
This should prevent e.g. `context` from being prematurely GC'd,
which in turn should prevent the `JNI DETECTED ERROR` message.
Update `tools/generator` to emit `GC.KeepAlive()` statements for
every parameter type which isn't a value type (`enum`, `int`, `string`,
etc.). `string` is considered a value type because we always send a
"deep copy" of the string contents, so it won't matter if the `string`
instance is GC'd immediately.
[0]: https://quoteinvestigator.com/2015/03/02/eureka-funny/
[1]: https://docs.microsoft.com/xamarin/android/internals/garbage-collection
[2]: http://docs.go-mono.com/?link=xhtml%3adeploy%2fmono-api-gc.html
[3]: https://docs.microsoft.com/archive/blogs/cbrumme/lifetime-gc-keepalive-handle-recycling |
Commit message comments:
There's a tiny typo of a double
I got tripped up when I first read through this because when I was reading 2 before seeing 3, I thought 2 was saying that all bridged objects are sent to
Based on the log output from the test apps, it seemed like the conversion to a weak global reference and back to a plain (strong) global reference all happened on the first GC right after the problematic type was instantiated, so it didn't seem like there were two GCs involved. Maybe in the latest versions, a single GC covers all 3 steps: marking the managed types (world stopped), running the Java GC (world running), and then completing the managed collection (world stopped)? Example log lines to illustrate, with an initial gref handle value of
If the grefs can indeed turn over on the first GC after they're created, then there could be a chance that the "good" lifetime for the problematic object would also have been quite close to 14ms in the original few test scenarios. For example, if by chance the nursery filled up just after
In case it might be desirable to match it up with the scenario from the original larger test app that was associated with the partial class MaterialButton {
public unsafe MaterialButton (global::Android.Content.Context context)
: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
{
const string __id = "(Landroid/content/Context;)V";
if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
return;
try {
JniArgumentValue* __args = stackalloc JniArgumentValue [1];
/* 1 */ __args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
/* 2 */ var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
/* 3 */ _members.InstanceMethods.FinishCreateInstance (__id, this, __args);
} finally {
}
}
} And then
Optionally, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha, I just noticed something that might be good to fix up before merging this PR. I got the example marshaling code for the MaterialButton
constructor by building the AndroidX bindings, and for that build I was actually using the installer package from the xamarin-android PR that includes the change from this Java.Interop PR: dotnet/android#5136.
For other generated methods in the output, GC.KeepAlive()
is included as expected:
public virtual unsafe void AddOnCheckedChangeListener (global::Google.Android.Material.Button.MaterialButton.IOnCheckedChangeListener listener)
{
const string __id = "addOnCheckedChangeListener.(Lcom/google/android/material/button/MaterialButton$OnCheckedChangeListener;)V";
try {
JniArgumentValue* __args = stackalloc JniArgumentValue [1];
__args [0] = new JniArgumentValue ((listener == null) ? IntPtr.Zero : ((global::Java.Lang.Object) listener).Handle);
_members.InstanceMethods.InvokeVirtualVoidMethod (__id, this, __args);
} finally {
global::System.GC.KeepAlive (listener);
}
}
But it's not yet being added for the MaterialButton
constructors:
// Metadata.xml XPath constructor reference: path="/api/package[@name='com.google.android.material.button']/class[@name='MaterialButton']/constructor[@name='MaterialButton' and count(parameter)=3 and parameter[1][@type='android.content.Context'] and parameter[2][@type='android.util.AttributeSet'] and parameter[3][@type='int']]"
[Register (".ctor", "(Landroid/content/Context;Landroid/util/AttributeSet;I)V", "")]
public unsafe MaterialButton (global::Android.Content.Context context, global::Android.Util.IAttributeSet attrs, int defStyleAttr)
: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
{
const string __id = "(Landroid/content/Context;Landroid/util/AttributeSet;I)V";
if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
return;
try {
JniArgumentValue* __args = stackalloc JniArgumentValue [3];
__args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
__args [1] = new JniArgumentValue ((attrs == null) ? IntPtr.Zero : ((global::Java.Lang.Object) attrs).Handle);
__args [2] = new JniArgumentValue (defStyleAttr);
var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
_members.InstanceMethods.FinishCreateInstance (__id, this, __args);
} finally {
}
}
This looks good, there are some excessive KeepAlive's. I agree with @jonpryor that we don't need the keep alive for non-ref types, or arrays of non-ref. Overdoing it is not the end of the world, since the overhead is tiny, but I would add a comment somewhere if we are being conservative about it. |
@brendanzagaeski Good catch! It looks like constructors use a different code path. I can add that on Monday. |
Replying before I update the commit message…
Insufficient information provided; it's impossible to tell. All we know is that the GREF My (incorrect?) understanding/asumption is that the GC bridge is built upon/reuses some of the infrastructure for finalizable objects, and that -- like finalizable objects -- bridged objects are first collected "normally", and when they are considered to be garbage/unreferenced, they are placed on a "finalization queue", and it is during the a subsequent GC which processes the finalization queue that the GC bridge callback is invoked. @BrzVlad: Is the above correct? Are bridgeable objects "like" finalizable objects? Furthermore, considering that e.g.
The provided example does have Mono's GC log messages enabled, which is handy, and further suggests that I'm wrong (yay?), and that the bridge callback is invoked as part of the first GC, and not part of a "second" "process the finalization queue" GC. That said, Mono's GC messages don't appear to print when the GC started, only after it's completed, which is a tad irksome. |
Since I have worked only sporadically with bridge code, take what I say with a grain of salt. I haven't tested this in practice, but the flow should be as follows : There are two objects OC#, OJava. OC# always has a finalizer. Let's say both objects are no longer reachable. Mono GC1 occurs, OC# is seen as being dead, but, because it is bridge object, it is resurrected and part of the bridge objects whose liveness needs to be checked on java side. World is resumed, Gref for OJava is switched to weak ref. Collection is triggered on Java side. OJava is dead after collection, Mono GC is informed that OC# is technically no longer a bridge object (meaning java no longer keeps this object alive). Mono GC2 occurs, OC# is seen as being dead, but, because it has a finalizer, it is resurrected and its finalizer is enqueued. Mono GC3 occurs. At this point if time, OC# is probably dead for good, its memory is reclaimed. In conclusion, java object can die immediately after C# object is initially not reachable, while the C# object will survive a few more GC iterations, since it gets resurrected. |
Added a commit to:
Note: For constructors that take a |
NOTE: This is a PR against
d16-8
. This will have to rewritten formaster
due to other changes in this code.Context: #719
Under certain circumstances, it is possible for a method parameter to get GC collected before native code is finished using it. Add an appropriate
GC.Collect (param)
to ensure the object stays alive until we are done with it. As an optimization, this is only needed for reference types.