Skip to content

Bump to dotnet/java-interop/main@d3d3a1bf #9921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

jonpryor
Copy link
Member

@jonpryor jonpryor commented Mar 14, 2025

Fixes: #9862

Changes: dotnet/java-interop@8221b7d...d3d3a1b

Context: 5c23bcd

Issue #9862 is an observed "race condition" around
Android.App.Application subclass creation. Two instances of
AndroidApp were created, one from the "normal" app startup:

at MailClient.Mobile.Droid.AndroidApp..ctor(IntPtr handle, JniHandleOwnership ownership)
…
at Java.Lang.Object.GetObject(IntPtr , JniHandleOwnership , Type )
at Java.Lang.Object._GetObject[Application](IntPtr , JniHandleOwnership )
at Java.Lang.Object.GetObject[Application](IntPtr handle, JniHandleOwnership transfer)
at Java.Lang.Object.GetObject[Application](IntPtr jnienv, IntPtr handle, JniHandleOwnership transfer)
at Android.App.Application.n_OnCreate(IntPtr jnienv, IntPtr native__this)
at crc647fae2f69c19dcd0d.AndroidApp.n_onCreate(Native Method)
at crc647fae2f69c19dcd0d.AndroidApp.onCreate(AndroidApp.java:25)
at android.app.Instrumentation.callApplicationOnCreate(Instrumentation.java:1316)

and another from an androidx.work.WorkerFactory, via parameter:

at MailClient.Mobile.Droid.AndroidApp..ctor(IntPtr handle, JniHandleOwnership ownership)
…
at Java.Lang.Object.GetObject(IntPtr , JniHandleOwnership , Type )
at Android.Runtime.JNIEnv.<>c.<CreateNativeArrayElementToManaged>b__70_9(Type type, IntPtr source, Int32 index)
at Android.Runtime.JNIEnv.GetObjectArray(IntPtr , Type[] )
at Java.Interop.TypeManager.n_Activate(IntPtr jnienv, IntPtr jclass, IntPtr typename_ptr, IntPtr signature_ptr, IntPtr jobject, IntPtr parameters_ptr)
at mono.android.TypeManager.n_activate(Native Method)
at mono.android.TypeManager.Activate(TypeManager.java:7)
at crc647fae2f69c19dcd0d.SyncWorker.<init>(SyncWorker.java:23)
at java.lang.reflect.Constructor.newInstance0(Native Method)
at java.lang.reflect.Constructor.newInstance(Constructor.java:343)
at androidx.work.WorkerFactory.createWorkerWithDefaultFallback(WorkerFactory.java:95)

However, what was odd about this "race condition" was that the
second instance created would reliably win!

Further investigation suggested that this was less of a
"race condition" and more a bug in AndroidValueManager, wherein
when "Replaceable" instances were created, an existing instance would
always be replaced, even if the new instance was also Replaceable!
This feels bananas; yes, Replaceable should be replaceable, but the
expectation was that it would be replaced by non-Replaceable
instances, not just any instance that came along later.

Aside: a "Replaceable" instance is an instance created via
JniRuntime.JniValueManager.CreatePeer() /
TypeManager.CreateInstance(). "Replaceable" instances can be
replaced in the JniRuntime.JniValueManager by instances
created via the constructor, e.g. new Integer(42).

JniPeerReference h;
using (var x = new Java.Lang.Integer(1))
  h = x.PeerReference.NewLocalRef();
// No mapping for `h` in JniValueManager, as x was disposed.

var a = JniEnvironment.Runtime.ValueManager.CreatePeer (ref h, JniObjectReferenceOptions.Copy, null);
  // a is "Replaceable"
var b = new Java.Lang.Integer(x.Handle, JniHandleOwnership.DoNotTransfer);
  // b is *not* Replaceable; created via constructor

var p = JniEnvironment.Runtime.ValueManager.PeekPeer(h);
// p should be the non-Replaceable value b, *not* a

dotnet/java-interop@d3d3a1bf updates
JniRuntimeJniValueManagerContract to add a new
CreatePeer_ReplaceableDoesNotReplace() test to codify the desired
semantic that Replaceable instances do not replace Replaceable
instances.

Update AndroidValueManager so that the new
CreatePeer_ReplaceableDoesNotReplace() test passes. Supporting
this new semantic required "extending" Replaceable lifetime; it turns
out that the existing code within TypeManager.CreateInstance():

var result = CreateProxy (type, handle, transfer);
if (Runtime.IsGCUserPeer (result.PeerReference.Handle)) {
  result.SetJniManagedPeerState (JniManagedPeerStates.Replaceable | JniManagedPeerStates.Activatable);
}
return result;

sets .Replaceable too late, as during execution of the activation
constructor the instance thinks it isn't replaceable, and thus
creation of a new instance via the activation constructor will
replace an already existing replaceable instance; it's not until
after the constructor finished executing that we set .Replaceable.

Address this by updating CreatePeer() to split up instance creation:

  1. Create an un-constructed instance using
    RuntimeHelpers.GetUninitializedObject(Type).

  2. Set .Replaceable on the instance from (1)

  3. Then invoke the activation constructor.

This allows JniRuntime.JniValueManager.AddPeer() to reliably
determine that an instance is .Replaceable, and thus not replace
a .Replaceable instance with another .Replaceable instance.

dotnet/java-interop@d3d3a1bf does similar things with
JniRuntime.JniValueManager. Update ManagedValueManager to
override the new TryConstructPeer() method, as TryCreatePeer()
has been removed.

Finally, an "interesting" crash happened when adding the
RuntimeHelpers.GetUninitializedObject() + invoke constructor fix:

E droid.NET_Test: JNI ERROR (app bug): accessed deleted Global 0x3056
F droid.NET_Test: java_vm_ext.cc:570] JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3056

or, when it didn't up and crash, we could see "bizarre" unit test
failures from JnienvTest.MoarThreadingTests():

I NUnit   : 1) Java.InteropTests.JnienvTest.MoarThreadingTests (Mono.Android.NET-Tests)
I NUnit   :   No exception should be thrown [t2]! Got: System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.
I NUnit   : Object name: 'Java.Lang.Integer'.
I NUnit   :    at Java.Interop.JniPeerMembers.AssertSelf(IJavaPeerable self)
I NUnit   :    at Java.Interop.JniPeerMembers.JniInstanceMethods.InvokeAbstractInt32Method(String encodedMember, IJavaPeerable self, JniArgumentValue* parameters)
I NUnit   :    at Java.Lang.Integer.IntValue()
I NUnit   :    at Java.Lang.Integer.System.IConvertible.ToInt32(IFormatProvider provider)
I NUnit   :    at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
I NUnit   :    at Android.Runtime.JNIEnv.GetObjectArray(IntPtr array_ptr, Type[] element_types)
I NUnit   :    at Java.InteropTests.JnienvTest.<>c__DisplayClass26_0.<MoarThreadingTests>b__1()
I NUnit   :   Expected: null
I NUnit   :   But was:  <System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.

or:

I NUnit   : 1) Java.InteropTests.JnienvTest.MoarThreadingTests (Mono.Android.NET-Tests)
I NUnit   :   No exception should be thrown [t2]! Got: System.ArgumentException: Handle must be valid. (Parameter 'instance')
I NUnit   :    at Java.Interop.JniEnvironment.InstanceMethods.CallIntMethod(JniObjectReference instance, JniMethodInfo method, JniArgumentValue* args)
I NUnit   :    at Java.Interop.JniPeerMembers.JniInstanceMethods.InvokeAbstractInt32Method(String encodedMember, IJavaPeerable self, JniArgumentValue* parameters)
I NUnit   :    at Java.Lang.Integer.IntValue()
I NUnit   :    at Java.Lang.Integer.System.IConvertible.ToInt32(IFormatProvider provider)
I NUnit   :    at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
I NUnit   :    at Android.Runtime.JNIEnv.GetObjectArray(IntPtr array_ptr, Type[] element_types)
I NUnit   :    at Java.InteropTests.JnienvTest.<>c__DisplayClass26_0.<MoarThreadingTests>b__1()

The thread using the invalid global reference is t1 in
MoarThreadingTests(), which uses
JNIEnv.CopyObjectArray<int>(IntPtr, int[]), which in turn uses
JavaConvert.FromJniHandle<T>().

Why would that be failing?

The answer: consider JavaConvert.JniHandleConverters:

partial class JavaConvert {
  static Dictionary<Type, Func<IntPtr, JniHandleOwnership, object>> JniHandleConverters = new() {
    { typeof (int), (handle, transfer) => {
      using (var value = new Java.Lang.Integer (handle, transfer | JniHandleOwnership.DoNotRegister))
        return value.IntValue ();
    } },
  }
}

Note that we're invoking the Integer(IntPtr, JniHandleOwnership)
constructor. As per the above Aside about Replaceable instances,
this means that value should replace any existing mapping
within the JniRuntime.JniValueManager. However, here was also
provide JniHandleOwnership.DoNotRegister, which prevents the
instance from being placed into the instance mapping, i.e.
Object.PeekObject() should not find the instance.

Next, turn to Object.SetHandle() from 5c23bcd:

partial class /* Java.Lang. */ Object {
  protected void SetHandle (IntPtr value, JniHandleOwnership transfer)
  {
    var reference = new JniObjectReference (value);
    JNIEnvInit.ValueManager?.ConstructPeer (
        this,
        ref reference,
        value == IntPtr.Zero ? JniObjectReferenceOptions.None : JniObjectReferenceOptions.Copy);
    JNIEnv.DeleteRef (value, transfer);
  }
}

What's missing? A check for JniHandleOwnership.DoNotRegister!

Which means that the new Integer(…) will be in the instance map,
and replace any existing mappings, only to be subsequently disposed.

This permits the following "total execution order" to happen between
threads t1 and t2 in MoarThreadingTests():

  • Main thread:

    // JNI equivalent to Java `new Object[]{ new Integer (1) }`
    IntPtr lrefJliArray = JNIEnv.NewObjectArray<int>(new[]{1})
    
  • t1 Thread:

    IntPtr lrefInteger  = JNIEnv.GetObjectArrayElement(lrefJliArray, 0);
    var value = new Integer(lrefInteger, .TransferLocalRef);
      // via JavaConvert.JniHandleConverters
    
  • t2 Thread:

    IntPtr lrefInteger  = JNIEnv.GetObjectArrayElement(lrefJliArray, 0);
    var x = Java.Lang.Object.GetObject (lrefInteger, .TransferLocalRef);
      // via JNIEnv.NativeArrayElementToManaged[typeof(IJavaObject)]
    

    Note that x is the same instance as value.

  • t1 Thread:

    int v = value.IntValue();
    value.Dispose();
    
  • t2 Thread: does anything with x. As x and value are
    the same instance, x has been disposed by thread t1.

    Depending on where thread scheduling, this would explain both the
    ObjectDisposedException as well as the JNI ERROR (app bug).

The fix is to correct the oversight from 5c23bcd, and forward the
JniHandleOwnership.DoNotRegister flag to JniObjectReferenceOptions.

On the way to investigating this…

Update tests/Mono.Android-Tests to use @(AndroidEnvironment) in
Debug builds to set debug.mono.debug=1. This allows for filenames
and line numbers in stack traces.

Update src/native so that the GREF log can be created; previously,
adb logcat would contain:

W monodroid: Failed to create directory '/data/user/0/Mono.Android.NET_Tests/files/.__override__/arm64-v8a'. No such file or directory
…
E monodroid: fopen failed for file /data/user/0/Mono.Android.NET_Tests/files/.__override__/arm64-v8a/grefs.txt: No such file or directory

The apparent cause for this is that the ABI is now part of the path,
…/.__override__/arm64-v8a/grefs.txt and not …/.__override__/grefs.txt.
Additionally, create_public_directory() was a straight mkdir() call,
which does not create intermediate directories. Update
create_public_directory() to use create_directory() instead,
which does create intermediate directories. This allows
grefs.txt to be created.

Next, the contents of grefs.txt occasionally looked "wrong".
Turns out, WriteGlobalReferenceLine() and WriteLocalReferenceLine()
didn't consistently append a newline to the message, which impacts e.g.
the Created … message:

Created PeerReference=0x3c06/G IdentityHashCode=0x2e29bd Instance=0xbe0aab36 Instance.Type=Android.OS.Bundle, Java.Type=android/os/Bundle+g+ grefc 19 gwrefc 0 obj-handle 0x706f711035/L -> new-handle 0x3d06/G from thread '<null>'(1)

Update WriteGlobalReferenceLine() and WriteLocalReferenceLine()
to always append a newline to the message.

Context: dotnet/java-interop#1323
Context: #9862 (comment)

Does It Build™?

(The expectation is that it *does* build -- only unit tests are changed
in dotnet/java-interop#1323 -- but that the new
`JniRuntimeJniValueManagerContract.cs.CreatePeer_ReplaceableDoesNotReplace()`
test will fail.)`
@jonpryor jonpryor force-pushed the dev/jonp/jonp-try-ji-1323 branch from e67229e to d6e0e7d Compare April 4, 2025 19:13
@jonpryor
Copy link
Member Author

jonpryor commented Apr 4, 2025

So I expected JniRuntimeJniValueManagerContract.CreatePeer_ReplaceableDoesNotReplace() to fail, but it failed for the wrong reason:

Expected peer1.JniManagedPeerState to have .Replaceable, but was None.

Further investigation was because we weren't hitting this line:

result.SetJniManagedPeerState (JniManagedPeerStates.Replaceable | JniManagedPeerStates.Activatable);

because Runtime.IsGCUserPeer() returned false, because AnotherJavaInterfaceImpl.java does not implement GCUserPeerable.

Okay, fair.

What I no longer remember -- and thus need to dig around for -- is why we're "protect" setting .Replacable with IsGCUserPeer(). Offhand, this feels like something we could always do?

@jonpryor jonpryor marked this pull request as ready for review April 15, 2025 16:04
@jonpryor jonpryor changed the title Try dotnet/java-interop#1323 Bump to dotnet/java-interop/main@d3d3a1bf Apr 15, 2025
Remove the `Console.WriteLine()`s from `AndroidRuntime.cs` and `TypeManager.cs`.

Add an `@(AndroidEnvironment)` to `Mono.Android.NET-Tests.csproj` to
set `debug.mono.log=default,gref+`, so that on the next run we can get
*some* gref logging to help diagnose an apparent crash:

    JNI ERROR (app bug): accessed deleted Global 0x3056
We're seeing a recurrent crash on CI, which is still inexplicable:

	E droid.NET_Test: JNI ERROR (app bug): accessed deleted Global 0x3056
	F droid.NET_Test: java_vm_ext.cc:570] JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3056

The obvious thing to do to track this down is to enable GREF logging,
which makes the error disappear.  (Figures.)

However, attempting to enable GREF logging found *other* issues:

The GREF log couldn't be created (!):

	 W monodroid: Failed to create directory '/data/user/0/Mono.Android.NET_Tests/files/.__override__/arm64-v8a'. No such file or directory
	 …
	 E monodroid: fopen failed for file /data/user/0/Mono.Android.NET_Tests/files/.__override__/arm64-v8a/grefs.txt: No such file or directory

The apparent cause for this is that the ABI is now part of the path,
`…/.__override__/arm64-v8a/grefs.txt` and not `…/.__override__/grefs.txt`.
Additionally, `create_public_directory()` was a straight `mkdir()` call,
which *does not* create intermediate directories.
Update `create_public_directory()` to use `create_directory()` instead,
which *does* create intermediate directories.  This allows `grefs.txt`
to be created.

*Next*, I started getting a *bizarre* failure within `MoarThreadingTests()`:

	I NUnit   : 1) Java.InteropTests.JnienvTest.MoarThreadingTests (Mono.Android.NET-Tests)
	I NUnit   :   No exception should be thrown [t2]! Got: System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.
	I NUnit   : Object name: 'Java.Lang.Integer'.
	I NUnit   :    at Java.Interop.JniPeerMembers.AssertSelf(IJavaPeerable self)
	I NUnit   :    at Java.Interop.JniPeerMembers.JniInstanceMethods.InvokeAbstractInt32Method(String encodedMember, IJavaPeerable self, JniArgumentValue* parameters)
	I NUnit   :    at Java.Lang.Integer.IntValue()
	I NUnit   :    at Java.Lang.Integer.System.IConvertible.ToInt32(IFormatProvider provider)
	I NUnit   :    at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
	I NUnit   :    at Android.Runtime.JNIEnv.GetObjectArray(IntPtr array_ptr, Type[] element_types)
	I NUnit   :    at Java.InteropTests.JnienvTest.<>c__DisplayClass26_0.<MoarThreadingTests>b__1()
	I NUnit   :   Expected: null
	I NUnit   :   But was:  <System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.

Relevant code from `JNIEnv.GetObjectArray()`:

	for (int i = 0; i < cnt; i++) {
		Type? targetType	= (element_types != null && i < element_types.Length) ? element_types [i] : null;
		object? value    = converter ((targetType == null || targetType.IsValueType) ? null : targetType,
				array_ptr, i);

		ret [i] = value;
		ret [i] = targetType == null || targetType.IsInstanceOfType (value)
			? value
			: Convert.ChangeType (value, targetType, CultureInfo.InvariantCulture);
	}

It's *conceivable* that `value` is finalized *during* the
`Convert.ChangeType()` call (…really?! I'm grasping at straws here!),
which could explain the `ObjectDisposedException` being thrown.
(This doesn't *actually* make sense, but it's the best I can come up with.)
Add a `GC.KeepAlive()` call to keep the object alive until after
`Convert.ChangeType()` is called:

	ret [i] = targetType == null || targetType.IsInstanceOfType (value)
		? value
		: Convert.ChangeType (value, targetType, CultureInfo.InvariantCulture);
	GC.KeepAlive (value);

This "fixed" the `ObjectDisposedException` issue, only to raise a new,
different, and *not* fixed issue:

	I NUnit   : 1) Java.InteropTests.JnienvTest.MoarThreadingTests (Mono.Android.NET-Tests)
	I NUnit   :   No exception should be thrown [t2]! Got: System.ArgumentException: Handle must be valid. (Parameter 'instance')
	I NUnit   :    at Java.Interop.JniEnvironment.InstanceMethods.CallIntMethod(JniObjectReference instance, JniMethodInfo method, JniArgumentValue* args) in /Volumes/Xamarin-Work/src/dotnet/android/external/Java.Interop/src/Java.Interop/obj/Debug/net9.0/JniEnvironment.g.cs:line 20191
	I NUnit   :    at Java.Interop.JniPeerMembers.JniInstanceMethods.InvokeAbstractInt32Method(String encodedMember, IJavaPeerable self, JniArgumentValue* parameters) in /Volumes/Xamarin-Work/src/dotnet/android/external/Java.Interop/src/Java.Interop/Java.Interop/JniPeerMembers.JniInstanceMethods_Invoke.cs:line 492
	I NUnit   :    at Java.Lang.Integer.IntValue() in /Volumes/Xamarin-Work/src/dotnet/android/src/Mono.Android/obj/Debug/net10.0/android-36/mcw/Java.Lang.Integer.cs:line 354
	I NUnit   :    at Java.Lang.Integer.System.IConvertible.ToInt32(IFormatProvider provider) in /Volumes/Xamarin-Work/src/dotnet/android/src/Mono.Android/Java.Lang/Integer.cs:line 58
	I NUnit   :    at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
	I NUnit   :    at Android.Runtime.JNIEnv.GetObjectArray(IntPtr array_ptr, Type[] element_types) in /Volumes/Xamarin-Work/src/dotnet/android/src/Mono.Android/Android.Runtime/JNIEnv.cs:line 1070
	I NUnit   :    at Java.InteropTests.JnienvTest.<>c__DisplayClass26_0.<MoarThreadingTests>b__1() in /Volumes/Xamarin-Work/src/dotnet/android/tests/Mono.Android-Tests/Mono.Android-Tests/Java.Interop/JnienvTest.cs:line 390

which makes *no* sense, and I'm starting to have flashbacks to
issue #9039, which was a GC bug fixed in dotnet/runtime#112825.
(Don't we have that fix?!  Perhaps not.)

Next, the contents of `grefs.txt` occasionally looked "wrong".
Turns out, `WriteGlobalReferenceLine()` doesn't consistently append
a newline to the message, which impacts e.g. the `Created …` message:

	Created PeerReference=0x3c06/G IdentityHashCode=0x2e29bd Instance=0xbe0aab36 Instance.Type=Android.OS.Bundle, Java.Type=android/os/Bundle+g+ grefc 19 gwrefc 0 obj-handle 0x706f711035/L -> new-handle 0x3d06/G from thread '<null>'(1)

Update `WriteGlobalReferenceLine()` to always append a newline
to the message.

Finally, update `env.txt` to always set debug.mono.debug=1.
Filenames and line numbers in stack traces are handy!
@jonpryor
Copy link
Member Author

The story so far…?

dotnet/java-interop@d3d3a1bf added several unit tests, one of which is CreatePeer_ReplaceableDoesNotReplace(), which was the unit test for #9862.

CI for this PR is observing crashes in tests/Mono.Android-Tests, which don't entirely make sense.

For testing, PR #10047 is a "similar" bump that also bumps to dotnet/java-interop@d3d3a1bf and fixes ManagedValueManager so that it builds, but does not contain fixes for CreatePeer_ReplaceableDoesNotReplace(). It shows two meaningful results:

  1. CreatePeer_ReplaceableDoesNotReplace() fails, which is expected, and
  2. It does not crash.

This suggests that the crash isn't caused by dotnet/java-interop@d3d3a1bf (yay), and may be "random" based on history.

Next step: copy over the remaining changes in this PR into #10047. Maybe it'll pass the tests?

jonpryor added a commit that referenced this pull request Apr 17, 2025
Context: #9921 (comment)

Do tests pass without crashing?

If the tests *do* crash, next step is to *revert* changes to
`AndroidRuntime.cs`.  This should re-introduce failures within
`CreatePeer_ReplaceableDoesNotReplace()`, but would narrow down
what may be in play.
Re-introduces `CreatePeer_ReplaceableDoesNotReplace()` failure.

No crash.
`CreatePeer_ReplaceableDoesNotReplace()` still fails.

No crash.
NOW the weirdness begins.

Remember 8aaefed?

> *Next*, I started getting a *bizarre* failure within `MoarThreadingTests()`:
>
>         I NUnit   : 1) Java.InteropTests.JnienvTest.MoarThreadingTests (Mono.Android.NET-Tests)
>         I NUnit   :   No exception should be thrown [t2]! Got: System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.

I'm seeing it again!

	E NUnit   :      [FAIL]
	E NUnit   :  :   No exception should be thrown [t2]! Got: System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.
	E NUnit   : Object name: 'Java.Lang.Integer'.
	E NUnit   :    at Java.Interop.JniPeerMembers.AssertSelf(IJavaPeerable self) in …/android/external/Java.Interop/src/Java.Interop/Java.Interop/JniPeerMembers.cs:line 153
	E NUnit   :    at Java.Interop.JniPeerMembers.JniInstanceMethods.InvokeAbstractInt32Method(String encodedMember, IJavaPeerable self, JniArgumentValue* parameters) in …/android/external/Java.Interop/src/Java.Interop/Java.Interop/JniPeerMembers.JniInstanceMethods_Invoke.cs:line 484
	E NUnit   :    at Java.Lang.Integer.IntValue() in …/android/src/Mono.Android/obj/Debug/net10.0/android-36/mcw/Java.Lang.Integer.cs:line 354
	E NUnit   :    at Java.Lang.Integer.System.IConvertible.ToInt32(IFormatProvider provider) in …/android/src/Mono.Android/Java.Lang/Integer.cs:line 58
	E NUnit   :    at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
	E NUnit   :    at Android.Runtime.JNIEnv.GetObjectArray(IntPtr array_ptr, Type[] element_types) in …/android/src/Mono.Android/Android.Runtime/JNIEnv.cs:line 1070
	E NUnit   :    at Java.InteropTests.JnienvTest.<>c__DisplayClass26_0.<MoarThreadingTests>b__1() in …/android/tests/Mono.Android-Tests/Mono.Android-Tests/Java.Interop/JnienvTest.cs:line 389
	E NUnit   :   Expected: null
	E NUnit   :   But was:  <System.ObjectDisposedException: Cannot access disposed object with JniIdentityHashCode=158880748.
	E NUnit   : Object name: 'Java.Lang.Integer'.

To reiterate: creating an *uninitialized* `Java.Lang.Integer` instance
***and not using it*** results in the `ObjectDisposedException`.

***HOW***?!
…and now we crash in `ConversionsAndThreadsAndInstanceMappingsOhMy()`.

We crash bigly.

	04-25 15:27:37.527 19954 20022 F droid.NET_Tests: java_vm_ext.cc:616] JNI DETECTED ERROR IN APPLICATION: JNI ERROR (app bug): jobject is an invalid global reference: 0x430a (deleted reference at index 536)
	04-25 15:27:37.527 19954 20022 F droid.NET_Tests: java_vm_ext.cc:616]     in call to NewLocalRef
	…
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708] "Thread-3" prio=10 tid=18 Runnable
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   | group="" sCount=0 ucsCount=0 flags=0 obj=0x1430c000 self=0xb400006eb4f45fe0
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   | sysTid=20022 nice=-8 cgrp=default sched=0/0 handle=0x6d9a5986c0
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   | state=R schedstat=( 6609743 480021 19 ) utm=0 stm=0 core=6 HZ=100
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   | stack=0x6d9a399000-0x6d9a39b000 stackSize=2045KB
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   | held mutexes= "abort lock" "mutator lock"(shared held)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #00 pc 004fc1e0  /apex/com.android.art/lib64/libart.so (art::DumpNativeStack+108) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #1 pc 004fe898  /apex/com.android.art/lib64/libart.so (art::Thread::DumpStack const+376) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #2 pc 0050068c  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run+216) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #3 pc 00553b2c  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint+684) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #4 pc 004fffc0  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump+292) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #5 pc 009377bc  /apex/com.android.art/lib64/libart.so (art::AbortState::Dump const+204) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #6 pc 00933c2c  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort+712) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #7 pc 000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter::$_0::__invoke+80) (BuildId: 1470f61c05962eb04fafe76bd58bf664)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #8 pc 000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage+516) (BuildId: 1470f61c05962eb04fafe76bd58bf664)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #9 pc 0043627c  /apex/com.android.art/lib64/libart.so (art::JavaVMExt::JniAbort+1696) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #10 pc 0088eaf4  /apex/com.android.art/lib64/libart.so (art::JavaVMExt::JniAbortV+108) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #11 pc 005a1a88  /apex/com.android.art/lib64/libart.so (art::::ScopedCheck::AbortF +140) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #12 pc 0059b970  /apex/com.android.art/lib64/libart.so (art::::ScopedCheck::Check +1412) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #13 pc 00597784  /apex/com.android.art/lib64/libart.so (art::::CheckJNI::NewRef +196) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #14 pc 001aa580  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (???) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #15 pc 001a8d38  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (???) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #16 pc 0019e068  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (???) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #17 pc 0019bcec  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (???) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #18 pc 00258434  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (mono_runtime_invoke_checked+140) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #19 pc 0026f19c  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (???) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #20 pc 00070374  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start+180) (BuildId: 3cad38c74be79004f5b2927be886e077)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   native: #21 pc 000620a0  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 3cad38c74be79004f5b2927be886e077)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:708]   (no managed stack frames)
	…
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:716] JNI DETECTED ERROR IN APPLICATION: JNI ERROR (app bug): jobject is an invalid global reference: 0x430a (deleted reference at index 536)
	04-25 15:27:37.625 19954 20022 F droid.NET_Tests: runtime.cc:716]     in call to NewLocalRef
	04-25 15:27:37.625 19954 20022 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 20022 (Thread-3), pid 19954 (droid.NET_Tests)
	…
	04-25 15:27:37.937 20026 20026 F DEBUG   : Abort message: 'JNI DETECTED ERROR IN APPLICATION: JNI ERROR (app bug): jobject is an invalid global reference: 0x430a (deleted reference at index 536)\12    in call to NewLocalRef'
	…
	04-25 15:27:37.937 20026 20026 F DEBUG   : 17 total frames
	04-25 15:27:37.937 20026 20026 F DEBUG   : backtrace:
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #00 pc 000000000005e8fc  /apex/com.android.runtime/lib64/bionic/libc.so (abort+156) (BuildId: 3cad38c74be79004f5b2927be886e077)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #1 pc 0000000000933abc  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort(char const*)+344) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #2 pc 00000000000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+80) (BuildId: 1470f61c05962eb04fafe76bd58bf664)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #3 pc 00000000000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage()+516) (BuildId: 1470f61c05962eb04fafe76bd58bf664)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #4 pc 000000000043627c  /apex/com.android.art/lib64/libart.so (art::JavaVMExt::JniAbort(char const*, char const*)+1696) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #5 pc 000000000088eaf4  /apex/com.android.art/lib64/libart.so (art::JavaVMExt::JniAbortV(char const*, char const*, std::__va_list)+108) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #6 pc 00000000005a1a88  /apex/com.android.art/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::AbortF(char const*, ...) (.__uniq.99033978352804627313491551960229047428)+140) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #7 pc 000000000059b970  /apex/com.android.art/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::Check(art::ScopedObjectAccess&, bool, char const*, art::(anonymous namespace)::JniValueType*) (.__uniq.99033978352804627313491551960229047428)+1412) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #8 pc 0000000000597784  /apex/com.android.art/lib64/libart.so (art::(anonymous namespace)::CheckJNI::NewRef(char const*, _JNIEnv*, _jobject*, art::IndirectRefKind) (.__uniq.99033978352804627313491551960229047428)+196) (BuildId: 4ccb65ae9ac5ad5da3af5a342d5b0b92)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #9 pc 00000000001aa580  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #10 pc 00000000001a8d38  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #11 pc 000000000019e068  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #12 pc 000000000019bcec  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #13 pc 0000000000258434  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (mono_runtime_invoke_checked+140) (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #14 pc 000000000026f19c  /data/app/~~rrsXQYRBNR6mJq3O2vECbA==/Mono.Android.NET_Tests-5qYa1AD9g6qP-Hr2BrRRnQ==/lib/arm64/libmonosgen-2.0.so (BuildId: dd7058e39c565cec8c5368d64fb956bacc2f52ad)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #15 pc 0000000000070374  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+180) (BuildId: 3cad38c74be79004f5b2927be886e077)
	04-25 15:27:37.937 20026 20026 F DEBUG   :       #16 pc 00000000000620a0  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 3cad38c74be79004f5b2927be886e077)

*Fortunately*, `adb logcat` output contains the *thread ID*, which
shows that it's from thread 20022.  What else is from thread 20022?

	04-25 15:27:37.527 19954 20022 I DOTNET  : # t1 iter: 35

It's `t1` in `ConversionsAndThreadsAndInstanceMappingsOhMy()`, which
calls `JNIEnv.CopyObjectArray()`, which uses
`JavaConvert.FromJniHandle<T>()`:

	var t1 = new Thread (() => {
		int[] output_array1 = new int[1];
		for (int i = 0; i < 2000; ++i) {
			Console.WriteLine ("# t1 iter: {0}", i);
			try {
				JNIEnv.CopyObjectArray (grefJliArray, output_array1);
			} catch (Exception e) {
				ignore_t1 = e;
				break;
			}
		}
	});

Time to start investigating `JNIEnv.CopyObjectArray()` and
`JavaConvert.FromJniHandle<T>()`…
Context: 5c23bcd

As per 1cb72eb, we're looking at a JNI error:

	JNI DETECTED ERROR IN APPLICATION: JNI ERROR (app bug): jobject is an invalid global reference: 0x430a (deleted reference at index 536)
	    in call to NewLocalRef

The thread using the invalid global reference is `t1` in
`ConversionsAndThreadsAndInstanceMappingsOhMy()`, which uses
`JNIEnv.CopyObjectArray<int>(IntPtr, int[])`, which in turn uses
`JavaConvert.FromJniHandle<T>()`.

Why would that be failing?

A plausible answer: consider `JavaConvert.JniHandleConverters`:

	partial class JavaConvert {
	  static Dictionary<Type, Func<IntPtr, JniHandleOwnership, object>> JniHandleConverters = new() {
	    { typeof (int), (handle, transfer) => {
	      using (var value = new Java.Lang.Integer (handle, transfer | JniHandleOwnership.DoNotRegister))
	        return value.IntValue ();
	    } },
	  }
	}

Note that we're invoking the `Integer(IntPtr, JniHandleOwnership)`
constructor, ensuring that it has `JniHandleOwnership.DoNotRegister`.
This *prevents* the instance from being placed into the instance
mapping, i.e. `Object.PeekObject()` *will not find the instance*.

Next, turn to `Object.SetHandle()` from 5c23bcd:

	partial class /* Java.Lang. */ Object {
	  protected void SetHandle (IntPtr value, JniHandleOwnership transfer)
	  {
	    var reference = new JniObjectReference (value);
	    JNIEnvInit.ValueManager?.ConstructPeer (
	        this,
	        ref reference,
	        value == IntPtr.Zero ? JniObjectReferenceOptions.None : JniObjectReferenceOptions.Copy);
	    JNIEnv.DeleteRef (value, transfer);
	  }
	}

What's missing?  There is no check for `JniHandleOwnership.DoNotRegister`!

Which means that the `new Integer(…)` will be in the instance map,
*and replace any existing mappings*, only to be subsequently disposed.

This permits the following "total execution order" to happen between
threads `t1` and `t2` in `ConversionsAndThreadsAndInstanceMappingsOhMy()`:

	setup: JNI equivalent to Java `Object[] array = new Object[]{new Integer(1)};
	t0: a = new Integer(handle, TransferLocalRef);
	  // via `Object.GetObject()`, via NativeArrayElementToManaged; creates instance map
	t1: b = new Integer(handle, TransferLocalRef | DoNotRegister)
	  // via `JNIEnv.CopyObjectArray()`, *replaces* instance mapping
	t2: c = Object.GetObject(handle)
	  // via `JNIEnv.GetArray<T>` -> `JNIEnv.CopyArray<T>` -> `NativeArrayElementToManaged` -> `Object.GetObject()`
	  // finds `b`, not `a`!
	t1: `b.Dispose()` begins unregistering `b`
	t2: `c.IntValue()`
	*boom*

"*boom*" can be the previously observed `ObjectDisposedException`, or
can be the JNI crash, depending on where things are between t1 and t2.

The fix is to correct the oversight from 5c23bcd, and forward the
`DoNotRegister` information.
Remove printfs
Copy link
Member

@jonathanpeppers jonathanpeppers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, can we remove the categories here:

[Test]
[Category ("CoreCLRIgnore")] //TODO: https://github.com/dotnet/android/issues/10069
[Category ("NativeAOTIgnore")] //TODO: https://github.com/dotnet/android/issues/10079
public void JnienvCreateInstance_RegistersMultipleInstances ()

And maybe the test passes on CoreCLR/NativeAOT?

This reverts commit c864fea.

Looks like `JnienvCreateInstance_RegistersMultipleInstances()`
is NOT fixed.  I'll need to investigate later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible race condition in proxy object creation
2 participants