Skip to content

[vm/ffi] Remove workaround for type feedback missing in FFI trampolines #44454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dcharkes opened this issue Dec 11, 2020 · 6 comments
Open
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi

Comments

@dcharkes
Copy link
Contributor

dcharkes commented Dec 11, 2020

Update 2021-01-05: The issue is that the JIT trampolines do not report type feedback that Struct. _addressOf contains TypedData on return values and arguments.

This could be addressed by making the VM know the layout of Struct and its subtypes. However, that makes the tree shaker no longer understand those types and their constructor calls. So, we might not want to make those types and their constructors opaque to the CFE&TFA.

We have a workaround, so this does not crash.

===============================================================================

In https://dart-review.googlesource.com/c/sdk/+/169221 we add nested structs and copying of nested structs.

Edit: the CQ also caught this: https://logs.chromium.org/logs/dart/buildbucket/cr-buildbucket.appspot.com/8861204090372525392/+/steps/test_results/0/logs/new_test_failures__logs_/0

Test succeeds:

  reset();
  print(typedDataBackedStruct.a1);
  typedDataBackedStruct.a1 = typedDataBackedStruct.a0;
  Expect.equals(5, typedDataBackedStruct.a1.a0);
  Expect.equals(6, typedDataBackedStruct.a1.a1);

Test fails:

  reset();
  typedDataBackedStruct.a1 = typedDataBackedStruct.a0; // This assignment does not work.
  Expect.equals(5, typedDataBackedStruct.a1.a0);
  Expect.equals(6, typedDataBackedStruct.a1.a1);
$ tools/build.py --no-start-goma -m debug -a x64 runtime ffi_test_functions dart_precompiled_runtime && tools/test.py -a x64 -c dartk,dartkp ffi_2
Done. Made 331 targets from 93 files in 400ms
ninja -C out/DebugX64 -j1000 -l64 runtime ffi_test_functions dart_precompiled_runtime
ninja: Entering directory `out/DebugX64'
ninja: no work to do.
ninja -C out/DebugX64 -j1000 -l64 runtime ffi_test_functions dart_precompiled_runtime done.
The build took 0.714 seconds
Warning: combination of compiler 'dartkp' and runtime 'vm' is invalid. Skipping this combination.
Warning: combination of compiler 'dartk' and runtime 'dart_precompiled' is invalid. Skipping this combination.
Test configurations:
    custom configuration(architecture: x64, compiler: dartk, mode: debug, runtime: vm, system: linux)
Suites tested: ffi_2
    custom configuration(architecture: x64, compiler: dartkp, mode: debug, runtime: dart_precompiled, system: linux)
Suites tested: ffi_2
[00:09 |  47% | +  213 | -    0]

FAILED: dartk-vm debug_x64 ffi_2/function_callbacks_structs_by_value_test
Expected: Pass
Actual: RuntimeError

--- Command "vm" (took 07.000348s):
DART_CONFIGURATION=DebugX64 out/DebugX64/dart --deterministic --optimization-counter-threshold=5 --use-slow-path --stacktrace-every=100 --ignore-unrecognized-flags --packages=/usr/local/google/home/dacoharkes/dart-sdk/sdk/.packages /usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi_2/function_callbacks_structs_by_value_test.dart

exit code:
255

stdout:
callbackPassStructRecurisive(10, (1, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(9, (2, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(8, (3, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(7, (4, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(6, (5, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(5, (6, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(4, (7, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(3, (8, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(2, (9, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(1, (10, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(0, (11, 2, 3, 4, 5))
returning
callbackPassStructRecurisive(11, (1, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(10, (2, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(9, (3, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(8, (4, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(7, (5, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(6, (6, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(5, (7, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(4, (8, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(3, (9, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(2, (10, 2, 3, 4, 5), 0x7fd435817000)
callbackPassStructRecurisive(1, (11, 2, 3, 4, 5))
PassStruct20BytesHomogeneousInt32x10(0, (12, 2, 3, 4, 5), 0x7fd435817000)
CallbackWithStruct(0x7fd43581700a)

stderr:
Unhandled exception:
Expect.equals(expected: <5>, actual: <-3085>) fails.
#0      Expect._fail (package:expect/expect.dart:702:5)
#1      Expect.equals (package:expect/expect.dart:132:5)
#2      testCopyLogic (file:///usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi_2/function_callbacks_structs_by_value_test.dart:130:10)
#3      main (file:///usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi_2/function_callbacks_structs_by_value_test.dart:21:5)
#4      _delayEntrypointInvocation.<anonymous closure> (dart:isolate-patch/isolate_patch.dart:283:19)
#5      _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:184:12)

--- Re-run this test:
python tools/test.py -c dartk,dartkp ffi_2/function_callbacks_structs_by_value_test
[01:12 | 100% | +  447 | -    1]
@dcharkes dcharkes added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi labels Dec 11, 2020
@dcharkes
Copy link
Contributor Author

dcharkes commented Dec 14, 2020

When the test fails, a nested struct accessed on a larger struct which has a typed data as backing store, is returned as a struct with a pointer as backing store, rather than a typed data view. This happens because _addressOf is inferred to have the type Pointer (which it has not in this case, it's a TypedData). This might be speculative optimization.

==== file:///usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi_2/function_structs_by_value_generated_test.dart_Struct8BytesNestedInt_get_a0 (GetterFunction)
B0[graph]:0 {
      v0 <- Constant(#null)
      v1 <- Constant(#<optimized out>)
      v7 <- Constant(#Type: Pointer*) T{_Type}
      v10 <- Constant(#true) T{bool}
      v19 <- Constant(#_ImmutableList len:3) T{_ImmutableList}
      v26 <- Constant(#_ImmutableList len:3) T{_ImmutableList}
      v35 <- Constant(#TypeArguments: (H1d7e78bd) [Type: Struct4BytesHomogeneousInt16*]) T{TypeArguments}
}
B1[function entry]:2 {
      v2 <- Parameter(0) T{Struct8BytesNestedInt}
}
    CheckStackOverflow:8(stack=0, loop=0)
    v3 <- AllocateObject(Struct4BytesHomogeneousInt16) T{Struct4BytesHomogeneousInt16}
    v5 <- InstanceCall:10( get:_addressOf@8050071<0>, v2 IC[0: ]) T{Object?}
    v8 <- InstanceCall:12( _simpleInstanceOf@0150898<0>, v5, v7 IC[0: ]) T{*?}

[...]

After ApplyClassIds
==== file:///usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi_2/function_structs_by_value_generated_test.dart_Struct8BytesNestedInt_get_a0 (GetterFunction)
B0[graph]:0 {
      v0 <- Constant(#null)
      v1 <- Constant(#<optimized out>)
      v7 <- Constant(#Type: Pointer*) T{_Type}
      v10 <- Constant(#true) T{bool}
      v19 <- Constant(#_ImmutableList len:3) T{_ImmutableList}
      v26 <- Constant(#_ImmutableList len:3) T{_ImmutableList}
      v35 <- Constant(#TypeArguments: (H1d7e78bd) [Type: Struct4BytesHomogeneousInt16*]) T{TypeArguments}
}
B1[function entry]:2 {
      v2 <- Parameter(0) T{Struct8BytesNestedInt}
}
    CheckStackOverflow:8(stack=0, loop=0)
    v3 <- AllocateObject(Struct4BytesHomogeneousInt16) T{Struct4BytesHomogeneousInt16}
    CheckClass:10(v2 Cids[1: Struct8BytesNestedInt etc.  cid 1132])
    v5 <- LoadField(v2 . _addressOf@8050071 {final}) T{Pointer}

On failing runs with --trace-field-guards --optimization-counter-threshold=5

Store Field <Struct._addressOf@8050071>: final <?> <- Pointer<Struct20BytesHomogeneousInt32>: address=0x55b86aabf9c0
    => <not-nullable Pointer>

On succeeding runs with --trace-field-guards --optimization-counter-threshold=20

Store Field <Struct._addressOf@8050071>: final <?> <- Pointer<Struct20BytesHomogeneousInt32>: address=0x562806815980
    => <not-nullable Pointer>
Store Field <Struct._addressOf@8050071>: final <not-nullable Pointer> <- TypedDataView(cid: 104)
    => <*>

This looks like speculative optimization failing to deoptimize when assumptions no longer hold.

@dcharkes
Copy link
Contributor Author

Working theory:

Fragment FlowGraphBuilder::WrapTypedDataBaseInStruct(
    const AbstractType& struct_type) {
  const auto& struct_sub_class = Class::ZoneHandle(Z, struct_type.type_class());
  struct_sub_class.EnsureIsFinalized(thread_);
  const auto& lib_ffi = Library::Handle(Z, Library::FfiLibrary());
  const auto& struct_class =
      Class::Handle(Z, lib_ffi.LookupClass(Symbols::Struct()));
  const auto& struct_addressof = Field::ZoneHandle(
      Z, struct_class.LookupInstanceFieldAllowPrivate(Symbols::_addressOf()));
  ASSERT(!struct_addressof.IsNull());

  Fragment body;
  LocalVariable* typed_data = MakeTemporary("typed_data_base");
  body += AllocateObject(TokenPosition::kNoSource, struct_sub_class, 0);
  body += LoadLocal(MakeTemporary("struct"));  // Duplicate Struct.
  body += LoadLocal(typed_data);
  body += StoreInstanceField(struct_addressof,
                             StoreInstanceFieldInstr::Kind::kInitializing);
  body += DropTempsPreserveTop(1);  // Drop TypedData.
  return body;
}

runtime/vm/compiler/frontend/kernel_to_il.cc

This allocates an object, stores a TypedDataView into it, without telling the JIT that that now Struct._addressOf can contain a TypedDataView. Which makes the JIT rely on that it has only seen Pointers in Struct._addressOf, which makes this test fail.

(Trying to add the guard fails, because FFI trampolines are force optimized.)

  body += StoreInstanceFieldGuarded(
      struct_addressof, StoreInstanceFieldInstr::Kind::kInitializing);

@mkustermann mkustermann added the crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. label Dec 16, 2020
@dcharkes dcharkes added this to the January 2021 milestone Dec 16, 2020
@dcharkes dcharkes changed the title [vm/ffi] Flaky crash on copying nested structs [vm/ffi] Crash in JIT due type feedback missing in FFI trampolines Dec 17, 2020
@dcharkes dcharkes changed the title [vm/ffi] Crash in JIT due type feedback missing in FFI trampolines [vm/ffi] Crash in JIT due to type feedback missing in FFI trampolines Dec 17, 2020
@dcharkes dcharkes changed the title [vm/ffi] Crash in JIT due to type feedback missing in FFI trampolines [vm/ffi] Crash in JIT due to type feedback missing in FFI trampolines (in not yet landed CL) Dec 17, 2020
@dcharkes
Copy link
Contributor Author

I'm unable to construct a case of this on master (without the CL):

  • All loads and stores work no matter whether the struct is backed by TypedData or Pointer.
  • addressOf always does an assert assignable due to the type argument having to be checked, even if the type feedback only contains Pointer.

It fails in the nested struct CL because we branch on Pointer<T extends NativeType> instead of on Pointer<T> in various places where the code branches on whether something is a Pointer or TypedData.

dart-bot pushed a commit that referenced this issue Dec 18, 2020
This CL adds support for nested structs in FFI calls, callbacks, and
memory loads and stores through the Struct classes itself.

Nesting empty structs and nesting a structs in themselves (directly or
indirectly) is reported as error.

This feature is almost fully implemented in the CFE transformation.

Because structs depend on the sizes of their nested structs, the structs
now need to be processed in topological order.

Field access to nested structs branches at runtime on making a derived
Pointer if the backing memory of the outer struct was a Pointer or
making a TypedDataView if the backing memory of the outer struct was
a TypedData.

Assigning to a nested struct is a byte for byte copy from the source.

The only changes in the VM are contained in the native calling
convention calculation which now recursively needs to reason about
fundamental types instead of just 1 struct deep.

Because of the amount of corner cases in the calling conventions that
need to be covered, the tests are generated, rather than hand-written.

ABIs tested on CQ: x64 (Linux, MacOS, Windows), ia32 (Linux, Windows),
arm (Android softFP, Linux hardFP), arm64 Android.
ABIs tested locally through Flutter: arm64 iOS.
ABIs not tested: ia32 Android (emulator), x64 iOS (simulator), arm iOS.
TEST=runtime/bin/ffi_test/ffi_test_functions_generated.cc
TEST=runtime/bin/ffi_test/ffi_test_functions.cc
TEST=tests/{ffi,ffi_2}/function_structs_by_value_generated_test.dart
TEST=tests/{ffi,ffi_2}/function_callbacks_structs_by_value_generated_tes
TEST=tests/{ffi,ffi_2}/function_callbacks_structs_by_value_test.dart
TEST=tests/{ffi,ffi_2}/vmspecific_static_checks_test.dart

Closes #37271.

Contains a temporary workaround for
#44454.

Change-Id: I5e5d10e09e5c3fc209f5f7e997efe17bd362214d
Cq-Include-Trybots: luci.dart.try:dart-sdk-linux-try,dart-sdk-mac-try,dart-sdk-win-try,vm-ffi-android-debug-arm-try,vm-ffi-android-debug-arm64-try,vm-kernel-asan-linux-release-x64-try,vm-kernel-mac-debug-x64-try,vm-kernel-linux-debug-ia32-try,vm-kernel-linux-debug-x64-try,vm-kernel-nnbd-linux-debug-x64-try,vm-kernel-nnbd-linux-debug-ia32-try,vm-kernel-nnbd-mac-release-x64-try,vm-kernel-nnbd-win-debug-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-linux-debug-simarm_x64-try,vm-kernel-precomp-nnbd-linux-debug-x64-try,vm-kernel-precomp-win-release-x64-try,vm-kernel-reload-linux-debug-x64-try,vm-kernel-reload-rollback-linux-debug-x64-try,vm-kernel-win-debug-x64-try,vm-kernel-win-debug-ia32-try,vm-precomp-ffi-qemu-linux-release-arm-try,vm-kernel-precomp-obfuscate-linux-release-x64-try,vm-kernel-msan-linux-release-x64-try,vm-kernel-precomp-msan-linux-release-x64-try,vm-kernel-precomp-android-release-arm_x64-try,analyzer-analysis-server-linux-try
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/169221
Commit-Queue: Daco Harkes <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Dmitry Stefantsov <[email protected]>
@dcharkes dcharkes changed the title [vm/ffi] Crash in JIT due to type feedback missing in FFI trampolines (in not yet landed CL) [vm/ffi] Remove workaround for type feedback missing in FFI trampolines Jan 5, 2021
@mkustermann mkustermann removed this from the January 2021 milestone Jan 5, 2021
@dcharkes
Copy link
Contributor Author

dcharkes commented Jan 5, 2021

Because we have a workaround, this is not crashing.

@dcharkes dcharkes removed the crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. label Jan 5, 2021
@mkustermann
Copy link
Member

mkustermann commented Jan 5, 2021

As discussed in today's meeting we should separate allocation from sizeof calculation (similar to C code) for code size / tree shake reasons and performance reasons. We could use an API such as

// In "dart:ffi"
abstract class Allocator {
  Pointer<Void> allocate(int numBytes);
  void free(Pointer<Void> pointer);
}

// In "dart:ffi"
extern Pointer<T> allocate<T extends Struct>(Allocator allocator);

// In "package:ffi/ffi.dart"
class MallocAllocator extends Allocator { ... }
class ZoneAllocator extends Allocator { ... }

final malloc = MallocAllocator();

// User code:
void main() {
  Pointer<Foo> foop = allocate<Foo>(malloc);
  ...
  free<Foo>(foop, malloc);
}
// User code (lowered to kernel):
void main() {
  Pointer<Foo> foop = malloc.allocate(sizeOf<Foo>(malloc) /* <-- or rather it's lowered form */).cast<Foo>();
  ...
  malloc.free(foop.cast<Void>());
}

So we'll

  • Make CL to demonstrate these changes
  • Make breaking change request with this information and motivation for it (treeshake-ability, performance)
  • Discuss API in next meeting
  • Possible make old API deprecated before Flutter 2.0 / FFI 1.0

@dcharkes
Copy link
Contributor Author

dcharkes commented Feb 7, 2024

fd2e9b9 Made trampolines explicit in the CFE.

If we make the FFI trampoline actually return TypedData instead of T extends Struct and wrap the result in the struct constructor, both the TFA and JIT would know about the struct constructor invocations. We'd need to do this for both the closures and the @Native external functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi
Projects
None yet
Development

No branches or pull requests

2 participants