[native_assets_cli] BuildOutput extension: addDataAssetDirectories #2097

MichealReed · 2025-03-12T20:16:49Z

This PR adds helper extensions to BuildOutputBuilder to assist with adding assets and directories to the package.

addFoundCodeAssets will recursively search the input config's target directory to find any libraries matching the given name. The prefix and extension are determined through use of the targetOS helper libraryFileName. Because the CodeAsset name does not always match the file that the library is contained in, we need to use a mapping where the key is used to find the file, and the name is the name it will become available to dart as, otherwise we see an error like below as the fallback does not seem to work in this case.

Invalid argument(s): Couldn't resolve native function 'add' in 'package:download_asset/native_add.dart' : No asset with id 'package:download_asset/native_add.dart' found. Available native assets: package:download_asset/native_add_windows_x64.dart. Attempted to fallback to process lookup.

example usage:

      await output.addFoundCodeAssets(
        input: input,
        assetMappings: [
          // asset to find : name to add it as
          { 'native_add_windows_x64' : 'native_add.dart'},
        ],
      );

addDataAssetDirectories takes in a list of paths or files and adds them as DataAsset dependencies, throwing an error if a given path is not found.

example usage:

await output.addDataAssetDirectories([
      'assets_a/file.txt', // adds the parent dir and file to dependencies
      'assets_b' // recursively adds uris
    ], input: input);

Tests have been added to data_assets/validation_test.dart and code_assets/validation_test.dart and the download example was updated to use addFoundCodeAssets.

Open Questions:

If we add a code asset then addDataAssetDirectories adds the folder that asset exists in, will its inclusion as a data asset cause conflict?
Is there a better way to derive name.dart for the CodeAsset without mappings?
BuildOutputBuilder seemed like a better place for the extension so these are accessible in the hook. Any thoughts?

closes #1346

…ries

dcharkes

Thanks @MichealReed!

If we add a code asset then addDataAssetDirectories adds the folder that asset exists in, will its inclusion as a data asset cause conflict?

Code assets and data assets live in a separate namespace, so that should not lead to issues.

Is there a better way to derive name.dart for the CodeAsset without mappings?

I'm inclined to not add support for addFoundCodeAssets. If users need to do a mapping from found files to asset id, they probably want to do it programmatically. Also, they might want to map asset id to file path instead. Or they might want to map targetOS+targetArch to both filename and asset id. So I'm not assuming that code assets will be so regular a helper function would make sense.

Do you have a use case where you want addFoundCodeAssets for?

BuildOutputBuilder seemed like a better place for the extension so these are accessible in the hook.

sgtm! 👍

dcharkes · 2025-03-12T20:31:44Z

pkgs/native_assets_cli/lib/src/config.dart

+    final packageName = input.packageName;
+    final packageRoot = input.packageRoot;
+    for (final path in paths) {
+      final resolvedUri = packageRoot.resolve('$packageName/$path');


Why do we have $packageName/ here? Wouldn't we simply want to resolve $path?

Files created in the tests were showing up under $packageName, is there something else moving the assets to this path?

I was kind of expecting this function to do what pkgs/native_assets_builder/test_data/simple_data_asset/hook/build.dart does, and that file being refactored to simply a call to this function.

How can test_data/simple_data_asset be tested? It depends on a nonexistent dart:asset and ByteAsset.

Pushed a fix for name that mirrors that example, we still must resolve the relative $packageName/$path to find the file though. Do you want this to be refactored for compatibility with system absolute paths or relative only?

The way to test for now is to run dart test in package:native_assets_builder.

we still must resolve the relative $packageName/$path to find the file though.

I don't understand, if I change the helper function to final resolvedUri = packageRoot.resolve(path);, then dart test in package:native_assets_builder succeeds.

Do you want this to be refactored for compatibility with system absolute paths or relative only?

I think relative paths in the package is fine for now. That's the most common use case. (For desktop applications that use an LLM that's installed on the system, I can imagine wanting to support absolute paths. So we could add such support in the future. We should be able to distinguish a string containing an absolute path from a relative path.)

This should be good now, the data assets validation tests makeDataBuildInput was not resolving the packageRoot correctly, which caused the failure in the validation test. Now both simple_data_asset and the validation tests pass with resolve(path).

pkgs/native_assets_cli/lib/src/config.dart

MichealReed · 2025-03-12T21:30:30Z

Thanks @MichealReed!

Gladly!

I'm inclined to not add support for addFoundCodeAssets. If users need to do a mapping from found files to asset id, they probably want to do it programmatically. Also, they might want to map asset id to file path instead. Or they might want to map targetOS+targetArch to both filename and asset id. So I'm not assuming that code assets will be so regular a helper function would make sense.

Do you have a use case where you want addFoundCodeAssets for?

addFoundCodeAssets is not for mapping, it's for recursively finding libraries in the build output. It needs mappings to map to the name ffi expects.

I think the changes in the download_asset hook demonstrate this sort of programmatic mapping. The generated file includes extensions so those must be stripped for the hash to validate and be normalized for library finding.

      await output.addFoundCodeAssets(
        input: input,
        assetMappings: [
          // asset to find : name to add it as
          { targetName : 'native_add.dart'},
        ],
      );

If we provide targetName directly (without mapping) as the CodeAsset name argument we see

Couldn't resolve native function 'add' in 'package:download_asset/native_add.dart' : 
No asset with id 'package:download_asset/native_add.dart' found. 

Available native assets: package:download_asset/native_add_windows_x64.dart.

Name is still somewhat ambiguous at the moment to me and from what I can tell, maps to the generated dart file. Meaning most variables must be stripped. Unless this reads exactly native_add.dart for the codeasset name, the program will not run and as mentioned the fallback method does not find add in native_add_windows_x64.dart. The mapping is just to map files of varying architecture and platform types or unexpected names to the expected library name from the dart side.

In another use case, there are sometimes libraries that expect other dynamic libraries be bundled with the executable and have nested ways to open the library. This allows users to find multiple libraries from a single build to be bundled with the output.

There are also cases like with cmake output libraries will be heavily nested, or libraries from child builds will be needed and users need a quick way to find these.

pkgs/native_assets_cli/test/data_assets/validation_test.dart

pkgs/native_assets_cli/lib/src/config.dart

dcharkes · 2025-03-13T07:43:58Z

addFoundCodeAssets is not for mapping, it's for recursively finding libraries in the build output.

There are also cases like with cmake output libraries will be heavily nested, or libraries from child builds will be needed and users need a quick way to find these.

Ah got it! You want to simply invoke the cmake process and then read the file system of what it output. (Does CMake not have a way for it to tell us what dylibs it output? Such that we would parse that information instead of scanning the file system?)

If we want to infer information from the files on disk, I think we should make it more general.

It should then also work for (a) downloading dynamic libraries, (b) grabbing dynamic libraries from an external package manager directory, and (c) dynamic libraries checked in to the package git [please don't!].

To make it more general:

Recognize whether it's a dylib or static lib. And if both with the same name are available, respect the CodeConfig.linkModePreference.
Use a tool to recognize the target architecture instead of assuming it's the CodeConfig.targetArchitecture. (And filtering the dylibs/static-libs on whether the architecture matches the CodeConfig.targetArchitecture.) We are using such tools already in the tests in package:native_toolchain_c. However, then I don't want such extension to be defined in package:native_assets_cli then it should be in package:native_toolchain_c most likely.
Use a tool to recognize the target OS. I don't know if this exists. Can we recognize whether a dylib is meant for Android or Linux? Both will be .so. Can we recognize whether a dylib is meant for iOS or MacOS?

If we go that route, I think assetMappings should be a callback function makeAssetId(CodeAssetInfo info) that has access to all the inferred information, so that users can write arbitrary Dart code in there to come up with an assetId (info) => '$packageName/src/${info.file.path.split(...).last}.dart'.

And we should also not limit it to the output directory, we should allow using an arbitrary directory. (For example allowing users to run an external package manager and then pointing to a directory with dylibs in there.)

This is a very different change from the data asset change, so I'd say that should be a separate PR if we go that way.

Also, in package:native_toolchain_cmake you can assume that you have the right targetArchitecture and the right targetOS and something respecting the link mode preference. So, maybe it makes more sense for now to just keep the extension private in that package and not try to solve it for the general case just yet. That would avoid having to detect architecture/os/link mode from the files. WDYT?

MichealReed · 2025-03-13T15:53:46Z

Ah got it! You want to simply invoke the cmake process and then read the file system of what it output. (Does CMake not have a way for it to tell us what dylibs it output? Such that we would parse that information instead of scanning the file system?)

I think it's infeasible for it to do so, one build process can have many libraries. A single build of Dawn contains many projects which produce static or dynamic libraries of their own. Sometimes we built it for the monolithic dynamic lib produced, sometimes we want to grab the statics or other dynamic libs for linking.

It should then also work for (a) downloading dynamic libraries, (b) grabbing dynamic libraries from an external package manager directory, and (c) dynamic libraries checked in to the package git [please don't!].

It should already cover these cases, just download/build/make appear the libraries to a folder and provide the folder and a list of a names and this helper would find them.

Recognize whether it's a dylib or static lib. And if both with the same name are available, respect the CodeConfig.linkModePreference.

This already does a selective find to get the proper lib based on

input.config.code.targetOS.libraryFileName(
            searchName,
            linkMode,
          );

2. Use a tool to recognize the target architecture instead of assuming it's the `CodeConfig.targetArchitecture`. (And filtering the dylibs/static-libs on whether the architecture matches the `CodeConfig.targetArchitecture`.) We are using such tools already in the tests in `package:native_toolchain_c`. However, then I don't want such extension to be defined in `package:native_assets_cli` then it should be in `package:native_toolchain_c` most likely.
3. Use a tool to recognize the target OS. I don't know if this exists. Can we recognize whether a dylib is meant for Android or Linux? Both will be `.so`. Can we recognize whether a dylib is meant for iOS or MacOS?

Is the concern here that a monolithic output might be produced with many different os/architectures and this would find based on prefix and extension potentially the wrong one? Even CMake in its maturity does not detect this and opted for a hints and paths approach for more manual control over library finding. TargetOS and library naming convention should protect us in most cases here like seen in the updated download example. It may be over-engineering to try to handle this where advanced toolchains even do not, developers still with this helper would have the option to manually add CodeAssets.

it should be in package:native_toolchain_c most likely.

Maybe a bigger discussion about separation? Many of the helpers in toolchain_c are beneficial to other toolchains and it does not seem logic to include toolchain_c in projects that do not need a cbuilder.

If we go that route, I think assetMappings should be a callback function makeAssetId(CodeAssetInfo info) that has access to all the inferred information, so that users can write arbitrary Dart code in there to come up with an assetId (info) => '$packageName/src/${info.file.path.split(...).last}.dart'.

Then only one library could be found at a time. I could not find ffi.so and dep.so in one go as they will be added with the same name. Couldn't the user effectively accomplish this via string interpolation in the map or by providing a prebuilt variable to the map?

And we should also not limit it to the output directory, we should allow using an arbitrary directory. (For example allowing users to run an external package manager and then pointing to a directory with dylibs in there.)

This is why outputDirectory is nullable and null assigned as input.outputDirectory. A user can provide any directory as is and it will run the recursive find there or as default, search the input's out folder.

This is a very different change from the data asset change, so I'd say that should be a separate PR if we go that way.

Understood, can strip this out if there's no way we can land a version of this here. I think it already covers most cases mentioned and those not, would easily be recognizable and fixable by devs through errors.

MichealReed · 2025-03-13T16:50:17Z

Probably the only way to be certain about an architecture/operating system would be to inspect the binary of the file to identify through ELF, PE, and Mach-O with additional symbol inspection per platform to determine OS differences, primarily between iOS/MacOS/VisionOS and Linux, Android and Fuschia (if even needed, they may be compatible anyways, maybe try to load it?). Getting a comprehensive map of fallback symbols per platform would probably be the most intensive part of doing this, but it all could be done in dart versus relying on external tools. We may get lost in a maze of edge-cases trying to make the detection flawless though.

dcharkes · 2025-03-14T08:22:35Z

It should then also work for (a) downloading dynamic libraries, (b) grabbing dynamic libraries from an external package manager directory, and (c) dynamic libraries checked in to the package git [please don't!].

It should already cover these cases, just download/build/make appear the libraries to a folder and provide the folder and a list of a names and this helper would find them.

I'm thinking users might download/build for all target architectures and OSes in a single directory, like a tarball. That means it would pick up the wrong ones as well. Ditto someone might write a build script that outputs fat binaries, containing more than one architecture. So I don't believe it covers those cases in general. Therefore, I am hesitant to provide something in the API that everyone sees that covers only a subset of the use cases.

Maybe a bigger discussion about separation? Many of the helpers in toolchain_c are beneficial to other toolchains and it does not seem logic to include toolchain_c in projects that do not need a cbuilder.

Agreed!

And the tools abstraction itself should also be reusable for other toolchains than just C toolchains.

[native_toolchain_c] Export tools in package #856

But that kind of requires a refactoring as well, as there are some issues with how it's currently set up.

[native_toolchain_c] Export tools in package #856 (comment)

I don't have the cycles to deal with that unfortunately atm. Maybe such packages should be community owned as well.

(The way to make progress on your package now is probably to either do an ugly src/ import of package:native_toolchain_c and pin the dependency. Or, maybe better, duplicate the subset of logic you need in your own package.)

Then only one library could be found at a time.

Why? Wouldn't you call the callback for every single library you find?

Understood, can strip this out if there's no way we can land a version of this here

Yes, let's land the data asset extension. 👍

Probably the only way to be certain about an architecture/operating system would be to inspect the binary of the file to identify through ELF, PE, and Mach-O with additional symbol inspection per platform to determine OS differences, primarily between iOS/MacOS/VisionOS and Linux, Android and Fuschia (if even needed, they may be compatible anyways, maybe try to load it?). Getting a comprehensive map of fallback symbols per platform would probably be the most intensive part of doing this, but it all could be done in dart versus relying on external tools. We may get lost in a maze of edge-cases trying to make the detection flawless though.

Agreed, therefore I'm leaning towards not adding this for "all dylibs and static libs in a directory". It feels like a lot of engineering effort to get it right for the general case. For your specific case you already know the target OS and target architecture is right, because you passed it in to CMake.

MichealReed · 2025-03-14T15:17:04Z

I'm thinking users might download/build for all target architectures and OSes in a single directory, like a tarball. That means it would pick up the wrong ones as well. Ditto someone might write a build script that outputs fat binaries, containing more than one architecture. So I don't believe it covers those cases in general.

If libraries have the same name, they would be split into platform specific folders, they cannot have the same name in the same folder. If they are named ambiguously and it grabs the wrong library the first implementation, it will error, and a developer can provide platform specific paths to find the correct output directory. Fat libraries would work as well as the builder supports this. This already works for the download_asset example without conflict.

The generalization here is in line with the libraryFileName helper. Rejecting it under that premise is a bit dis-encouraging, but it's your show to run here.

identify through ELF, PE, and Mach-O

This is very viable though and the external tooling should be replaced with dart native code that does this. I've prototyped something here

https://gist.github.com/MichealReed/bc2aa5d059133ab17bf7571b353335fe

For your specific case you already know the target OS and target architecture is right, because you passed it in to CMake.

In other cases, this would be so too, builds will either be platform isolated with the temporary directory approach the builder currently uses, or they will be manually isolated into different build folders (build/platform/arch). Building everything for all platforms to the same folder is an anti-pattern that breaks other stuff and developers should avoid, not an edge case to cover.

Yes, let's land the data asset extension. 👍

Roger that, will remove this code over the weekend and we can finalize on Monday. Let me know if you have more questions or change your mind 👍

github-actions · 2025-03-17T09:31:15Z

PR Health

Breaking changes ✔️

Package	Change	Current Version	New Version	Needed Version	Looking good?

Changelog Entry ✔️

Package	Changed Files

Changes to files need to be accounted for in their respective changelogs.

API leaks ✔️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

Package	Leaked API symbols

License Headers ✔️

// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

Files
no missing headers

All source files should start with a license header.

Unrelated files missing license headers

Files
pkgs/jni/lib/src/third_party/generated_bindings.dart
pkgs/objective_c/lib/src/ns_input_stream.dart

pkgs/native_assets_cli/example/build/download_asset/hook/build.dart

pkgs/native_assets_cli/lib/src/config.dart

dcharkes · 2025-03-17T11:46:28Z

pkgs/native_assets_cli/lib/src/config.dart

+    final packageName = input.packageName;
+    final packageRoot = input.packageRoot;
+    for (final path in paths) {
+      final resolvedUri = packageRoot.resolve('$packageName/$path');


I was kind of expecting this function to do what pkgs/native_assets_builder/test_data/simple_data_asset/hook/build.dart does, and that file being refactored to simply a call to this function.

pkgs/native_assets_cli/lib/src/config.dart

pkgs/native_assets_cli/lib/native_assets_cli.dart

pkgs/native_assets_cli/example/build/download_asset/hook/build.dart

dcharkes · 2025-03-18T08:48:00Z

pkgs/native_assets_cli/lib/src/config.dart

+    final packageName = input.packageName;
+    final packageRoot = input.packageRoot;
+    for (final path in paths) {
+      final resolvedUri = packageRoot.resolve('$packageName/$path');


The way to test for now is to run dart test in package:native_assets_builder.

we still must resolve the relative $packageName/$path to find the file though.

I don't understand, if I change the helper function to final resolvedUri = packageRoot.resolve(path);, then dart test in package:native_assets_builder succeeds.

Do you want this to be refactored for compatibility with system absolute paths or relative only?

I think relative paths in the package is fine for now. That's the most common use case. (For desktop applications that use an LLM that's installed on the system, I can imagine wanting to support absolute paths. So we could add such support in the future. We should be able to distinguish a string containing an absolute path from a relative path.)

coveralls · 2025-03-19T22:02:14Z

coverage: 86.652% (-0.3%) from 86.948%
when pulling 57d6d41 on MichealReed:dependency_helpers
into 0cd2e55 on dart-lang:main.

pkgs/native_assets_cli/lib/src/data_assets/config.dart

MichealReed added 7 commits March 12, 2025 12:56

add BuildOutput extensions addFoundCodeAssets and addDataAssetDirecto…

80af89c

…ries

update changelog

28e4dd2

should use targetOS

ec9b98c

tests should use targetOS

ba20b2d

mappings instead of names, libraries don't always match convention.

80f7576

fix comments

42d26a3

test nesting

6795c87

github-actions bot added the package:hooks label Mar 12, 2025

MichealReed changed the title ~~BuildOutput extensions addFoundCodeAssets and addDataAssetDirectories~~ [native_assets_cli] BuildOutput extensions addFoundCodeAssets and addDataAssetDirectories Mar 12, 2025

optional outputDirectory URI

703a5c5

dcharkes reviewed Mar 12, 2025

View reviewed changes

MichealReed added 2 commits March 12, 2025 15:56

normalize keys instead of edit generated file, needs new issue

6c91856

use dependency from dart

7da0b8f

MichealReed added 4 commits March 12, 2025 16:38

remove main from ffigen

303e701

clean-up unnecessary show

4e99e19

show AddDataAssetsDirectoryExtension and GetLinkMode

d78a647

BuildOutputBuilder in changelog.

2836592

dcharkes reviewed Mar 13, 2025

View reviewed changes

pkgs/native_assets_cli/test/data_assets/validation_test.dart Outdated Show resolved Hide resolved

dcharkes reviewed Mar 13, 2025

View reviewed changes

pkgs/native_assets_cli/lib/src/config.dart Outdated Show resolved Hide resolved

MichealReed added 5 commits March 15, 2025 19:44

remove AddFoundCodeAssets extension

82a4b05

remove AddFoundCodeAssets extension

09cbb6b

remove GetLinkMode extension

a9b87b9

remove addfound tests

379a76e

dont add directories

b76ec8f

MichealReed added 6 commits March 15, 2025 19:46

dont add directories

768827c

remove directory tests

590811f

remove addfoundlibrary from download_asset example

8f29e61

update changelog

7b0d8fb

no need for code asset import in config

ca389e5

remove config important from validation tests

6423e27

dcharkes reviewed Mar 17, 2025

View reviewed changes

actually add the data asset

56cbc0b

github-actions bot added the package:hooks_runner label Mar 17, 2025

MichealReed added 2 commits March 17, 2025 11:47

fix asset name

3cf0742

fix config imports

7719d15

dcharkes mentioned this pull request Mar 18, 2025

[native_assets_cli] Fix download_asset example #2105

Merged

dcharkes reviewed Mar 18, 2025

View reviewed changes

MichealReed added 4 commits March 18, 2025 11:19

restore download_asset

a60dff5

move to data_assets

af8218c

restore download_asset example

c21eba0

fix validation test, cleanup new extension

8034a81

MichealReed changed the title ~~[native_assets_cli] BuildOutput extensions addFoundCodeAssets and addDataAssetDirectories~~ [native_assets_cli] BuildOutput extension: addDataAssetDirectories Mar 18, 2025

MichealReed added 4 commits March 18, 2025 13:21

restore base config.dart imports

7424003

addDependency directory

7d6c35d

remove import

2c316ea

restore build_runner_test

bc33b02

dcharkes reviewed Mar 19, 2025

View reviewed changes

pkgs/native_assets_cli/lib/src/data_assets/config.dart Show resolved Hide resolved

add directory uri

57d6d41

dcharkes approved these changes Mar 20, 2025

View reviewed changes

dcharkes merged commit 8132054 into dart-lang:main Mar 20, 2025
32 checks passed

[native_assets_cli] BuildOutput extension: addDataAssetDirectories #2097

[native_assets_cli] BuildOutput extension: addDataAssetDirectories #2097

Uh oh!

Conversation

MichealReed commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcharkes left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichealReed commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dcharkes commented Mar 13, 2025

Uh oh!

MichealReed commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichealReed commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcharkes commented Mar 14, 2025

Uh oh!

MichealReed commented Mar 14, 2025

Uh oh!

github-actions bot commented Mar 17, 2025

PR Health

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coveralls commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichealReed commented Mar 12, 2025 •

edited

Loading

MichealReed commented Mar 12, 2025 •

edited

Loading

MichealReed commented Mar 13, 2025 •

edited

Loading

MichealReed commented Mar 13, 2025 •

edited

Loading

coveralls commented Mar 19, 2025 •

edited

Loading