-
Notifications
You must be signed in to change notification settings - Fork 67
[native_assets_cli] BuildOutput extension: addDataAssetDirectories #2097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MichealReed!
- If we add a code asset then addDataAssetDirectories adds the folder that asset exists in, will its inclusion as a data asset cause conflict?
Code assets and data assets live in a separate namespace, so that should not lead to issues.
- Is there a better way to derive name.dart for the CodeAsset without mappings?
I'm inclined to not add support for addFoundCodeAssets
. If users need to do a mapping from found files to asset id, they probably want to do it programmatically. Also, they might want to map asset id to file path instead. Or they might want to map targetOS+targetArch to both filename and asset id. So I'm not assuming that code assets will be so regular a helper function would make sense.
Do you have a use case where you want addFoundCodeAssets
for?
- BuildOutputBuilder seemed like a better place for the extension so these are accessible in the hook.
sgtm! 👍
final packageName = input.packageName; | ||
final packageRoot = input.packageRoot; | ||
for (final path in paths) { | ||
final resolvedUri = packageRoot.resolve('$packageName/$path'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have $packageName/
here? Wouldn't we simply want to resolve $path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files created in the tests were showing up under $packageName, is there something else moving the assets to this path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was kind of expecting this function to do what pkgs/native_assets_builder/test_data/simple_data_asset/hook/build.dart does, and that file being refactored to simply a call to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can test_data/simple_data_asset be tested? It depends on a nonexistent dart:asset and ByteAsset.
Pushed a fix for name that mirrors that example, we still must resolve the relative $packageName/$path to find the file though. Do you want this to be refactored for compatibility with system absolute paths or relative only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to test for now is to run dart test
in package:native_assets_builder
.
we still must resolve the relative $packageName/$path to find the file though.
I don't understand, if I change the helper function to final resolvedUri = packageRoot.resolve(path);
, then dart test
in package:native_assets_builder
succeeds.
Do you want this to be refactored for compatibility with system absolute paths or relative only?
I think relative paths in the package is fine for now. That's the most common use case. (For desktop applications that use an LLM that's installed on the system, I can imagine wanting to support absolute paths. So we could add such support in the future. We should be able to distinguish a string containing an absolute path from a relative path.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be good now, the data assets validation tests makeDataBuildInput was not resolving the packageRoot correctly, which caused the failure in the validation test. Now both simple_data_asset and the validation tests pass with resolve(path).
Gladly!
addFoundCodeAssets is not for mapping, it's for recursively finding libraries in the build output. It needs mappings to map to the name ffi expects. I think the changes in the download_asset hook demonstrate this sort of programmatic mapping. The generated file includes extensions so those must be stripped for the hash to validate and be normalized for library finding. await output.addFoundCodeAssets(
input: input,
assetMappings: [
// asset to find : name to add it as
{ targetName : 'native_add.dart'},
],
); If we provide targetName directly (without mapping) as the CodeAsset name argument we see Couldn't resolve native function 'add' in 'package:download_asset/native_add.dart' :
No asset with id 'package:download_asset/native_add.dart' found.
Available native assets: package:download_asset/native_add_windows_x64.dart. Name is still somewhat ambiguous at the moment to me and from what I can tell, maps to the generated dart file. Meaning most variables must be stripped. Unless this reads exactly native_add.dart for the codeasset name, the program will not run and as mentioned the fallback method does not find add in native_add_windows_x64.dart. The mapping is just to map files of varying architecture and platform types or unexpected names to the expected library name from the dart side. In another use case, there are sometimes libraries that expect other dynamic libraries be bundled with the executable and have nested ways to open the library. This allows users to find multiple libraries from a single build to be bundled with the output. There are also cases like with cmake output libraries will be heavily nested, or libraries from child builds will be needed and users need a quick way to find these. |
Ah got it! You want to simply invoke the cmake process and then read the file system of what it output. (Does CMake not have a way for it to tell us what dylibs it output? Such that we would parse that information instead of scanning the file system?) If we want to infer information from the files on disk, I think we should make it more general. It should then also work for (a) downloading dynamic libraries, (b) grabbing dynamic libraries from an external package manager directory, and (c) dynamic libraries checked in to the package git [please don't!]. To make it more general:
If we go that route, I think And we should also not limit it to the output directory, we should allow using an arbitrary directory. (For example allowing users to run an external package manager and then pointing to a directory with dylibs in there.) This is a very different change from the data asset change, so I'd say that should be a separate PR if we go that way. Also, in |
I think it's infeasible for it to do so, one build process can have many libraries. A single build of Dawn contains many projects which produce static or dynamic libraries of their own. Sometimes we built it for the monolithic dynamic lib produced, sometimes we want to grab the statics or other dynamic libs for linking.
It should already cover these cases, just download/build/make appear the libraries to a folder and provide the folder and a list of a names and this helper would find them.
This already does a selective find to get the proper lib based on input.config.code.targetOS.libraryFileName(
searchName,
linkMode,
);
Is the concern here that a monolithic output might be produced with many different os/architectures and this would find based on prefix and extension potentially the wrong one? Even CMake in its maturity does not detect this and opted for a hints and paths approach for more manual control over library finding. TargetOS and library naming convention should protect us in most cases here like seen in the updated download example. It may be over-engineering to try to handle this where advanced toolchains even do not, developers still with this helper would have the option to manually add CodeAssets.
Maybe a bigger discussion about separation? Many of the helpers in toolchain_c are beneficial to other toolchains and it does not seem logic to include toolchain_c in projects that do not need a cbuilder.
Then only one library could be found at a time. I could not find ffi.so and dep.so in one go as they will be added with the same name. Couldn't the user effectively accomplish this via string interpolation in the map or by providing a prebuilt variable to the map?
This is why outputDirectory is nullable and null assigned as input.outputDirectory. A user can provide any directory as is and it will run the recursive find there or as default, search the input's out folder.
Understood, can strip this out if there's no way we can land a version of this here. I think it already covers most cases mentioned and those not, would easily be recognizable and fixable by devs through errors. |
Probably the only way to be certain about an architecture/operating system would be to inspect the binary of the file to identify through ELF, PE, and Mach-O with additional symbol inspection per platform to determine OS differences, primarily between iOS/MacOS/VisionOS and Linux, Android and Fuschia (if even needed, they may be compatible anyways, maybe try to load it?). Getting a comprehensive map of fallback symbols per platform would probably be the most intensive part of doing this, but it all could be done in dart versus relying on external tools. We may get lost in a maze of edge-cases trying to make the detection flawless though. |
I'm thinking users might download/build for all target architectures and OSes in a single directory, like a tarball. That means it would pick up the wrong ones as well. Ditto someone might write a build script that outputs fat binaries, containing more than one architecture. So I don't believe it covers those cases in general. Therefore, I am hesitant to provide something in the API that everyone sees that covers only a subset of the use cases.
Agreed! And the tools abstraction itself should also be reusable for other toolchains than just C toolchains. But that kind of requires a refactoring as well, as there are some issues with how it's currently set up. I don't have the cycles to deal with that unfortunately atm. Maybe such packages should be community owned as well. (The way to make progress on your package now is probably to either do an ugly
Why? Wouldn't you call the callback for every single library you find?
Yes, let's land the data asset extension. 👍
Agreed, therefore I'm leaning towards not adding this for "all dylibs and static libs in a directory". It feels like a lot of engineering effort to get it right for the general case. For your specific case you already know the target OS and target architecture is right, because you passed it in to CMake. |
If libraries have the same name, they would be split into platform specific folders, they cannot have the same name in the same folder. If they are named ambiguously and it grabs the wrong library the first implementation, it will error, and a developer can provide platform specific paths to find the correct output directory. Fat libraries would work as well as the builder supports this. This already works for the download_asset example without conflict. The generalization here is in line with the libraryFileName helper. Rejecting it under that premise is a bit dis-encouraging, but it's your show to run here.
This is very viable though and the external tooling should be replaced with dart native code that does this. I've prototyped something here https://gist.github.com/MichealReed/bc2aa5d059133ab17bf7571b353335fe
In other cases, this would be so too, builds will either be platform isolated with the temporary directory approach the builder currently uses, or they will be manually isolated into different build folders (build/platform/arch). Building everything for all platforms to the same folder is an anti-pattern that breaks other stuff and developers should avoid, not an edge case to cover.
Roger that, will remove this code over the weekend and we can finalize on Monday. Let me know if you have more questions or change your mind 👍 |
PR HealthBreaking changes ✔️
Changelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. API leaks ✔️The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
License Headers ✔️
All source files should start with a license header. Unrelated files missing license headers
|
pkgs/native_assets_cli/example/build/download_asset/hook/build.dart
Outdated
Show resolved
Hide resolved
final packageName = input.packageName; | ||
final packageRoot = input.packageRoot; | ||
for (final path in paths) { | ||
final resolvedUri = packageRoot.resolve('$packageName/$path'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was kind of expecting this function to do what pkgs/native_assets_builder/test_data/simple_data_asset/hook/build.dart does, and that file being refactored to simply a call to this function.
pkgs/native_assets_cli/example/build/download_asset/hook/build.dart
Outdated
Show resolved
Hide resolved
final packageName = input.packageName; | ||
final packageRoot = input.packageRoot; | ||
for (final path in paths) { | ||
final resolvedUri = packageRoot.resolve('$packageName/$path'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to test for now is to run dart test
in package:native_assets_builder
.
we still must resolve the relative $packageName/$path to find the file though.
I don't understand, if I change the helper function to final resolvedUri = packageRoot.resolve(path);
, then dart test
in package:native_assets_builder
succeeds.
Do you want this to be refactored for compatibility with system absolute paths or relative only?
I think relative paths in the package is fine for now. That's the most common use case. (For desktop applications that use an LLM that's installed on the system, I can imagine wanting to support absolute paths. So we could add such support in the future. We should be able to distinguish a string containing an absolute path from a relative path.)
This PR adds helper extensions to BuildOutputBuilder to assist with adding assets and directories to the package.
addFoundCodeAssets will recursively search the input config's target directory to find any libraries matching the given name. The prefix and extension are determined through use of the targetOS helper libraryFileName. Because the CodeAsset name does not always match the file that the library is contained in, we need to use a mapping where the key is used to find the file, and the name is the name it will become available to dart as, otherwise we see an error like below as the fallback does not seem to work in this case.
Invalid argument(s): Couldn't resolve native function 'add' in 'package:download_asset/native_add.dart' : No asset with id 'package:download_asset/native_add.dart' found. Available native assets: package:download_asset/native_add_windows_x64.dart. Attempted to fallback to process lookup.
example usage:
addDataAssetDirectories takes in a list of paths or files and adds them as DataAsset dependencies, throwing an error if a given path is not found.
example usage:
Tests have been added to data_assets/validation_test.dart and code_assets/validation_test.dart and the download example was updated to use addFoundCodeAssets.
Open Questions:
closes #1346