Skip to content

Refactor tokenizer test and add to cmake #8450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 15, 2025

Conversation

lucylq
Copy link
Contributor

@lucylq lucylq commented Feb 13, 2025

Summary

Refactor tokenizer test to use env instead of resource path.

Add to cmake tests.

Test plan

build et

./install_executorch.sh

build test

CMAKE_PREFIX_PATH="$(python3 -c 'import torch as _; print(_.__path__[0])')"
  cmake . \
    -DCMAKE_INSTALL_PREFIX=cmake-out \
    -DCMAKE_PREFIX_PATH="${CMAKE_PREFIX_PATH}" \
    -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
    -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
    -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
    -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
    -DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
    -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
    -DEXECUTORCH_BUILD_DEVTOOLS=ON \
    -DEXECUTORCH_BUILD_XNNPACK=ON \
    -DEXECUTORCH_BUILD_TESTS=ON \
    -Bcmake-out
  cmake --build cmake-out -j9 --target install

test

cd cmake-out
ctest -R tokenizer

Test project /data/users/lfq/executorch/cmake-out
    Start 54: extension_llm_tokenizer_test
1/1 Test #54: extension_llm_tokenizer_test .....   Passed    3.66 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   3.67 sec

Copy link

pytorch-bot bot commented Feb 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8450

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b1a9127 with merge base 8148603 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 13, 2025
@lucylq lucylq force-pushed the lfq.cmake-tokenizer-test branch 6 times, most recently from 41b88a2 to 15d4cd0 Compare February 14, 2025 05:14
@lucylq lucylq changed the title tokenizer Refactor tokenizer test and add to cmake Feb 14, 2025
@lucylq lucylq force-pushed the lfq.cmake-tokenizer-test branch from 15d4cd0 to 71976c7 Compare February 14, 2025 05:17
@lucylq lucylq marked this pull request as ready for review February 14, 2025 05:18
@facebook-github-bot
Copy link
Contributor

@lucylq has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@lucylq lucylq added the release notes: build Changes related to build, including dependency upgrades, build flags, optimizations, etc. label Feb 14, 2025
facebook-github-bot pushed a commit that referenced this pull request Feb 14, 2025
Summary:
Refactor tokenizer test to use env instead of resource path.

Add to cmake tests.


Test Plan:
## Internal
```
buck2 test  fbsource//xplat/executorch/extension/llm/tokenizer/test:test_tiktoken
buck2 test  fbsource//xplat/executorch/extension/llm/tokenizer/test:test_bpe_tokenizer
buck2 test  fbcode//executorch/extension/llm/tokenizer/test:test_tiktoken
buck2 test  fbcode//executorch/extension/llm/tokenizer/test:test_bpe_tokenizer
```

## OSS
build et
```
./install_executorch.sh
```
build test
```
CMAKE_PREFIX_PATH="$(python3 -c 'import torch as _; print(_.__path__[0])')"
  cmake . \
    -DCMAKE_INSTALL_PREFIX=cmake-out \
    -DCMAKE_PREFIX_PATH="${CMAKE_PREFIX_PATH}" \
    -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
    -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
    -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
    -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
    -DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
    -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
    -DEXECUTORCH_BUILD_DEVTOOLS=ON \
    -DEXECUTORCH_BUILD_XNNPACK=ON \
    -DEXECUTORCH_BUILD_TESTS=ON \
    -Bcmake-out
  cmake --build cmake-out -j9 --target install
```

test
```
cd cmake-out
ctest -R tokenizer

Test project /data/users/lfq/executorch/cmake-out
    Start 54: extension_llm_tokenizer_test
1/1 Test #54: extension_llm_tokenizer_test .....   Passed    3.66 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =--sanitized--

Differential Revision: D69642007

Pulled By: lucylq
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69642007

facebook-github-bot pushed a commit that referenced this pull request Feb 14, 2025
Summary:
Refactor tokenizer test to use env instead of resource path.

Add to cmake tests.


Test Plan:
## Internal
```
buck2 test  fbsource//xplat/executorch/extension/llm/tokenizer/test:test_tiktoken
buck2 test  fbsource//xplat/executorch/extension/llm/tokenizer/test:test_bpe_tokenizer
buck2 test  fbcode//executorch/extension/llm/tokenizer/test:test_tiktoken
buck2 test  fbcode//executorch/extension/llm/tokenizer/test:test_bpe_tokenizer
```

## OSS
build et
```
./install_executorch.sh
```
build test
```
CMAKE_PREFIX_PATH="$(python3 -c 'import torch as _; print(_.__path__[0])')"
  cmake . \
    -DCMAKE_INSTALL_PREFIX=cmake-out \
    -DCMAKE_PREFIX_PATH="${CMAKE_PREFIX_PATH}" \
    -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
    -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
    -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
    -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
    -DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
    -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
    -DEXECUTORCH_BUILD_DEVTOOLS=ON \
    -DEXECUTORCH_BUILD_XNNPACK=OFF \
    -DEXECUTORCH_BUILD_TESTS=ON \
    -Bcmake-out
  cmake --build cmake-out -j9 --target install
```

test
```
cd cmake-out
ctest -R tokenizer

Test project /data/users/lfq/executorch/cmake-out
    Start 54: extension_llm_tokenizer_test
1/1 Test #54: extension_llm_tokenizer_test .....   Passed    3.66 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =--sanitized--

Differential Revision: D69642007

Pulled By: lucylq
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69642007

Summary:
Refactor tokenizer test to use env instead of resource path.

Add to cmake tests.


Test Plan:
## Internal
```
buck2 test  fbsource//xplat/executorch/extension/llm/tokenizer/test:test_tiktoken
buck2 test  fbsource//xplat/executorch/extension/llm/tokenizer/test:test_bpe_tokenizer
buck2 test  fbcode//executorch/extension/llm/tokenizer/test:test_tiktoken
buck2 test  fbcode//executorch/extension/llm/tokenizer/test:test_bpe_tokenizer
```

## OSS
build et
```
./install_executorch.sh
```
build test
```
CMAKE_PREFIX_PATH="$(python3 -c 'import torch as _; print(_.__path__[0])')"
  cmake . \
    -DCMAKE_INSTALL_PREFIX=cmake-out \
    -DCMAKE_PREFIX_PATH="${CMAKE_PREFIX_PATH}" \
    -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
    -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
    -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
    -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
    -DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
    -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
    -DEXECUTORCH_BUILD_DEVTOOLS=ON \
    -DEXECUTORCH_BUILD_XNNPACK=OFF \
    -DEXECUTORCH_BUILD_TESTS=ON \
    -Bcmake-out
  cmake --build cmake-out -j9 --target install
```

test
```
cd cmake-out
ctest -R tokenizer

Test project /data/users/lfq/executorch/cmake-out
    Start 54: extension_llm_tokenizer_test
1/1 Test #54: extension_llm_tokenizer_test .....   Passed    3.66 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =--sanitized--

Differential Revision: D69642007

Pulled By: lucylq
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69642007

@facebook-github-bot facebook-github-bot merged commit adf3956 into main Feb 15, 2025
45 of 48 checks passed
@facebook-github-bot facebook-github-bot deleted the lfq.cmake-tokenizer-test branch February 15, 2025 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported release notes: build Changes related to build, including dependency upgrades, build flags, optimizations, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants