refactor: move engine function outside of the main#15
Conversation
| pub logs: String, | ||
| pub output: serde_json::Value, |
There was a problem hiding this comment.
Conceptually these aren't statistics to me- I think we're missing a parent ExecutionResult concept that would own the logs, output, and statistics.
There was a problem hiding this comment.
Alternatively, we could rename this struct to the more generic ExecutionResult (or similar name) and not bother having a separate struct just for the statistics
There was a problem hiding this comment.
I think that decision comes down to whether or not there's value in separating the display portion into separate modules for logs/output and statistics or if it's easier to just have one display implementation for the entire result- I don't have a strong opinion on this 🤷🏻
e.g.
struct RunStatistics { ...]
impl display for RunStatistics { ...just prints out the runtime/memory info... }
// separate file
struct Result { stats: RunStatistics, logs: string, output: value }
impl display for Result { ...print out logs and output... ; format!('{:#}', stats) }
There was a problem hiding this comment.
Good idea, I was hesitant on the name as well. I will implement the first option since I don't think that there will be a need to separate the displaying of the different sections.
ErinRenNumber1
left a comment
There was a problem hiding this comment.
This is great work!!! I'm impressed.
src/run_statistics.rs
Outdated
| @@ -4,11 +4,23 @@ use std::{fmt, time::Duration}; | |||
| pub struct RunStatistics { | |||
| pub runtime: Duration, | |||
| pub threshold: Duration, | |||
There was a problem hiding this comment.
I don't understand the point of the duration here? It's hard coded in the run() function, and serves no purpose other than comparing it in the test (IIUC). Can we just write the test as:
assert!(statistics.runtime <= Duration::from_millis(5);
Since I think it's only the test that needs to know.
There was a problem hiding this comment.
Would it be better to pass the threshold to the run function instead of having it hardcoded in it? I use the threshold when the result gets printed, I display it when the runtime is over a given threshold.
There was a problem hiding this comment.
There was a problem hiding this comment.
I think hardcoding it is reasonable- it's completely rigid from the perspective of the user, and unlikely to change anytime soon.
| .typed::<(), (), _>(&store)? | ||
| .call(&mut store, ()); | ||
|
|
||
| runtime = start.elapsed(); |
There was a problem hiding this comment.
I'm just wondering if we are measuring the right thing here? 🤔
Just for context, the purpose of this script-runner is so that the devs can measure performance locally, instead of always needing to execute in production. Therefore, we want to measure as closely as what's done in runtime-engine. Disclaimer, I am not an expert in runtime-engine, so these might be dumb questions. I see that in runtime-engine, the timeout is set on the entire module run (here). What you are measuring is more analogous to the main_duration done in the finer measurements here. I know that this is the only part that's relevant to the dev's code, but should we still measure the entire execution including start_duration?
Another question that might be outside the scope of your PR. @jianghong: do you know if we should emulate the code in the runtime-engine a bit closer? Like in here, I see it's loading the default function, where the runtime-engine is looking for a typed function _start. I am not sure if the differences has any performance impact at all. WDYT?
There was a problem hiding this comment.
Yeah good catch here, we should be looking for the _start func to run here. You can do something like
let instance = linker.instantiate(&mut store, &module)?;
let result = instance
.get_typed_func::<(), (), _>(&mut store, "_start")
.call(&mut store, ());
There should be no performance impact. This will align more closely with our wasm API spec:
MUST Export a
_startfunction of type(func)(Taking 0 arguments and having no return value)
There was a problem hiding this comment.
Updated my code snippet above. The main change here is that you'll need to get an instance explicitly to use get_typed_func. This more closely mirrors what we're doing our runtime engine
There was a problem hiding this comment.
Thanks for the heads up! @jianghong I had to use store, because store.0 was not a thing. Let me know if you know why.
There was a problem hiding this comment.
Oops yeah, I typo'd there. I'll DM you some info about that
There was a problem hiding this comment.
(Reply after 1:1 discussion with Erin) Since Andrew said that the result will be compared to a benchmark script and therefore the benchmark is relative, we don't need to have an absolute comparison with what we use in the runtime-engine.
andrewhassan
left a comment
There was a problem hiding this comment.
Overall this is looking really great!! I only had a few nitpicky comments in addition to Erin's 😄
src/execution_result.rs
Outdated
| pub struct ExecutionResult { | ||
| pub runtime: Duration, | ||
| pub threshold: Duration, | ||
| pub logs: String, | ||
| pub output: serde_json::Value, | ||
| } |
There was a problem hiding this comment.
I love this approach! In the future, we plan to integrate this tool into the Shopify CLI, so having a struct like this would be very useful in allowing the caller to specify the output format (e.g. human-readable text, JSON, etc.).
src/main.rs
Outdated
| let engine = Engine::default(); | ||
| let module = Module::from_file(&engine, &opts.script) | ||
| .map_err(|e| anyhow!("Couldn't load script {:?}: {}", &opts.script, e))?; | ||
| let statistics = run(opts.script, opts.input)?; |
There was a problem hiding this comment.
Nit: run returns Result<ExecutionResult> so execution_result might be a better variable name here.
There was a problem hiding this comment.
But based on my comment, we should call it run_result or function_run_result instead.
There was a problem hiding this comment.
alright, I'll correct these :)
src/execution_result.rs
Outdated
| use colored::Colorize; | ||
| use std::{fmt, time::Duration}; | ||
|
|
||
| pub struct ExecutionResult { |
There was a problem hiding this comment.
Nitpick: In the past, we were told not to use the term execution externally since it has negative connotations. Instead, we've switched to using run everywhere we expose this concept to partners (for examples on the Partners Dashboard). I think we should rename this struct (and associated things) to RunResult (or FunctionRunResult if you want to be more specific).
There was a problem hiding this comment.
TIL, I'll use FunctionRunResult since RunResult suggests an action of running the result which doesn't really make sense imo
src/function_run_result.rs
Outdated
|
|
||
| pub struct FunctionRunResult { | ||
| pub runtime: Duration, | ||
| pub threshold: Duration, |
There was a problem hiding this comment.
I know that this isn't part of this PR, but we don't need to check the threshold in script-runner. We only need to report the amount of time the function took to run.
This is because the absolute threshold on a partner's machine isn't very useful, since its performance profile will be different from our production servers.
In the future, this tool will be integrated into the Shopify CLI which will use it to run a benchmark function, which is close to the resource limits in production, and the partner's function. It will then compare the results and present them to the partner. So all script-runner needs to do is output the time taken to run the function.
There was a problem hiding this comment.
Thanks for the clarification, I removed the threshold logic.
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
| use std::path::Path; | ||
|
|
||
| #[test] | ||
| fn test_runtime_under_threshold() { | ||
| let function_run_result = run( | ||
| Path::new("tests/benchmarks/hello_world.wasm").to_path_buf(), | ||
| Path::new("tests/benchmarks/hello_world.json").to_path_buf(), | ||
| ) | ||
| .unwrap(); | ||
|
|
||
| assert!(function_run_result.runtime <= Duration::from_millis(5)); | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_runtime_over_threshold() { | ||
| let function_run_result = run( | ||
| Path::new("tests/benchmarks/sleeps.wasm").to_path_buf(), | ||
| Path::new("tests/benchmarks/sleeps.json").to_path_buf(), | ||
| ) | ||
| .unwrap(); | ||
|
|
||
| assert!(function_run_result.runtime > Duration::from_millis(5)); | ||
| } | ||
| } |
There was a problem hiding this comment.
Good idea adding unit tests!
I think it would also be useful to have a test with a Wasm module that reads from the input, logs some stuff, and outputs something. That way we can verify that the returned FunctionRunResult is what we expect.
For example, the input could be something like { value: 1 } and the function is expected to output value + 1. We could also verify that it logs something like hello world.
What are you trying to accomplish?
Refactor the code by moving the WASI engine outside the main function. This way it becomes easier to test and more readable.
What should reviewers focus on?
The way I used the Result type and the String types, not fully sure about that.
The impact of these changes
On Shopify
Easier to test. De-bloats the main.
On merchants
No change
On third-party apps
No change.
Tophat 🎩
You can test your own script by running
cargo run --release -- <path/to/input.json> -s <path/to/script.wasm>.Note that
--releaseis mandatory to benchmark since the runtime is much shorter when the binary is optimized.Additionally, I wrote automated tests (that are executed in an opt-3 environment), so
cargo testwill show the results of these tests.The tests use the inputs and the scripts located at
tests/benchmarks/,hello_world.wasmis a simple and short script available here,sleeps.wasmwas compiled by adding a sleep of 42ms tohello_world.wasm.Before you deploy