refactor: move engine function outside of the main by hainsdominic · Pull Request #15 · Shopify/function-runner

hainsdominic · 2022-05-25T21:00:33Z

What are you trying to accomplish?

Refactor the code by moving the WASI engine outside the main function. This way it becomes easier to test and more readable.

What should reviewers focus on?

The way I used the Result type and the String types, not fully sure about that.

The impact of these changes

On Shopify

Easier to test. De-bloats the main.

On merchants

No change

On third-party apps

No change.

Tophat 🎩

You can test your own script by running cargo run --release -- <path/to/input.json> -s <path/to/script.wasm>.

Note that --release is mandatory to benchmark since the runtime is much shorter when the binary is optimized.

Additionally, I wrote automated tests (that are executed in an opt-3 environment), so cargo test will show the results of these tests.

The tests use the inputs and the scripts located at tests/benchmarks/, hello_world.wasm is a simple and short script available here, sleeps.wasm was compiled by adding a sleep of 42ms to hello_world.wasm.

Before you deploy

I tophatted or tested this change.

src/run_engine.rs

src/engine.rs

src/run_statistics.rs

DuncanUszkay1 · 2022-05-25T21:30:33Z

src/run_statistics.rs

+    pub logs: String,
+    pub output: serde_json::Value,


Conceptually these aren't statistics to me- I think we're missing a parent ExecutionResult concept that would own the logs, output, and statistics.

Alternatively, we could rename this struct to the more generic ExecutionResult (or similar name) and not bother having a separate struct just for the statistics

I think that decision comes down to whether or not there's value in separating the display portion into separate modules for logs/output and statistics or if it's easier to just have one display implementation for the entire result- I don't have a strong opinion on this 🤷🏻

e.g.

struct RunStatistics { ...] impl display for RunStatistics { ...just prints out the runtime/memory info... } // separate file struct Result { stats: RunStatistics, logs: string, output: value } impl display for Result { ...print out logs and output... ; format!('{:#}', stats) }

Good idea, I was hesitant on the name as well. I will implement the first option since I don't think that there will be a need to separate the displaying of the different sections.

ErinRenNumber1

This is great work!!! I'm impressed.

ErinRenNumber1 · 2022-05-26T11:26:35Z

src/run_statistics.rs

@@ -4,11 +4,23 @@ use std::{fmt, time::Duration};
 pub struct RunStatistics {
    pub runtime: Duration,
    pub threshold: Duration,


I don't understand the point of the duration here? It's hard coded in the run() function, and serves no purpose other than comparing it in the test (IIUC). Can we just write the test as:

assert!(statistics.runtime <= Duration::from_millis(5);

Since I think it's only the test that needs to know.

Would it be better to pass the threshold to the run function instead of having it hardcoded in it? I use the threshold when the result gets printed, I display it when the runtime is over a given threshold.

https://user-images.githubusercontent.com/39207554/170121071-c907c259-8adf-45d6-b8d1-f5d62376a159.png

I think hardcoding it is reasonable- it's completely rigid from the perspective of the user, and unlikely to change anytime soon.

src/run_statistics.rs

ErinRenNumber1 · 2022-05-26T12:43:04Z

src/engine.rs

+            .typed::<(), (), _>(&store)?
+            .call(&mut store, ());
+
+        runtime = start.elapsed();


I'm just wondering if we are measuring the right thing here? 🤔
Just for context, the purpose of this script-runner is so that the devs can measure performance locally, instead of always needing to execute in production. Therefore, we want to measure as closely as what's done in runtime-engine. Disclaimer, I am not an expert in runtime-engine, so these might be dumb questions. I see that in runtime-engine, the timeout is set on the entire module run (here). What you are measuring is more analogous to the main_duration done in the finer measurements here. I know that this is the only part that's relevant to the dev's code, but should we still measure the entire execution including start_duration?

Another question that might be outside the scope of your PR. @jianghong: do you know if we should emulate the code in the runtime-engine a bit closer? Like in here, I see it's loading the default function, where the runtime-engine is looking for a typed function _start. I am not sure if the differences has any performance impact at all. WDYT?

Yeah good catch here, we should be looking for the _start func to run here. You can do something like

let instance = linker.instantiate(&mut store, &module)?; let result = instance .get_typed_func::<(), (), _>(&mut store, "_start") .call(&mut store, ());

There should be no performance impact. This will align more closely with our wasm API spec:

MUST Export a _start function of type (func) (Taking 0 arguments and having no return value)

Updated my code snippet above. The main change here is that you'll need to get an instance explicitly to use get_typed_func. This more closely mirrors what we're doing our runtime engine

Thanks for the heads up! @jianghong I had to use store, because store.0 was not a thing. Let me know if you know why.

Oops yeah, I typo'd there. I'll DM you some info about that

(Reply after 1:1 discussion with Erin) Since Andrew said that the result will be compared to a benchmark script and therefore the benchmark is relative, we don't need to have an absolute comparison with what we use in the runtime-engine.

andrewhassan

Overall this is looking really great!! I only had a few nitpicky comments in addition to Erin's 😄

andrewhassan · 2022-05-26T15:48:50Z

src/execution_result.rs

+pub struct ExecutionResult {
+    pub runtime: Duration,
+    pub threshold: Duration,
+    pub logs: String,
+    pub output: serde_json::Value,
+}


I love this approach! In the future, we plan to integrate this tool into the Shopify CLI, so having a struct like this would be very useful in allowing the caller to specify the output format (e.g. human-readable text, JSON, etc.).

andrewhassan · 2022-05-26T15:50:13Z

src/main.rs

-    let engine = Engine::default();
-    let module = Module::from_file(&engine, &opts.script)
-        .map_err(|e| anyhow!("Couldn't load script {:?}: {}", &opts.script, e))?;
+    let statistics = run(opts.script, opts.input)?;


Nit: run returns Result<ExecutionResult> so execution_result might be a better variable name here.

But based on my comment, we should call it run_result or function_run_result instead.

alright, I'll correct these :)

andrewhassan · 2022-05-26T15:51:42Z

src/execution_result.rs

+use colored::Colorize;
+use std::{fmt, time::Duration};
+
+pub struct ExecutionResult {


Nitpick: In the past, we were told not to use the term execution externally since it has negative connotations. Instead, we've switched to using run everywhere we expose this concept to partners (for examples on the Partners Dashboard). I think we should rename this struct (and associated things) to RunResult (or FunctionRunResult if you want to be more specific).

TIL, I'll use FunctionRunResult since RunResult suggests an action of running the result which doesn't really make sense imo

andrewhassan · 2022-05-26T17:30:15Z

src/function_run_result.rs

+
+pub struct FunctionRunResult {
+    pub runtime: Duration,
+    pub threshold: Duration,


I know that this isn't part of this PR, but we don't need to check the threshold in script-runner. We only need to report the amount of time the function took to run.

This is because the absolute threshold on a partner's machine isn't very useful, since its performance profile will be different from our production servers.

In the future, this tool will be integrated into the Shopify CLI which will use it to run a benchmark function, which is close to the resource limits in production, and the partner's function. It will then compare the results and present them to the partner. So all script-runner needs to do is output the time taken to run the function.

Thanks for the clarification, I removed the threshold logic.

DuncanUszkay1

LGTM 👍

andrewhassan · 2022-05-27T14:30:22Z

src/engine.rs

+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::path::Path;
+
+    #[test]
+    fn test_runtime_under_threshold() {
+        let function_run_result = run(
+            Path::new("tests/benchmarks/hello_world.wasm").to_path_buf(),
+            Path::new("tests/benchmarks/hello_world.json").to_path_buf(),
+        )
+        .unwrap();
+
+        assert!(function_run_result.runtime <= Duration::from_millis(5));
+    }
+
+    #[test]
+    fn test_runtime_over_threshold() {
+        let function_run_result = run(
+            Path::new("tests/benchmarks/sleeps.wasm").to_path_buf(),
+            Path::new("tests/benchmarks/sleeps.json").to_path_buf(),
+        )
+        .unwrap();
+
+        assert!(function_run_result.runtime > Duration::from_millis(5));
+    }
+}


Good idea adding unit tests!

I think it would also be useful to have a test with a Wasm module that reads from the input, logs some stuff, and outputs something. That way we can verify that the returned FunctionRunResult is what we expect.

For example, the input could be something like { value: 1 } and the function is expected to output value + 1. We could also verify that it logs something like hello world.

refactor: move engine function outside of the main

ce7284a

hainsdominic requested review from DuncanUszkay1, andrewhassan, jianghong and smacleodify May 25, 2022 21:00