Your current environment
The bug is not related to the envirement
Model Input Dumps
The bug does not related to the model
🐛 Describe the bug
QUESTION 1:
How do you calculate the RequestMetrics in RequestOutput please look at screen-shot below (in YELLOW):

I have found here in L. 696 that last_token_time is equal to arrival_time !!! IS IT A BUG?
Could you please tell me what unit is the time is it second? nanosecond? I believe it is something like this example below (correct me if I am wrong):
import time
arrival_time = time.perf_counter()
QUESTION 2:
How can I calculate the tokens/second (for output), TTFT, TBT, throughput and total time
Before submitting a new issue...
Your current environment
The bug is not related to the envirement
Model Input Dumps
The bug does not related to the model
🐛 Describe the bug
QUESTION 1:
How do you calculate the
RequestMetricsinRequestOutputplease look at screen-shot below (in YELLOW):I have found here in L. 696 that
last_token_timeis equal toarrival_time!!! IS IT A BUG?Could you please tell me what unit is the time is it second? nanosecond? I believe it is something like this example below (correct me if I am wrong):
QUESTION 2:
How can I calculate the tokens/second (for output), TTFT, TBT, throughput and total time
Before submitting a new issue...