Hi all,
Backend-Infra team have landed the changes to exclude latency of failed requests from the percentile calculation (https://phabricator.noc.tvlk.cloud/D89437). Previously, the metrics ltcy.p95
use the latency of all requests during calculation. But they can be skewed if there are failed requests that take a long time (increasing ltcy.p95
) or if there are requests that fail very fast (reducing ltcy.p95
). Please let us know if you see any anomaly after deploying your application to test/staging/production environment.
Note: the changes only affect the RPC and API methods.
Thank you!