Added export of error metric#3956
Added export of error metric#3956vladvildanov wants to merge 1 commit intofeat/async-observabilityfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR instruments the asyncio Redis client/cluster/connection code paths to emit the resiliency “error count” metric (including network peer attributes and retry-attempt metadata), and adds/updates asyncio tests to validate the emitted metric attributes.
Changes:
- Extend
redis.asyncio.retry.Retry.call_with_retry()to optionally pass a failure counter into the failure callback. - Emit
record_error_count()from asyncio client/cluster/connection error paths with peer + retry metadata. - Add/expand asyncio unit tests validating error metric emission and attributes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
redis/asyncio/retry.py |
Adds optional with_failure_count + is_retryable support to propagate retry/failure counts. |
redis/asyncio/client.py |
Records error-count metrics on async command execution failures, capturing retry attempt counts. |
redis/asyncio/connection.py |
Records error-count metrics on connect failures; adds connection-count helper and extends disconnect() signature for potential error reporting. |
redis/asyncio/cluster.py |
Records error-count metrics for cluster execution failures and propagates connection context through exceptions. |
tests/test_asyncio/test_observability/test_recorder.py |
Expands recorder tests to validate additional attributes and is_internal behavior. |
tests/test_asyncio/test_client.py |
New tests verifying async client calls result in error metrics being recorded. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| operation_name = "MULTI" | ||
| else: | ||
| execute = self._execute_pipeline | ||
| operation_name = "PIPELINE" |
There was a problem hiding this comment.
operation_name is set in both branches but never used. This will raise a Ruff F841 lint error; remove it or incorporate it into the metrics/error recording logic.
| operation_name = "MULTI" | |
| else: | |
| execute = self._execute_pipeline | |
| operation_name = "PIPELINE" | |
| else: | |
| execute = self._execute_pipeline |
|
|
||
| # Track actual retry attempts for error reporting | ||
| actual_retry_attempts = [0] | ||
| stack_len = len(stack) |
There was a problem hiding this comment.
stack_len is computed but never used, which will trigger Ruff F841. Remove it or use it as the batch_size/context attribute if that's the intent.
| stack_len = len(stack) |
| from ..observability.attributes import AttributeBuilder, DB_CLIENT_CONNECTION_POOL_NAME, DB_CLIENT_CONNECTION_STATE, \ | ||
| ConnectionState |
There was a problem hiding this comment.
This import uses a backslash line-continuation and packs multiple names onto one line, which is inconsistent with the surrounding import style and is likely to be reformatted by automated formatters. Prefer wrapping the import in parentheses and one name per line (or group) to keep it formatter-friendly.
| from ..observability.attributes import AttributeBuilder, DB_CLIENT_CONNECTION_POOL_NAME, DB_CLIENT_CONNECTION_STATE, \ | |
| ConnectionState | |
| from ..observability.attributes import ( | |
| AttributeBuilder, | |
| DB_CLIENT_CONNECTION_POOL_NAME, | |
| DB_CLIENT_CONNECTION_STATE, | |
| ConnectionState, | |
| ) |
| async def disconnect( | ||
| self, | ||
| nowait: bool = False, | ||
| error: Optional[Exception] = None, | ||
| failure_count: Optional[int] = None, | ||
| health_check_failed: bool = False, | ||
| ) -> None: |
There was a problem hiding this comment.
disconnect() now accepts error/failure_count, and the metric emission depends on them, but there are currently no internal call sites passing these arguments (all calls are disconnect() / disconnect(nowait=...)). As a result, this new error-metric path is effectively dead code; either update retry failure callbacks to pass error and failure_count (using with_failure_count=True) or remove these params/logic to avoid misleading behavior.
| import copy | ||
| import inspect | ||
| import re | ||
| import time |
There was a problem hiding this comment.
time is imported but never used in this module, which will trigger Ruff/flake8 unused-import lint failures. Remove the import or use it where intended.
| import time |
| if not len(args) == 0: | ||
| command_name = args[0] | ||
| else: | ||
| command_name = None |
There was a problem hiding this comment.
command_name is assigned but never used. This will raise a Ruff F841 (unused local variable) lint error; either use it (e.g., for observability attributes) or remove the assignment.
| if not len(args) == 0: | |
| command_name = args[0] | |
| else: | |
| command_name = None |
Description of change
Please provide a description of the change here.
Pull Request check-list
Please make sure to review and check all of these items:
NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.