-
Notifications
You must be signed in to change notification settings - Fork 14
feat: Add StatusMessageWatcher
#407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
a71ae41
TODO: Figure out hwo to mock response with steram
Pijukatel 69ff84c
WIP
Pijukatel 862cacc
Polish spliting of messages and setting the log level
Pijukatel 753427a
Draft with async implementation and example tests
Pijukatel cc0d944
Add `raw=True`
Pijukatel cbcabd3
Add chunck processing
Pijukatel 81577e8
Merge remote-tracking branch 'origin/master' into redirected-actor-logs
Pijukatel b9bc44d
Add sync version of the logging.
Pijukatel 9720327
Finalize, update comments
Pijukatel 85ead2f
Add `from_start` argument for streaming from stand-by actors
Pijukatel 4ad39fa
Skip first logs based on datetime of the marker
Pijukatel 74595f9
Self review.
Pijukatel cba571f
Handle bytestream edgecase of chunk containing only half of the multi…
Pijukatel 02a1eb2
Review comments
Pijukatel 2674cf2
Remove unnecessary `actor_name` argument
Pijukatel 2a6f2ec
Update split pattern to deal with multiple times redirected log
Pijukatel 1263450
Review comment
Pijukatel b1338f1
Regenerate `uv.lock` with new version of `uv`
Pijukatel 669a749
Test data time alignment.
Pijukatel 737cde9
Add status redirector
Pijukatel 2914e50
TODO: Finalize tests
Pijukatel 8fbbffa
Finalize tests.
Pijukatel 8e70e59
Merge remote-tracking branch 'origin/master' into redirect-status-mes…
Pijukatel a3a629e
Update syntax to avoid https://github.com/PyCQA/redbaron/issues/212
Pijukatel 18f4f51
Update client names in tests to match their type
Pijukatel 268e568
Review comments
Pijukatel 335b8c3
Properly set _force_propagate
Pijukatel 1e5e976
Use whitespace in default redirect logger name instead of `-`
Pijukatel 350fc67
Review comments
Pijukatel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,9 +4,10 @@ | |
import logging | ||
import re | ||
import threading | ||
import time | ||
from asyncio import Task | ||
from contextlib import asynccontextmanager, contextmanager | ||
from datetime import datetime, timezone | ||
from datetime import datetime, timedelta, timezone | ||
from threading import Thread | ||
from typing import TYPE_CHECKING, Any, cast | ||
|
||
|
@@ -23,6 +24,8 @@ | |
import httpx | ||
from typing_extensions import Self | ||
|
||
from apify_client.clients import RunClient, RunClientAsync | ||
|
||
|
||
class LogClient(ResourceClient): | ||
"""Sub-client for manipulating logs.""" | ||
|
@@ -228,9 +231,9 @@ def __init__(self, to_logger: logging.Logger, *, from_start: bool = True) -> Non | |
logs for long-running actors in stand-by. | ||
|
||
""" | ||
self._to_logger = to_logger | ||
if self._force_propagate: | ||
to_logger.propagate = True | ||
self._to_logger = to_logger | ||
self._stream_buffer = list[bytes]() | ||
self._split_marker = re.compile(rb'(?:\n|^)(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z)') | ||
self._relevancy_time_limit: datetime | None = None if from_start else datetime.now(tz=timezone.utc) | ||
|
@@ -350,13 +353,16 @@ def start(self) -> Task: | |
self._streaming_task = asyncio.create_task(self._stream_log()) | ||
return self._streaming_task | ||
|
||
def stop(self) -> None: | ||
async def stop(self) -> None: | ||
"""Stop the streaming task.""" | ||
if not self._streaming_task: | ||
raise RuntimeError('Streaming task is not active') | ||
|
||
self._streaming_task.cancel() | ||
self._streaming_task = None | ||
try: | ||
await self._streaming_task | ||
except asyncio.CancelledError: | ||
self._streaming_task = None | ||
|
||
async def __aenter__(self) -> Self: | ||
"""Start the streaming task within the context. Exiting the context will cancel the streaming task.""" | ||
|
@@ -367,7 +373,7 @@ async def __aexit__( | |
self, exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None | ||
) -> None: | ||
"""Cancel the streaming task.""" | ||
self.stop() | ||
await self.stop() | ||
|
||
async def _stream_log(self) -> None: | ||
async with self._log_client.stream(raw=True) as log_stream: | ||
|
@@ -378,3 +384,163 @@ async def _stream_log(self) -> None: | |
|
||
# If the stream is finished, then the last part will be also processed. | ||
self._log_buffer_content(include_last_part=True) | ||
|
||
|
||
class StatusMessageWatcher: | ||
"""Utility class for logging status messages from another Actor run. | ||
|
||
Status message is logged at fixed time intervals, and there is no guarantee that all messages will be logged, | ||
especially in cases of frequent status message changes. | ||
""" | ||
|
||
_force_propagate = False | ||
# This is final sleep time to try to get the last status and status message of finished Actor run. | ||
# The status and status message can get set on the Actor run with a delay. Sleep time does not guarantee that the | ||
# final message will be captured, but increases the chances of that. | ||
_final_sleep_time_s = 6 | ||
|
||
def __init__(self, *, to_logger: logging.Logger, check_period: timedelta = timedelta(seconds=5)) -> None: | ||
"""Initialize `StatusMessageWatcher`. | ||
|
||
Args: | ||
to_logger: The logger to which the status message will be redirected. | ||
check_period: The period with which the status message will be polled. | ||
""" | ||
if self._force_propagate: | ||
to_logger.propagate = True | ||
self._to_logger = to_logger | ||
self._check_period = check_period.total_seconds() | ||
self._last_status_message = '' | ||
|
||
def _log_run_data(self, run_data: dict[str, Any] | None) -> bool: | ||
"""Get relevant run data, log them if changed and return `True` if more data is expected. | ||
|
||
Args: | ||
run_data: The dictionary that contains the run data. | ||
|
||
Returns: | ||
`True` if more data is expected, `False` otherwise. | ||
""" | ||
if run_data is not None: | ||
status = run_data.get('status', 'Unknown status') | ||
status_message = run_data.get('statusMessage', '') | ||
new_status_message = f'Status: {status}, Message: {status_message}' | ||
|
||
if new_status_message != self._last_status_message: | ||
self._last_status_message = new_status_message | ||
self._to_logger.info(new_status_message) | ||
|
||
return not (run_data.get('isStatusMessageTerminal', False)) | ||
return True | ||
|
||
|
||
class StatusMessageWatcherAsync(StatusMessageWatcher): | ||
"""Async variant of `StatusMessageWatcher` that is logging in task.""" | ||
|
||
def __init__( | ||
self, *, run_client: RunClientAsync, to_logger: logging.Logger, check_period: timedelta = timedelta(seconds=1) | ||
) -> None: | ||
"""Initialize `StatusMessageWatcherAsync`. | ||
|
||
Args: | ||
run_client: The client for run that will be used to get a status and message. | ||
to_logger: The logger to which the status message will be redirected. | ||
check_period: The period with which the status message will be polled. | ||
""" | ||
super().__init__(to_logger=to_logger, check_period=check_period) | ||
self._run_client = run_client | ||
self._logging_task: Task | None = None | ||
|
||
def start(self) -> Task: | ||
"""Start the logging task. The caller has to handle any cleanup by manually calling the `stop` method.""" | ||
if self._logging_task: | ||
raise RuntimeError('Logging task already active') | ||
self._logging_task = asyncio.create_task(self._log_changed_status_message()) | ||
return self._logging_task | ||
|
||
async def stop(self) -> None: | ||
"""Stop the logging task.""" | ||
if not self._logging_task: | ||
raise RuntimeError('Logging task is not active') | ||
|
||
self._logging_task.cancel() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid there might be GC-related warnings if you don't await the task (docs) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, added |
||
try: | ||
await self._logging_task | ||
except asyncio.CancelledError: | ||
self._logging_task = None | ||
|
||
async def __aenter__(self) -> Self: | ||
"""Start the logging task within the context. Exiting the context will cancel the logging task.""" | ||
self.start() | ||
return self | ||
|
||
async def __aexit__( | ||
self, exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None | ||
) -> None: | ||
"""Cancel the logging task.""" | ||
await asyncio.sleep(self._final_sleep_time_s) | ||
await self.stop() | ||
|
||
async def _log_changed_status_message(self) -> None: | ||
while True: | ||
run_data = await self._run_client.get() | ||
if not self._log_run_data(run_data): | ||
break | ||
await asyncio.sleep(self._check_period) | ||
|
||
|
||
class StatusMessageWatcherSync(StatusMessageWatcher): | ||
"""Sync variant of `StatusMessageWatcher` that is logging in thread.""" | ||
|
||
def __init__( | ||
self, *, run_client: RunClient, to_logger: logging.Logger, check_period: timedelta = timedelta(seconds=1) | ||
) -> None: | ||
"""Initialize `StatusMessageWatcherSync`. | ||
|
||
Args: | ||
run_client: The client for run that will be used to get a status and message. | ||
to_logger: The logger to which the status message will be redirected. | ||
check_period: The period with which the status message will be polled. | ||
""" | ||
super().__init__(to_logger=to_logger, check_period=check_period) | ||
self._run_client = run_client | ||
self._logging_thread: Thread | None = None | ||
self._stop_logging = False | ||
|
||
def start(self) -> Thread: | ||
"""Start the logging thread. The caller has to handle any cleanup by manually calling the `stop` method.""" | ||
if self._logging_thread: | ||
raise RuntimeError('Logging thread already active') | ||
self._stop_logging = False | ||
self._logging_thread = threading.Thread(target=self._log_changed_status_message) | ||
self._logging_thread.start() | ||
return self._logging_thread | ||
|
||
def stop(self) -> None: | ||
"""Signal the _logging_thread thread to stop logging and wait for it to finish.""" | ||
if not self._logging_thread: | ||
raise RuntimeError('Logging thread is not active') | ||
time.sleep(self._final_sleep_time_s) | ||
self._stop_logging = True | ||
self._logging_thread.join() | ||
self._logging_thread = None | ||
self._stop_logging = False | ||
|
||
def __enter__(self) -> Self: | ||
"""Start the logging task within the context. Exiting the context will cancel the logging task.""" | ||
self.start() | ||
return self | ||
|
||
def __exit__( | ||
self, exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None | ||
) -> None: | ||
"""Cancel the logging task.""" | ||
self.stop() | ||
|
||
def _log_changed_status_message(self) -> None: | ||
while True: | ||
if not self._log_run_data(self._run_client.get()): | ||
break | ||
if self._stop_logging: | ||
break | ||
time.sleep(self._check_period) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these new classes be exposed publicly? They seem like implementation details to me - you usually create these using helper methods on the resource client, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper methods on clients are convenient constructors for these classes, but the user will interact with them directly calling either
start
,close
or using them as context managers.(From
ActorClient
point of view this is indeed implementation detail hidden in thecall
method, but fromRunClient
point of view it is actual public return value of one of the public method.)