Skip to content

batch: different behaviour for all-items-failed in cold start vs. warm start #1021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
huonw opened this issue Feb 10, 2022 · 3 comments · Fixed by #1022
Closed

batch: different behaviour for all-items-failed in cold start vs. warm start #1021

huonw opened this issue Feb 10, 2022 · 3 comments · Fixed by #1022
Labels
bug Something isn't working p1

Comments

@huonw
Copy link
Contributor

huonw commented Feb 10, 2022

We're attempting to use aws_lambda_powertools.utilities.batch to handle retries etc. for an SQS queue. We're observing differences between a cold lambda and a warm lambda (i.e. previously processed an SQS message) for a batch containing a single failing message:

  • cold lambda: we get the BatchProcessingError: All records failed processing. 1 individual errors loggedseparately below. exception (i.e. the whole invocation fails)
  • warm lambda: we get a successful invocation reporting {"batchItemFailures": ...}

This difference in behaviour makes us a bit suspicious that there's a deeper bug hiding?

Expected Behavior

The cold and warm lambdas should behave the same.

Current Behavior

Cold lambdas have the invocation fail, warm lambdas have it succeed.

Possible Solution

🤷

Steps to Reproduce (for bugs)

This is the distilled version of our code, without the business logic.

from typing import Any
from aws_lambda_powertools.logging import Logger
from aws_lambda_powertools.tracing import Tracer
from aws_lambda_powertools.utilities.batch import (
    batch_processor,
    BatchProcessor,
    EventType,
)
from aws_lambda_powertools.utilities.data_classes.sqs_event import SQSRecord

logger = Logger()
tracer = Tracer()

from . import client

processor = BatchProcessor(event_type=EventType.SQS)


@tracer.capture_method
def record_handler(record: SQSRecord) -> None:
    raise Exception()

@logger.inject_lambda_context
@tracer.capture_lambda_handler
@batch_processor(record_handler=record_handler, processor=processor)
def handler(event: Any, context: Any) -> Any:
    response = processor.response()
    logger.info(response)
    return response

We observe the difference when the lambda is invoked with a batch containing a single record. I haven't checked for batches with multiple records.

Environment

  • Powertools version used: 1.24.2
  • Packaging format (Layers, PyPi): layers
  • AWS Lambda function runtime: python3.9
  • Debugging logs

Cold start:

START RequestId: 3bd01a85-d651-5443-a0b9-40fa74025d5e Version: $LATEST
2022-02-10 23:30:39,023 aws_lambda_powertools.tracing.tracer [DEBUG] Annotating cold start
2022-02-10 23:30:39,023 aws_lambda_powertools.tracing.tracer [DEBUG] Calling lambda handler
2022-02-10 23:30:39,024 aws_lambda_powertools.tracing.tracer [DEBUG] Calling method: record_handler
2022-02-10 23:30:39,024 aws_lambda_powertools.tracing.tracer [ERROR] Exception received from 'record_handler' method
Traceback (most recent call last):
File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 632, in decorate
response = method(*args, **kwargs)
File "/var/task/shared/notifications/pinpoint_sender.py", line 22, in record_handler
raise Exception()
Exception
2022-02-10 23:30:39,024 aws_lambda_powertools.utilities.batch.base [DEBUG] Record processing exception: <class 'Exception'>:
2022-02-10 23:30:39,024 aws_lambda_powertools.middleware_factory.factory [ERROR] Caught exception in batch_processor
Traceback (most recent call last):
File "/opt/python/aws_lambda_powertools/middleware_factory/factory.py", line 133, in wrapper
response = middleware()
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 197, in batch_processor
processor.process()
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 95, in __exit__
self._clean()
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 381, in _clean
raise BatchProcessingError(
aws_lambda_powertools.utilities.batch.exceptions.BatchProcessingError: All records failed processing. 1 individual errors loggedseparately below.
Traceback (most recent call last):
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 367, in _process_record
result = self.handler(record=data)
File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 632, in decorate
response = method(*args, **kwargs)
File "/var/task/shared/notifications/pinpoint_sender.py", line 22, in record_handler
raise Exception()
Exception
2022-02-10 23:30:39,025 aws_lambda_powertools.tracing.tracer [ERROR] Exception received from handler
Traceback (most recent call last):
File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 315, in decorate
response = lambda_handler(event, context, **kwargs)
File "/opt/python/aws_lambda_powertools/middleware_factory/factory.py", line 133, in wrapper
response = middleware()
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 197, in batch_processor
processor.process()
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 95, in __exit__
self._clean()
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 381, in _clean
raise BatchProcessingError(
aws_lambda_powertools.utilities.batch.exceptions.BatchProcessingError: All records failed processing. 1 individual errors loggedseparately below.
Traceback (most recent call last):
File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 367, in _process_record
result = self.handler(record=data)
File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 632, in decorate
response = method(*args, **kwargs)
File "/var/task/shared/notifications/pinpoint_sender.py", line 22, in record_handler
raise Exception()
Exception
[ERROR] BatchProcessingError: All records failed processing. 1 individual errors loggedseparately below.Traceback (most recent call last):  File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 367, in _process_record    result = self.handler(record=data)  File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 632, in decorate    response = method(*args, **kwargs)  File "/var/task/shared/notifications/pinpoint_sender.py", line 22, in record_handler    raise Exception()ExceptionTraceback (most recent call last):  File "/opt/python/aws_lambda_powertools/logging/logger.py", line 354, in decorate    return lambda_handler(event, context)  File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 315, in decorate    response = lambda_handler(event, context, **kwargs)  File "/opt/python/aws_lambda_powertools/middleware_factory/factory.py", line 133, in wrapper    response = middleware()  File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 197, in batch_processor    processor.process()  File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 95, in __exit__    self._clean()  File "/opt/python/aws_lambda_powertools/utilities/batch/base.py", line 381, in _clean    raise BatchProcessingError(
END RequestId: 3bd01a85-d651-5443-a0b9-40fa74025d5e
REPORT RequestId: 3bd01a85-d651-5443-a0b9-40fa74025d5e	Duration: 4.89 ms	Billed Duration: 5 ms	Memory Size: 1024 MB	Max Memory Used: 131 MB	Init Duration: 2747.49 ms	XRAY TraceId: 1-6205a01b-7a6ccc85114d85b16cae1bf2	SegmentId: 027240f94d75adf6	Sampled: true

Warm start:

START RequestId: 4c561c3a-8657-5c98-bf47-46452430d17f Version: $LATEST
2022-02-10 23:31:05,393 aws_lambda_powertools.tracing.tracer [DEBUG] Annotating cold start
2022-02-10 23:31:05,394 aws_lambda_powertools.tracing.tracer [DEBUG] Calling lambda handler
2022-02-10 23:31:05,394 aws_lambda_powertools.tracing.tracer [DEBUG] Calling method: record_handler
2022-02-10 23:31:05,394 aws_lambda_powertools.tracing.tracer [ERROR] Exception received from 'record_handler' method
Traceback (most recent call last):
File "/opt/python/aws_lambda_powertools/tracing/tracer.py", line 632, in decorate
response = method(*args, **kwargs)
File "/var/task/shared/notifications/pinpoint_sender.py", line 22, in record_handler
raise Exception()
Exception
2022-02-10 23:31:05,394 aws_lambda_powertools.utilities.batch.base [DEBUG] Record processing exception: <class 'Exception'>:
{     "level": "INFO",     "location": "/var/task/shared/notifications/pinpoint_sender.py:handler:56",     "message": {         "batchItemFailures": [             {                 "itemIdentifier": "490fd65e-664f-48be-b264-0a569ba14a51"             }         ]     },     "timestamp": "2022-02-10 23:31:05,394+0000",     "service": "api",     "sampling_rate": "1",     "commit": "6b83e37dab11205fd9a78a08d3221b8a405a5c6e",     "cold_start": false,     "function_name": "serverless-pr-2250-pinpoint_sender",     "function_memory_size": "1024",     "function_arn": "arn:aws:lambda:ap-southeast-2:561074087650:function:serverless-pr-2250-pinpoint_sender",     "function_request_id": "4c561c3a-8657-5c98-bf47-46452430d17f",     "xray_trace_id": "1-6205a039-5b188aab3a9891836e5a5de5" }
2022-02-10 23:31:05,394 aws_lambda_powertools.tracing.tracer [DEBUG] Received lambda handler response successfully
END RequestId: 4c561c3a-8657-5c98-bf47-46452430d17f
REPORT RequestId: 4c561c3a-8657-5c98-bf47-46452430d17f	Duration: 2.68 ms	Billed Duration: 3 ms	Memory Size: 1024 MB	Max Memory Used: 131 MB	XRAY TraceId: 1-6205a039-5b188aab3a9891836e5a5de5	SegmentId: 1ad8d244381e4642	Sampled: true

@huonw huonw added bug Something isn't working triage Pending triage from maintainers labels Feb 10, 2022
michaelbrewer added a commit to gyft/aws-lambda-powertools-python that referenced this issue Feb 11, 2022
@michaelbrewer
Copy link
Contributor

michaelbrewer commented Feb 11, 2022

@heitorlessa @huonw - a fix (#1022) with unit tests validating it

@heitorlessa heitorlessa removed the triage Pending triage from maintainers label Feb 14, 2022
@heitorlessa
Copy link
Contributor

Uh oh, that's definitely a critical bug caused by the many refactorings when working on the doc - thank you @michaelbrewer for the fix!

@huonw Gonna work on this now before meetings start so we can have this emergency fix released.

@heitorlessa heitorlessa added p1 pending-release Fix or implementation already in dev waiting to be released labels Feb 14, 2022
@github-actions
Copy link
Contributor

This is now released under 1.25.10 version!

@github-actions github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants