Skip to content

Aggregation of execution context when partitioning [BATCH-53] #3524

Closed
@spring-projects-issues

Description

@spring-projects-issues

Wayne Lund opened BATCH-53 and commented

Batch Statistics Aggregation - we had a discussion some months back about modeling our schema so that the parent tables stored the summary, aggregated data for the detail at the next level down. We are getting questioned about how we are going to do this in our partitioning scenarios by clients. Chong and Lucas discussed yesterday and when Lucas and I reviewed we realized that we were looking at the issue a little differently. I remember a considerable noise at one of our clients because we didn't cover all of the scenarios correctly on aggregating statistics. For example, we had an issue getting statistics aggregated correctly on jobs that had been restarted multiple times to provide the true number of records processed. Here's some thoughts on what I think we're aggregating.

a. Skips & Records processed for the job ::= statistics for all records processed by the job regardless of how many skips. I remember in our early performance testing Mike Tsay kept a table of records down to the tracking of how many we were processing per time unit. We had five steps running on the CDX batch jobs, each having their own set of record types they processed (e.g. participant, addresses, employers, court cases, etc). POINT OF AGGREGATION
b. Skips & Records processed for the Partitioned Step ::= All records and skips for aggregation of all partitioned steps. POINT OF AGGREGATON
c. Skips & Records processed for a Step ::= all records within a single step. Different than B because he's only concerned about his own records.

A & B need to produce the same results whether a job has been restarted or not, in other words, the statistics and summaries should be the same whether restarts were involved or not. Would we meet project needs if we didn't expose status and statistics at this level of detail?


This issue is a sub-task of BATCH-677

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions