[RFC] Profiling Extensibility

### Is your feature request related to a problem? Please describe

Currently, the core profiler only supports query timing information for queries (times weight creation, scoring, matching, etc.). Two limitations arise from this: a user cannot add more information to the breakdown and a plugin cannot add any profiling information. It would be useful for any plugin that is used in the search process to be able to contribute profiling information, if it wants to provide telemetry on the search query within its plugin. Currently, there are no extension points for plugins to add profiling information. The first limitation arises from the fact that the code is too narrowly focused on basic query timing information by strictly using the `QueryTimingType`  enum. Abstracting some of these classes further allows room to design solutions for plugins to add information. In addition to not being able to supply additional timing information, the current implementation does not allow non-timing information from being in the breakdown (e.g. size of result set after a filter). Extending the capability of the profiler would allow plugins to add useful profiling information. 

### Describe the solution you'd like

# Context

This is the general design on the profiler and how it interacts with search:

![Image](https://github.com/user-attachments/assets/0fdb4c28-d69e-41fd-b59d-25378366573a)


### Profile Breakdown

#### Current Breakdown Structure
![Image](https://github.com/user-attachments/assets/d8028eb3-4db0-4c7f-a8f1-d7e76a716613)

#### Proposed Breakdown Structure
![Image](https://github.com/user-attachments/assets/0a905bac-f687-457b-a78f-deaa3284c65f)

This is the key component of the profiler. Each type of profiler has it’s own information it wants to display through use of a breakdown. Currently, it only supports timing information but abstracting it away allows for more than timing information:

```
public abstract class AbstractProfileBreakdown {

    /** Sole constructor. */
    public AbstractProfileBreakdown() {}
    
    /**
    * Gather important metrics for current instance
    */
    abstract public Map<String, Long> toImportantMetricsMap();

    /**
     * Build a breakdown for current instance
     */
    abstract public Map<String, Long> toBreakdownMap();

    /**
     * Fetch extra debugging information.
     */
    public Map<String, Object> toDebugMap() {
        return emptyMap();
    }
} 
```



### Profile Timing Breakdown

Things like queries or something in a plugin might want to time specific information, so it would be nice to have another abstraction just for timing information where the timers are a map based on a enum specified for this breakdown:

```
/**
 * Base class for all timing profile breakdowns.
 */
public abstract class AbstractTimingProfileBreakdown extends AbstractProfileBreakdown {

    protected final Map<String, Timer> timers = new HashMap<>();

    public AbstractTimingProfileBreakdown() {}

    public Timer getTimer(String type) {
        ...
    }

    public long toNodeTime() {
        ...
    }

    /**
     * Build a timing count breakdown for current instance
     */
    @Override
    public Map<String, Long> toBreakdownMap() {
        ...
    }
}

```


### Profile Tree

This is another important part of the profiler. It keeps track of each breakdown in the query tree by constructing its own tree. It does not need any change from the current implementation. Each profiler has their own subclass of a profile tree because it needs to return the specific breakdown instance of that profiler. These subclasses can also contain other information (e.g. timing the rewrite of a query). The tree keeps track of the breakdown instances along with the queries associated with the breakdown.

`public abstract class AbstractInternalProfileTree<PB extends AbstractProfileBreakdown, E>
`


### Profile Result

This gets created after the query is done and the entire ProfileTree is needed. Each breakdown gets converted to a result. In order to statically parse any ProfileResult, a new map `importantMetrics` had to be added to display this top-level information. Currently, `time_in_nanos` and other information is displayed in this:

```
public class ProfileResult implements Writeable, ToXContentObject {
    protected final String type;
    protected final String description;
    protected final Map<String, Long> importantMetrics;
    protected final Map<String, Long> breakdown;
    protected final Map<String, Object> debug;
    protected List<ProfileResult> children;
}
```



### Profile Shard Result

A profile result gets converted into a profile shard result to be sent to other nodes. This should be abstracted in case plugins (or aggregations) don’t need to send additional shard-level information (rewrite time and collector info in the case of a query).


### Profiler

This was already abstracted, but a profile shard result generic was added to make it cleaner to create a shard result based on the profiler.

```
public abstract class AbstractProfiler<PB extends AbstractProfileBreakdown, E, SR extends AbstractProfileShardResult> {
    ...
    public abstract SR createProfileShardResult();
    
}
```



### Profilers

This is what actually gets created when the profile flag is specified. It has a list of `QueryProfiler`  (starts off with one query profiler, but aggregation adds more to the list), and an `AggregationProfiler` . For the Solution 1 PoC, a list of `AbstractProfiler`  for plugins was added. It uses the `SearchPlugin`  to get the plugins’ profilers (if specified) and adds to the list. Importantly, this is where concurrency is specified when creating profilers. With solution 1 (just like the current code), all these classes have to have a concurrent subclass to specify what to do in the case of concurrency.


# Proposed Solutions

The proposal has two high-level suggestions:

### 1) Single Profiler design

The SearchPlugin  has a hook that accepts `Class<AbstractTimingProfileBreakdown>` . This means that the core side only has 1 profile tree structure (`InternalQueryProfileTree` ) and the query profiler maintains the reference to the plugin profile breakdown class. When the profile tree needs to create an instance, it will just instantiate it. The plugin would only have to create an `AbstractProfileBreakdown`  subclass.

This would be the theoretical output of this design:

```
"profile" : {
    "shards" : [
      {
        "id" : "[pxAj3PB9QPqb4R-YcH-yMQ][test_index][0]",
        "inbound_network_time_in_millis" : 0,
        "outbound_network_time_in_millis" : 0,
        "searches" : [
          {
            "query" : [
              {
                "type" : "KNNQuery",
                "description" : "",
                "important_metrics" : {
                  "avg_slice_time_in_nanos" : 0,
                  "min_slice_time_in_nanos" : 123456,
                  "time_in_nanos" : 123456,
                  "max_slice_time_in_nanos" : 123456
                },
                "breakdown" : {
                  "avg_score_count" : 4,
                  "next_doc" : 2000625,
                  "score_count" : 19,
                  "score" : 1991375,
                  "max_next_doc_count" : 15,
                  "ann_search" : 19130750,
                  "search_leaf_count" : 10,
                  "bitset_creation" : 18868750,
                  /* Both plugin info and query info in same breakdown */
                  "build_scorer" : 20872417,
                  "avg_ann_search_count" : 2,
                  "create_weight" : 196542,
                  "avg_next_doc_count" : 7,
                  "ann_search_count" : 10,
                  "max_search_leaf_count" : 3,
                  "min_score_count" : 2
                }
              }
            ],
            "rewrite_time" : 12207,
            "collector" : [
              {
                ...
              }
            ]
          }
        ],
        "aggregations" : [ ]
      }
    ]
  }
```

### 2) Multi-profiler design

The SearchPlugin  has a hook that accepts AbstractProfiler . Profilers maintains a list of profilers provided by any plugin. The final output consists of a separate tree structure for each plugin profiler. The plugin would have to implement many classes such as AbstractProfileBreakdown, AbstractProfiler , AbstractProfileTree , AbstractShardResult , etc. A query can be done concurrently and the profiler has separate classes (profile tree, profile breakdown, etc) to handle the profiling stats across slices. Since the plugin would have to implement its own profiler, it would also have to implement the concurrent versions of these classes. An enum would be used for timing, but is specific to each profiler (core would only have to worry about the basic query timing types). 

This would be the theoretical output of this design:

```
"profile" : {
    "shards" : [
      {
        "id" : "[pxAj3PB9QPqb4R-YcH-yMQ][test_index][0]",
        "inbound_network_time_in_millis" : 0,
        "outbound_network_time_in_millis" : 0,
        "searches" : [
          {
            "query" : [
              {
                "type" : "KNNQuery",
                "description" : "",
                "important_metrics" : {
                  "avg_slice_time_in_nanos" : 0,
                  "min_slice_time_in_nanos" : 123456,
                  "time_in_nanos" : 123456,
                  "max_slice_time_in_nanos" : 123456
                },
                "breakdown" : {
                  "avg_score_count" : 4,
                  "next_doc" : 2000625,
                  "score_count" : 19,
                  "score" : 1991375,
                  "max_next_doc_count" : 15,
                  /* Query timing info shown here  */
                  "build_scorer" : 20872417,
                  "create_weight" : 196542,
                  "avg_next_doc_count" : 7,
                  "min_score_count" : 2
                }
              }
            ],
            "rewrite_time" : 12207,
            "collector" : [
              {
                ...
              }
            ]
          }
        ],
        "plugins" : [
          {
            "knn-query" : [
              {
                "type" : "KNNQuery",
                "description" : "",
                "breakdown" : {
                  "search_leaf_count" : 10,
                  "ann_search" : 58718123,
                  "bitset_creation" : 21792,
                  /* Plugin-specific info shown here */
                  "search_leaf" : 60492126,
                  "cardinality" : 0,
                  "exact_search" : 0
                }
              }
            ]
          }
        ],
        "aggregations" : [ ]
      }
    ]
  }


```

### Related component

Search:Query Insights

### Describe alternatives you've considered

**Concurrency**

Running a query non-concurrently means that a single profile breakdown is generated per `LeafReaderContext`  and each segment belonging to that context gets sequentially processed and their metrics sequentially updated. However, when running concurrently, each segment gets split up into slices to run parallel. This means that there’s multiple `LeafReaderContext` s generated (one per slice). Therefore, once the profiling is done on each slice, we need to aggregate the profile results: each metric across each slice has the min/max/avg calculated providing concurrency information in the breakdown. In addition, each metric across each slice will be summed to get the final result for that particular query in the profile tree (what the non-concurrent execution would normally do). With Solution 1, this is quite easy because the plugin profile breakdown belongs to the query profile breakdown, so once the results need to get aggregated, it also just aggregates the plugin breakdown metrics. However, with Solution 2 making the plugin have its own profiler, the plugin needs to do all the programming for the aggregation of the plugin metrics. For Solution 2, some sort of abstracted class could be created that does the aggregation for a single metric. This would make it simpler for plugins using Solution 1 to create a concurrent profiler. Without this, the user would need to code each class concurrently: profiler, profile tree, profile breakdown.


**Enum elimination**

Enums are used for building timers and used to keep track of it which timer to start timing. Ultimately only `QueryTimingType` enum gets used inside the concurrent breakdown builder. This does not allow for plugins to add their own timers and have it displayed in the breakdown. Instead, it would be ideal to not have an enum structure in core and have the timing breakdown dynamically create timers and keep track based on maybe a string sent in. Then the concurrent breakdown wouldn’t have to be solely based on one specific enum and can iterate through all timers.


### Additional context

https://github.com/opensearch-project/OpenSearch/pull/17146

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Profiling Extensibility #18460

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Context

Profile Breakdown

Current Breakdown Structure

Proposed Breakdown Structure

Profile Timing Breakdown

Profile Tree

Profile Result

Profile Shard Result

Profiler

Profilers

Proposed Solutions

1) Single Profiler design

2) Multi-profiler design

Related component

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Profiling Extensibility #18460

Description

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Context

Profile Breakdown

Current Breakdown Structure

Proposed Breakdown Structure

Profile Timing Breakdown

Profile Tree

Profile Result

Profile Shard Result

Profiler

Profilers

Proposed Solutions

1) Single Profiler design

2) Multi-profiler design

Related component

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions