Skip to content

Provider get() is called too many times #305

Closed
@seanmil

Description

@seanmil

Use Case

The get() method on a provider is currently called too many times. The operations performed by the get() method can often be somewhat time consuming (e.g. slow command, slow API). The Resource API specification at https://github.com/puppetlabs/puppet-specifications/blob/master/language/resource-api/README.md acknowledges that when it states that calls to get() should be minimal:

The problem however is that, in practice, the calls to get are not always minimal.

Here are the problems with the current implementation:

  • The instances() method is called in primarily (only?) two scenarios: 1) "puppet resource" is used. and 2) By resources which use "resource generation" (probably to support purging: see "resources" and "crayfishx/purge". instances() attempts to cache results with the "cache_current_state" method, but that caches state into each resource instance and those are not preserved between calls to instances() - so they are not available to other instances() calls nor to Puppet for the normal apply transaction.
  • When the provider does not use "simple_get_filter" the state is retrieved in refresh_current_state per resource instance, however, only the results for the sought resource are saved, discarding everything else returned by get. get() is called this way once per resource: https://github.com/puppetlabs/puppet-resource_api/blob/main/lib/puppet/resource_api.rb#L256

The case where get() is called (in a type which does not support "simple_get_filter") and returns all resources it is particularly bad because that is called to enumerate all the resources perhaps hundreds (or thousands) of times, for a use-heavy resource, with all the results but one discarded.

Describe the Solution You Would Like

Per the Resource API specification, the Resource API should actually attempt to minimize the calls to get(). Once a resource is retrieved, it should not be retrieved a second time.

The Resource API should establish and manage a cache which persists for the duration of a "puppet resource", "puppet apply" or single "puppet agent" transaction which contains any values fetched from the provider get().

Describe Alternatives You've Considered

Each provider could implement its own cache at the provider level. Alternatively, the Resource API could provide an enhancement or alternate to the SimpleProvider which implements caching entirely at the provider layer as a type of wrapper. However, if the goal is to only call get() once (or, no more than once per resource) then this should probably be implemented within the Resource API so that the person implementing a new RSAPI Provider need not concern themselves with the particulars of how Puppet/RSAPI gets resources.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions