Skip to content

Netbox API returns duplicate resources during paging with offset #18729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dankotrajkovic opened this issue Feb 25, 2025 · 5 comments · Fixed by #18805
Closed

Netbox API returns duplicate resources during paging with offset #18729

dankotrajkovic opened this issue Feb 25, 2025 · 5 comments · Fixed by #18805
Assignees
Labels
severity: medium Results in substantial degraded or broken functionality for specfic workflows status: accepted This issue has been accepted for implementation type: bug A confirmed report of unexpected behavior in the application

Comments

@dankotrajkovic
Copy link

dankotrajkovic commented Feb 25, 2025

Deployment Type

Self-hosted

NetBox Version

v4.2.3

Python Version

3.12

Steps to Reproduce

Use Python or Postman API to GET the clusters from Netbox by paging through the resources.
The real-world use case is to load a larger list of clusters using the paging mechanism.

To reproduce Run the following code

import requests


def main():
    """
    Pull netbox clusters for demo.netbox.dev using limit=5
    The idea of small limit is just to simulate a bigger database where we have to do multiple
    requests of 50 items to load a larger list of for example 250 clusters.

    The idea is to show that within the returned items few duplicates appear.
    :return:
    """

    cluster_list = [] # Place to store the clusters after with each requests call
    cluster_unique_ids = set() # Set to store the unique IDs of clusters loaded from netbox


    # Collect the clusters
    headers = {
        'Accept': 'application/json',
        'Authorization': 'Token 6a768e6363830a536ffa07abf261c1d64d365b9a'
    }

    parameters = {
        'limit': 5,
        'offset': 0
    }
    while True:
        response = requests.get('https://demo.netbox.dev/api/virtualization/clusters', headers=headers,
                                params=parameters)
        if response.status_code == 200:
            print(f'Collected clusters from Netbox with offset: {parameters["offset"]}, limit: {parameters["limit"]}')
            cluster_list.extend(response.json()['results'])
            parameters['offset'] += parameters['limit']
        if  not response.json()['next']:
            break

    # Check if there are any duplicates in the clusters
    for cluster in cluster_list:
        if cluster['name'] not in cluster_unique_ids:
            cluster_unique_ids.add(cluster['name'])
        else:
            print(f'Duplicate Cluster. Name: {cluster["name"]}, ID: {cluster["id"]}')


if __name__ == '__main__':
    main()

To Reproduce in Postman:
Issue the GET requests with following path:
https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=5
and then issue again with
https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=30

You will see that Cluster with ID 10 repeats in two responses. The ID might differ at times.

The Example above uses demo.netbox.dev, but the same behavior we experience on our on-prem self-hosted instance. With 150 clusters we see about 30-35 duplicates.

Expected Behavior

Expect not to have duplicates returned as we page through the clusters.

Observed Behavior

Clusters with duplicate IDs are present in the responses.

Image
Image

We see that few IDs are duplicated in the response

Collected clusters from Netbox with offset: 0, limit: 5
Collected clusters from Netbox with offset: 5, limit: 5
Collected clusters from Netbox with offset: 10, limit: 5
Collected clusters from Netbox with offset: 15, limit: 5
Collected clusters from Netbox with offset: 20, limit: 5
Collected clusters from Netbox with offset: 25, limit: 5
Collected clusters from Netbox with offset: 30, limit: 5
Duplicate Cluster. Name: gc-us-west1, ID: 10
Duplicate Cluster. Name: gc-europe-west4, ID: 22
@dankotrajkovic dankotrajkovic added status: needs triage This issue is awaiting triage by a maintainer type: bug A confirmed report of unexpected behavior in the application labels Feb 25, 2025
@bctiemann bctiemann added status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation severity: medium Results in substantial degraded or broken functionality for specfic workflows and removed status: needs triage This issue is awaiting triage by a maintainer labels Feb 27, 2025
@bctiemann
Copy link
Contributor

This seems fairly high severity as the API pagination ought to be predictable and orderly. Is this reproducible in any other models?

@dankotrajkovic
Copy link
Author

In our local environment, we can reproduce this with the IPAddress model. This is where we initially found it but we have not been able to reproduce it on the public Netbox instance so we held back from raising the issue.

We thought it was because we have 500,000 IPAddresses in netbox across various VRFs that this was causing the duplicates. But even then the duplicates were not severe. Fetching with limit=1000 (so 500 pages) we were getting only about 10 duplicates. Still very problematic for our code and we had to write methods to recover from this, but luckily on this model the issue is severe and should lead to the quicker discovery of the problem. Its possible we are doing something wrong, but either way knowing how to fix would really help us.

@atownson
Copy link
Contributor

I have seen this issue as well, when performing GET requests for Services.

@cruse1977
Copy link
Member

@bctiemann bctiemann self-assigned this Mar 4, 2025
@bctiemann
Copy link
Contributor

It looks like the issue is just that Django isn't obeying the model's ordering setting when annotation is applied to the queryset, i.e. in the case of ClusterViewSet:

queryset = Cluster.objects.prefetch_related('virtual_machines').annotate(
allocated_vcpus=Sum('virtual_machines__vcpus'),
allocated_memory=Sum('virtual_machines__memory'),
allocated_disk=Sum('virtual_machines__disk'),
)

Note that ordering = ["name"] for Cluster:

In [29]: queryset = Cluster.objects.all()

In [30]: [(r.id, r.name) for r in queryset[0:10]]
Out[30]: 
[(9, 'DO-AMS3'),
 (8, 'DO-BLR1'),
 (7, 'DO-FRA1'),
 (6, 'DO-LON1'),
 (1, 'DO-NYC1'),
 (2, 'DO-NYC3'),
 (3, 'DO-SFO3'),
 (5, 'DO-SGP1'),
 (4, 'DO-TOR1'),
 (36, 'gc-asia-east1')]
In [27]: queryset = Cluster.objects.prefetch_related('virtual_machines').annotate(
    ...:         allocated_vcpus=Sum('virtual_machines__vcpus'),
    ...:         allocated_memory=Sum('virtual_machines__memory'),
    ...:         allocated_disk=Sum('virtual_machines__disk'),
    ...:     )

In [28]: [(r.id, r.name) for r in queryset[0:10]]
Out[28]: 
[(4, 'DO-TOR1'),
 (34, 'gc-asia-southeast1'),
 (40, 'gc-asia-northeast3'),
 (10, 'gc-us-west1'),
 (9, 'DO-AMS3'),
 (7, 'DO-FRA1'),
 (35, 'gc-asia-southeast2'),
 (38, 'gc-asia-northeast1'),
 (15, 'gc-us-east1'),
 (6, 'DO-LON1')]

If .order_by("name") is added to the custom queryset, it sorts predictably. Same if you add &ordering=name to the query parameters on the API call.

https://code.djangoproject.com/ticket/32811

We may need to identify all the ViewSets that use annotation in this way and add explicit ordering to the queryset statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity: medium Results in substantial degraded or broken functionality for specfic workflows status: accepted This issue has been accepted for implementation type: bug A confirmed report of unexpected behavior in the application
Projects
None yet
5 participants