Netbox API returns duplicate resources during paging with offset #18729

dankotrajkovic · 2025-02-25T13:17:48Z

Deployment Type

Self-hosted

NetBox Version

v4.2.3

Python Version

3.12

Steps to Reproduce

Use Python or Postman API to GET the clusters from Netbox by paging through the resources.
The real-world use case is to load a larger list of clusters using the paging mechanism.

To reproduce Run the following code

import requests


def main():
    """
    Pull netbox clusters for demo.netbox.dev using limit=5
    The idea of small limit is just to simulate a bigger database where we have to do multiple
    requests of 50 items to load a larger list of for example 250 clusters.

    The idea is to show that within the returned items few duplicates appear.
    :return:
    """

    cluster_list = [] # Place to store the clusters after with each requests call
    cluster_unique_ids = set() # Set to store the unique IDs of clusters loaded from netbox


    # Collect the clusters
    headers = {
        'Accept': 'application/json',
        'Authorization': 'Token 6a768e6363830a536ffa07abf261c1d64d365b9a'
    }

    parameters = {
        'limit': 5,
        'offset': 0
    }
    while True:
        response = requests.get('https://demo.netbox.dev/api/virtualization/clusters', headers=headers,
                                params=parameters)
        if response.status_code == 200:
            print(f'Collected clusters from Netbox with offset: {parameters["offset"]}, limit: {parameters["limit"]}')
            cluster_list.extend(response.json()['results'])
            parameters['offset'] += parameters['limit']
        if  not response.json()['next']:
            break

    # Check if there are any duplicates in the clusters
    for cluster in cluster_list:
        if cluster['name'] not in cluster_unique_ids:
            cluster_unique_ids.add(cluster['name'])
        else:
            print(f'Duplicate Cluster. Name: {cluster["name"]}, ID: {cluster["id"]}')


if __name__ == '__main__':
    main()

To Reproduce in Postman:
Issue the GET requests with following path:
https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=5
and then issue again with
https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=30

You will see that Cluster with ID 10 repeats in two responses. The ID might differ at times.

The Example above uses demo.netbox.dev, but the same behavior we experience on our on-prem self-hosted instance. With 150 clusters we see about 30-35 duplicates.

Expected Behavior

Expect not to have duplicates returned as we page through the clusters.

Observed Behavior

Clusters with duplicate IDs are present in the responses.

We see that few IDs are duplicated in the response

Collected clusters from Netbox with offset: 0, limit: 5
Collected clusters from Netbox with offset: 5, limit: 5
Collected clusters from Netbox with offset: 10, limit: 5
Collected clusters from Netbox with offset: 15, limit: 5
Collected clusters from Netbox with offset: 20, limit: 5
Collected clusters from Netbox with offset: 25, limit: 5
Collected clusters from Netbox with offset: 30, limit: 5
Duplicate Cluster. Name: gc-us-west1, ID: 10
Duplicate Cluster. Name: gc-europe-west4, ID: 22

The text was updated successfully, but these errors were encountered:

bctiemann · 2025-02-27T01:46:08Z

This seems fairly high severity as the API pagination ought to be predictable and orderly. Is this reproducible in any other models?

dankotrajkovic · 2025-02-27T09:10:30Z

In our local environment, we can reproduce this with the IPAddress model. This is where we initially found it but we have not been able to reproduce it on the public Netbox instance so we held back from raising the issue.

We thought it was because we have 500,000 IPAddresses in netbox across various VRFs that this was causing the duplicates. But even then the duplicates were not severe. Fetching with limit=1000 (so 500 pages) we were getting only about 10 duplicates. Still very problematic for our code and we had to write methods to recover from this, but luckily on this model the issue is severe and should lead to the quicker discovery of the problem. Its possible we are doing something wrong, but either way knowing how to fix would really help us.

atownson · 2025-02-27T13:41:55Z

I have seen this issue as well, when performing GET requests for Services.

cruse1977 · 2025-03-04T09:03:27Z

https://demo.netbox.dev/api/virtualization/clusters/?offset=5&limit=5

https://demo.netbox.dev/api/virtualization/clusters/?offset=5&limit=30

ID 10 shown in both

bctiemann · 2025-03-04T18:49:58Z

It looks like the issue is just that Django isn't obeying the model's ordering setting when annotation is applied to the queryset, i.e. in the case of ClusterViewSet:

netbox/netbox/virtualization/api/views.py

Lines 37 to 41 in 913405a

    
           queryset = Cluster.objects.prefetch_related('virtual_machines').annotate( 
        
               allocated_vcpus=Sum('virtual_machines__vcpus'), 
        
               allocated_memory=Sum('virtual_machines__memory'), 
        
               allocated_disk=Sum('virtual_machines__disk'), 
        
           )

Note that ordering = ["name"] for Cluster:

In [29]: queryset = Cluster.objects.all()

In [30]: [(r.id, r.name) for r in queryset[0:10]]
Out[30]: 
[(9, 'DO-AMS3'),
 (8, 'DO-BLR1'),
 (7, 'DO-FRA1'),
 (6, 'DO-LON1'),
 (1, 'DO-NYC1'),
 (2, 'DO-NYC3'),
 (3, 'DO-SFO3'),
 (5, 'DO-SGP1'),
 (4, 'DO-TOR1'),
 (36, 'gc-asia-east1')]

In [27]: queryset = Cluster.objects.prefetch_related('virtual_machines').annotate(
    ...:         allocated_vcpus=Sum('virtual_machines__vcpus'),
    ...:         allocated_memory=Sum('virtual_machines__memory'),
    ...:         allocated_disk=Sum('virtual_machines__disk'),
    ...:     )

In [28]: [(r.id, r.name) for r in queryset[0:10]]
Out[28]: 
[(4, 'DO-TOR1'),
 (34, 'gc-asia-southeast1'),
 (40, 'gc-asia-northeast3'),
 (10, 'gc-us-west1'),
 (9, 'DO-AMS3'),
 (7, 'DO-FRA1'),
 (35, 'gc-asia-southeast2'),
 (38, 'gc-asia-northeast1'),
 (15, 'gc-us-east1'),
 (6, 'DO-LON1')]

If .order_by("name") is added to the custom queryset, it sorts predictably. Same if you add &ordering=name to the query parameters on the API call.

https://code.djangoproject.com/ticket/32811

We may need to identify all the ViewSets that use annotation in this way and add explicit ordering to the queryset statement.

dankotrajkovic added status: needs triage This issue is awaiting triage by a maintainer type: bug A confirmed report of unexpected behavior in the application labels Feb 25, 2025

bctiemann self-assigned this Mar 4, 2025

bctiemann mentioned this issue Mar 4, 2025

Fixes: #18729 - Reapply model-level ordering on list views (UI and API) to account for annotation #18805

Merged

arthanson closed this as completed in #18805 Mar 5, 2025

jeremystretch added status: accepted This issue has been accepted for implementation and removed status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation labels Mar 6, 2025

This was referenced Mar 6, 2025

Release v4.2.5 #18823

Merged

List views of self-nesting (recursive) object types are displayed incorrectly #18863

Closed

bctiemann mentioned this issue Mar 11, 2025

Fixes: #18863 - Exempt MPTT-based models from centrally applying ordering on querysets #18867

Merged

This was referenced Mar 13, 2025

GraphQL Pagination #16224

Closed

The REST API paginator should raise an exception if attempting to paginate an unordered queryset #18900

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Netbox API returns duplicate resources during paging with offset #18729

Netbox API returns duplicate resources during paging with offset #18729

dankotrajkovic commented Feb 25, 2025 •

edited

Loading

bctiemann commented Feb 27, 2025

Uh oh!

dankotrajkovic commented Feb 27, 2025

Uh oh!

atownson commented Feb 27, 2025

Uh oh!

cruse1977 commented Mar 4, 2025

Uh oh!

bctiemann commented Mar 4, 2025

Uh oh!

Netbox API returns duplicate resources during paging with offset #18729

Netbox API returns duplicate resources during paging with offset #18729

Comments

dankotrajkovic commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deployment Type

NetBox Version

Python Version

Steps to Reproduce

Expected Behavior

Observed Behavior

bctiemann commented Feb 27, 2025

Uh oh!

dankotrajkovic commented Feb 27, 2025

Uh oh!

atownson commented Feb 27, 2025

Uh oh!

cruse1977 commented Mar 4, 2025

Uh oh!

bctiemann commented Mar 4, 2025

Uh oh!

dankotrajkovic commented Feb 25, 2025 •

edited

Loading