Skip to content

Missing 'distinct' clause in subquery #276

@nyordanoff

Description

@nyordanoff

Hello,

I noticed that distinct clause is missing in a subquery which results in many returned duplicate records. The setup is as follows:

I have two entities - SystemInstance:

@Entity(name = "systemInstance")
@Table(name = "tenants_apps")
public class SystemInstanceEntity {
    ...
    @OneToMany(mappedBy = "systemInstance", fetch = FetchType.LAZY)
    private Set<BundleEntity> consumptionBundles;
    ...
}

and Bundle:

@Entity(name = "consumptionBundle")
@Table(name = "tenants_bundles")
public class BundleEntity {
    ...
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "app_id", insertable = false, updatable = false)
    private SystemInstanceEntity systemInstance;
    ...
}

For testing purposes, let's say we have 1 SystemInstance which has 2 Bundles.

When I execute: /systemInstances?$expand=consumptionBundles, these are some of the generated queries:

[EL Fine]: sql: 2024-02-26 14:42:53.903--ServerSession(1700334050)--Connection(79428551)--Thread(Thread[http-nio-8080-exec-3,5,main])--SELECT DISTINCT local_tenant_id, product_type, application_namespace, fa_formation_id, formation_type_id, description, assignment_id, id, system_number, base_url, name FROM tenants_apps WHERE (((target_id = ?) AND ((formation_id = ?) OR (formation_id = ?))) AND (tenant_id = ?))
	bind => [0ac08747-bd76-4697-8a2d-4a8c851c052a, 53d874d3-f80c-4572-9a0d-209d51f5e46f, acb93a87-5b10-4b2d-9c79-ad53454acba1, bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb]
[EL Fine]: sql: 2024-02-26 14:42:53.908--ServerSession(1700334050)--Connection(1141354648)--Thread(Thread[http-nio-8080-exec-3,5,main])--SELECT t0.name, t0.ord_id, t0.description, t0.local_tenant_id, t0.app_id, t0.id, t0.short_description, t0.version, t0.last_update FROM tenants_bundles t0, tenants_apps t1 WHERE (((((t1.target_id = ?) AND ((t1.formation_id = ?) OR (t1.formation_id = ?))) AND (t1.tenant_id = ?)) AND (((t0.formation_id = ?) OR (t0.formation_id = ?)) AND (t0.tenant_id = ?))) AND (t0.app_id = t1.id)) ORDER BY t0.app_id ASC
	bind => [0ac08747-bd76-4697-8a2d-4a8c851c052a, 53d874d3-f80c-4572-9a0d-209d51f5e46f, acb93a87-5b10-4b2d-9c79-ad53454acba1, bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb, 53d874d3-f80c-4572-9a0d-209d51f5e46f, acb93a87-5b10-4b2d-9c79-ad53454acba1, bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb]

We can see that the second query is missing distinct clause. How is this a problem?

Well, in our testing example, tenants_apps have 2 matching records and tenants_bundles have 4 matching records (due to the various where clauses generated from @EdmProtectedBy). Then full combinations will be made which will result in 8 records. That's why in the response from the ODATA query we have 8 consumptionBundles instead of 2. Most of them are duplicates:

{
    "@odata.context": "$metadata#systemInstances(consumptionBundles())",
    "value": [
        {
            "description": null,
            "title": "system-instance-name",
            "id": "0ac08747-bd76-4697-8a2d-4a8c851c052a",
            "consumptionBundles": [
                {
                    "title": "bar",
                    "id": "454b2c49-ee20-4f2c-a1f2-b60c454e6e08"
                },
                {
                    "title": "bar",
                    "id": "454b2c49-ee20-4f2c-a1f2-b60c454e6e08"
                },
                {
                    "title": "foo",
                    "id": "814b2e02-abfa-428f-82b6-1484bb2e4125"
                },
                {
                    "title": "foo",
                    "id": "814b2e02-abfa-428f-82b6-1484bb2e4125"
                },
                {
                    "title": "bar",
                    "id": "454b2c49-ee20-4f2c-a1f2-b60c454e6e08"
                },
                {
                    "title": "bar",
                    "id": "454b2c49-ee20-4f2c-a1f2-b60c454e6e08"
                },
                {
                    "title": "foo",
                    "id": "814b2e02-abfa-428f-82b6-1484bb2e4125"
                },
                {
                    "title": "foo",
                    "id": "814b2e02-abfa-428f-82b6-1484bb2e4125"
                }
            ]
        }
    ]
}

Another interesting observation from the generated queries is that distinct clause is always present if the relation between the entities is modelled with:

    @ElementCollection
    @CollectionTable(name = "correlation_ids_applications", joinColumns = @JoinColumn(name = "application_id"))
    private List<ArrayElement> correlationIds;
[EL Fine]: sql: 2024-02-26 14:33:48.011--ServerSession(1843609566)--Connection(1086777970)--Thread(Thread[http-nio-8080-exec-3,5,main])--SELECT DISTINCT t0.id, t1.value FROM tenants_apps t0, correlation_ids_applications t1 WHERE ((((t0.target_id = ?) AND (t0.formation_id = ?)) AND (t0.tenant_id = ?)) AND (t1.application_id = t0.id)) ORDER BY t0.id ASC
        bind => [eeeeeeee-eeee-eeee-eeee-eeeeeeeeeeee, aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa, 3e64ebae-38b5-46a0-b1ed-9ccee153a0ae]

which made me wondering if the distinct clause is missing in the subqueries only if the relation is modelled with @OneToMany/@ManyToOne/@ManyToMany relations?

As a conclusion, the point of this issue is to figure out why distinct clause is missing from generated subqueries which results in duplicate records in the ODATA response.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions