Skip to content

$filter counting navigation property is slow #239

Closed
@wog48

Description

@wog48

Requests like GET .../AdministrativeDivisions?$top=1&$filter=Children/$count ge 1 can become slow depending on the amount of data and the choose database.

Such a request would generate a sql query like this:

select distinct t0."ParentCodeID" as a1,
	t0."ParentDivisionCode" as a2,
	t0."AlternativeCode" as a3,
	t0."CountryISOCode" as a4,
	t0."Population" as a5,
	t0."CodeID" as a6,
	t0."Area" as a7,
	t0."DivisionCode" as a8,
	t0."CodePublisher" as a9
from "AdministrativeDivision" t0
where exists ( select t1."CodePublisher"
	from "AdministrativeDivision" t1
	where (((t1."CodePublisher" = t0."CodePublisher")
			and (t1."ParentCodeID" = t0."CodeID"))
			and (t1."ParentDivisionCode" = t0."DivisionCode"))
	group by
		t1."CodePublisher",
		t1."ParentCodeID",
		t1."ParentDivisionCode"
	having (COUNT(t1."DivisionCode") >= 1))
order by
	t0."CodePublisher" asc,
	t0."CodeID" asc,
	t0."DivisionCode" asc
limit 1 offset 0

On some databases this is slow. As an alternative an IN clause instead of the EXISTS could be used:

select distinct t0."ParentCodeID" as a1,
	t0."ParentDivisionCode" as a2,
	t0."AlternativeCode" as a3,
	t0."CountryISOCode" as a4,
	t0."Population" as a5,
	t0."CodeID" as a6,
	t0."Area" as a7,
	t0."DivisionCode" as a8,
	t0."CodePublisher" as a9
from "OLINGO"."AdministrativeDivision" t0
where (t0."CodePublisher", t0."CodeID", t0."DivisionCode") in (
	select t1."CodePublisher",t1."ParentCodeID", t1."ParentDivisionCode"
	from "OLINGO"."AdministrativeDivision" t1
	group by t1."CodePublisher", 
t1."ParentCodeID",
		t1."ParentDivisionCode"
	having (COUNT(t1."DivisionCode") >= 1) )
order by
	t0."CodePublisher" asc,
	t0."CodeID" asc,
	t0."DivisionCode" asc
limit 1 offset 0

This is possible as EclipseLink is now supporting such IN clauses, which were not correctly generated at least till version 2.7.9.
The following table shall give an impression of the difference:

Database Number of Rows EXISTS IN
MariaDB ~22000 104 sec 0.031 sec
PostgreSQL ~22000 51.5 sec 0.012 sec
SAP HANA ~71000 0.023 sec 0.027 sec

(All the database run in parallel on my local machine)

Additional remark:
Surprisingly AdministrativeDivisions?$filter=Parent/Children/$count eq 2 takes onyl 1.5 sec on PostrgeSQL with EXISTS and takes 0,025 sec with IN

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions