-
Notifications
You must be signed in to change notification settings - Fork 882
perf: Optimize Grafana query for trip view to leverage indexes more effectively #4964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for teslamate ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Nice find, and thanks for your contribution! |
|
🚀 nice finding! this one looks even better for me while may is easier to read / understand ? @jaypark0006 - could you retest with that one? SELECT floor(extract(epoch from date) / 5) * 5 AS time,
avg(latitude) AS latitude,
avg(longitude) AS longitude
from positions
where
car_id = '2'
and date BETWEEN '2025-09-22T16:00:00Z' AND '2025-09-23T16:00:00Z'
and (drive_id is null or drive_id in (select id from drives where start_date BETWEEN '2025-09-22T16:00:00Z' AND '2025-09-23T16:00:00Z'))
GROUP BY 1
ORDER BY 1 ASC |
Hi, sorry for the long wait, I finally got the rest today. Your version’s condition is different from the original one. In the original, date BETWEEN '2025-09-22T16:00:00Z' AND '2025-09-23T16:00:00Z' is not a global/common condition (see #4791). I believe Matthias Wirtz wanted to capture the full drive positions even if the drive itself extends beyond this time range. |
|
adding more detail on why the conditions differ and what we can do to keep semantics while improving index usage. Before PR version |
it's me ;) - yes, you're absolutely right. i wanted to avoid showing different data in different panels - as for the other drives we filter by start date only i wanted to show the full drives positions in the map as well. with unioned_positions as (
-- fetch all positions based on start_date of drives so the map aligns with data shown in other panels
select p.* from positions p
inner join drives d on p.drive_id = d.id
where p.car_id = '2' and start_date between '2025-09-22T16:00:00Z' and '2025-09-23T16:00:00Z'
union all
-- get all positions logged while not driving
select * from positions p
where p.car_id = '2' and drive_id is null and date between '2025-09-22T16:00:00Z' and '2025-09-23T16:00:00Z'
)
SELECT floor(extract(epoch from date) / 5) * 5 AS time,
avg(latitude) AS latitude,
avg(longitude) AS longitude
from unioned_positions
GROUP BY 1
ORDER BY 1 ASCthis one should work just fine, right? |
|
@swiffer Yes, your last version is correct. I think it works well in my database and also avoids the double grouping. |
Signed-off-by: jaypark0006 <[email protected]>
|
Hi @swiffer I tested your new SQL with the same dataset. The results are consistent, and the performance is also the same. Could you please review it again? double group Sort (cost=40470.08..40470.58 rows=200 width=96) (actual time=24.210..24.256 rows=1054 loops=1)
Sort Key: ((floor((EXTRACT(epoch FROM p.date) / '5'::numeric)) * '5'::numeric))
Sort Method: quicksort Memory: 102kB
-> HashAggregate (cost=40459.44..40462.44 rows=200 width=96) (actual time=23.123..23.862 rows=1054 loops=1)
Group Key: ((floor((EXTRACT(epoch FROM p.date) / '5'::numeric)) * '5'::numeric))
Batches: 1 Memory Usage: 849kB
-> Append (cost=39673.67..40319.40 rows=18672 width=96) (actual time=20.918..22.058 rows=1056 loops=1)
-> HashAggregate (cost=39673.67..40139.37 rows=18628 width=96) (actual time=20.917..21.750 rows=971 loops=1)
Group Key: (floor((EXTRACT(epoch FROM p.date) / '5'::numeric)) * '5'::numeric)
Batches: 1 Memory Usage: 1169kB
-> Nested Loop (cost=0.43..39533.96 rows=18628 width=48) (actual time=0.048..15.015 rows=14747 loops=1)
-> Seq Scan on drives d (cost=0.00..130.22 rows=6 width=4) (actual time=0.025..0.226 rows=3 loops=1)
Filter: ((start_date >= '2025-09-22 16:00:00'::timestamp without time zone) AND (start_date <= '2025-09-23 16:00:00'::timestamp without time zone))
Rows Removed by Filter: 2599
-> Index Scan using positions_drive_id_date_index on positions p (cost=0.43..6496.48 rows=3976 width=28) (actual time=0.012..2.349 rows=4916 loops=3)
Index Cond: (drive_id = d.id)
Filter: (car_id = '1'::smallint)
-> HashAggregate (cost=85.57..86.67 rows=44 width=96) (actual time=0.185..0.239 rows=85 loops=1)
Group Key: (floor((EXTRACT(epoch FROM p_1.date) / '5'::numeric)) * '5'::numeric)
Batches: 1 Memory Usage: 88kB
-> Index Scan using positions_drive_id_date_index on positions p_1 (cost=0.43..85.24 rows=44 width=48) (actual time=0.029..0.121 rows=85 loops=1)
Index Cond: ((drive_id IS NULL) AND (date >= '2025-09-22 16:00:00'::timestamp without time zone) AND (date <= '2025-09-23 16:00:00'::timestamp without time zone))
Filter: (car_id = '1'::smallint)
Planning Time: 0.442 ms
Execution Time: 24.594 ms
cte version Sort (cost=40238.68..40239.18 rows=200 width=96) (actual time=28.572..28.618 rows=1054 loops=1)
" Sort Key: ((floor((EXTRACT(epoch FROM ""*SELECT* 1"".date) / '5'::numeric)) * '5'::numeric))"
Sort Method: quicksort Memory: 102kB
-> HashAggregate (cost=40226.04..40231.04 rows=200 width=96) (actual time=27.381..28.203 rows=1054 loops=1)
" Group Key: (floor((EXTRACT(epoch FROM ""*SELECT* 1"".date) / '5'::numeric)) * '5'::numeric)"
Batches: 1 Memory Usage: 849kB
-> Result (cost=0.43..40086.00 rows=18672 width=48) (actual time=0.059..20.438 rows=14832 loops=1)
-> Append (cost=0.43..39712.56 rows=18672 width=24) (actual time=0.051..12.072 rows=14832 loops=1)
" -> Subquery Scan on ""*SELECT* 1"" (cost=0.43..39533.96 rows=18628 width=24) (actual time=0.051..11.121 rows=14747 loops=1)"
-> Nested Loop (cost=0.43..39347.68 rows=18628 width=200) (actual time=0.050..9.801 rows=14747 loops=1)
-> Seq Scan on drives d (cost=0.00..130.22 rows=6 width=4) (actual time=0.028..0.222 rows=3 loops=1)
Filter: ((start_date >= '2025-09-22 16:00:00'::timestamp without time zone) AND (start_date <= '2025-09-23 16:00:00'::timestamp without time zone))
Rows Removed by Filter: 2599
-> Index Scan using positions_drive_id_date_index on positions p (cost=0.43..6496.48 rows=3976 width=28) (actual time=0.012..2.402 rows=4916 loops=3)
Index Cond: (drive_id = d.id)
Filter: (car_id = '1'::smallint)
" -> Subquery Scan on ""*SELECT* 2"" (cost=0.43..85.24 rows=44 width=24) (actual time=0.025..0.089 rows=85 loops=1)"
-> Index Scan using positions_drive_id_date_index on positions p_1 (cost=0.43..84.80 rows=44 width=200) (actual time=0.024..0.080 rows=85 loops=1)
Index Cond: ((drive_id IS NULL) AND (date >= '2025-09-22 16:00:00'::timestamp without time zone) AND (date <= '2025-09-23 16:00:00'::timestamp without time zone))
Filter: (car_id = '1'::smallint)
Planning Time: 0.698 ms
Execution Time: 28.753 ms |
|
perfect, thank for checking and adapting. nice outcome and ready to be merged! |
…ffectively (#4964) * refactor: Optimize Grafana query for trip view * refactor: avoid double group by using CTE Signed-off-by: jaypark0006 <[email protected]> --------- Signed-off-by: jaypark0006 <[email protected]> Co-authored-by: qilei.riley <[email protected]>
…ffectively (teslamate-org#4964) * refactor: Optimize Grafana query for trip view * refactor: avoid double group by using CTE Signed-off-by: jaypark0006 <[email protected]> --------- Signed-off-by: jaypark0006 <[email protected]> Co-authored-by: qilei.riley <[email protected]>
…ffectively (teslamate-org#4964) * refactor: Optimize Grafana query for trip view * refactor: avoid double group by using CTE Signed-off-by: jaypark0006 <[email protected]> --------- Signed-off-by: jaypark0006 <[email protected]> Co-authored-by: qilei.riley <[email protected]>
This PR replaces the original positions aggregation query that used OR + subquery with a UNION ALL based query shape. The new form preserves the same result but allows PostgreSQL to leverage indexes more effectively.
old version:
new version:
for instance:
old version:
new version:
explain:
old version:
new version: