I have a little web-application that is using sqlite3 as it's DB (the db is fairly small).
Right now, I am generating some content to display using the following query:
SELECT dbId, dlState, retreivalTime, seriesName, <snip irrelevant columns> FROM DataItems GROUP BY seriesName ORDER BY retreivalTime DESC LIMIT ? OFFSET ?;
limit is typically ~200, and
offset is 0 (they drive a pagination mechanism).
Anyways, right now, this one query is completely killing my performance. It takes approximately 800 milliseconds to execute on a table with ~67K rows.
I have indexes on both
sqlite> SELECT name FROM sqlite_master WHERE type='index' ORDER BY name; <snip irrelevant indexes> DataItems_seriesName_index DataItems_time_index // This is the index on retreivalTime. Yeah, it's poorly named
EXPLAIN QUERY PLAN seems to indicate they're not being used:
sqlite> EXPLAIN QUERY PLAN SELECT dbId, dlState, retreivalTime, seriesName FROM DataItems GROUP BY seriesName ORDER BY retreivalTime DESC LIMIT 200 OFFSET 0; 0|0|0|SCAN TABLE DataItems 0|0|0|USE TEMP B-TREE FOR GROUP BY 0|0|0|USE TEMP B-TREE FOR ORDER BY
The index on
COLLATE NOCASE, if that's relevant.
If I drop the
GROUP BY, it behaves as expected:
sqlite> EXPLAIN QUERY PLAN SELECT dbId, dlState, retreivalTime, seriesName FROM DataItems ORDER BY retreivalTime DESC LIMIT 200 OFFSET 0; 0|0|0|SCAN TABLE DataItems USING INDEX DataItems_time_index
Basically, my naive assumption would be that the best way to perform this query would be to walk backwards from latest value in
retreivalTime, and every time a new value for
seriesName is seen, append it to a temporary list, and finally return that value. That would have somewhat poor performance for cases where
OFFSET is large, but that happens very rarely in this application.
How can I optimize this query? I can provide the raw query operations if needed.
Insert performance is not critical here, so if I need to create an additional index or two, that's fine.
My current thoughts are a commit-hook that updates a separate table that is used to track only unique items, but that seems like overkill.