'Re: Range search on primary key does not use index scan'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: Range search on primary key does not use index scan
From:       Tom Lane <tgl () sss ! pgh ! pa ! us>
Date:       2020-12-24 1:43:08
Message-ID: 750937.1608774188 () sss ! pgh ! pa ! us
[Download RAW message or body]

Shantanu Shekhar <shekharshan@yahoo.com> writes:
> (2) Here I am searching for employee_id < 123. I was expecting the plan would use \
> the index on employees_pk to find all leaf nodes where employee_id < 123 and then \
> issue read of table blocks for each of the matching entries in the index leaf. But \
> looks like the query plan has decided on using full table scan instead and not \
> using the index. Any ideas why is this happening? testdb# explain select \
> first_name, last_name from testschema.employees where employee_id < 123;            \
> QUERY PLAN                                        \
> ════════════════════════════ \
> ══════════════════════════════════ \
> Seq Scan on employees   (cost=0.00..10.62 rows=17 width=1032)     Filter: \
> (employee_id < 123)(2 rows)

In the absence of any statistics (and since you have no data in the table,
there are no stats either), the default assumption about an inequality
condition is that it's not very selective --- I think it's assumed to
select one-third of the table, but am too lazy to go check that right now.
In any case, it's enough of the table to discourage use of an indexscan.
With Postgres' normal cost settings, a potentially indexable condition
has to be estimated to select just a few percent of the table, else a
seqscan is going to look cheaper.

If you'd used an actual range condition (which I take to be a BETWEEN
clause or equivalent), you probably would have gotten an indexscan plan,
because the default estimate for that is a lot tighter.  But a one-sided
inequality can't reasonably be assumed to be selecting just a small
part of the table without any evidence to back that.  Now, if you'd
populated and analyzed the table, the planner might discover that only
a few percent of the rows have employee_id < 123 (if you've got many
thousands of employees), and then it'd pick an indexscan for the
query as it stands.

Taking a few steps back here, the way you are testing things is not going
to do anything except mislead you.  Queries on empty tables that lack any
statistics are unlikely to produce the same plans as queries on populated,
analyzed tables.  See for instance the advice at

https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-CAVEATS

			regards, tom lane


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic