Andrus schrieb: > Richard, > >> These are the same but the times are different. I'd be very surprised if >> you can reproduce these times reliably. > > I re-tried today again and got same results: in production database > pattern query is many times slower that equality query. > toode and rid base contain only single product starting with 99000010 > So both queries should scan exactly same numbers of rows. > >> Can I give you some wider-ranging suggestions Andrus? >> 1. Fix the vacuuming issue in your hash-join question. > > I have ran VACUUM FULL VERBOSE ANALYSE and set max_fsm_pages=150000 > So issue is fixed before those tests. > >> 2. Monitor the system to make sure you know if/when disk activity is >> high. > > I optimized this system. Now there are short (some seconds) sales > queries about after every 5 - 300 seconds which cause few disk > activity and add few new rows to some tables. > I havent seen that this activity affects to this test result. > >> 3. *Then* start to profile individual queries and look into their plans. >> Change the queries one at a time and monitor again. > > How to change pattern matching query to faster ? > > Andrus. > > Btw. > > I tried to reproduce this big difference in test server in 8.3 using > sample data script below and got big difference but in opposite > direction. > > explain analyze SELECT sum(1) > FROM orders > JOIN orders_products USING (order_id) > JOIN products USING (product_id) > WHERE orders.order_date>'2006-01-01' and ... > > different where clauses produce different results: > > AND orders_products.product_id = '3370000000000000' -- 880 .. 926 ms > AND orders_products.product_id like '3370000000000000%' -- 41 ..98 ms > > So patter index is 10 .. 20 times (!) faster always. > No idea why. > > Test data creation script: > > begin; > CREATE OR REPLACE FUNCTION Counter() RETURNS int IMMUTABLE AS > $_$ > SELECT 3500000; > $_$ LANGUAGE SQL; > > CREATE TEMP TABLE orders (order_id INTEGER NOT NULL, order_date DATE > NOT NULL); > CREATE TEMP TABLE products (product_id CHAR(20) NOT NULL, product_name > char(70) NOT NULL, quantity numeric(12,2) default 1); > CREATE TEMP TABLE orders_products (order_id INTEGER NOT NULL, > product_id CHAR(20), > id serial, price numeric(12,2) default 1 ); > > INSERT INTO products SELECT (n*power( 10,13))::INT8::CHAR(20), > 'product number ' || n::TEXT FROM generate_series(0,13410) AS n; > > INSERT INTO orders > SELECT n,'2005-01-01'::date + (4000.0 * n/Counter() * '1 DAY'::interval) > FROM generate_series(0, Counter()/3 ) AS n; > > SET work_mem TO 2097151; > > INSERT INTO orders_products SELECT > generate_series/3 as order_id, > ( (1+ (generate_series % 13410))*power( 10,13))::INT8::CHAR(20) AS > product_id > FROM generate_series(1, Counter()); > > ALTER TABLE orders ADD PRIMARY KEY (order_id); > ALTER TABLE products ADD PRIMARY KEY (product_id); > ALTER TABLE orders_products ADD PRIMARY KEY (id); > > ALTER TABLE orders_products ADD FOREIGN KEY (product_id) REFERENCES > products(product_id); > ALTER TABLE orders_products ADD FOREIGN KEY (order_id) REFERENCES > orders(order_id) ON DELETE CASCADE; > > CREATE INDEX orders_date ON orders( order_date ); > CREATE INDEX order_product_pattern_idx ON orders_products( product_id > bpchar_pattern_ops ); > > COMMIT; > SET work_mem TO DEFAULT; > ANALYZE; > No wonder that = compares bad, you created the index this way: CREATE INDEX order_product_pattern_idx ON orders_products( product_id bpchar_pattern_ops ); why not: CREATE INDEX order_product_pattern_idx ON orders_products( product_id); explain analyze SELECT sum(1) FROM orders JOIN orders_products USING (order_id) JOIN products USING (product_id) WHERE orders.order_date>'2006-01-01' AND orders_products.product_id = '3370000000000000'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=3013.68..3013.69 rows=1 width=0) (actual time=8.206..8.207 rows=1 loops=1) -> Nested Loop (cost=10.83..3013.21 rows=185 width=0) (actual time=2.095..7.962 rows=189 loops=1) -> Index Scan using products_pkey on products (cost=0.00..8.27 rows=1 width=18) (actual time=0.036..0.038 rows=1 loops=1) Index Cond: ((product_id)::text = '3370000000000000'::text) -> Nested Loop (cost=10.83..3003.09 rows=185 width=18) (actual time=2.052..7.474 rows=189 loops=1) -> Bitmap Heap Scan on orders_products (cost=10.83..949.68 rows=253 width=22) (actual time=0.161..0.817 rows=261 loops=1) Recheck Cond: ((product_id)::text = '3370000000000000'::text) -> Bitmap Index Scan on foo (cost=0.00..10.76 rows=253 width=0) (actual time=0.116..0.116 rows=261 loops=1) Index Cond: ((product_id)::text = '3370000000000000'::text) -> Index Scan using orders_pkey on orders (cost=0.00..8.10 rows=1 width=4) (actual time=0.020..0.021 rows=1 loops=261) Index Cond: (orders.order_id = orders_products.order_id) Filter: (orders.order_date > '2006-01-01'::date) Total runtime: 8.268 ms -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance