[prev in list] [next in list] [prev in thread] [next in thread]
List: postgresql-general
Subject: Re: Query plan: SELECT vs INSERT from same select
From: Alban Hertroys <haramrae () gmail ! com>
Date: 2019-07-24 9:24:23
Message-ID: 346DB5C5-3B26-4FA6-9B17-93FDED88A738 () gmail ! com
[Download RAW message or body]
> On 23 Jul 2019, at 22:29, Alexander Voytsekhovskyy <young.inbox@gmail.com> wrote:
>
> I have quite complicated query:
>
> SELECT axis_x1, axis_y1, SUM(delivery_price) as v_1 FROM (
> SELECT to_char(delivery_data.delivery_date, 'YYYY-MM') as axis_x1, \
> clients.id_client as axis_y1, delivery_data.amount * production_price.price * \
> groups.discount as delivery_price
> FROM delivery_data
> JOIN client_tt ON (client_tt.id_client_tt = delivery_data.id_client_tt)
> JOIN clients ON (client_tt.id_client = clients.id_client)
> JOIN production ON (production.id = delivery_data.id_product)
> JOIN groups ON (groups.id = delivery_data.delivery_group_id AND client_tt.id_group \
> = groups.id AND groups.id = clients.id_group)
Are client_tt.id_group and clients.id_group ever different from each other? It looks \
like you might have redundant information there, but... If they are guaranteed to be \
the same then you don't need the JOIN to clients, which would both remove a JOIN and \
reduce the complexity of the JOIN condition on groups.
Or (assuming the group id's are indeed supposed to be equal), you could
JOIN clients ON (client_tt.id_client = clients.id_client AND client_tt.id_group = \
clients.id_group) instead of putting that condition within the JOIN condition on \
groups.
I don't think either option will make a huge difference (the first probably more than \
the second, as it reduces an entire join), but it could be enough to help the \
database figure out a better plan.
> LEFT JOIN production_price on (delivery_data.id_product = \
> production_price.id_production AND groups.price_list_id = \
> production_price.price_list_id AND delivery_data.delivery_date BETWEEN \
> production_price.date_from AND production_price.date_to)
> WHERE delivery_data.delivery_date between '2019-03-01' AND '2019-06-30'
> AND delivery_data.delivery_group_id IN (...short list of values...)
> AND delivery_data.id_product IN ()) AS tmpsource
You don't have a price if your goods weren't produced in the delivery window you set? \
Or do you have goods that get delivered without having a price?
You seem to be using this query for a report on nett sales by month, but I have my \
doubts whether that LEFT JOIN, and especially the condition on the production date \
window, is really what you want: Your formula for delivery_price includes the price \
column from that LEFT JOIN, so you're going to get 0 values when there is no \
production_price record in your delivery-window, resulting in a SUM that's too low if \
the product was produced before (or after, but that seems unlikely) the delivery \
window.
> WHERE TRUE
This line is unnecessary.
> GROUP BY GROUPING SETS ((axis_x1, axis_y1), (axis_x1), (axis_y1), ())
Apparently (I'm new to these statements), CUBE (axis_x1, axis_y1) is a shorthand for \
the above. They seem to have been introduced at the same time (in 9.6?). See: \
https://www.postgresql.org/docs/11/queries-table-expressions.html#QUERIES-GROUPING-SETS
> It runs well, took 1s and returns 4000 rows.
I won't go into the performance issue ash this point, other more knowledgeable people \
already did.
Regards,
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic