/    Sign up×
Community /Pin to ProfileBookmark

LEFT JOIN distorting SUM results

I have a table of invoices with costs associated.
I have a linked table with misc costs as well
relationship is 1 to many misc costs

When I do a SUM() and multiple rows exist on RIGHT table, the values in the LEFT table are multiplied by the number of rows found.

DB Fiddle: https://www.db-fiddle.com/f/9CKGRShXPoT3k4Wcm83fVD/5

Original query with incorrect results for LABOUR, PARTS, POLS and SUBLET:

“`
SELECT SUM(ad.labour_cost) AS LABOUR,
SUM(ad.part_cost) AS PARTS,
SUM(ad.pol_cost) AS POLS,
SUM(ad.sublet_cost) AS SUBLET,
SUM(am.misc_sales_amt) AS MISC
FROM AdvisorSalesData ad
LEFT JOIN AdvisorMiscSalesData am
ON (ad.customer_id=am.customer_id AND ad.invoice_no=am.invoice_no)
WHERE ad.customer_id IN (3)
“`

Adjusted query – with incorrect result for MISC (NULL):

“`
SELECT SUM(ad.labour_cost) AS LABOUR,
SUM(ad.part_cost) AS PARTS,
SUM(ad.pol_cost) AS POLS,
SUM(ad.sublet_cost) AS SUBLET,
(SELECT SUM(misc_sales_amt) FROM AdvisorMiscSalesData WHERE customer_id IN (3) AND invoice_no=ad.invoice_no ) AS MISC
FROM AdvisorSalesData ad
WHERE ad.customer_id IN (3)
“`

How can I adjust to get the correct results for all 5 columns in one single query?

to post a comment

10 Comments(s)

Copy linkTweet thisAlerts:
@NogDogFeb 26.2020 — > @php-bgrader#1615258 When I do a SUM() and multiple rows exist on RIGHT table, the values in the LEFT table are multiplied by the number of rows found.

I believe you can use SUM(DISTINCT left_table.column_name) for that case, if that helps?
Copy linkTweet thisAlerts:
@php-bgraderauthorFeb 26.2020 — > @NogDog#1615260 I believe you can use SUM(DISTINCT left_table.column_name) for that case, if that helps?

For some reason, this also provides incorrect figures for LABOUR, PARTS, POLS and SUBLET (marginally lower than expected)
Copy linkTweet thisAlerts:
@NogDogFeb 26.2020 — Hmm...so really both of the tables on that query actually represent separate many-to-one relationships to the invoice/customer pair? I'm sort of feeling like that may be throwing us off. I'm wondering if it might work better to include those tables...
<i>
</i>select
cus.id as customer_id, -- guessing at table/column names
inv.id as invoice_id, -- ditto
SUM(distinct ad.labour_cost) AS LABOUR,
SUM(distinct ad.part_cost) AS PARTS,
SUM(distinct ad.pol_cost) AS POLS,
SUM(distinct ad.sublet_cost) AS SUBLET,
SUM(distinct am.misc_sales_amt) AS MISC
from customer cus
inner join invoice inv on inv.customer_id = cus.id
left join AdvisorSalesData ad on ad.customer_id = cus.id and ad.invoice_no = inv.id
left join AdvisorMiscSalesData am on am.customer_id = cus.id and am.invoice_no = inv.id
where customer.id in (3)
group by cus.id, inv.id

Or something like that. Just quickly hacking something together to see if it sparks any ideas for you. :)
Copy linkTweet thisAlerts:
@php-bgraderauthorFeb 26.2020 — > @NogDog#1615263 Hmm...so really both of the tables on that query actually represent separate many-to-one relationships to the invoice/customer pair?

Not really, the AdvisorSalesData table is the master - invoice_no is UNIQUE

Does the DISTINCT return distinct invoice_no rows or distinct labour_cost rows? So will it ignore invoice_no #2 if the labour_cost value is the same as invoice_no #1?
Copy linkTweet thisAlerts:
@NogDogFeb 27.2020 — This runs...does it give the results you're looking for?
<i>
</i>SELECT
ad.customer_id,
SUM(ad.labour_cost) AS LABOUR,
SUM(ad.part_cost) AS PARTS,
SUM(ad.pol_cost) AS POLS,
SUM(ad.sublet_cost) AS SUBLET,
(
SELECT SUM(misc_sales_amt)
FROM AdvisorMiscSalesData
WHERE customer_id = ad.customer_id
) AS MISC
FROM AdvisorSalesData ad
WHERE ad.customer_id IN (3)
GROUP BY ad.customer_id

<i>
</i>customer_id LABOUR PARTS POLS SUBLET MISC
3 245.37 662.95 83.32 0 -252.46
Copy linkTweet thisAlerts:
@php-bgraderauthorFeb 27.2020 — It partially works until I expand the resultset by date (not in sample data)

``<i>
</i>SELECT
ad.customer_id,
SUM(ad.labour_cost) AS LABOUR,
SUM(ad.part_cost) AS PARTS,
SUM(ad.pol_cost) AS POLS,
SUM(ad.sublet_cost) AS SUBLET,
(
SELECT SUM(misc_sales_amt)
FROM AdvisorMiscSalesData
WHERE customer_id = ad.customer_id
) AS MISC
FROM AdvisorSalesData ad
WHERE ad.customer_id IN (3)
AND ad.ro_close BETWEEN DATE_FORMAT(NOW() ,'%Y-%m-01') AND NOW()
GROUP BY ad.customer_id<i>
</i>
``
Copy linkTweet thisAlerts:
@NogDogFeb 27.2020 — Probably have to add the date range to the sub-query on AdvisorMiscSalesData as well? :|
Copy linkTweet thisAlerts:
@php-bgraderauthorFeb 28.2020 — > @NogDog#1615376 Probably have to add the date range to the sub-query on AdvisorMiscSalesData as well?

There is no date on the Misc table which makes it more difficult

I could do 2 queries, but I would obviously prefer to combine and reduce overheads:

Invoice Costs:

``<i>
</i>SELECT SUM(labour_cost) AS LABOUR,
SUM(part_cost) AS PARTS,
SUM(pol_cost) AS POLS,
SUM(sublet_cost) AS SUBLET
FROM AdvisorSalesData
WHERE customer_id IN (3)
AND ro_close BETWEEN DATE_FORMAT(NOW() ,'%Y-%m-01') AND NOW()<i>
</i>
`</CODE>
<br/>
MISC Costs:

<CODE>
`<i>
</i>SELECT SUM(misc_sales_amt) AS MISC
FROM AdvisorMiscSalesData
WHERE customer_id IN (3)
AND invoice_no IN (SELECT invoice_no
FROM AdvisorSalesData
WHERE customer_id IN (3)
AND ad.ro_close BETWEEN DATE_FORMAT(NOW() ,'%Y-%m-01') AND NOW()
)<i>
</i>
``
Copy linkTweet thisAlerts:
@php-bgraderauthorFeb 28.2020 — I think I am going to give up trying to solve it.

Going to extract all invoices and then do the calcs in PHP:

``<i>
</i>SELECT ad.invoice_no,
ad.labour_cost AS LABOUR,
ad.part_cost AS PARTS,
ad.pol_cost AS POLS,
ad.sublet_cost AS SUBLET,
SUM(am.misc_sales_amt) AS MISC
FROM AdvisorSalesData ad
LEFT JOIN AdvisorMiscSalesData am
ON (ad.customer_id=am.customer_id AND ad.invoice_no=am.invoice_no)
WHERE ad.customer_id IN (3)
AND ad.ro_close BETWEEN DATE_FORMAT(NOW() ,'%Y-%m-01') AND NOW()
GROUP BY invoice_no<i>
</i>
``
Copy linkTweet thisAlerts:
@NogDogFeb 28.2020 — Ugh...feels like we need some database re-design/re-normalization, perhaps? If that's not possible, then all I can think of is adding something like this to the where clause of the sub-query in my last example:
<i>
</i>(
SELECT SUM(misc_sales_amt)
FROM AdvisorMiscSalesData
WHERE customer_id = ad.customer_id AND invoice_no IN (
SELECT invoice_no FROM AdvisorSalesData
WHERE ro_close BETWEEN DATE_FORMAT(NOW() ,'%Y-%m-01') AND NOW()
) AS invoices
) AS MISC

All of which is making me die a little bit inside. ;)

Frankly, it might perform better under 2 separate queries anyway. :|
×

Success!

Help @php-bgrader spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 4.24,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,

tipper: @Samric24,
tipped: article
amount: 1000 SATS,
)...