Using GROUP BY Inside a CTE for Aggregation












1















I'm using TSQL and AdventureWorks 2014 to try to compute the SUM of all the LineTotals for each unique order, specified by SalesOrderID. I'm joining on the SalesOrderHead.SalesOrderID because in SalesOrderHeader, SalesOrderID is the primary key. This query takes forever to run, and returns duplicates of the SalesOrderIDs, when there sould only be one record returned for each SalesOrderID. Hope that makes sense...



What I'm trying to do is calculate the SUM(LineTotal) grouped by SalesOrderID inside the SumPerOrderCTE, and then join this to SalesOrderHeader to select some additional tables, and only select from SumPerOrder the records who have the SalesOrderID matching SalesOrderHeader.



Any suggestions? I can't seem to figure out why it's not doing what it should..



Here's the output (notice the SalesOrderID's are all the same, and th differences should mostly be 0 but are not):
buggy_output



Small sample of how the output should generally look:
desired_output



CREATE OR ALTER PROC [dbo].[SalesOrderDataReviewCTE]
AS
WITH SumPerOrder (SalesOrderID, CalculatedSubTotalFromDetail)
AS
(
SELECT
SalesOrderID
,SUM(LineTotal)
FROM AdventureWorks2014.Sales.SalesOrderDetail
GROUP BY SalesOrderID
)
SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.CalculatedSubTotalFromDetail) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN SumPerOrder spo
ON soh.SalesOrderID = spo.SalesOrderID









share|improve this question




















  • 1





    It would help immensely if we had (cut down) tables definitions, sample data and expected results. Yes, a narrative can be good but for data related questions, nothing beats being able to actually work with the data.

    – Damien_The_Unbeliever
    Nov 21 '18 at 7:45











  • Even better if you could also prepare a fiddle here for example sqlfiddle.com

    – ZorgoZ
    Nov 21 '18 at 7:46











  • SalesOrderID is PK/FK in those tables, your query should work as-is. After the join you can't have different CalculatedSubTotalFromDetail for the same SalesOrderID, that's impossible. There must be something else, looks like a Cross Join. How many rows are returned by that Select and how many rows exist in SalesOrderHeader

    – dnoeth
    Nov 21 '18 at 8:05













  • Why the CTE instead of a SUM(LineTotal) in the SELECT clause?

    – Panagiotis Kanavos
    Nov 21 '18 at 8:08











  • @PanagiotisKanavos: aggregation before join is probably more efficient than aggregation *after join.

    – dnoeth
    Nov 21 '18 at 8:10
















1















I'm using TSQL and AdventureWorks 2014 to try to compute the SUM of all the LineTotals for each unique order, specified by SalesOrderID. I'm joining on the SalesOrderHead.SalesOrderID because in SalesOrderHeader, SalesOrderID is the primary key. This query takes forever to run, and returns duplicates of the SalesOrderIDs, when there sould only be one record returned for each SalesOrderID. Hope that makes sense...



What I'm trying to do is calculate the SUM(LineTotal) grouped by SalesOrderID inside the SumPerOrderCTE, and then join this to SalesOrderHeader to select some additional tables, and only select from SumPerOrder the records who have the SalesOrderID matching SalesOrderHeader.



Any suggestions? I can't seem to figure out why it's not doing what it should..



Here's the output (notice the SalesOrderID's are all the same, and th differences should mostly be 0 but are not):
buggy_output



Small sample of how the output should generally look:
desired_output



CREATE OR ALTER PROC [dbo].[SalesOrderDataReviewCTE]
AS
WITH SumPerOrder (SalesOrderID, CalculatedSubTotalFromDetail)
AS
(
SELECT
SalesOrderID
,SUM(LineTotal)
FROM AdventureWorks2014.Sales.SalesOrderDetail
GROUP BY SalesOrderID
)
SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.CalculatedSubTotalFromDetail) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN SumPerOrder spo
ON soh.SalesOrderID = spo.SalesOrderID









share|improve this question




















  • 1





    It would help immensely if we had (cut down) tables definitions, sample data and expected results. Yes, a narrative can be good but for data related questions, nothing beats being able to actually work with the data.

    – Damien_The_Unbeliever
    Nov 21 '18 at 7:45











  • Even better if you could also prepare a fiddle here for example sqlfiddle.com

    – ZorgoZ
    Nov 21 '18 at 7:46











  • SalesOrderID is PK/FK in those tables, your query should work as-is. After the join you can't have different CalculatedSubTotalFromDetail for the same SalesOrderID, that's impossible. There must be something else, looks like a Cross Join. How many rows are returned by that Select and how many rows exist in SalesOrderHeader

    – dnoeth
    Nov 21 '18 at 8:05













  • Why the CTE instead of a SUM(LineTotal) in the SELECT clause?

    – Panagiotis Kanavos
    Nov 21 '18 at 8:08











  • @PanagiotisKanavos: aggregation before join is probably more efficient than aggregation *after join.

    – dnoeth
    Nov 21 '18 at 8:10














1












1








1


2






I'm using TSQL and AdventureWorks 2014 to try to compute the SUM of all the LineTotals for each unique order, specified by SalesOrderID. I'm joining on the SalesOrderHead.SalesOrderID because in SalesOrderHeader, SalesOrderID is the primary key. This query takes forever to run, and returns duplicates of the SalesOrderIDs, when there sould only be one record returned for each SalesOrderID. Hope that makes sense...



What I'm trying to do is calculate the SUM(LineTotal) grouped by SalesOrderID inside the SumPerOrderCTE, and then join this to SalesOrderHeader to select some additional tables, and only select from SumPerOrder the records who have the SalesOrderID matching SalesOrderHeader.



Any suggestions? I can't seem to figure out why it's not doing what it should..



Here's the output (notice the SalesOrderID's are all the same, and th differences should mostly be 0 but are not):
buggy_output



Small sample of how the output should generally look:
desired_output



CREATE OR ALTER PROC [dbo].[SalesOrderDataReviewCTE]
AS
WITH SumPerOrder (SalesOrderID, CalculatedSubTotalFromDetail)
AS
(
SELECT
SalesOrderID
,SUM(LineTotal)
FROM AdventureWorks2014.Sales.SalesOrderDetail
GROUP BY SalesOrderID
)
SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.CalculatedSubTotalFromDetail) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN SumPerOrder spo
ON soh.SalesOrderID = spo.SalesOrderID









share|improve this question
















I'm using TSQL and AdventureWorks 2014 to try to compute the SUM of all the LineTotals for each unique order, specified by SalesOrderID. I'm joining on the SalesOrderHead.SalesOrderID because in SalesOrderHeader, SalesOrderID is the primary key. This query takes forever to run, and returns duplicates of the SalesOrderIDs, when there sould only be one record returned for each SalesOrderID. Hope that makes sense...



What I'm trying to do is calculate the SUM(LineTotal) grouped by SalesOrderID inside the SumPerOrderCTE, and then join this to SalesOrderHeader to select some additional tables, and only select from SumPerOrder the records who have the SalesOrderID matching SalesOrderHeader.



Any suggestions? I can't seem to figure out why it's not doing what it should..



Here's the output (notice the SalesOrderID's are all the same, and th differences should mostly be 0 but are not):
buggy_output



Small sample of how the output should generally look:
desired_output



CREATE OR ALTER PROC [dbo].[SalesOrderDataReviewCTE]
AS
WITH SumPerOrder (SalesOrderID, CalculatedSubTotalFromDetail)
AS
(
SELECT
SalesOrderID
,SUM(LineTotal)
FROM AdventureWorks2014.Sales.SalesOrderDetail
GROUP BY SalesOrderID
)
SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.CalculatedSubTotalFromDetail) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN SumPerOrder spo
ON soh.SalesOrderID = spo.SalesOrderID






sql sql-server tsql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 21 '18 at 7:55







leilanihagen

















asked Nov 21 '18 at 7:42









leilanihagenleilanihagen

84




84








  • 1





    It would help immensely if we had (cut down) tables definitions, sample data and expected results. Yes, a narrative can be good but for data related questions, nothing beats being able to actually work with the data.

    – Damien_The_Unbeliever
    Nov 21 '18 at 7:45











  • Even better if you could also prepare a fiddle here for example sqlfiddle.com

    – ZorgoZ
    Nov 21 '18 at 7:46











  • SalesOrderID is PK/FK in those tables, your query should work as-is. After the join you can't have different CalculatedSubTotalFromDetail for the same SalesOrderID, that's impossible. There must be something else, looks like a Cross Join. How many rows are returned by that Select and how many rows exist in SalesOrderHeader

    – dnoeth
    Nov 21 '18 at 8:05













  • Why the CTE instead of a SUM(LineTotal) in the SELECT clause?

    – Panagiotis Kanavos
    Nov 21 '18 at 8:08











  • @PanagiotisKanavos: aggregation before join is probably more efficient than aggregation *after join.

    – dnoeth
    Nov 21 '18 at 8:10














  • 1





    It would help immensely if we had (cut down) tables definitions, sample data and expected results. Yes, a narrative can be good but for data related questions, nothing beats being able to actually work with the data.

    – Damien_The_Unbeliever
    Nov 21 '18 at 7:45











  • Even better if you could also prepare a fiddle here for example sqlfiddle.com

    – ZorgoZ
    Nov 21 '18 at 7:46











  • SalesOrderID is PK/FK in those tables, your query should work as-is. After the join you can't have different CalculatedSubTotalFromDetail for the same SalesOrderID, that's impossible. There must be something else, looks like a Cross Join. How many rows are returned by that Select and how many rows exist in SalesOrderHeader

    – dnoeth
    Nov 21 '18 at 8:05













  • Why the CTE instead of a SUM(LineTotal) in the SELECT clause?

    – Panagiotis Kanavos
    Nov 21 '18 at 8:08











  • @PanagiotisKanavos: aggregation before join is probably more efficient than aggregation *after join.

    – dnoeth
    Nov 21 '18 at 8:10








1




1





It would help immensely if we had (cut down) tables definitions, sample data and expected results. Yes, a narrative can be good but for data related questions, nothing beats being able to actually work with the data.

– Damien_The_Unbeliever
Nov 21 '18 at 7:45





It would help immensely if we had (cut down) tables definitions, sample data and expected results. Yes, a narrative can be good but for data related questions, nothing beats being able to actually work with the data.

– Damien_The_Unbeliever
Nov 21 '18 at 7:45













Even better if you could also prepare a fiddle here for example sqlfiddle.com

– ZorgoZ
Nov 21 '18 at 7:46





Even better if you could also prepare a fiddle here for example sqlfiddle.com

– ZorgoZ
Nov 21 '18 at 7:46













SalesOrderID is PK/FK in those tables, your query should work as-is. After the join you can't have different CalculatedSubTotalFromDetail for the same SalesOrderID, that's impossible. There must be something else, looks like a Cross Join. How many rows are returned by that Select and how many rows exist in SalesOrderHeader

– dnoeth
Nov 21 '18 at 8:05







SalesOrderID is PK/FK in those tables, your query should work as-is. After the join you can't have different CalculatedSubTotalFromDetail for the same SalesOrderID, that's impossible. There must be something else, looks like a Cross Join. How many rows are returned by that Select and how many rows exist in SalesOrderHeader

– dnoeth
Nov 21 '18 at 8:05















Why the CTE instead of a SUM(LineTotal) in the SELECT clause?

– Panagiotis Kanavos
Nov 21 '18 at 8:08





Why the CTE instead of a SUM(LineTotal) in the SELECT clause?

– Panagiotis Kanavos
Nov 21 '18 at 8:08













@PanagiotisKanavos: aggregation before join is probably more efficient than aggregation *after join.

– dnoeth
Nov 21 '18 at 8:10





@PanagiotisKanavos: aggregation before join is probably more efficient than aggregation *after join.

– dnoeth
Nov 21 '18 at 8:10












2 Answers
2






active

oldest

votes


















0














Your query seems to be working fine:



enter image description here



Please check for some other issue.






share|improve this answer
























  • Hmmm strange... I will look into that. Thank you!

    – leilanihagen
    Nov 21 '18 at 8:09











  • Yes, there was just an issue with how I was executing it, thanks again.

    – leilanihagen
    Nov 21 '18 at 8:15











  • @leilanihagen: Can you share the issue?

    – dnoeth
    Nov 21 '18 at 8:16











  • I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

    – leilanihagen
    Nov 21 '18 at 20:05



















1














Why use the CTE?



The CTE is just a handy and one-query way to create a temp table with repeatable code. When you do this, the SumPerOrder table is missing all the indexes of SalesOrderDetail. Thus, the join with SalesOrderHeader takes much longer than what a direct join of SalesOrderHeader with SalesOrderDetail would do.



The simpler way would be:



SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.SUM(LineTotal) as CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.SUM(LineTotal)) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN AdventureWorks2014.Sales.SalesOrderDetail spo
ON soh.SalesOrderID = spo.SalesOrderID
GROUP BY spo.SalesOrderID,soh.SalesOrderNumber,soh.SubTotal





share|improve this answer
























  • Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

    – leilanihagen
    Nov 21 '18 at 8:08











  • The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

    – dnoeth
    Nov 21 '18 at 8:19











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53407330%2fusing-group-by-inside-a-cte-for-aggregation%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Your query seems to be working fine:



enter image description here



Please check for some other issue.






share|improve this answer
























  • Hmmm strange... I will look into that. Thank you!

    – leilanihagen
    Nov 21 '18 at 8:09











  • Yes, there was just an issue with how I was executing it, thanks again.

    – leilanihagen
    Nov 21 '18 at 8:15











  • @leilanihagen: Can you share the issue?

    – dnoeth
    Nov 21 '18 at 8:16











  • I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

    – leilanihagen
    Nov 21 '18 at 20:05
















0














Your query seems to be working fine:



enter image description here



Please check for some other issue.






share|improve this answer
























  • Hmmm strange... I will look into that. Thank you!

    – leilanihagen
    Nov 21 '18 at 8:09











  • Yes, there was just an issue with how I was executing it, thanks again.

    – leilanihagen
    Nov 21 '18 at 8:15











  • @leilanihagen: Can you share the issue?

    – dnoeth
    Nov 21 '18 at 8:16











  • I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

    – leilanihagen
    Nov 21 '18 at 20:05














0












0








0







Your query seems to be working fine:



enter image description here



Please check for some other issue.






share|improve this answer













Your query seems to be working fine:



enter image description here



Please check for some other issue.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 '18 at 8:01









satishcsesatishcse

1,1741712




1,1741712













  • Hmmm strange... I will look into that. Thank you!

    – leilanihagen
    Nov 21 '18 at 8:09











  • Yes, there was just an issue with how I was executing it, thanks again.

    – leilanihagen
    Nov 21 '18 at 8:15











  • @leilanihagen: Can you share the issue?

    – dnoeth
    Nov 21 '18 at 8:16











  • I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

    – leilanihagen
    Nov 21 '18 at 20:05



















  • Hmmm strange... I will look into that. Thank you!

    – leilanihagen
    Nov 21 '18 at 8:09











  • Yes, there was just an issue with how I was executing it, thanks again.

    – leilanihagen
    Nov 21 '18 at 8:15











  • @leilanihagen: Can you share the issue?

    – dnoeth
    Nov 21 '18 at 8:16











  • I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

    – leilanihagen
    Nov 21 '18 at 20:05

















Hmmm strange... I will look into that. Thank you!

– leilanihagen
Nov 21 '18 at 8:09





Hmmm strange... I will look into that. Thank you!

– leilanihagen
Nov 21 '18 at 8:09













Yes, there was just an issue with how I was executing it, thanks again.

– leilanihagen
Nov 21 '18 at 8:15





Yes, there was just an issue with how I was executing it, thanks again.

– leilanihagen
Nov 21 '18 at 8:15













@leilanihagen: Can you share the issue?

– dnoeth
Nov 21 '18 at 8:16





@leilanihagen: Can you share the issue?

– dnoeth
Nov 21 '18 at 8:16













I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

– leilanihagen
Nov 21 '18 at 20:05





I'm not exactly sure what it is, I'm using the DBeaver IDE and I had two versions of my stored procedure in one file. I was highlighting only this one when I executed it, but was getting the weird data and taking forever to execute when I did that. When I commented out the other version, it worked fine.

– leilanihagen
Nov 21 '18 at 20:05













1














Why use the CTE?



The CTE is just a handy and one-query way to create a temp table with repeatable code. When you do this, the SumPerOrder table is missing all the indexes of SalesOrderDetail. Thus, the join with SalesOrderHeader takes much longer than what a direct join of SalesOrderHeader with SalesOrderDetail would do.



The simpler way would be:



SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.SUM(LineTotal) as CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.SUM(LineTotal)) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN AdventureWorks2014.Sales.SalesOrderDetail spo
ON soh.SalesOrderID = spo.SalesOrderID
GROUP BY spo.SalesOrderID,soh.SalesOrderNumber,soh.SubTotal





share|improve this answer
























  • Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

    – leilanihagen
    Nov 21 '18 at 8:08











  • The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

    – dnoeth
    Nov 21 '18 at 8:19
















1














Why use the CTE?



The CTE is just a handy and one-query way to create a temp table with repeatable code. When you do this, the SumPerOrder table is missing all the indexes of SalesOrderDetail. Thus, the join with SalesOrderHeader takes much longer than what a direct join of SalesOrderHeader with SalesOrderDetail would do.



The simpler way would be:



SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.SUM(LineTotal) as CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.SUM(LineTotal)) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN AdventureWorks2014.Sales.SalesOrderDetail spo
ON soh.SalesOrderID = spo.SalesOrderID
GROUP BY spo.SalesOrderID,soh.SalesOrderNumber,soh.SubTotal





share|improve this answer
























  • Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

    – leilanihagen
    Nov 21 '18 at 8:08











  • The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

    – dnoeth
    Nov 21 '18 at 8:19














1












1








1







Why use the CTE?



The CTE is just a handy and one-query way to create a temp table with repeatable code. When you do this, the SumPerOrder table is missing all the indexes of SalesOrderDetail. Thus, the join with SalesOrderHeader takes much longer than what a direct join of SalesOrderHeader with SalesOrderDetail would do.



The simpler way would be:



SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.SUM(LineTotal) as CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.SUM(LineTotal)) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN AdventureWorks2014.Sales.SalesOrderDetail spo
ON soh.SalesOrderID = spo.SalesOrderID
GROUP BY spo.SalesOrderID,soh.SalesOrderNumber,soh.SubTotal





share|improve this answer













Why use the CTE?



The CTE is just a handy and one-query way to create a temp table with repeatable code. When you do this, the SumPerOrder table is missing all the indexes of SalesOrderDetail. Thus, the join with SalesOrderHeader takes much longer than what a direct join of SalesOrderHeader with SalesOrderDetail would do.



The simpler way would be:



SELECT
soh.SalesOrderID
,soh.SalesOrderNumber
,soh.SubTotal AS OriginalSubTotal
,spo.SUM(LineTotal) as CalculatedSubTotalFromDetail
,(soh.SubTotal - spo.SUM(LineTotal)) AS Difference
FROM AdventureWorks2014.Sales.SalesOrderHeader soh
INNER JOIN AdventureWorks2014.Sales.SalesOrderDetail spo
ON soh.SalesOrderID = spo.SalesOrderID
GROUP BY spo.SalesOrderID,soh.SalesOrderNumber,soh.SubTotal






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 '18 at 8:03









George MenoutisGeorge Menoutis

2,803521




2,803521













  • Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

    – leilanihagen
    Nov 21 '18 at 8:08











  • The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

    – dnoeth
    Nov 21 '18 at 8:19



















  • Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

    – leilanihagen
    Nov 21 '18 at 8:08











  • The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

    – dnoeth
    Nov 21 '18 at 8:19

















Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

– leilanihagen
Nov 21 '18 at 8:08





Thank you! I'm still learning how everything is executed/scopes and such in SQL. Helps a lot!

– leilanihagen
Nov 21 '18 at 8:08













The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

– dnoeth
Nov 21 '18 at 8:19





The join doesn't take much longer, because it's a way smaller amount of data to be joined. And you still have to do do the aggregation. Btw, a CTE is exactly the same as a Derived Table, the optimizer doesn't have to materialize it.

– dnoeth
Nov 21 '18 at 8:19


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53407330%2fusing-group-by-inside-a-cte-for-aggregation%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Guess what letter conforming each word

Port of Spain

Run scheduled task as local user group (not BUILTIN)