SQL Pull Latest Distinct Records back based upon a column value
I have data that where most of the columns have unique data. There are only three columns from the table that I am interested in and two of them have unique data.
Example data:
Ins_Cd | Encounter | Date
-------------------------------
A00 | 12345678 | 01-01-2001
A00 | 98765432 | 02-01-2001
From the above I want to return the second record
Ins_Cd | Encounter | Date
-------------------------------
A00 | 98765432 | 02-01-2001
I wrote the following code, which I think can be improved upon. It runs fairly quickly ~ 9 seconds whith just under 2 million records in the view.
SELECT Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(
partition by pyr1_co_plan_cd
order by dsch_date desc
)
into #temp
FROM schema.my_view
where Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9'
order by Pyr1_Co_Plan_Cd
, Dsch_Date desc
;
select a.Pyr1_Co_Plan_Cd
, a.PtNo_Num
, a.Dsch_Date
from #temp as a
where a.rn = 1
order by a.Pyr1_Co_Plan_Cd
;
drop table #temp
;
The above does give me what I want. How can I write this a bit more efficiently? Or should I be posting this on codereview
add a comment |
I have data that where most of the columns have unique data. There are only three columns from the table that I am interested in and two of them have unique data.
Example data:
Ins_Cd | Encounter | Date
-------------------------------
A00 | 12345678 | 01-01-2001
A00 | 98765432 | 02-01-2001
From the above I want to return the second record
Ins_Cd | Encounter | Date
-------------------------------
A00 | 98765432 | 02-01-2001
I wrote the following code, which I think can be improved upon. It runs fairly quickly ~ 9 seconds whith just under 2 million records in the view.
SELECT Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(
partition by pyr1_co_plan_cd
order by dsch_date desc
)
into #temp
FROM schema.my_view
where Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9'
order by Pyr1_Co_Plan_Cd
, Dsch_Date desc
;
select a.Pyr1_Co_Plan_Cd
, a.PtNo_Num
, a.Dsch_Date
from #temp as a
where a.rn = 1
order by a.Pyr1_Co_Plan_Cd
;
drop table #temp
;
The above does give me what I want. How can I write this a bit more efficiently? Or should I be posting this on codereview
add a comment |
I have data that where most of the columns have unique data. There are only three columns from the table that I am interested in and two of them have unique data.
Example data:
Ins_Cd | Encounter | Date
-------------------------------
A00 | 12345678 | 01-01-2001
A00 | 98765432 | 02-01-2001
From the above I want to return the second record
Ins_Cd | Encounter | Date
-------------------------------
A00 | 98765432 | 02-01-2001
I wrote the following code, which I think can be improved upon. It runs fairly quickly ~ 9 seconds whith just under 2 million records in the view.
SELECT Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(
partition by pyr1_co_plan_cd
order by dsch_date desc
)
into #temp
FROM schema.my_view
where Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9'
order by Pyr1_Co_Plan_Cd
, Dsch_Date desc
;
select a.Pyr1_Co_Plan_Cd
, a.PtNo_Num
, a.Dsch_Date
from #temp as a
where a.rn = 1
order by a.Pyr1_Co_Plan_Cd
;
drop table #temp
;
The above does give me what I want. How can I write this a bit more efficiently? Or should I be posting this on codereview
I have data that where most of the columns have unique data. There are only three columns from the table that I am interested in and two of them have unique data.
Example data:
Ins_Cd | Encounter | Date
-------------------------------
A00 | 12345678 | 01-01-2001
A00 | 98765432 | 02-01-2001
From the above I want to return the second record
Ins_Cd | Encounter | Date
-------------------------------
A00 | 98765432 | 02-01-2001
I wrote the following code, which I think can be improved upon. It runs fairly quickly ~ 9 seconds whith just under 2 million records in the view.
SELECT Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(
partition by pyr1_co_plan_cd
order by dsch_date desc
)
into #temp
FROM schema.my_view
where Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9'
order by Pyr1_Co_Plan_Cd
, Dsch_Date desc
;
select a.Pyr1_Co_Plan_Cd
, a.PtNo_Num
, a.Dsch_Date
from #temp as a
where a.rn = 1
order by a.Pyr1_Co_Plan_Cd
;
drop table #temp
;
The above does give me what I want. How can I write this a bit more efficiently? Or should I be posting this on codereview
edited Nov 14 '18 at 19:49
scsimon
20.8k41536
20.8k41536
asked Nov 14 '18 at 19:21
MCP_infiltrator
1,50283060
1,50283060
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
This should prbably go on codereview but since you are here...
Your code seems to have columns that isn't in your data... specifically your partitioned column. Maybe this is the correct column to partition on, but just seems odd.
One way to speed this up and to not use a temp table. This will speed things up because the INSERT operation doesn't have to be done. If you look at your actual execution plan, I bet the insert is a large % of the total query cost. Instead, use a derived table or CTE.
select
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
from
(SELECT
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(partition by pyr1_co_plan_cd order by dsch_date desc)
FROM
schema.my_view
where
Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9') x
where RN = 1
order by Pyr1_Co_Plan_Cd
If you are set on using a TEMP TABLE then you can also speed things up by removing the order by Pyr1_Co_Plan_Cd, Dsch_Date desc on the SELECT INTO #temp portion. This ordering is unnecessary and doesn't benefit you at all especially because you are ordering your results in the final select, and are using a window function to calculate your RN which has an order by.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307402%2fsql-pull-latest-distinct-records-back-based-upon-a-column-value%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This should prbably go on codereview but since you are here...
Your code seems to have columns that isn't in your data... specifically your partitioned column. Maybe this is the correct column to partition on, but just seems odd.
One way to speed this up and to not use a temp table. This will speed things up because the INSERT operation doesn't have to be done. If you look at your actual execution plan, I bet the insert is a large % of the total query cost. Instead, use a derived table or CTE.
select
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
from
(SELECT
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(partition by pyr1_co_plan_cd order by dsch_date desc)
FROM
schema.my_view
where
Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9') x
where RN = 1
order by Pyr1_Co_Plan_Cd
If you are set on using a TEMP TABLE then you can also speed things up by removing the order by Pyr1_Co_Plan_Cd, Dsch_Date desc on the SELECT INTO #temp portion. This ordering is unnecessary and doesn't benefit you at all especially because you are ordering your results in the final select, and are using a window function to calculate your RN which has an order by.
add a comment |
This should prbably go on codereview but since you are here...
Your code seems to have columns that isn't in your data... specifically your partitioned column. Maybe this is the correct column to partition on, but just seems odd.
One way to speed this up and to not use a temp table. This will speed things up because the INSERT operation doesn't have to be done. If you look at your actual execution plan, I bet the insert is a large % of the total query cost. Instead, use a derived table or CTE.
select
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
from
(SELECT
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(partition by pyr1_co_plan_cd order by dsch_date desc)
FROM
schema.my_view
where
Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9') x
where RN = 1
order by Pyr1_Co_Plan_Cd
If you are set on using a TEMP TABLE then you can also speed things up by removing the order by Pyr1_Co_Plan_Cd, Dsch_Date desc on the SELECT INTO #temp portion. This ordering is unnecessary and doesn't benefit you at all especially because you are ordering your results in the final select, and are using a window function to calculate your RN which has an order by.
add a comment |
This should prbably go on codereview but since you are here...
Your code seems to have columns that isn't in your data... specifically your partitioned column. Maybe this is the correct column to partition on, but just seems odd.
One way to speed this up and to not use a temp table. This will speed things up because the INSERT operation doesn't have to be done. If you look at your actual execution plan, I bet the insert is a large % of the total query cost. Instead, use a derived table or CTE.
select
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
from
(SELECT
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(partition by pyr1_co_plan_cd order by dsch_date desc)
FROM
schema.my_view
where
Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9') x
where RN = 1
order by Pyr1_Co_Plan_Cd
If you are set on using a TEMP TABLE then you can also speed things up by removing the order by Pyr1_Co_Plan_Cd, Dsch_Date desc on the SELECT INTO #temp portion. This ordering is unnecessary and doesn't benefit you at all especially because you are ordering your results in the final select, and are using a window function to calculate your RN which has an order by.
This should prbably go on codereview but since you are here...
Your code seems to have columns that isn't in your data... specifically your partitioned column. Maybe this is the correct column to partition on, but just seems odd.
One way to speed this up and to not use a temp table. This will speed things up because the INSERT operation doesn't have to be done. If you look at your actual execution plan, I bet the insert is a large % of the total query cost. Instead, use a derived table or CTE.
select
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
from
(SELECT
Pyr1_Co_Plan_Cd
, PtNo_Num
, Dsch_Date
, [rn] = ROW_NUMBER() over(partition by pyr1_co_plan_cd order by dsch_date desc)
FROM
schema.my_view
where
Med_Rec_No is not null
and Dsch_Date is not null
and LEFT(PtNo_Num, 1) != '2'
and LEFT(ptno_num, 4) != '1999'
and LEFT(ptno_num, 1) != '9') x
where RN = 1
order by Pyr1_Co_Plan_Cd
If you are set on using a TEMP TABLE then you can also speed things up by removing the order by Pyr1_Co_Plan_Cd, Dsch_Date desc on the SELECT INTO #temp portion. This ordering is unnecessary and doesn't benefit you at all especially because you are ordering your results in the final select, and are using a window function to calculate your RN which has an order by.
answered Nov 14 '18 at 19:49
scsimon
20.8k41536
20.8k41536
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307402%2fsql-pull-latest-distinct-records-back-based-upon-a-column-value%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown