Trying to simplify a SQL query without UNION












4















I'm very bad at explaining, so let me try to lay out my issue. I have a table that resembles the following:



 Source    Value    User
======== ======= ======
old1 1 Phil
new 2 Phil
old2 3 Phil
new 4 Phil
old1 1 Mike
old2 2 Mike
new 1 Jeff
new 2 Jeff


What I need to do is create a query that gets values for users based on the source and the value. It should follow this rule:




For every user, get the highest value. However, disregard the 'new'
source if either 'old1' or 'old2' exists for that user.




So based on those rules, my query should return the following from this table:



 Value    User
======= ======
3 Phil
2 Mike
2 Jeff


I've come up with a query that does close to what is asked:



SELECT      MAX([Value]), [User]
FROM
(
SELECT CASE [Source]
WHEN 'old1' THEN 1
WHEN 'old2' THEN 1
WHEN 'new' THEN 2
END AS [SourcePriority],
[Value],
[User]
FROM #UserValues
) MainPriority
WHERE [SourcePriority] = 1
GROUP BY [User]
UNION
SELECT MAX([Value]), [User]
FROM
(
SELECT CASE [Source]
WHEN 'old1' THEN 1
WHEN 'old2' THEN 1
WHEN 'new' THEN 2
END AS [SourcePriority],
[Value],
[User]
FROM #UserValues
) SecondaryPriority
WHERE [SourcePriority] = 2
GROUP BY [User]


However this returns the following results:



 Value    User
======= ======
3 Phil
4 Phil
2 Mike
2 Jeff


Obviously that extra value for Phil=4 is not desired. How should I attempt to fix this query? I also understand that this is a pretty convoluted solution and that it can probably be more easily solved by proper use of aggregates, however I'm not too familiar with aggregates yet which resulted in me resorting to a union. Essentially I'm looking for help creating the cleanest-looking solution possible.



Here is the SQL code if anyone wanted to populate the table themselves to give it a try:



CREATE TABLE #UserValues
(
[Source] VARCHAR(10),
[Value] INT,
[User] VARCHAR(10)
)
INSERT INTO #UserValues VALUES
('old1', 1, 'Phil'),
('new', 2, 'Phil'),
('old2', 3, 'Phil'),
('new', 4, 'Phil'),
('old1', 1, 'Mike'),
('old2', 2, 'Mike'),
('new', 1, 'Jeff'),
('new', 2, 'Jeff')









share|improve this question



























    4















    I'm very bad at explaining, so let me try to lay out my issue. I have a table that resembles the following:



     Source    Value    User
    ======== ======= ======
    old1 1 Phil
    new 2 Phil
    old2 3 Phil
    new 4 Phil
    old1 1 Mike
    old2 2 Mike
    new 1 Jeff
    new 2 Jeff


    What I need to do is create a query that gets values for users based on the source and the value. It should follow this rule:




    For every user, get the highest value. However, disregard the 'new'
    source if either 'old1' or 'old2' exists for that user.




    So based on those rules, my query should return the following from this table:



     Value    User
    ======= ======
    3 Phil
    2 Mike
    2 Jeff


    I've come up with a query that does close to what is asked:



    SELECT      MAX([Value]), [User]
    FROM
    (
    SELECT CASE [Source]
    WHEN 'old1' THEN 1
    WHEN 'old2' THEN 1
    WHEN 'new' THEN 2
    END AS [SourcePriority],
    [Value],
    [User]
    FROM #UserValues
    ) MainPriority
    WHERE [SourcePriority] = 1
    GROUP BY [User]
    UNION
    SELECT MAX([Value]), [User]
    FROM
    (
    SELECT CASE [Source]
    WHEN 'old1' THEN 1
    WHEN 'old2' THEN 1
    WHEN 'new' THEN 2
    END AS [SourcePriority],
    [Value],
    [User]
    FROM #UserValues
    ) SecondaryPriority
    WHERE [SourcePriority] = 2
    GROUP BY [User]


    However this returns the following results:



     Value    User
    ======= ======
    3 Phil
    4 Phil
    2 Mike
    2 Jeff


    Obviously that extra value for Phil=4 is not desired. How should I attempt to fix this query? I also understand that this is a pretty convoluted solution and that it can probably be more easily solved by proper use of aggregates, however I'm not too familiar with aggregates yet which resulted in me resorting to a union. Essentially I'm looking for help creating the cleanest-looking solution possible.



    Here is the SQL code if anyone wanted to populate the table themselves to give it a try:



    CREATE TABLE #UserValues
    (
    [Source] VARCHAR(10),
    [Value] INT,
    [User] VARCHAR(10)
    )
    INSERT INTO #UserValues VALUES
    ('old1', 1, 'Phil'),
    ('new', 2, 'Phil'),
    ('old2', 3, 'Phil'),
    ('new', 4, 'Phil'),
    ('old1', 1, 'Mike'),
    ('old2', 2, 'Mike'),
    ('new', 1, 'Jeff'),
    ('new', 2, 'Jeff')









    share|improve this question

























      4












      4








      4








      I'm very bad at explaining, so let me try to lay out my issue. I have a table that resembles the following:



       Source    Value    User
      ======== ======= ======
      old1 1 Phil
      new 2 Phil
      old2 3 Phil
      new 4 Phil
      old1 1 Mike
      old2 2 Mike
      new 1 Jeff
      new 2 Jeff


      What I need to do is create a query that gets values for users based on the source and the value. It should follow this rule:




      For every user, get the highest value. However, disregard the 'new'
      source if either 'old1' or 'old2' exists for that user.




      So based on those rules, my query should return the following from this table:



       Value    User
      ======= ======
      3 Phil
      2 Mike
      2 Jeff


      I've come up with a query that does close to what is asked:



      SELECT      MAX([Value]), [User]
      FROM
      (
      SELECT CASE [Source]
      WHEN 'old1' THEN 1
      WHEN 'old2' THEN 1
      WHEN 'new' THEN 2
      END AS [SourcePriority],
      [Value],
      [User]
      FROM #UserValues
      ) MainPriority
      WHERE [SourcePriority] = 1
      GROUP BY [User]
      UNION
      SELECT MAX([Value]), [User]
      FROM
      (
      SELECT CASE [Source]
      WHEN 'old1' THEN 1
      WHEN 'old2' THEN 1
      WHEN 'new' THEN 2
      END AS [SourcePriority],
      [Value],
      [User]
      FROM #UserValues
      ) SecondaryPriority
      WHERE [SourcePriority] = 2
      GROUP BY [User]


      However this returns the following results:



       Value    User
      ======= ======
      3 Phil
      4 Phil
      2 Mike
      2 Jeff


      Obviously that extra value for Phil=4 is not desired. How should I attempt to fix this query? I also understand that this is a pretty convoluted solution and that it can probably be more easily solved by proper use of aggregates, however I'm not too familiar with aggregates yet which resulted in me resorting to a union. Essentially I'm looking for help creating the cleanest-looking solution possible.



      Here is the SQL code if anyone wanted to populate the table themselves to give it a try:



      CREATE TABLE #UserValues
      (
      [Source] VARCHAR(10),
      [Value] INT,
      [User] VARCHAR(10)
      )
      INSERT INTO #UserValues VALUES
      ('old1', 1, 'Phil'),
      ('new', 2, 'Phil'),
      ('old2', 3, 'Phil'),
      ('new', 4, 'Phil'),
      ('old1', 1, 'Mike'),
      ('old2', 2, 'Mike'),
      ('new', 1, 'Jeff'),
      ('new', 2, 'Jeff')









      share|improve this question














      I'm very bad at explaining, so let me try to lay out my issue. I have a table that resembles the following:



       Source    Value    User
      ======== ======= ======
      old1 1 Phil
      new 2 Phil
      old2 3 Phil
      new 4 Phil
      old1 1 Mike
      old2 2 Mike
      new 1 Jeff
      new 2 Jeff


      What I need to do is create a query that gets values for users based on the source and the value. It should follow this rule:




      For every user, get the highest value. However, disregard the 'new'
      source if either 'old1' or 'old2' exists for that user.




      So based on those rules, my query should return the following from this table:



       Value    User
      ======= ======
      3 Phil
      2 Mike
      2 Jeff


      I've come up with a query that does close to what is asked:



      SELECT      MAX([Value]), [User]
      FROM
      (
      SELECT CASE [Source]
      WHEN 'old1' THEN 1
      WHEN 'old2' THEN 1
      WHEN 'new' THEN 2
      END AS [SourcePriority],
      [Value],
      [User]
      FROM #UserValues
      ) MainPriority
      WHERE [SourcePriority] = 1
      GROUP BY [User]
      UNION
      SELECT MAX([Value]), [User]
      FROM
      (
      SELECT CASE [Source]
      WHEN 'old1' THEN 1
      WHEN 'old2' THEN 1
      WHEN 'new' THEN 2
      END AS [SourcePriority],
      [Value],
      [User]
      FROM #UserValues
      ) SecondaryPriority
      WHERE [SourcePriority] = 2
      GROUP BY [User]


      However this returns the following results:



       Value    User
      ======= ======
      3 Phil
      4 Phil
      2 Mike
      2 Jeff


      Obviously that extra value for Phil=4 is not desired. How should I attempt to fix this query? I also understand that this is a pretty convoluted solution and that it can probably be more easily solved by proper use of aggregates, however I'm not too familiar with aggregates yet which resulted in me resorting to a union. Essentially I'm looking for help creating the cleanest-looking solution possible.



      Here is the SQL code if anyone wanted to populate the table themselves to give it a try:



      CREATE TABLE #UserValues
      (
      [Source] VARCHAR(10),
      [Value] INT,
      [User] VARCHAR(10)
      )
      INSERT INTO #UserValues VALUES
      ('old1', 1, 'Phil'),
      ('new', 2, 'Phil'),
      ('old2', 3, 'Phil'),
      ('new', 4, 'Phil'),
      ('old1', 1, 'Mike'),
      ('old2', 2, 'Mike'),
      ('new', 1, 'Jeff'),
      ('new', 2, 'Jeff')






      sql sql-server tsql






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 20 '18 at 17:25









      Jon WarrenJon Warren

      137112




      137112
























          5 Answers
          5






          active

          oldest

          votes


















          1














          You can use priorities order by with row_number() :



          select top (1) with ties uv.*
          from #UserValues uv
          order by row_number() over (partition by [user]
          order by (case when source = 'old2' then 1 when source = 'old1' then 2 else 3 end), value desc
          );


          However, if you have only source limited with 3 then you can also do :



          . . . 
          order by row_number() over (partition by [user]
          order by (case when source = 'new' then 2 else 1 end), value desc
          )





          share|improve this answer
























          • Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

            – Jon Warren
            Nov 21 '18 at 13:29



















          2














          You can solve it fairly easily without resorting to window functions. In this case, you need the maximum value where ((not new) OR (there isn't an old1 or old2 entry)).



          Here's a query that works correctly with your sample data:



          SELECT
          MAX(U1.[Value]) as 'Value'
          ,U1.[User]
          FROM
          #UserValues U1
          WHERE
          U1.[Source] <> 'new'
          OR NOT EXISTS (SELECT * FROM #UserValues U2 WHERE U2.[User] = U1.[User] AND U2.[Source] IN ('old1','old2'))
          GROUP BY U1.[User]





          share|improve this answer































            1














            with raw_data
            as (
            select row_number() over(partition by a.[user] order by a.value desc) as rnk
            ,count(case when a.source in('old1','old2') then 1 end) over(partition by a.[user]) as cnt_old
            ,a.*
            from uservalues a
            )
            ,curated_data
            as(select *
            ,row_number() over(partition by rd.[user] order by rd.value desc) as rnk2
            from raw_data rd
            where 0 = case when rnk=1 and source='new' and cnt_old>0 then 1 else 0 end
            )
            select *
            from curated_data
            where rnk2=1


            I am doing the following




            1. raw_data ->First i get rank the values on the basis of max available value per user. Also i get to check if the user has any records which are pegged at old1 or old2 in the source column


            2. curated_data ->i eliminate records which have the highest value(rnk=1) as new if they have cnt_old >0. Also now i rank(rnk2) the records on the highest value available from this result set.


            3. I select the highest available value from curated_data(ie rnk2=1)







            share|improve this answer































              1














              I think you should consider setting up an XREF table to define which source is what priority, for a possible more complicated priorisation in the future. I do it with a temp table:



              CREATE TABLE #SourcePriority
              (
              [Source] VARCHAR(10),
              [SourcePriority] INT
              )
              INSERT INTO #SourcePriority VALUES
              ('old1', 1),
              ('old2', 1),
              ('new', 2)


              You might also create a View to look up the SourcePriority to the original table. I do it wit a CTE + possible implementation how to look up the top priority with the highest value:



              ;WITH CTE as (
              SELECT s.[SourcePriority], u.[Value], u.[User]
              FROM #UserValues as u
              INNER JOIN #SourcePriority as s on u.[Source] = s.[Source]
              )
              SELECT MAX (v.[Value]) as [Value], v.[User]
              FROM (
              SELECT MIN ([SourcePriority]) as [TopPriority], [User]
              FROM cte
              GROUP BY [User]
              ) as s
              INNER JOIN cte as v
              ON s.[User] = v.[User] and s.[TopPriority] = v.[SourcePriority]
              GROUP BY v.[User]





              share|improve this answer































                0














                I think you want:



                select top (1) with ties uv.*
                from (select uv.*,
                sum(case when source in ('old1', 'old2') then 1 else 0 end) over (partition by user) as cnt_old
                from #UserValues uv
                ) uv
                where cnt_old = 0 or source <> 'new'
                order by row_number() over (partition by user order by value desc);





                share|improve this answer

























                  Your Answer






                  StackExchange.ifUsing("editor", function () {
                  StackExchange.using("externalEditor", function () {
                  StackExchange.using("snippets", function () {
                  StackExchange.snippets.init();
                  });
                  });
                  }, "code-snippets");

                  StackExchange.ready(function() {
                  var channelOptions = {
                  tags: "".split(" "),
                  id: "1"
                  };
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function() {
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled) {
                  StackExchange.using("snippets", function() {
                  createEditor();
                  });
                  }
                  else {
                  createEditor();
                  }
                  });

                  function createEditor() {
                  StackExchange.prepareEditor({
                  heartbeatType: 'answer',
                  autoActivateHeartbeat: false,
                  convertImagesToLinks: true,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: 10,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader: {
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  },
                  onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  });


                  }
                  });














                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function () {
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53398354%2ftrying-to-simplify-a-sql-query-without-union%23new-answer', 'question_page');
                  }
                  );

                  Post as a guest















                  Required, but never shown

























                  5 Answers
                  5






                  active

                  oldest

                  votes








                  5 Answers
                  5






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes









                  1














                  You can use priorities order by with row_number() :



                  select top (1) with ties uv.*
                  from #UserValues uv
                  order by row_number() over (partition by [user]
                  order by (case when source = 'old2' then 1 when source = 'old1' then 2 else 3 end), value desc
                  );


                  However, if you have only source limited with 3 then you can also do :



                  . . . 
                  order by row_number() over (partition by [user]
                  order by (case when source = 'new' then 2 else 1 end), value desc
                  )





                  share|improve this answer
























                  • Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

                    – Jon Warren
                    Nov 21 '18 at 13:29
















                  1














                  You can use priorities order by with row_number() :



                  select top (1) with ties uv.*
                  from #UserValues uv
                  order by row_number() over (partition by [user]
                  order by (case when source = 'old2' then 1 when source = 'old1' then 2 else 3 end), value desc
                  );


                  However, if you have only source limited with 3 then you can also do :



                  . . . 
                  order by row_number() over (partition by [user]
                  order by (case when source = 'new' then 2 else 1 end), value desc
                  )





                  share|improve this answer
























                  • Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

                    – Jon Warren
                    Nov 21 '18 at 13:29














                  1












                  1








                  1







                  You can use priorities order by with row_number() :



                  select top (1) with ties uv.*
                  from #UserValues uv
                  order by row_number() over (partition by [user]
                  order by (case when source = 'old2' then 1 when source = 'old1' then 2 else 3 end), value desc
                  );


                  However, if you have only source limited with 3 then you can also do :



                  . . . 
                  order by row_number() over (partition by [user]
                  order by (case when source = 'new' then 2 else 1 end), value desc
                  )





                  share|improve this answer













                  You can use priorities order by with row_number() :



                  select top (1) with ties uv.*
                  from #UserValues uv
                  order by row_number() over (partition by [user]
                  order by (case when source = 'old2' then 1 when source = 'old1' then 2 else 3 end), value desc
                  );


                  However, if you have only source limited with 3 then you can also do :



                  . . . 
                  order by row_number() over (partition by [user]
                  order by (case when source = 'new' then 2 else 1 end), value desc
                  )






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 20 '18 at 17:32









                  Yogesh SharmaYogesh Sharma

                  32.6k51438




                  32.6k51438













                  • Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

                    – Jon Warren
                    Nov 21 '18 at 13:29



















                  • Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

                    – Jon Warren
                    Nov 21 '18 at 13:29

















                  Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

                  – Jon Warren
                  Nov 21 '18 at 13:29





                  Thanks for the solutions everyone! This one, when adapted to fit my actual business problem, ended up being the most compact solution as well as providing the results quickest

                  – Jon Warren
                  Nov 21 '18 at 13:29













                  2














                  You can solve it fairly easily without resorting to window functions. In this case, you need the maximum value where ((not new) OR (there isn't an old1 or old2 entry)).



                  Here's a query that works correctly with your sample data:



                  SELECT
                  MAX(U1.[Value]) as 'Value'
                  ,U1.[User]
                  FROM
                  #UserValues U1
                  WHERE
                  U1.[Source] <> 'new'
                  OR NOT EXISTS (SELECT * FROM #UserValues U2 WHERE U2.[User] = U1.[User] AND U2.[Source] IN ('old1','old2'))
                  GROUP BY U1.[User]





                  share|improve this answer




























                    2














                    You can solve it fairly easily without resorting to window functions. In this case, you need the maximum value where ((not new) OR (there isn't an old1 or old2 entry)).



                    Here's a query that works correctly with your sample data:



                    SELECT
                    MAX(U1.[Value]) as 'Value'
                    ,U1.[User]
                    FROM
                    #UserValues U1
                    WHERE
                    U1.[Source] <> 'new'
                    OR NOT EXISTS (SELECT * FROM #UserValues U2 WHERE U2.[User] = U1.[User] AND U2.[Source] IN ('old1','old2'))
                    GROUP BY U1.[User]





                    share|improve this answer


























                      2












                      2








                      2







                      You can solve it fairly easily without resorting to window functions. In this case, you need the maximum value where ((not new) OR (there isn't an old1 or old2 entry)).



                      Here's a query that works correctly with your sample data:



                      SELECT
                      MAX(U1.[Value]) as 'Value'
                      ,U1.[User]
                      FROM
                      #UserValues U1
                      WHERE
                      U1.[Source] <> 'new'
                      OR NOT EXISTS (SELECT * FROM #UserValues U2 WHERE U2.[User] = U1.[User] AND U2.[Source] IN ('old1','old2'))
                      GROUP BY U1.[User]





                      share|improve this answer













                      You can solve it fairly easily without resorting to window functions. In this case, you need the maximum value where ((not new) OR (there isn't an old1 or old2 entry)).



                      Here's a query that works correctly with your sample data:



                      SELECT
                      MAX(U1.[Value]) as 'Value'
                      ,U1.[User]
                      FROM
                      #UserValues U1
                      WHERE
                      U1.[Source] <> 'new'
                      OR NOT EXISTS (SELECT * FROM #UserValues U2 WHERE U2.[User] = U1.[User] AND U2.[Source] IN ('old1','old2'))
                      GROUP BY U1.[User]






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Nov 20 '18 at 17:39









                      Shawn PenceShawn Pence

                      14125




                      14125























                          1














                          with raw_data
                          as (
                          select row_number() over(partition by a.[user] order by a.value desc) as rnk
                          ,count(case when a.source in('old1','old2') then 1 end) over(partition by a.[user]) as cnt_old
                          ,a.*
                          from uservalues a
                          )
                          ,curated_data
                          as(select *
                          ,row_number() over(partition by rd.[user] order by rd.value desc) as rnk2
                          from raw_data rd
                          where 0 = case when rnk=1 and source='new' and cnt_old>0 then 1 else 0 end
                          )
                          select *
                          from curated_data
                          where rnk2=1


                          I am doing the following




                          1. raw_data ->First i get rank the values on the basis of max available value per user. Also i get to check if the user has any records which are pegged at old1 or old2 in the source column


                          2. curated_data ->i eliminate records which have the highest value(rnk=1) as new if they have cnt_old >0. Also now i rank(rnk2) the records on the highest value available from this result set.


                          3. I select the highest available value from curated_data(ie rnk2=1)







                          share|improve this answer




























                            1














                            with raw_data
                            as (
                            select row_number() over(partition by a.[user] order by a.value desc) as rnk
                            ,count(case when a.source in('old1','old2') then 1 end) over(partition by a.[user]) as cnt_old
                            ,a.*
                            from uservalues a
                            )
                            ,curated_data
                            as(select *
                            ,row_number() over(partition by rd.[user] order by rd.value desc) as rnk2
                            from raw_data rd
                            where 0 = case when rnk=1 and source='new' and cnt_old>0 then 1 else 0 end
                            )
                            select *
                            from curated_data
                            where rnk2=1


                            I am doing the following




                            1. raw_data ->First i get rank the values on the basis of max available value per user. Also i get to check if the user has any records which are pegged at old1 or old2 in the source column


                            2. curated_data ->i eliminate records which have the highest value(rnk=1) as new if they have cnt_old >0. Also now i rank(rnk2) the records on the highest value available from this result set.


                            3. I select the highest available value from curated_data(ie rnk2=1)







                            share|improve this answer


























                              1












                              1








                              1







                              with raw_data
                              as (
                              select row_number() over(partition by a.[user] order by a.value desc) as rnk
                              ,count(case when a.source in('old1','old2') then 1 end) over(partition by a.[user]) as cnt_old
                              ,a.*
                              from uservalues a
                              )
                              ,curated_data
                              as(select *
                              ,row_number() over(partition by rd.[user] order by rd.value desc) as rnk2
                              from raw_data rd
                              where 0 = case when rnk=1 and source='new' and cnt_old>0 then 1 else 0 end
                              )
                              select *
                              from curated_data
                              where rnk2=1


                              I am doing the following




                              1. raw_data ->First i get rank the values on the basis of max available value per user. Also i get to check if the user has any records which are pegged at old1 or old2 in the source column


                              2. curated_data ->i eliminate records which have the highest value(rnk=1) as new if they have cnt_old >0. Also now i rank(rnk2) the records on the highest value available from this result set.


                              3. I select the highest available value from curated_data(ie rnk2=1)







                              share|improve this answer













                              with raw_data
                              as (
                              select row_number() over(partition by a.[user] order by a.value desc) as rnk
                              ,count(case when a.source in('old1','old2') then 1 end) over(partition by a.[user]) as cnt_old
                              ,a.*
                              from uservalues a
                              )
                              ,curated_data
                              as(select *
                              ,row_number() over(partition by rd.[user] order by rd.value desc) as rnk2
                              from raw_data rd
                              where 0 = case when rnk=1 and source='new' and cnt_old>0 then 1 else 0 end
                              )
                              select *
                              from curated_data
                              where rnk2=1


                              I am doing the following




                              1. raw_data ->First i get rank the values on the basis of max available value per user. Also i get to check if the user has any records which are pegged at old1 or old2 in the source column


                              2. curated_data ->i eliminate records which have the highest value(rnk=1) as new if they have cnt_old >0. Also now i rank(rnk2) the records on the highest value available from this result set.


                              3. I select the highest available value from curated_data(ie rnk2=1)








                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Nov 20 '18 at 17:52









                              George JosephGeorge Joseph

                              1,59059




                              1,59059























                                  1














                                  I think you should consider setting up an XREF table to define which source is what priority, for a possible more complicated priorisation in the future. I do it with a temp table:



                                  CREATE TABLE #SourcePriority
                                  (
                                  [Source] VARCHAR(10),
                                  [SourcePriority] INT
                                  )
                                  INSERT INTO #SourcePriority VALUES
                                  ('old1', 1),
                                  ('old2', 1),
                                  ('new', 2)


                                  You might also create a View to look up the SourcePriority to the original table. I do it wit a CTE + possible implementation how to look up the top priority with the highest value:



                                  ;WITH CTE as (
                                  SELECT s.[SourcePriority], u.[Value], u.[User]
                                  FROM #UserValues as u
                                  INNER JOIN #SourcePriority as s on u.[Source] = s.[Source]
                                  )
                                  SELECT MAX (v.[Value]) as [Value], v.[User]
                                  FROM (
                                  SELECT MIN ([SourcePriority]) as [TopPriority], [User]
                                  FROM cte
                                  GROUP BY [User]
                                  ) as s
                                  INNER JOIN cte as v
                                  ON s.[User] = v.[User] and s.[TopPriority] = v.[SourcePriority]
                                  GROUP BY v.[User]





                                  share|improve this answer




























                                    1














                                    I think you should consider setting up an XREF table to define which source is what priority, for a possible more complicated priorisation in the future. I do it with a temp table:



                                    CREATE TABLE #SourcePriority
                                    (
                                    [Source] VARCHAR(10),
                                    [SourcePriority] INT
                                    )
                                    INSERT INTO #SourcePriority VALUES
                                    ('old1', 1),
                                    ('old2', 1),
                                    ('new', 2)


                                    You might also create a View to look up the SourcePriority to the original table. I do it wit a CTE + possible implementation how to look up the top priority with the highest value:



                                    ;WITH CTE as (
                                    SELECT s.[SourcePriority], u.[Value], u.[User]
                                    FROM #UserValues as u
                                    INNER JOIN #SourcePriority as s on u.[Source] = s.[Source]
                                    )
                                    SELECT MAX (v.[Value]) as [Value], v.[User]
                                    FROM (
                                    SELECT MIN ([SourcePriority]) as [TopPriority], [User]
                                    FROM cte
                                    GROUP BY [User]
                                    ) as s
                                    INNER JOIN cte as v
                                    ON s.[User] = v.[User] and s.[TopPriority] = v.[SourcePriority]
                                    GROUP BY v.[User]





                                    share|improve this answer


























                                      1












                                      1








                                      1







                                      I think you should consider setting up an XREF table to define which source is what priority, for a possible more complicated priorisation in the future. I do it with a temp table:



                                      CREATE TABLE #SourcePriority
                                      (
                                      [Source] VARCHAR(10),
                                      [SourcePriority] INT
                                      )
                                      INSERT INTO #SourcePriority VALUES
                                      ('old1', 1),
                                      ('old2', 1),
                                      ('new', 2)


                                      You might also create a View to look up the SourcePriority to the original table. I do it wit a CTE + possible implementation how to look up the top priority with the highest value:



                                      ;WITH CTE as (
                                      SELECT s.[SourcePriority], u.[Value], u.[User]
                                      FROM #UserValues as u
                                      INNER JOIN #SourcePriority as s on u.[Source] = s.[Source]
                                      )
                                      SELECT MAX (v.[Value]) as [Value], v.[User]
                                      FROM (
                                      SELECT MIN ([SourcePriority]) as [TopPriority], [User]
                                      FROM cte
                                      GROUP BY [User]
                                      ) as s
                                      INNER JOIN cte as v
                                      ON s.[User] = v.[User] and s.[TopPriority] = v.[SourcePriority]
                                      GROUP BY v.[User]





                                      share|improve this answer













                                      I think you should consider setting up an XREF table to define which source is what priority, for a possible more complicated priorisation in the future. I do it with a temp table:



                                      CREATE TABLE #SourcePriority
                                      (
                                      [Source] VARCHAR(10),
                                      [SourcePriority] INT
                                      )
                                      INSERT INTO #SourcePriority VALUES
                                      ('old1', 1),
                                      ('old2', 1),
                                      ('new', 2)


                                      You might also create a View to look up the SourcePriority to the original table. I do it wit a CTE + possible implementation how to look up the top priority with the highest value:



                                      ;WITH CTE as (
                                      SELECT s.[SourcePriority], u.[Value], u.[User]
                                      FROM #UserValues as u
                                      INNER JOIN #SourcePriority as s on u.[Source] = s.[Source]
                                      )
                                      SELECT MAX (v.[Value]) as [Value], v.[User]
                                      FROM (
                                      SELECT MIN ([SourcePriority]) as [TopPriority], [User]
                                      FROM cte
                                      GROUP BY [User]
                                      ) as s
                                      INNER JOIN cte as v
                                      ON s.[User] = v.[User] and s.[TopPriority] = v.[SourcePriority]
                                      GROUP BY v.[User]






                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Nov 20 '18 at 18:33









                                      Dávid LaczkóDávid Laczkó

                                      429128




                                      429128























                                          0














                                          I think you want:



                                          select top (1) with ties uv.*
                                          from (select uv.*,
                                          sum(case when source in ('old1', 'old2') then 1 else 0 end) over (partition by user) as cnt_old
                                          from #UserValues uv
                                          ) uv
                                          where cnt_old = 0 or source <> 'new'
                                          order by row_number() over (partition by user order by value desc);





                                          share|improve this answer






























                                            0














                                            I think you want:



                                            select top (1) with ties uv.*
                                            from (select uv.*,
                                            sum(case when source in ('old1', 'old2') then 1 else 0 end) over (partition by user) as cnt_old
                                            from #UserValues uv
                                            ) uv
                                            where cnt_old = 0 or source <> 'new'
                                            order by row_number() over (partition by user order by value desc);





                                            share|improve this answer




























                                              0












                                              0








                                              0







                                              I think you want:



                                              select top (1) with ties uv.*
                                              from (select uv.*,
                                              sum(case when source in ('old1', 'old2') then 1 else 0 end) over (partition by user) as cnt_old
                                              from #UserValues uv
                                              ) uv
                                              where cnt_old = 0 or source <> 'new'
                                              order by row_number() over (partition by user order by value desc);





                                              share|improve this answer















                                              I think you want:



                                              select top (1) with ties uv.*
                                              from (select uv.*,
                                              sum(case when source in ('old1', 'old2') then 1 else 0 end) over (partition by user) as cnt_old
                                              from #UserValues uv
                                              ) uv
                                              where cnt_old = 0 or source <> 'new'
                                              order by row_number() over (partition by user order by value desc);






                                              share|improve this answer














                                              share|improve this answer



                                              share|improve this answer








                                              edited Nov 20 '18 at 18:39

























                                              answered Nov 20 '18 at 17:47









                                              Gordon LinoffGordon Linoff

                                              783k35310414




                                              783k35310414






























                                                  draft saved

                                                  draft discarded




















































                                                  Thanks for contributing an answer to Stack Overflow!


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  To learn more, see our tips on writing great answers.




                                                  draft saved


                                                  draft discarded














                                                  StackExchange.ready(
                                                  function () {
                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53398354%2ftrying-to-simplify-a-sql-query-without-union%23new-answer', 'question_page');
                                                  }
                                                  );

                                                  Post as a guest















                                                  Required, but never shown





















































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown

































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown







                                                  Popular posts from this blog

                                                  Guess what letter conforming each word

                                                  Run scheduled task as local user group (not BUILTIN)

                                                  Port of Spain