Amazon Redshift query to get delinquent amount and days past due at the end of month





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















Changing the question because of a misunderstanding in use case.



Amazon Redshift Query for the following problem statement.



The data structure:




  1. id - primary key

  2. acc_id - id unique to a loan account (this id will be same for all
    emi's for a particular loan account, this maybe repeated 6 times or
    12 times based on loan tenure which can be 6 months or 12 months
    respectively)

  3. status - PAID or UNPAID (the emi's unpaid are followed my unpaid
    emi's only)

  4. s_id - just a scheduling id which would be consecutive numbers for a
    a particular loan id

  5. due_date - the due date for that particular emi

  6. principal - amount that is due


The table:



   id       acc_id status   s_id    due_date            principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286


The use case -



The unpaid amount becomes delinquent (overdue) after the due_date.



So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)



I also need to calculate days past due at the end of that month.



Illustration from the above data:




  • From the sample data provided, no EMI is due at the end of July so amount delinquent is 0

  • But at the end of August - the id 9999958 is due - as of 31st August
    the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)


The catch: I need to calculate this for the loan (acc_id) and not the emi.



To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).



The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).



Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).



The final output would be something like this:



Month_End   DPD_Band    Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0


Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.










share|improve this question




















  • 1





    I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.

    – jarlh
    Nov 22 '18 at 10:16













  • Specify the expected result as well - as formatted text. And show us your current query attempt.

    – jarlh
    Nov 22 '18 at 10:17











  • Okay, new here. Thanks for noticing.

    – Abhinav sql
    Nov 22 '18 at 10:20











  • why mysql tag? is this mysql or redshift?

    – Jon Scott
    Nov 22 '18 at 13:16











  • The data is avialble on both mysql and Redshift.

    – Abhinav sql
    Nov 22 '18 at 13:46


















0















Changing the question because of a misunderstanding in use case.



Amazon Redshift Query for the following problem statement.



The data structure:




  1. id - primary key

  2. acc_id - id unique to a loan account (this id will be same for all
    emi's for a particular loan account, this maybe repeated 6 times or
    12 times based on loan tenure which can be 6 months or 12 months
    respectively)

  3. status - PAID or UNPAID (the emi's unpaid are followed my unpaid
    emi's only)

  4. s_id - just a scheduling id which would be consecutive numbers for a
    a particular loan id

  5. due_date - the due date for that particular emi

  6. principal - amount that is due


The table:



   id       acc_id status   s_id    due_date            principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286


The use case -



The unpaid amount becomes delinquent (overdue) after the due_date.



So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)



I also need to calculate days past due at the end of that month.



Illustration from the above data:




  • From the sample data provided, no EMI is due at the end of July so amount delinquent is 0

  • But at the end of August - the id 9999958 is due - as of 31st August
    the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)


The catch: I need to calculate this for the loan (acc_id) and not the emi.



To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).



The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).



Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).



The final output would be something like this:



Month_End   DPD_Band    Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0


Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.










share|improve this question




















  • 1





    I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.

    – jarlh
    Nov 22 '18 at 10:16













  • Specify the expected result as well - as formatted text. And show us your current query attempt.

    – jarlh
    Nov 22 '18 at 10:17











  • Okay, new here. Thanks for noticing.

    – Abhinav sql
    Nov 22 '18 at 10:20











  • why mysql tag? is this mysql or redshift?

    – Jon Scott
    Nov 22 '18 at 13:16











  • The data is avialble on both mysql and Redshift.

    – Abhinav sql
    Nov 22 '18 at 13:46














0












0








0








Changing the question because of a misunderstanding in use case.



Amazon Redshift Query for the following problem statement.



The data structure:




  1. id - primary key

  2. acc_id - id unique to a loan account (this id will be same for all
    emi's for a particular loan account, this maybe repeated 6 times or
    12 times based on loan tenure which can be 6 months or 12 months
    respectively)

  3. status - PAID or UNPAID (the emi's unpaid are followed my unpaid
    emi's only)

  4. s_id - just a scheduling id which would be consecutive numbers for a
    a particular loan id

  5. due_date - the due date for that particular emi

  6. principal - amount that is due


The table:



   id       acc_id status   s_id    due_date            principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286


The use case -



The unpaid amount becomes delinquent (overdue) after the due_date.



So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)



I also need to calculate days past due at the end of that month.



Illustration from the above data:




  • From the sample data provided, no EMI is due at the end of July so amount delinquent is 0

  • But at the end of August - the id 9999958 is due - as of 31st August
    the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)


The catch: I need to calculate this for the loan (acc_id) and not the emi.



To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).



The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).



Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).



The final output would be something like this:



Month_End   DPD_Band    Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0


Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.










share|improve this question
















Changing the question because of a misunderstanding in use case.



Amazon Redshift Query for the following problem statement.



The data structure:




  1. id - primary key

  2. acc_id - id unique to a loan account (this id will be same for all
    emi's for a particular loan account, this maybe repeated 6 times or
    12 times based on loan tenure which can be 6 months or 12 months
    respectively)

  3. status - PAID or UNPAID (the emi's unpaid are followed my unpaid
    emi's only)

  4. s_id - just a scheduling id which would be consecutive numbers for a
    a particular loan id

  5. due_date - the due date for that particular emi

  6. principal - amount that is due


The table:



   id       acc_id status   s_id    due_date            principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286


The use case -



The unpaid amount becomes delinquent (overdue) after the due_date.



So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)



I also need to calculate days past due at the end of that month.



Illustration from the above data:




  • From the sample data provided, no EMI is due at the end of July so amount delinquent is 0

  • But at the end of August - the id 9999958 is due - as of 31st August
    the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)


The catch: I need to calculate this for the loan (acc_id) and not the emi.



To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).



The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).



Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).



The final output would be something like this:



Month_End   DPD_Band    Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0


Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.







mysql sql amazon-redshift






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 '18 at 14:55







Abhinav sql

















asked Nov 22 '18 at 10:15









Abhinav sqlAbhinav sql

32




32








  • 1





    I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.

    – jarlh
    Nov 22 '18 at 10:16













  • Specify the expected result as well - as formatted text. And show us your current query attempt.

    – jarlh
    Nov 22 '18 at 10:17











  • Okay, new here. Thanks for noticing.

    – Abhinav sql
    Nov 22 '18 at 10:20











  • why mysql tag? is this mysql or redshift?

    – Jon Scott
    Nov 22 '18 at 13:16











  • The data is avialble on both mysql and Redshift.

    – Abhinav sql
    Nov 22 '18 at 13:46














  • 1





    I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.

    – jarlh
    Nov 22 '18 at 10:16













  • Specify the expected result as well - as formatted text. And show us your current query attempt.

    – jarlh
    Nov 22 '18 at 10:17











  • Okay, new here. Thanks for noticing.

    – Abhinav sql
    Nov 22 '18 at 10:20











  • why mysql tag? is this mysql or redshift?

    – Jon Scott
    Nov 22 '18 at 13:16











  • The data is avialble on both mysql and Redshift.

    – Abhinav sql
    Nov 22 '18 at 13:46








1




1





I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.

– jarlh
Nov 22 '18 at 10:16







I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.

– jarlh
Nov 22 '18 at 10:16















Specify the expected result as well - as formatted text. And show us your current query attempt.

– jarlh
Nov 22 '18 at 10:17





Specify the expected result as well - as formatted text. And show us your current query attempt.

– jarlh
Nov 22 '18 at 10:17













Okay, new here. Thanks for noticing.

– Abhinav sql
Nov 22 '18 at 10:20





Okay, new here. Thanks for noticing.

– Abhinav sql
Nov 22 '18 at 10:20













why mysql tag? is this mysql or redshift?

– Jon Scott
Nov 22 '18 at 13:16





why mysql tag? is this mysql or redshift?

– Jon Scott
Nov 22 '18 at 13:16













The data is avialble on both mysql and Redshift.

– Abhinav sql
Nov 22 '18 at 13:46





The data is avialble on both mysql and Redshift.

– Abhinav sql
Nov 22 '18 at 13:46












1 Answer
1






active

oldest

votes


















0














Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)



The following creates one record for each month between your first record, and the end of last month.



It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date


That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.



Instead, you could normalise your output...



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30





share|improve this answer


























  • This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

    – Abhinav sql
    Nov 22 '18 at 15:02











  • and dpd should be calculated from the first unpaid loan

    – Abhinav sql
    Nov 22 '18 at 15:26











  • Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

    – Abhinav sql
    Nov 23 '18 at 7:27











  • Do you want me to create another question?

    – Abhinav sql
    Nov 26 '18 at 8:15












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428626%2famazon-redshift-query-to-get-delinquent-amount-and-days-past-due-at-the-end-of-m%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)



The following creates one record for each month between your first record, and the end of last month.



It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date


That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.



Instead, you could normalise your output...



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30





share|improve this answer


























  • This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

    – Abhinav sql
    Nov 22 '18 at 15:02











  • and dpd should be calculated from the first unpaid loan

    – Abhinav sql
    Nov 22 '18 at 15:26











  • Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

    – Abhinav sql
    Nov 23 '18 at 7:27











  • Do you want me to create another question?

    – Abhinav sql
    Nov 26 '18 at 8:15
















0














Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)



The following creates one record for each month between your first record, and the end of last month.



It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date


That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.



Instead, you could normalise your output...



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30





share|improve this answer


























  • This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

    – Abhinav sql
    Nov 22 '18 at 15:02











  • and dpd should be calculated from the first unpaid loan

    – Abhinav sql
    Nov 22 '18 at 15:26











  • Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

    – Abhinav sql
    Nov 23 '18 at 7:27











  • Do you want me to create another question?

    – Abhinav sql
    Nov 26 '18 at 8:15














0












0








0







Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)



The following creates one record for each month between your first record, and the end of last month.



It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date


That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.



Instead, you could normalise your output...



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30





share|improve this answer















Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)



The following creates one record for each month between your first record, and the end of last month.



It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date


That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.



Instead, you could normalise your output...



WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 22 '18 at 14:03

























answered Nov 22 '18 at 10:32









MatBailieMatBailie

60.8k1477112




60.8k1477112













  • This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

    – Abhinav sql
    Nov 22 '18 at 15:02











  • and dpd should be calculated from the first unpaid loan

    – Abhinav sql
    Nov 22 '18 at 15:26











  • Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

    – Abhinav sql
    Nov 23 '18 at 7:27











  • Do you want me to create another question?

    – Abhinav sql
    Nov 26 '18 at 8:15



















  • This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

    – Abhinav sql
    Nov 22 '18 at 15:02











  • and dpd should be calculated from the first unpaid loan

    – Abhinav sql
    Nov 22 '18 at 15:26











  • Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

    – Abhinav sql
    Nov 23 '18 at 7:27











  • Do you want me to create another question?

    – Abhinav sql
    Nov 26 '18 at 8:15

















This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

– Abhinav sql
Nov 22 '18 at 15:02





This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.

– Abhinav sql
Nov 22 '18 at 15:02













and dpd should be calculated from the first unpaid loan

– Abhinav sql
Nov 22 '18 at 15:26





and dpd should be calculated from the first unpaid loan

– Abhinav sql
Nov 22 '18 at 15:26













Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

– Abhinav sql
Nov 23 '18 at 7:27





Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it

– Abhinav sql
Nov 23 '18 at 7:27













Do you want me to create another question?

– Abhinav sql
Nov 26 '18 at 8:15





Do you want me to create another question?

– Abhinav sql
Nov 26 '18 at 8:15




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428626%2famazon-redshift-query-to-get-delinquent-amount-and-days-past-due-at-the-end-of-m%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Guess what letter conforming each word

Run scheduled task as local user group (not BUILTIN)

Port of Spain