Amazon Redshift query to get delinquent amount and days past due at the end of month
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Changing the question because of a misunderstanding in use case.
Amazon Redshift Query for the following problem statement.
The data structure:
- id - primary key
- acc_id - id unique to a loan account (this id will be same for all
emi's for a particular loan account, this maybe repeated 6 times or
12 times based on loan tenure which can be 6 months or 12 months
respectively) - status - PAID or UNPAID (the emi's unpaid are followed my unpaid
emi's only) - s_id - just a scheduling id which would be consecutive numbers for a
a particular loan id - due_date - the due date for that particular emi
- principal - amount that is due
The table:
id acc_id status s_id due_date principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286
The use case -
The unpaid amount becomes delinquent (overdue) after the due_date.
So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)
I also need to calculate days past due at the end of that month.
Illustration from the above data:
- From the sample data provided, no EMI is due at the end of July so amount delinquent is 0
- But at the end of August - the id 9999958 is due - as of 31st August
the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)
The catch: I need to calculate this for the loan (acc_id) and not the emi.
To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).
The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).
Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).
The final output would be something like this:
Month_End DPD_Band Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0
Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.
mysql sql amazon-redshift
add a comment |
Changing the question because of a misunderstanding in use case.
Amazon Redshift Query for the following problem statement.
The data structure:
- id - primary key
- acc_id - id unique to a loan account (this id will be same for all
emi's for a particular loan account, this maybe repeated 6 times or
12 times based on loan tenure which can be 6 months or 12 months
respectively) - status - PAID or UNPAID (the emi's unpaid are followed my unpaid
emi's only) - s_id - just a scheduling id which would be consecutive numbers for a
a particular loan id - due_date - the due date for that particular emi
- principal - amount that is due
The table:
id acc_id status s_id due_date principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286
The use case -
The unpaid amount becomes delinquent (overdue) after the due_date.
So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)
I also need to calculate days past due at the end of that month.
Illustration from the above data:
- From the sample data provided, no EMI is due at the end of July so amount delinquent is 0
- But at the end of August - the id 9999958 is due - as of 31st August
the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)
The catch: I need to calculate this for the loan (acc_id) and not the emi.
To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).
The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).
Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).
The final output would be something like this:
Month_End DPD_Band Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0
Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.
mysql sql amazon-redshift
1
I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.
– jarlh
Nov 22 '18 at 10:16
Specify the expected result as well - as formatted text. And show us your current query attempt.
– jarlh
Nov 22 '18 at 10:17
Okay, new here. Thanks for noticing.
– Abhinav sql
Nov 22 '18 at 10:20
why mysql tag? is this mysql or redshift?
– Jon Scott
Nov 22 '18 at 13:16
The data is avialble on both mysql and Redshift.
– Abhinav sql
Nov 22 '18 at 13:46
add a comment |
Changing the question because of a misunderstanding in use case.
Amazon Redshift Query for the following problem statement.
The data structure:
- id - primary key
- acc_id - id unique to a loan account (this id will be same for all
emi's for a particular loan account, this maybe repeated 6 times or
12 times based on loan tenure which can be 6 months or 12 months
respectively) - status - PAID or UNPAID (the emi's unpaid are followed my unpaid
emi's only) - s_id - just a scheduling id which would be consecutive numbers for a
a particular loan id - due_date - the due date for that particular emi
- principal - amount that is due
The table:
id acc_id status s_id due_date principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286
The use case -
The unpaid amount becomes delinquent (overdue) after the due_date.
So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)
I also need to calculate days past due at the end of that month.
Illustration from the above data:
- From the sample data provided, no EMI is due at the end of July so amount delinquent is 0
- But at the end of August - the id 9999958 is due - as of 31st August
the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)
The catch: I need to calculate this for the loan (acc_id) and not the emi.
To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).
The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).
Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).
The final output would be something like this:
Month_End DPD_Band Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0
Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.
mysql sql amazon-redshift
Changing the question because of a misunderstanding in use case.
Amazon Redshift Query for the following problem statement.
The data structure:
- id - primary key
- acc_id - id unique to a loan account (this id will be same for all
emi's for a particular loan account, this maybe repeated 6 times or
12 times based on loan tenure which can be 6 months or 12 months
respectively) - status - PAID or UNPAID (the emi's unpaid are followed my unpaid
emi's only) - s_id - just a scheduling id which would be consecutive numbers for a
a particular loan id - due_date - the due date for that particular emi
- principal - amount that is due
The table:
id acc_id status s_id due_date principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286
The use case -
The unpaid amount becomes delinquent (overdue) after the due_date.
So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)
I also need to calculate days past due at the end of that month.
Illustration from the above data:
- From the sample data provided, no EMI is due at the end of July so amount delinquent is 0
- But at the end of August - the id 9999958 is due - as of 31st August
the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)
The catch: I need to calculate this for the loan (acc_id) and not the emi.
To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).
The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).
Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).
The final output would be something like this:
Month_End DPD_Band Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0
Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.
mysql sql amazon-redshift
mysql sql amazon-redshift
edited Nov 22 '18 at 14:55
Abhinav sql
asked Nov 22 '18 at 10:15
Abhinav sqlAbhinav sql
32
32
1
I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.
– jarlh
Nov 22 '18 at 10:16
Specify the expected result as well - as formatted text. And show us your current query attempt.
– jarlh
Nov 22 '18 at 10:17
Okay, new here. Thanks for noticing.
– Abhinav sql
Nov 22 '18 at 10:20
why mysql tag? is this mysql or redshift?
– Jon Scott
Nov 22 '18 at 13:16
The data is avialble on both mysql and Redshift.
– Abhinav sql
Nov 22 '18 at 13:46
add a comment |
1
I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.
– jarlh
Nov 22 '18 at 10:16
Specify the expected result as well - as formatted text. And show us your current query attempt.
– jarlh
Nov 22 '18 at 10:17
Okay, new here. Thanks for noticing.
– Abhinav sql
Nov 22 '18 at 10:20
why mysql tag? is this mysql or redshift?
– Jon Scott
Nov 22 '18 at 13:16
The data is avialble on both mysql and Redshift.
– Abhinav sql
Nov 22 '18 at 13:46
1
1
I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.
– jarlh
Nov 22 '18 at 10:16
I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.
– jarlh
Nov 22 '18 at 10:16
Specify the expected result as well - as formatted text. And show us your current query attempt.
– jarlh
Nov 22 '18 at 10:17
Specify the expected result as well - as formatted text. And show us your current query attempt.
– jarlh
Nov 22 '18 at 10:17
Okay, new here. Thanks for noticing.
– Abhinav sql
Nov 22 '18 at 10:20
Okay, new here. Thanks for noticing.
– Abhinav sql
Nov 22 '18 at 10:20
why mysql tag? is this mysql or redshift?
– Jon Scott
Nov 22 '18 at 13:16
why mysql tag? is this mysql or redshift?
– Jon Scott
Nov 22 '18 at 13:16
The data is avialble on both mysql and Redshift.
– Abhinav sql
Nov 22 '18 at 13:46
The data is avialble on both mysql and Redshift.
– Abhinav sql
Nov 22 '18 at 13:46
add a comment |
1 Answer
1
active
oldest
votes
Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)
The following creates one record for each month between your first record, and the end of last month.
It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date
That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.
Instead, you could normalise your output...
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428626%2famazon-redshift-query-to-get-delinquent-amount-and-days-past-due-at-the-end-of-m%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)
The following creates one record for each month between your first record, and the end of last month.
It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date
That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.
Instead, you could normalise your output...
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
add a comment |
Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)
The following creates one record for each month between your first record, and the end of last month.
It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date
That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.
Instead, you could normalise your output...
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
add a comment |
Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)
The following creates one record for each month between your first record, and the end of last month.
It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date
That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.
Instead, you could normalise your output...
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30
Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)
The following creates one record for each month between your first record, and the end of last month.
It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date
That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.
Instead, you could normalise your output...
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30
edited Nov 22 '18 at 14:03
answered Nov 22 '18 at 10:32
MatBailieMatBailie
60.8k1477112
60.8k1477112
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
add a comment |
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
This is perfect Mat. But can you see my updated use case. The only thing additional on this is instead of individual emi, the sum(principal) should be for all unpaid loans before the date for that particular user.
– Abhinav sql
Nov 22 '18 at 15:02
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
and dpd should be calculated from the first unpaid loan
– Abhinav sql
Nov 22 '18 at 15:26
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Hey Mat, can you help me with the updated use case? I can provide more understanding, im really struggling with it
– Abhinav sql
Nov 23 '18 at 7:27
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
Do you want me to create another question?
– Abhinav sql
Nov 26 '18 at 8:15
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428626%2famazon-redshift-query-to-get-delinquent-amount-and-days-past-due-at-the-end-of-m%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I have removed all those conflicting dbms tags. Put one of them back, the one for the dbms actually used.
– jarlh
Nov 22 '18 at 10:16
Specify the expected result as well - as formatted text. And show us your current query attempt.
– jarlh
Nov 22 '18 at 10:17
Okay, new here. Thanks for noticing.
– Abhinav sql
Nov 22 '18 at 10:20
why mysql tag? is this mysql or redshift?
– Jon Scott
Nov 22 '18 at 13:16
The data is avialble on both mysql and Redshift.
– Abhinav sql
Nov 22 '18 at 13:46