Problem description & analysis:
A certain database table describes the payment cycle for multiple projects (IDs), with one payment cycle consisting of regular months and a closing month. The regular month only has the current month’s amount but no invoice, Invoiced=0; The closing month includes both the current month’s amount and the invoice, Invoiced=1.
Task: Now we need to identify each payment month for each project and calculate the total amount for that payment cycle. Note that the grouping criteria and order of the payment cycle are related, that is, “when last month’s Invoiced=1, start a new group”, which is different from the common equivalence grouping.
Code comparisons:
SQL:
WITH cte AS (
SELECT *, sum(invoiced) OVER (PARTITION BY ID ORDER BY Date desc) grp
FROM mytable
ORDER BY ID, Date
)
SELECT ID, MAX(date) AS Date, MAX(Invoiced) AS Invoiced, SUM(Amount) AS Amount
FROM cte
GROUP BY ID, grp
ORDER BY ID, Date
SQL does not have a direct ordered grouping, it needs to add a help column using window functions and subqueries, and then group and aggregate based on the help column. The above SQL uses the method of reverse order and then accumulation to gather the help column, which is difficult to understand.
SPL:
SPL supports convenient ordered calculations, and the code is straightforward.
A1: Load data, note that the data has been sorted.
A2: When the ID remains unchanged and the previous month is a regular month, change the Amount to the cumulative value; Otherwise (in the first month of each payment cycle), reset the Amount to the current month’s amount. [-1] represents the previous record.
esProc SPL is open-source and now available here: Open-Source Address.
Top comments (0)