I was fiddling around with some data, looking at making a quick aging report. I probably should have reached for Power BI Desktop, but sometimes I can be stubborn, and I wanted to get my T-SQL to work–if nothing else to figure out how to make it work.
My Data
My data was essentially a really long list of widgets, with dates–really simple & straightforward:
Widget Name | Widget Date |
---|---|
Widget A | 2019-06-19 |
Widget B | 2019-07-14 |
Widget C | 2018-11-04 |
… | … |
Widget D | 2015-11-30 |
My reporting requirements
I wanted to count widgets based on how old they were. The data spanned several years, so I wanted to be able to put them into 3 or 6 month buckets. Essentially, I wanted to use N-month sized buckets to make my report to look something like this:
Widget Age | Widget Count |
---|---|
0 – 6 Months Old | 8752 |
6 – 12 Months Old | 9836 |
12 – 18 Months old | 7852 |
… | … |
60 – 66 Months old | 5872 |
Let’s write the report
First, we need some test data we can fool around with. I’m going to create a temp table with one row per day, using the Numbers table from my DBA Database. For this test data, that will make the numbers per bucket pretty similar, but that makes for easy validation!
SELECT WidgetName = 'Widget ' + LEFT(NEWID(),3),
WidgetDate = CONVERT(DATE,DATEADD(DAY,-1*[Number],GETDATE()))
INTO #Widgets
FROM DBA.dbo.Numbers
WHERE [Number] < 75000;
Now, it’s pretty easy to calculate the age of these rows–we just use DATEDIFF
to compute the number of days/weeks/months/years between the WidgetDate
and today. But that doesn’t help me with my report. I could do a GROUP BY MonthsAgo
to count per month… but I really wanted to put these into N-month buckets.
SELECT WidgetDate,
DaysOld = DATEDIFF(DAY,WidgetDate,GETDATE()),
MonthsOld = DATEDIFF(MONTH,WidgetDate,GETDATE())
FROM #Widgets;
Next, I want to put those “Months Ago” numbers into buckets–and I want to be able to easily adjust the size of those buckets as I evaluate the data & work on how I want to present it. Each bucket has a “bottom end” and “top end” value–referred to as a floor & ceiling.
Here’s where my math nerd-iness comes out… There’s a little trick for “bucketizing” numbers (in this case, turning “Months” into “Month Buckets”):
- Take a number
- Divide it by your bucket size
- Round that number down to a whole number–We’ll call this the “divided number”
- Multiply the “divided number” by the bucket size–This is your bucket floor
- Add one to the “divided number” and then multiply that by your bucket size–This is your bucket ceiling
Converting that into T-SQL, we can even save a step: Our MonthsOld
column is always a whole number (ie, an int
). If our bucket size is also an int
, then dividing the two will always result in an int
, which has been rounded down–this saves me from having to manually do any rounding! (Though, I could totally throw in a call to the FLOOR()
function if I wanted to be explicit.)
Now I can convert that into T-SQL and continue to add on to our last query, and spot-check the results:
DECLARE @BucketSize int = 6;
SELECT WidgetDate,
DaysOld = DATEDIFF(DAY,WidgetDate,GETDATE()),
MonthsOld = DATEDIFF(MONTH,WidgetDate,GETDATE()),
MonthBucketFloor = DATEDIFF(MONTH,WidgetDate,GETDATE()) / @BucketSize * @BucketSize,
MonthBucketCeiling = (DATEDIFF(MONTH,WidgetDate,GETDATE()) / @BucketSize + 1) * @BucketSize
FROM #Widgets;
I know there’s someone else out there shouting “No, Andy! That formula is too complicated! There’s another way to do that with fewer steps!” Yes, there is another way. This option might be easier to understand why it works, but I find the code harder to read.
- Take a number
- Perform a modulo with the bucket size as the divisor to get the remainder
- Subtract the remainder from the original number–This is your bucket floor
- Take your bucket floor and add your bucket size–This is your bucket ceiling
OK, so technically, that’s only 4 steps. But I don’t think that makes it any simpler. Here’s what this alternate version looks like when you convert it into T-SQL:
--Alternate to my prior query
DECLARE @BucketSize int = 6;
SELECT WidgetDate,
DaysOld = DATEDIFF(DAY,WidgetDate,GETDATE()),
MonthsOld = DATEDIFF(MONTH,WidgetDate,GETDATE()),
MonthBucketFloor = DATEDIFF(MONTH,WidgetDate,GETDATE()) - (DATEDIFF(MONTH,WidgetDate,GETDATE()) % @BucketSize),
MonthBucketCeiling = DATEDIFF(MONTH,WidgetDate,GETDATE()) - (DATEDIFF(MONTH,WidgetDate,GETDATE()) % @BucketSize) + @BucketSize
FROM #Widgets;
I’m not going to use this version, but know that it’s 100% logically equivalent to the prior version. Maybe some of my fellow math nerds will slide into the comments with the mathematical proof that these two queries are logically equivalent.
My last step is simply to do some counts grouped by the buckets–and to convert the floor/ceiling numbers into a pretty label:
DECLARE @BucketSize int = 6;
WITH WidgetsWithAge AS (
SELECT WidgetDate,
DaysOld = DATEDIFF(DAY,WidgetDate,GETDATE()),
MonthsOld = DATEDIFF(MONTH,WidgetDate,GETDATE()),
MonthBucketFloor = FLOOR(DATEDIFF(MONTH,WidgetDate,GETDATE())/@BucketSize) * @BucketSize,
MonthBucketCeiling = (FLOOR(DATEDIFF(MONTH,WidgetDate,GETDATE())/@BucketSize) + 1) * @BucketSize
FROM #Widgets
)
SELECT WidgetAge = CONVERT(varchar(100), MonthBucketFloor) + ' - ' + CONVERT(varchar(100), MonthBucketCeiling) + ' Months Old',
WidgetCount = COUNT(*)
FROM WidgetsWithAge
GROUP BY MonthBucketFloor, MonthBucketCeiling;
Boom! That’s it. That’s my report. And since I sneaked in that @BucketSize
as a parameter, it’s pretty easy for me to adjust my bucket size for different data. I can easily switch that to a 36-month bucket with the flick of a keystroke:
DECLARE @BucketSize int = 36;
WITH WidgetsWithAge AS (
SELECT WidgetDate,
DaysOld = DATEDIFF(DAY,WidgetDate,GETDATE()),
MonthsOld = DATEDIFF(MONTH,WidgetDate,GETDATE()),
MonthBucketFloor = FLOOR(DATEDIFF(MONTH,WidgetDate,GETDATE())/@BucketSize) * @BucketSize,
MonthBucketCeiling = (FLOOR(DATEDIFF(MONTH,WidgetDate,GETDATE())/@BucketSize) + 1) * @BucketSize
FROM #Widgets
)
SELECT WidgetAge = CONVERT(varchar(100), MonthBucketFloor) + ' - ' + CONVERT(varchar(100), MonthBucketCeiling) + ' Months Old',
WidgetCount = COUNT(*)
FROM WidgetsWithAge
GROUP BY MonthBucketFloor, MonthBucketCeiling;
The post How to create date buckets in T-SQL appeared first on Andy M Mallon - AM².
Top comments (0)