Now we all know what a date dimension is and there are plenty of really great examples out there for creating them in various languages. Well, here’s my U-SQL version creating the output from scratch using a numbers table. Remember that U-SQL needs to be handled slightly differently because we don’t have any iterative functionality available. Plus its ability to massively parallelise jobs means we can’t write something that relies on procedural code.
This is version 1…
//Enter start and end dates for the date dimension:
DECLARE @StartDate DateTime = new DateTime(2017,1,1);
DECLARE @EndDate DateTime = new DateTime(2018,12,31);
//Create numbers table
@Numbers0 = SELECT * FROM (VALUES(0)) AS Row (Number);
@Numbers1 = SELECT [Number] FROM @Numbers0
UNION ALL SELECT [Number]+1 AS Number FROM @Numbers0;
@Numbers2 = SELECT [Number] FROM @Numbers1
UNION ALL SELECT [Number]+2 AS Number FROM @Numbers1;
@Numbers4 = SELECT [Number] FROM @Numbers2
UNION ALL SELECT [Number]+4 AS Number FROM @Numbers2;
@Numbers8 = SELECT [Number] FROM @Numbers4
UNION ALL SELECT [Number]+8 AS Number FROM @Numbers4;
@Numbers16 = SELECT [Number] FROM @Numbers8
UNION ALL SELECT [Number]+16 AS Number FROM @Numbers8;
@Numbers32 = SELECT [Number] FROM @Numbers16
UNION ALL SELECT [Number]+32 AS Number FROM @Numbers16;
@Numbers64 = SELECT [Number] FROM @Numbers32
UNION ALL SELECT [Number]+64 AS Number FROM @Numbers32;
//Double it again if you want it bigger...
//Create date dimension
@DateDimension =
SELECT
int.Parse([Date].ToString("yyyyMMdd")) AS DateKey,
[Date],
[Date].ToString("dd/MM/yyyy") AS DateString,
[Date].Day AS Day,
[Date].Month AS Month,
Math.Floor(((decimal)[Date].Month + 2) / 3) AS Quater,
[Date].Year AS Year,
Convert.ToInt32([Date].DayOfWeek) + 1 AS DayOfWeekNo,
[Date].ToString("dddd") AS DayName,
[Date].ToString("MMMM") AS MonthName,
[Date].Month >=4 ? [Date].ToString("yyyy")+"/"+([Date].AddYears(+1)).ToString("yy")
: ([Date].Year - 1).ToString() + "/" + [Date].ToString("yy") AS FinancialYear,
DateTimeFormatInfo.CurrentInfo.Calendar.GetWeekOfYear(
[Date], CalendarWeekRule.FirstDay, System.DayOfWeek.Sunday) AS WeekNoOfYear
FROM
(
SELECT
@StartDate.AddDays(Convert.ToDouble([RowNumber]) -1) AS Date
FROM
(
SELECT
ROW_NUMBER() OVER (ORDER BY n1.[Number]) AS RowNumber
FROM
@Numbers64 AS n1
CROSS JOIN @Numbers64 AS n2 //make it big!
) AS x
) AS y
WHERE
[Date] <= @EndDate; //cheat to cut off results
//Output files
OUTPUT @DateDimension
TO "/Stuff/DateDimension.csv"
ORDER BY [Date] ASC
USING Outputters.Csv(quoting : true, outputHeader : true);
//Get a numbers table as a bonus :-)
OUTPUT @Numbers64
TO "/Stuff/Numbers.csv"
ORDER BY [Number] ASC
USING Outputters.Csv(quoting : true, outputHeader : true);
In version 2 I may use a replicated string that EXPLODE's from an array rather than using a numbers table. But that's for another time. I included the numbers table as an output in this one as a little bonus 🙂
Hope this helps you out while swimming in the Azure Data Lake.
Many thanks for reading.