Monthly Archives: November 2017
What can you expect from my SQLBits pre conference training day in February 2018 at the London Olympia?
Well my friends, in short, we are going to take whirlwind tour of the entire business intelligence stack of services in Azure. No stone will be left unturned. No service will be left without scalability. We’ll cover them all and we certainly aren’t going to check with the Azure bill payer before turning up the compute on our data transforms.
What will we actually cover?
With new cloud services and advancements in locally hosted platforms developing a lambda architecture is becoming the new normal. In this full day of high level training we’ll learn how to architect hybrid business intelligence solutions using Microsoft Azure offerings. We’ll explore the roles of these cloud data services and how to make them work for you in this complete overview of business intelligence on the Microsoft cloud data platform.
Here’s how we’ll break that down during the day…
Module 1 – Getting Started with Azure
Using platform as a service products is great, but let’s take a step back. To kick off we’ll cover the basics for deploying and managing your Azure services. Navigating the Azure portal and building dashboards isn’t always as intuitive as we’d like. What’s a resource group? And why is it important to understand your Azure Activity Directory tenant?
Module 2 – An Overview of BI in Azure
What’s available for the business intelligence architect in the cloud and how might these services relate to traditional on premises ETL and cube data flows. Is ETL enough for our big unstructured data sources or do we need to mix things up and add some more letters to the acronym in the cloud?
Module 3 – Databases in Azure (SQL DB, SQL DW, Cosmos DB, SQL MI)
It’s SQL Server Jim, but not as we know it. Check out the PaaS flavours of our long term on premises friends. Can we trade the agent and an operating system for that sliding bar of scalable compute? DTU and DWU are here to stay with new SLA’s relating to throughput. Who’s on ACID and as BI people do we care?
Module 4 – The Azure Machines are here to Learn
Data scientist or developer? Azure Machine Learning was designed for applied machine learning. Use best-in-class algorithms in a simple drag-and-drop interface. We’ll go from idea to deployment in a matter of clicks. Without a terminator in sight!
Module 5 – Swimming in the Data Lake with U-SQL
Let’s understand the role of this hyper-scale two tier big data technology and how to harness its power with U-SQL, the offspring of T-SQL and C#. We’ll cover everything you need to know to get started developing solutions with Azure Data Lake.
Module 6 – IoT, Event Hubs and Azure Stream Analytics
Real-time data is everywhere. We need to use it and unlock it as a rich source of information that can be channelled to react to events, produce alerts from sensor values or in 9000 other scenarios. In this module, we’ll learn how, using Azure messaging hubs and Azure Stream Analytics.
Module 7 – Power BI, our Sematic Layer, is it All Things to All People?
Combining all our data sources in one place with rich visuals and a flexible data modelling tool. Power BI takes it all, small data, big data, streaming data, website content and more. But we really need a Venn diagram to decide when/where it’s needed.
Module 8 – Data Integration with Azure Data Factory and SSIS
The new integration runtime is here. But how do we unlock the scale out potential of our control flow and data flow? Let’s learn to create the perfect dependency driven pipeline for our data flows. Plus, how to work with the Azure Batch Service should you need that extensibility.
Finally we’ll wrap up the day by playing the Azure icon game, which you’ll all now be familiar with and able to complete with a perfect score having completed this training day 🙂
Many thanks for reading and I hope to see you in February, its going to be magic 😉
All training day content is subject to change, dependant on timings and the demo gods will!
Did you know it’s now possible to RDP to your Azure Batch Service compute nodes?
I’ve used the batch service to handle the compute for my Azure Data Factory custom activities for a while now. Which I’ve basically been doing blindly because the code execution and logging is provided to ADF, with no visibility to the underlying pool of VM’s doing the work. Well, no more is this the case!
In the Azure portal go to your Batch Service > Pools > Select Pool > Nodes > Select Node > Connect.
The connect button then presents you with the option to add a new user before telling you the external IP with an RDP file.
Once you’ve connected you’ll find a virtual machine, but with a few slight differences.
- The OS is on the D drive, rather than C.
- The amount of storage probably won’t match what you requested when you created the VM compute pool (that’s for another post).
- The VM has a bunch of special environment variables that you’ll want to use for any jobs being ran. More info on these here: https://docs.microsoft.com/en-us/azure/batch/batch-compute-node-environment-variables
The directory on the VM used for any ADF custom activities will be something like the following path:
I hope this was helpful when you go beyond the basics of Creating Azure Data Factory Custom Activities