Azure Synapse Series: What is Workload Management (part2)
So in part 1 we found out what Workload Management was and why we wanted to use it. (If you haven’t read that part, please click here and read that post before you carry on, as we’re jumping in at the deep end!) In this part we’ll be learning how to configure it.
Continuing from the end of the previous post, you have two main options to configure this. First is to assign a user or role to one of the existing system defined workload groups, second is to create a custom workload group and assign a user or role to this instead.
First let’s look at adding a user to an existing workload group. Choose the Workload Group you’d like to add to and click the number in the Classifiers column.
Name – Give the Classifier a name (This can be anything)
Member – Select the user/role that you’d like this to apply to. (Run, SELECT USER_NAME() ,against your SQL Pool to find the actual username authenticated with. If you have sa access, this may be replaced with “sa”).
Request importance – This holds 5 levels, that can be used to prioritise conflicting queries. (Think about how important it is for different roles within your environment to access data quickly and who should go first with conflicts).
Label – When you run a query you can add, OPTION (LABEL = ‘Your_Label_Name’) ,to the end of your queries and it will apply whatever Label Name you put in to your query. If the label matches any of the classifiers then it will use that resource class. (Think about having an ‘ETL’ label or ‘ML’ or something similar).
Start and End Time – The time that the Classifier will be active for. (You may only want to grant increased resource usage during particular times. (Think about an overnight process needing a large amount of resource, compared to the same user/role only needing a reduced amount of resource during the day due to conflicting priorities).
Session Context – When you make a connection to your SQL Pool, rather than having to apply a label to every query, you can instead apply a Session Context that will last for that entire connection. This is a session bound variable rather than the usual batch bound variable. You can create this by running, EXEC sys.sp_set_session_context @key = ‘wlm_context’, @value = ‘Your_Session_Name’ ,at the start of your connection. The session name you enter will then apply to this Classifier.
These can all be combined a variety of ways, but each of them have specific weighting:
These add together, so if you have a user in two different classifiers, one with a label (64 + 16), and the other with a context and time (64 + 8 +4), then the one with the label would be chosen as it has the higher total. 80 to 76.
Secondly creating your own custom Workload Group allows you to be more precise with the allocation of system resources. To create one of these click on the “New workload group” button. There are three suggestions and a custom option. The three suggestions apply some values as a guide, whilst the Custom option just provides default values.
After this is created, you add Classifiers exactly the same as above.
This is a very complex thing to get right, so here are my final thoughts:
Think about system usage, volume of users vs performance of queries. You can’t assign every user 100% of resources. You’ll cause waits all over the place. How many users are querying at once? How complex are the queries?
Create Custom Workload Groups and Assign Classifiers. The system-defined groups are a great starting point to get away from smallrc, but not great for flexibility.
Assign roles to workload groups with different priorities. How important is it that the report loads quickly for an end user vs a manager vs the board..?
Assign times for specific workloads, overnight ELT, large ML query etc. 100% is sometimes okay… maybe… Scale Up, use 100% resource for intensive Transformation or ML query, Scale Down for everyday usage?
Test, monitor, amend, monitor, amend, monitor, amend… Constant Development Cycle. Listen to users, find out what is going wrong, monitor the usage stats during heavy load on the server, change the configuration to improve things.
Additional reading can be done on the Microsoft website: Here
Any questions? Let me know in the comments below or contact me on twitter: @PurpleFrogReiss