Has a client ever asked you how much it actually costs to run a single pipeline in Azure Data Factory? Have you ever thought ADF pricing is just a black box?
Well, hopefully my latest blog post will give you an indication on how you can start calculating the cost of a pipeline run!
I will base my analysis on a sample pipeline containing the following activities as shown below:
- 1 x Lookup Activity (Pipeline Activity)
- 1 x Copy Data Activity (Data Movement Activity)
- 1 x Stored Procedure (External Activity)
At the time of writing (December 2022) there are 4 x pricing scenarios used to generate a pipeline run cost (excluding dataflow costs).
Note: All activities which are charged by the hour are rounded up to the nearest minute. For example just a 1sec activity run time will be rounded up-to 1 minute. This will be charged at 1/60 = 0.016667hrs.
- Price Per Orchestration Run / Activity Run = £0.000864 (Here you are charged each time you click debug/trigger).
- Price Per Hour – Pipeline Activity (i.e. Lookup, Get Metadata, Delete) = £0.005 (Here you are charged a minimum of 1 minute for each activity in your pipeline)
- Price Per Data Integration Unit Hour (DIU-Hour) (1hr * 4 DIU’s * £0.216) = £0.864 (This charge only applies to a Copy Data Activity)
- Price Per Hour – External Activity (i.e. Databricks, Stored Procedures) = £0.000216 (Here you are charged a minimum of 1 minute for each external activity in your pipeline)
The above costs are based on Azure Region UK South, a Standard Azure Integration Runtime (Non VNET) and 4 DIUs used in the Copy Data Activity.
So let’s see how much it costs us to run our sample pipeline once:
- 1 x Trigger Run / Debug Run = £0.000864
- 3 x Activity Runs (Lookup, Copy & Stored Proc) = 3 * £0.000864 = £0.002592
- 1 x Lookup Activity with a duration of 30 seconds = 0.016667hrs * £0.005 = £0.000083
- 1 x Copy Data Activity with a duration of 1minute and 3 seconds = 0.03hrs * £0.864 = £0.02592
- 1 x Stored Procedure Activity with a duration of 21 seconds = 0.016667hrs * £0.000216 = £0.000004
Therefore the total cost to run our pipeline = £0.000864 + £0.002592 + £0.000083 + £0.02592 + £0.000004 = £0.029463 or 3p.
Now, this doesn’t sound an awful lot, but image a scenario where we are copying data directly from a source ERP system into a set of staging tables in a Azure SQL DB.
Let’s say we had 20 staging tables that we needed populate and each table takes 30 minutes to populate.
Our new copy data activity will now cost 20 x 0.5hrs x £0.864 = £8.64.
Now if we multiply that for each day in a typical month the total cost becomes £259.20!
Note: If your copy data activity settings are left on “Auto” ADF will use at least 4 DIUs. A good way to halve costs in a dev/test environment and for smaller workloads is to always select 2 DIUs rather than “Auto”.
Other “smaller” costs you should be aware of which can skew your bill.
Read/Write Operations:
£0.432 per 50,000 creations, deletions or modifications of any entities such as a dataset, linked service etc.
Monitoring Operations:
£0.216 per 50,000 run records retrieved. Monitoring operations include get and list for pipeline, activity, trigger, and debug runs.
Hopefully this blog post helps you get a handle on how much your pipelines are actually costing to run!