What’s New in Azure Data Factory Version 2 (ADFv2)

I’m sure for most cloud data wranglers the release of Azure Data Factory Version 2 has been long overdue. Well good news friends. It’s here! So, what new features does the service now offer for handling our Azure data solutions?… In short, loads!

In this post, I’ll try and give you an overview of what’s new and what to expect from ADFv2. However, I’m sure more questions than answers will be raised here. As developers we must ask why and how when presented with anything. But let’s start somewhere.

Note: the order of the sub headings below was intentional.

Before diving into the new and shiny I think we need to deal with a couple of concepts to understand why ADFv2 is a completely new service and not just an extension of what version 1 offered.

Let’s compare Azure Data Factory Version 1 and Version 2 at a high level.

  • ADFv1 – is a service designed for the batch data processing of time series data.
  • ADFv2 – is a very general-purpose hybrid data integration service with very flexible execution patterns.

This makes ADFv2 a very different animal and something that now can handle scale out control flow and data flow patterns for all our ETL needs. Microsoft seemed to have got the message here, following lots of feedback from the community, that this is the framework we want for developing our data flows. Plus, is how we’ve been working for a long time with the very mature SQL Server Integration Services (SSIS).
 
 
 

Concepts:

Integration Runtime (IR)

Everything done in Azure Data Factory v2 will use the Integration Runtime engine. The IR is the core service component for ADFv2. It is to the ADFv2 JSON framework of instructions what the Common Language Runtime (CLR) is to the .Net framework.

Currently the IR can be virtualised to live in Azure, or it can be used on premises as a local emulator/endpoint. To give each of these instances their proper JSON label the IR can be ‘SelfHosted’ or ‘Managed’. To try and put that into context, consider the ADFv1 Data Management Gateway as a self-hosted IR endpoint (for now). This distinction between hosted and managed IR’s will also be reflected in the data movement costs on your subscription bill, but let’s not get distracted with pricing yet.

The new IR is designed to perform three operations:

  1. Move data.
  2. Execute ADF activities.
  3. Execute SSIS packages.

Of course, points 1 and 2 here aren’t really anything new as we could already do this in ADFv1, but point 3 is what should spark the excitement. It is this ability to transform our data that has been missing from Azure that we’ve badly needed.

With the IR in ADFv2 this means we can now lift and shift our existing on premises SSIS packages into the cloud or start with a blank canvas and create cloud based scale out control flow and data flow pipelines, facilitated by the new capabilities in ADFv2.

Without crossing any lines, the IR will become the way you start using SSIS in Azure, regardless of whether you decide to wrap it in ADFv2 or not.

Branching

This next concept I assume for anyone that’s used SSIS won’t be new. But it’s great to learn that we now have it available in the ADFv2 control flow (at an activity level).

Post execution our downstream activities can now be dependent on four possible outcomes as standard.

  • On success
  • On failure
  • On completion
  • On skip

Also, custom ‘if’ conditions will be available for branching based expressions (more on expressions later).


That’s the high-level concepts dealt with. Now, for ease of reading let’s break the new features down into two main sections. The service level changes and then the additions to our toolkit of ADF activities.

Service Features:

Web Based Developer UI

This won’t be available for use until later in the year but having a web based development tool to build our ADF pipelines is very exciting!… No more hand crafting the JSON. I’ll leave this point just with a sneaky picture. I’m sure this explains more than I can in words.

It will include an interface to GitHub for source control and the ability the execute the activities directly in the development environment.

For field mappings between source and destination the new UI will also support a drag and drop panel, like SSIS.

Better quality screen shots to follow as soon as its available.

Expressions & Parameters

Like most other Microsoft data tools, expressions give us that valuable bit of inline extensibility to achieve things more dynamically when developing. Within our ADFv2 JSON we can now influence the values of our attributes in a similar way using a rich new set of custom inner syntax, secondary to the ADF JSON. To support the expressions factory-wide, parameters will become first class citizens in the service.

As a basic example, before we might do something like this:

1
"name": "value"

Now we can have an expression and return the value from elsewhere, maybe using a parameter like this:

1
"name": "@parameters('StartingDatasetName')"

With the @ symbol becoming important here for the start of the inline expression. The expression syntax is rich and offers a host of inline functions to call and manipulate our service. These include:

  • String functions – concat, substring, replace, indexof etc.
  • Collection functions – length, union, first, last etc.
  • Logic functions – equals, less than, greater than, and, or, not etc.
  • Conversation functions – coalesce, xpath, array, int, string, json etc.
  • Math functions – add, sub, div, mod, min, max etc.
  • Date functions – utcnow, addminutes, addhours, format etc.

System Variables

As a good follow on from the new expressions/parameters available we now also have a handful of system variables to support our JSON. These are scoped at two levels with ADFv2.

  1. Pipeline scoped.
  2. Trigger scoped (more on triggers later).

The system variables extend the parameter syntax allowing us to return values like the data factory name, the pipeline name and a specific run ID. Variables can be called in the following way using the new @ symbol prefix to reference the dynamic content:

1
"attribute": "@pipeline().RunId"

Inline Pipelines

For me this is a deployment convenience thing. Before and currently our linked services, datasets and pipelines were separate JSON files within our Visual Studio solution. Now an inline pipeline can house all its required parts within its own properties. Personally, I like having a single reusable linked service for various datasets in one place that only needs updating with new credentials once. Why would you duplicate these settings as part of several pipelines? Maybe if you want some complex expressions to influence your data handling and you are limited by the scope of a system variable, an inline pipeline may then be required.

Anyway, this is what the JSON looks like:

1
2
3
4
5
6
7
8
9
{
    "name": "SomePipeline",
    "properties": {
		"activities": [], 		//before
		"linkedServices": [], 		//now available
		"datasets": [],			//now available
		"parameters": []		//now available
		}
}

Beware, if you use the ADF copy wizard via the Azure portal. An inline pipeline is what you’ll now get back.

Activity Retry & Pipeline Concurrency

In ADFv2 our activities will be categorised as control and non-control types. This is mainly to support the use of our new activities like ‘ForEach’ (more on the activity itself later). A ‘ForEach’ activity sits within the category of a control type. Meaning it will not have retry, long retry and concurrency options available within its JSON policy block. I think it’s logical that something like a sequential looping can’t concurrency run, so just be aware that such JSON attributes will now be validated depending on the category of the activity.

Our familiar and existing activities like ‘Copy’, ‘Hive’ and ‘U-SQL’ will therefore be categorised as non-control types with policy attributes remaining the same.

Event Triggers

Like our close friend Azure Logic Apps, ADFv2 can perform actions based on triggered events. So far, the only working example of this requires an Azure Blob Storage account that will output a file arrival event. It will be great to replace those time series polling activities that needed to keep retrying until the file appeared with this event based approach.

Scheduled Triggers

You guessed it. We can now finally schedule our ADF executions using a defined recursive pattern (with enough JSON). This schedule will sit above our pipelines as a separate component within ADFv2.

  • A trigger will be able to start multiple pipelines.
  • A pipeline can be started by multiple scheduled triggers.

Let’s look at some JSON to help with the understanding.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
{
  "properties": {
    "type": "ScheduleTrigger",
    "typeProperties": {
      "recurrence": {
        "frequency": Minute, Hour, Day, Week, Year,
        "interval": ,  // optional, how often to fire (default to 1)
        "startTime": ,
        "endTime": ,
        "timeZone": 
        "schedule": {  // optional (advanced scheduling specifics)
          "hours": 0-24,
          "weekDays": ": ,
          "minutes": 0-60,
          "monthDays": 1-31,
          "monthlyOccurences": [
               {
                    "day": ,
                    "occurrence": 1-5
               }
           ] 
      }
    },
   "pipelines": [ // pipeline here
            {
                "pipelineReference": {
                    "type": "PipelineReference",
                    "referenceName": ""
                },
                "parameters": {
                    "": {
                        "type": "Expression",
                        "value": ""
                    },
                    " : ""
                }
           }
      ]
  }
}

Tumbling Window Triggers

For me, ADFv1 time slices simply have a new name. A tumbling window is a time slice in ADFv2. Enough said on that I think.

Depends On

We know that ADF is a dependency driven tool in terms of datasets. But now activities are also dependency driven with the execution of one providing the necessary information for the execution of the second. The introduction of a new ‘DependOn’ attribute/clause can be used within an activity to drive this behaviour.

The ‘DependsOn’ clause will also provide the branching behaviour mentioned above. Quick example:

1
"dependsOn": [ { "dependencyConditions": [ "Succeeded" ], "activity": "DownstreamActivity" } ]

More to come with this explanation later when we talk about the new ‘LookUp’ activity.

Azure Monitor & OMS Integration

Diagnostic logs for various other Azure services have been available for a while in Azure Monitor and OMS. Now with a little bit of setup ADFv2 will be able to output much richer logs with various metrics available across a data factory services. These metrics will include:

  • Successful pipeline runs.
  • Failed pipeline runs.
  • Successful activity runs.
  • Failed activity runs.
  • Successful trigger runs.
  • Failed trigger runs.

This will be a great improvement on the current PowerShell or .Net work required with version 1 just to monitor issues at a high level.
If you want to know more about Azure Monitor go here: https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-azure-monitor

PowerShell

It’s worth being aware that to support ADFv2 there will be a new set of PowerShell cmdlets available within the Azure module. Basically, all named the same as the cmdlets used for version 1 of the service, but now including ‘V2’ somewhere in the cmdlet name and accepting parameters specific to the new features.

Let’s start with the obvious one:

1
2
3
4
New-AzureRmDataFactoryV2 `
	-ResourceGroupName "ADFv2" `
	-Name "PaulsFunFactoryV2" `
	-Location "NorthEurope"

Or, a splatting friendly version for the PowerShell geeks 🙂

1
2
3
4
5
6
$parameters = @{
    Name = "PaulsFunFactoryV2"
    Location = "NorthEurope"
    ResourceGroupName = "ADFv2"
}
New-AzureRmDataFactoryV2  @parameters

Pricing

This isn’t a new feature as such, but probably worth mentioning that with all the new components and functionality in ADFv2 there is a new pricing model that you’ll need to do battle with. More details here: https://azure.microsoft.com/en-gb/pricing/details/data-factory/v2

Note: the new pricing tables for SSIS as a service with variations on CPU, RAM and Storage!


Activities:

Lookup

This is not an SSIS data transformation lookup! For ADFv2 we can lookup a list of datasets to be used in another downstream activity, like a Copy. I mentioned earlier that we now have a ‘DependsOn’ clause in our JSON, lookup is a good example of why we might use it.

Scenario: we have a pipeline containing two activities. The first lookups of some list of datasets (maybe some tables in a SQLDB). The second performs the data movement using the results of the lookup so it knows what to copy. This is very much a dataset level handling operation and not a row level data join. I think a picture is required:

Here’s a JSON snippet, which will probably be a familiar structure for those of you that have ever created an ARM Template.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
{
"name": "SomePipeline",
"properties": {
    "activities": [
        {
            "name": "LookupActivity", //First
            "type": "Lookup"
        },
        {
            "name": "CopyActivity", //Second
            "type": "Copy",              
            "dependsOn": [  //Dependancy
                {
                    "activity": "LookupActivity"
                }
            ],
            "inputs": [],  //From Lookup
            "outputs": []
        }
    ]        
}}

Currently the following sources can be used as lookups, all of which need to return a JSON dataset.

  • Azure Storage (Blob and Table)
  • On Premises Files
  • Azure SQL DB

HTTP

With the HTTP activity, we can call out to any web service directly from our pipelines. The call itself is a little more involved than a typical web hook and requires an XML job request to be created within a workspace. Like other activities ADF doesn’t handle the work itself. It passes off the instructions to some other service. In this case it uses the Azure Queue Service. The queue service is the compute for this activity that handles the request and HTTP response, if successful this get thrown back up to ADF.

There’s something about needing XML inside JSON for this activity that just seems perverse. So much so that I’m not going to give you a code snippet 🙂

Web (REST)

Our new web activity type is simply a REST API caller. Which I assume doesn’t require much more explanation. In ADFv1 if we wanted to make a REST call a custom activity was required and we needed C# for the interface interaction. Now we can do it directly from the JSON with child attributes to cover all the usual suspects for REST APIs:

  • URL
  • Method (GET, POST, PUT)
  • Headers
  • Body
  • Authentication

ForEach

The ForEach activity is probably self-explanatory for anyone with an ounce of programming experience. ADFv2 brings some enhancements to this. You can use a ForEach activity to simply iterate over a collection of defined items one at a time as you would expect. This is done by setting the IsSequential attribute of the activity to True. But you also have the ability to perform the activity in parallel, speeding up the processing time and using the scaling power of Azure.

For example: if you had a ‘ForEach’ Activity iterating over a ‘Copy’ operation, with 10 different items, with the attribute “isSequential” set to false, all copies will execute at once. ForEach then offers a new maximum of 20 concurrent iterations, compared to a signal non-control activity with its concurrency supporting only a maximum of 10.

To try and clarify, the ForEach activity accepts items and is developed as a recursive thing. But on execution you can chosoe to process them sequentially or in parallel (up to a maxuimum of 20). Maybe a picture will help:

Going even deeper, the ‘ForEach’ activity is not confined to only processing a single activity, it can also iterate over a collection of other activities, meaning we can nest activities in a workflow where ‘ForEach’ is the parent/master activity. The items clause for the looping still needs to be provided as a JSON array, maybe by an expression and parameter within your pipeline. But those items can reference another inner block of activities.

There will definitely be a follow up blog post on this one with some more detail and a better explanation, come back soon 🙂

Meta Data

Let’s start by defining what metadata is within the context of ADFv2. Meta data includes the structure, size and last modified date information about a dataset. A metadata activity will take a dataset as an input, and output the various information about what it’s found. This output could then be used as a point of validation for some downstream operation. Or, for some dynamic data transformation task that needs to be told what dataset structure to expect.

The input JSON for this dataset type needs to know the basic file format and location. Then the structure will be worked out based on what it finds.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
"name": "MyDataset",
"properties": {
"type": "AzureBlob",
	"linkedService": {
		"referenceName": "StorageLinkedService",
		"type": "LinkedServiceReference"
	},
	"typeProperties": {
		"folderPath":"container/folder",
		"Filename": "file.json",
		"format":{
			"type":"JsonFormat"
			"nestedSeperator": ","
		}
	}
}}

Currently, only datasets within Azure blob storage are supported.

I’m hoping you are beginning to see how branching, depends on condititions, expressions and parameters are bringing you new options when working with ADFv2, where one new features uses the other.


The next couple as you’ll know aren’t new activities, but do have some new options available when creating them.

Custom

Previously in our .Net custom activity code we could only pass static extended properties from the ADF JSON down to the C# class. Now we have a new ‘referenceObjects’ attribute that can be used to access information about linked services and datasets. Example JSON snippet below for an ADFv2 custom activity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
  "name": "SomePipeline",
  "properties": {
    "activities": [{
      "type": "DotNetActivity",
      "linkedServiceName": {
        "referenceName": "AzureBatchLinkedService",
        "type": "LinkedServiceReference"
      },
		"referenceObjects": { //new bits
          "linkedServices": [],
		  "datasets": []
        },
        "extendedProperties": {}
}}}

This completes the configuration data for our C# methods giving us access to things like the connection credentials used in our linked services. Within the IDotNetActivity class we need the following methods to get these values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
static void Main(string[] args)
{
    CustomActivity customActivity = 
        SafeJsonConvert.DeserializeObject(File.ReadAllText("activity.json"), 
        DeserializationSettings) as CustomActivity;
    List linkedServices = 
        SafeJsonConvert.DeserializeObject(File.ReadAllText("linkedServices.json"), 
        DeserializationSettings);
    List datasets = 
        SafeJsonConvert.DeserializeObject(File.ReadAllText("datasets.json"), 
        DeserializationSettings);
}
 
static JsonSerializerSettings DeserializationSettings
{
    get
    {
        var DeserializationSettings = new JsonSerializerSettings
        {
            DateFormatHandling = Newtonsoft.Json.DateFormatHandling.IsoDateFormat,
            DateTimeZoneHandling = Newtonsoft.Json.DateTimeZoneHandling.Utc,
            NullValueHandling = Newtonsoft.Json.NullValueHandling.Ignore,
            ReferenceLoopHandling = Newtonsoft.Json.ReferenceLoopHandling.Serialize
        };
        DeserializationSettings.Converters.Add(new PolymorphicDeserializeJsonConverter("type"));
        DeserializationSettings.Converters.Add(new PolymorphicDeserializeJsonConverter("type"));
        DeserializationSettings.Converters.Add(new PolymorphicDeserializeJsonConverter("type"));
        DeserializationSettings.Converters.Add(new TransformationJsonConverter());
 
        return DeserializationSettings;
    }
}

Copy

This can be a short one as we know what copy does. The activity now supports the following new data sources and destinations:

  • Dynamics CRM
  • Dynamics 365
  • Salesforce (with Azure Key Vault credentials)

Also as standard ‘copy’ will be able to return the number of rows processed as a parameter. This could then be used with a branching ‘if’ condition when the number of expected rows isn’t available for example.


Hopefully that’s everything and your now fully up to date with ADFv2 and all the new and exciting things it has to offer. Stay tuned for more in depth posts soon.

For more information check out the Microsoft documentation on ADF here: https://docs.microsoft.com/en-gb/azure/data-factory/introduction

Many thanks for reading.

 

Special thanks to Rob Sewell for reviewing and contributing towards the post.


45 Responses to What’s New in Azure Data Factory Version 2 (ADFv2)

  • This is great stuff – thanks for sharing!

  • This is a huge step forward. As a consultant I see many companies with tons of SSIS packages. The ability to move these into Azure enables full migration to Azure! The additional ADF functionality is where data integration needs to go. Very excited about this!

  • Thank you for this superb description !!!!

  • Nice summary, well put together Paul; it’s good to see it all summarized in one place in this way rather than having to leap all around the MS Documentation….cheers 🙂

  • Great post as always. Very helpful.

  • Thanks a lot for this overview and its details. Really helps on understanding concrete implications and options.

    What’s your estimate/guess on when – approximately – we will move out of Preview with v2?

  • Thanks for the great description.
    Why creating of Linked Services, Datasets and Pipelines through JSON has been removed in V2?
    The only way is Powershell and SDK?

  • Hi Paul
    I am trying to create ADF V2 using .Net Library but whenever going to create ADF then getting error “Operation returned an invalid status code ‘Forbidden\'”. Please help me.
    I am following these Microsoft blog:
    https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-dot-net

    • Hi Brijesh, thanks for your comment. Typically a forbidden message relates to a permissions issue. I suggest double checking how your methods are accessing there Azure services. The credentials are not taken from the ADF linked service as the code is actually executing in the context of the Azure Batch Service. Cheers Paul

  • Do you think 3rd Party SSIS components such as KingswaySoft CRM toolbox would work with this?

  • Very Informative!

  • Hi Paul!

    Thanks for sharing your overview with us!

    Any comments on using ADF v2 vs Apache Sqoop to move data from on prem database SQL server to Azure Blob Storage ?
    Eg from performance and manageability standpoint

    Regards,
    -Yuriy

    • Hi Yuriy, thanks for your comment. I’m afraid I have zero experience with Apache Sqoop so unable to comment. Cheers Paul

  • Hi Paul,
    Great post!
    Do you know when the Web Based Developer UI is coming out?
    In the mean time, does it mean the only way to generate Data Factory v2 pipelines is programatically, rather than through the portal?
    Thanks

    • Hi Patrick, thanks for your comment. As far as I’m aware the UI will be coming later this year. But I don’t know anything more specific than that. I’m planning to pester the Microsoft peeps at the PASS Summit in a couple of weeks to get a better roadmap for the new service. In the mean time, yes, the .Net SDK or PowerShell are the only means of working with ADFv2. The Azure Portal blades as you’ve seen give you nothing! Cheers Paul

      • Thanks for the reply.
        I might start hassling them online as well – I’ve got up and running with the python sdk – but a gui would be great.

        • @Patrick, were you able to start scheduled triggers using the Python SDK? if so, do these steps look familiar?

          pipeline_reference = PipelineReference(reference_name=pipeline_name)
          trigger_pipeline_reference = TriggerPipelineReference(pipeline_reference)
          recurrence = ScheduleTriggerRecurrence(frequency=’Minute’, interval=2,
          start_time=start_time,
          end_time=end_time,
          time_zone=’UTC’)
          schedule_trigger = ScheduleTrigger(pipelines=pl_scheduled, recurrence=recurrence)
          adf_client.triggers.start(resource_group_name=rg_name, factory_name=df_name,
          trigger_name=’test’)

          At this point it fails, getting a “Missing pipeline parameters for trigger “name of trigger”, pipeline “pipeline name”

  • Hi Paul,

    Any ideas on the expected release date for v2 ?

    And any more extra features that we may see before it is actually released ?

    Thanks

  • Hey Paul,

    The article is great and shows great stuff what is coming up!
    We’ve build up stage reservoir for one application with ADF v2 and it is up and running. (we’ve also build it with ADF v1)
    For what is coming up, I’m really excited!

    I’m curious about your opinion Paul.
    On ADF v1 we had a pipeline for each source table. We had tons of pipelines and saw that is getting complex when implementing incremental load. Besides the overhead with tons of pipelines was killing.
    For ADF v2 we have 1 pipeline for staging, for every table we implement activity within a pipeline. This way we only have one pipeline to manage, dependency is easy to implement and gives less administration overhead.
    The only cons I saw is, there is a limit in how big the pipeline can be. We could add activities in one pipeline for only 35 tables.

    Secondly I have one question regarding the load of the data.
    In the pipeline I see only properties for the source. But for the de destination, it is only defined with the output dataset. I’m curious in how the data load to the destination table is handled.
    Can we alter data load options, like table lock, identity insert, etc?

    If I can provide you some information about our project, let me know.

    Kind regards,

  • Great post, I’ve been trying to create scheduled triggers using the Python SDK, regardless of the SDK, I still haven’t found an example of how to use the pipeline parameters in scheduledTriggers

    “parameters”: {
    “”: {
    “type”: “Expression”,
    “value”: “”
    },

    Are these optional?

    • Hi Sual, yes, as far as I’m aware the parameters array is optional throughout ADF. Thanks for your comment. Paul

      • thanks Paul, yeah I just noticed that, another question , any ideas on how can we have dynamic folder paths like in ADF V1 when defining the data sets and using the partitioned by option (SliceStart), for example an Azure Data Lake Store data set and having a folder path like ingest/{Year}-{Month}-{Day} ?

  • Great post, please update if you get any further info on the release of the GUI! We are eagerly awaiting this.

  • Hi Paul,
    I agree to the great post comments above.

    What is your opinion about building new on prem DWs with SSIS. Would you avoide using SSIS in order to have a more smooth migration into the cloud. Or would you go ahead and use existing well know SSIS patterns and utilize the new ADF features if/when a migration into the cloud takes Place?

    • Hi Daniel, thanks for the comment. Great question. Using known and established services like SSIS is always nice compared to bleeding edge PaaS equivalents in Azure. But those Azure services do offer things that are difficult to implement with on premises services. I think as we stand right now, today! I’d be tempted to use SSIS. Then later, lift and shift it to Azure as required. There are several features coming soon that will allow you to schedule Azure SSIS package that don’t require ADFv2. Eg. SQL MI Agent. Really tough call though. Ask me again at the end of the year 🙂 Good luck. Cheers Paul

  • I am using using Azure DataLake as Source and Blob as Sink but I am getting the following error. I think the feature is not complete or is not working
    ErrorCode=UserErrorPluginNotRegistered,’Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Unrecognized plugin type 'AzureDataLakeStoreFile'.,Source=Microsoft.DataTransfer.ClientLibrary,\'”,

    • Hi Daniel, thanks for the comment. I’m not surprised! Maybe raise it with Microsoft or stick it on Stack Overflow. I’ll add it to my list to try out. Cheers Paul

  • Hi Paul,
    We are using Azure Data Factory version 1, how to convert or build existing ADF version 1 to version 2.

    • Hi Kumar, great question. The answer will have to be: it depends. The services are very different. Depending on what your existing ADFv1 service is doing I would probably suggest a rebuild in v2 to ensure you take advantage of all the new features. Cheers Paul

  • Hi Paul. Thank you for the article. Very helpful. I’m curious though why you place such a huge emphasis on understanding what’s going on at the JSON level? I’m really hoping we won’t have to work at that level very often, if at all, in creating pipelines. This seems reasonable since no one in their right mind would want to try and create or modify SSIS packages at the XML level.

    • Hi Randy, thanks for the comment. When you start to automate deployments, and generate your ADF JSON with meta data it’s important to understand the low-level syntax. Like using BIML to automate SSIS. The same principals of injecting into the XML apply. Developing with UI’s can only get you so far. Cheers Paul

  • Hi Paul
    is the Execute SSIS packages component available (even if available for public review) or is this still in private review
    If still in private review when will it be available for public review , available for use
    Yogan

  • Hey Paul,

    Could you please let me know how we can see the IR for existing ADFv1.
    We have one issue, our ADF hosted in North Europe, but its execution location is East US2.
    Please help.

    Thanks,
    Chandan

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

1,037 Spambots Blocked by Simple Comments

Paul’s Frog Blog

Paul is a Microsoft Data Platform MVP with 10+ years’ experience working with the complete on premises SQL Server stack in a variety of roles and industries. Now as the Business Intelligence Consultant at Purple Frog Systems has turned his keyboard to big data solutions in the Microsoft cloud. Specialising in Azure Data Lake Analytics, Azure Data Factory, Azure Stream Analytics, Event Hubs and IoT. Paul is also a STEM Ambassador for the networking education in schools’ programme, PASS chapter leader for the Microsoft Data Platform Group – Birmingham, SQL Bits, SQL Relay, SQL Saturday speaker and helper. Currently the Stack Overflow top user for Azure Data Factory. As well as very active member of the technical community.
Thanks for visiting.
@mrpaulandrew