7 Features of Microsoft Azure Data Factory You Should Know

  Jul 02, 2021 12:21:00  |    Joseph C V   #ADF # Azure #DataEngineering #ETL #Analytics

Most organizations are slowly moving their data-related projects to the cloud. Why? Because of the benefits such as easy scaling and flexibility, cost-effective offerings and secure backup and recovery. And this migration is a great opportunity to unify your siloed data. But which tool should you opt for to move your data to the cloud?

Microsoft’s answer to this is Azure Data Factory (ADF). It’s one of the best tools available in the market to move, integrate, and prepare your data on the cloud premises.

Let’s see the 7 benefits of feature-rich ADF that would come in handy when shifting your data stores to the cloud.

7 Featuresof Azure Data Factory That You Shouldn’t Miss

1.   Hybrid Data Integration at Scale

Organizations are collecting humongous data daily. When data comes from multiple sources, as disparate as multi-cloud, SaaS apps, and On-premise, collecting, ingesting, and processing data could be a huge challenge. Not only can it cause integration and quality issues, but time and effort would also be at stake. Microsoft ADF offers just the right solution to handle such issues.

ADF is a serverless, cloud-based solution that offers excellent security and flexibility at scale.

  • Easy to use: Its architecture has drag-and-drop features that cover ingestion, preparation, transformation, and serving the data for analytics with no-code/low-code features.
  • Cost-effective: Pay as you go model means you would be spending only on your consumed resources.
  • Powerful: The pipelines facilitate orchestrations and monitoring of end-to-end jobs. It connects to all enterprise databases and serves your ETL/ELT needs with a robust solution.
  • Intelligent: ADF offers intelligent and autonomous ETL, helps in data democratization by empowering citizen data professionals.

2.   Connect to Enterprise Data

In the age of social media and voluminous data, pulling and merging data from various data sources could be complex, time-consuming, and expertise-oriented. The backstage job of connecting with data stores could become a nightmare with multiple tools, plug-ins, utilities, and vendors. Microsoft’s Azure Data Factory solves this problem instantly.

  • ADF comes with 90+ built-in connectors to connect with various data sources.
  • Connects with Bigdata sources like HDFS, Google BigQuery, etc.
  • Connects with enterprise-level data warehouses like Teradata and Oracle Exadata.
  • Connects with SaaS apps like Service Now and Salesforce.
  • Supports various file formats like JSON, XML, Excel, and Delta.
  • To connect with a data source without a built-in connector, you can use an ODBC connector or similar facilities.

3.   Run On-Premise SSIS packages

If you want to move your SSIS packages due to the limitations of its desktop-based features and heavy memory consumption on servers, ADF offers the same with smoothness. Here are some great benefits when you go cloud from on-premise using ADF:

  • Against the license-based SSIS, ADF offers a pay-as-you-go model that lowers your TCO up to 88%.
  • An easy to migrate solution with fully compatible features to support SSIS packages.
  • ADF comes with a deployment wizard tool and abundant how-to guides and other support materials.
  • ADF helps in data cataloging with data lineage, tracking, and tagging data from disparate sources—a vitality for data democratization.
  • Analytics and machine learning initiatives materialize easily with hybrid data that is cleansed, consumable, and insight-ready.

4.   DataOps

Continuous integration (CI) and continuous delivery (CD) are the processes of testing every change and delivering the same at the earliest possible time. Shrinking the time-lapse between development (or change), test, and deployment in a data-based project is called DataOps, similar to DevOps.

ADF promotes CI/CD by connecting and supporting with Azure DevOps or GitHub. You can automate deployments of various branches into Dev, Test, and Prod environment. ADF uses Azure Resource Manager (ARM) templates to define and store the configuration of ADF entities like pipelines, triggers, datasets, etc., of your project.

You can utilize any of the below mentioned methods to promote your data factory from one environment to another:

  • Automatically deploying the data factory to a different environment using Azure pipelines.
  • Using an Azure Resource Manager, manually uploading an ARM template.

5.   Code-Free Data Transformation at Scale

ADF provides an environment to transform the data at scale. Mapping data flow, a feature of ADF, is the visually designed data transformations that help citizen data professionals:

  • To derive graphical logic without writing codes,
  • To perform your ETL and ELT tasks visually,
  • To save time from writing codes, figuring out the source and destination, ports with a few clicks,
  • To perform various data wrangling operations like data cleansing, aggregations, transformation, and conversions visually on a canvas.
  • The Azure-managed Apache Spark behind the ADF takes charge of the code generation.
  • No need to understand Spark programming or clusters.
  • Can test the logic in the ADF’s debug mode.

Citizen data professionals can prepare and wrangle data using the Power Query feature in ADF. This comes with the following benefits:

  • UI similar to Microsoft Excel for the data prep process.
  • The process is visual and agile and needs no coding expertise.
  • Analyze data to find anomalies and outliers and validate your data.
  • Availability of various data prep functions like combining tables, adding and updating columns, row management, etc.

6.   Run Your Code on Any Compute

ADF supports various external compute environments to process your data. Some of the compute engines you can connect your ADF project with are:

  • Azure Databricks
  • Azure HDInsight
  • Azure Synapse Analytics
  • Azure Machine Learning
  • Azure Function
  • Azure Batch

7.   Security and Compliance

With ADF, all the cloud-based security features of Microsoft Azure come by default, making it one of the best-in-class and secured cloud-based ETL tools.

  • ADF is available in more than 25 regions globally, complying with the local regulatory norms of all the areas.
  • Data Factory is certified for various ISO certificates, HIPPA BAA, HITRUST, and CSA STAR, making it a highly secured tool.
  • AFD protects every connected database and data store credentials by encrypting with Microsoft-issued certificates, which are renewed and rotated every 2 years.
  • All data in transit among the data stores are exchanged via HTTPS or TLS secured channel, whichever they support.
  • If a datastore supports encryption of data at rest, ADF recommends enabling that feature.
  • Azure Key Vault, a core component of any Azure solution, holds credentials values while interacting with any linked services. This extra layer of security is recommended. If used, ADF should retrieve values from this vault using its own Managed Service Identity (MSI).
  • ADF inherits all other security guidelines and regulatory policies that Azure follows to function as a secured cloud entity.

How Logesys Can Help with ADF Implementation?

Logesys teams have many years of rich experience in Microsoft cloud solutions and products. We have been working with our clients to move their data to the cloud with smooth and cost-effective projects.

If you are planning to move your SSIS packages to a cloud-based solution or your on-premise data to Azure, we would be happy to help. Initiate a discussion with us here, and allow us to help you start your ADF and cloud journey.