As a Data Engineer, I build analytics solutions on cloud platforms, mostly on Azure. Microsoft Azure has diverse products to support Analytics ranging from Azure Data Lake for storage, Azure Data Factory for orchestration, Azure Synapse as a Data Warehouse, and Power BI for report building. However, the introduction of Microsoft Fabric will bring a big change in the utilization of these Azure resources.
Recently, I had the privilege of conducting a Proof of Concept (POC) within my organization to explore the capabilities of Microsoft Fabric for our specific use case. In this article, I’ll share some key insights derived from my experiments.
What is Microsoft Fabric?
Fabric is the latest offering from Microsoft that offers end-to-end analytics solutions with full-service capabilities from data movement to data science, Real-Time Analytics, and business intelligence. It offers all the services required for data analytics as a single component instead of creating different resources for different services
Components of Fabric Used
In my POC, I used the following components of Fabric:
- Lakehouse: I established an external connection between Lakehouse and Azure Data Lake storage, facilitating the ingestion of raw data. This connection was established using the Access key of the storage account.
- Notebooks: To transform raw data, I employed Notebooks running on a Spark cluster, utilizing Pyspark code for data transformation. The resultant transformed data was stored as Delta Tables.
- Data Warehouse: The data warehouse supported cross-platform queries, allowing tables created in Lakehouse to be seamlessly accessible in the Data Warehouse. Within the Warehouse, I authored stored procedures to construct Dimension and Fact tables based on the Delta tables.
- Dataset: In essence, the Dataset mirrors the functionality of Power BI’s Dataset. I loaded both the Fact and Dimension tables into the Dataset and established relationships between them. This Dataset serves as the foundation for generating reports in Power BI.
Throughout the course of my POC, several noteworthy observations came to light:
- Connecting to external Azure Data Lake accounts is contingent on their public accessibility. As of the time of writing, there is no Virtual Network (VNet) connectivity for Microsoft Fabric.
- Data Warehouse does not support the utilization of temporary tables in Stored Procedures.
- Table and column names within Microsoft Fabric are case-sensitive, potentially posing challenges during migration.
- There is currently no option to alter table names when they are loaded into the Power BI Dataset.
Given that Microsoft Fabric remains in Public Preview, we anticipate that these limitations will be addressed as it progresses toward General Availability.
Advantages of Migrating to Fabric
- Single Resource: Microsoft Fabric consolidates all the essential components needed for building an analytics solution into a single resource, eliminating the complexity of managing multiple resources.
- Cost Saving: In my project, we estimated a substantial 45% reduction in overall costs upon migrating to Fabric. While the extent of cost savings may vary by project, Fabric’s role as a single resource typically leads to more cost-efficient operations.
In conclusion, I believe Microsoft Fabric can be a game-changer in the world of data analytics. It is a one-stop shop highly integrated, end-to-end, and easy-to-use product that is designed to simplify analytics needs.
If you found the article to be helpful, you can buy me a coffee here:
Buy Me A Coffee.