When using Spark notebooks in Fabric, it is very easy to connect directly to a Lakehouse. From here, you can see all of the data that is stored in tables and click and drop to insert code snippets to extract the data.

Here at Purple Frog, it is common to have projects using Data Warehouses as their data store. When looking at ways to connect Notebooks to a warehouse, there is currently  little information available online; One way that is recommended is to use a shortcut within a Lakehouse to clone the data from the warehouse. This method works well, but with a large-scale solution, copying over each table to a Lakehouse to then model the data in a notebook adds extra steps and compute time. It also creates a store of data that has been replicated unnecessarily.

After a bit of investigation, there is a code snippet to extract data from a delta table, and although it may not be clear, you can use the ABFSS path available from a table. This will then put the data into a Spark DataFrame that is then ready to use for your data analysis in Spark

Here is an example of the code:


delta_table_path = "<abfss path>" #fill in your delta table path

df = spark.read.format("delta").load(delta_table_path)



To find the ABFSS path for a table, you need to access the properties pane for the specific table you need. From there, you can copy the path and paste this into the code snippet above.

Let us know if this works for you, or if you have found any other ways to connect a Warehouse to your Notebooks!

Tags: , , , ,