When querying data from Azure Databricks (spark) into Power BI you may encounter an error: “ODBC:ERROR [HY000] [Microsoft][Hardy] (100) The host and port specified for the connection do not seem to belong to a Spark server. Please check your configuration.“
This is usually caused by trying to connect to a ‘Standard’ Databricks instance, but Power BI (and ODBC in general) can only connect to Databricks using a ‘Premium’ pricing tier. You will need to upgrade this to be able to access Databricks/Spark from Power BI.
Another common and similar error when Power BI is querying data from Azure Databricks (spark) is: “ODBC:ERROR [HY000] [Microsoft][DriverSupport] (1170) Unexpected response received from the server. Please ensure the server host and port specified for the connection are correct.“
The most likely cause of this error is an invalid server path, as you have to modify the path that Databricks gives you before using it.
In Databricks, open the cluster, and in Advanced Options click on the JDBC/ODBC tab, and copy the JDBC URL. It will look something like this:
All of the sections in red need removing or changing, so it ends up like this:
The username needs to be “token”, and the password needs to be a token, generated from the Databricks portal. Click on the user/silhouette icon on the top right of the screen, User Settings, Generate New Token.
Also when using Databricks, watch out for the type of connection you choose in Power BI, it should be ‘Spark’ not ‘Azure HDInsight Spark’.
Hope this helps!