Hi,
I'm following the tutorial from https://docs.microsoft.com/en-Us/azure/azure-databricks/databricks-extract-load-sql-data-warehouse
but when I try to execute almost the very last piece of code:
val blobStorage = "adbtutorial.blob.core.windows.net"
val blobContainer = "adbblob"
val blobAccessKey = "**"
val tempDir = "wasbs://" + blobContainer + "@" + blobStorage +"/tempDirs"
val acntInfo = "fs.azure.account.key."+ blobStorage
sc.hadoopConfiguration.set(acntInfo, blobAccessKey)
//SQL Data Warehouse related settings
val dwDatabase = "adbtutorial"
val dwServer = "adbtutorial.database.windows.net"
val dwUser = ""
val dwPass = ""
val dwJdbcPort = "1433"
val dwJdbcExtraOptions = "encrypt=true;trustServerCertificate=true;hostNameInCertificate=.database.windows.net;loginTimeout=30;"
val sqlDwUrl = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass + ";$dwJdbcExtraOptions"
val sqlDwUrlSmall = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
spark.conf.set(
"spark.sql.parquet.writeLegacyFormat",
"true")
renamedColumnsDF.write.format("com.databricks.spark.sqldw").option("url", sqlDwUrlSmall).option("dbtable", "SampleTable").option( "forwardSparkAzureStorageCredentials","True").option("tempdir", tempDir).mode("overwrite").save()
I get exception:
com.databricks.spark.sqldw.SqlDWConnectorException: Exception encountered in SQL DW connector code.
The stacktrace is:
at com.databricks.spark.sqldw.Utils$.wrapExceptions(Utils.scala:281)
at com.databricks.spark.sqldw.DefaultSource.createRelation(DefaultSource.scala:86)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:72)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:88)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:146)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:134)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$5.apply(SparkPlan.scala:187)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:183)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:134)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:114)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:690)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:690)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withCustomExecutionEnv$1.apply(SQLExecution.scala:99)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:228)
at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:85)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:158)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:690)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:290)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:284)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3733811941490348:24)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3733811941490348:88)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw$$iw.<init>(command-3733811941490348:90)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw.<init>(command-3733811941490348:92)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw.<init>(command-3733811941490348:94)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw.<init>(command-3733811941490348:96)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read.<init>(command-3733811941490348:98)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$.<init>(command-3733811941490348:102)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$.<clinit>(command-3733811941490348)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$eval$.$print$lzycompute(<notebook>:7)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$eval$.$print(<notebook>:6)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$eval.$print(<notebook>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:793)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1054)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:645)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:644)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:644)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:576)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:572)
at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:199)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply$mcV$sp(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:590)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:545)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:323)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:303)
at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:235)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:230)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:47)
at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:268)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:47)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:303)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:591)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:591)
at scala.util.Try$.apply(Try.scala:192)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:586)
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:477)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:544)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:383)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:330)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:216)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 64
at com.microsoft.azure.storage.core.Base64.decode(Base64.java:105)
at com.microsoft.azure.storage.StorageCredentialsAccountAndKey.<init>(StorageCredentialsAccountAndKey.java:81)
at com.databricks.spark.sqldw.Utils$.getStorageCredentials(Utils.scala:241)
at com.databricks.spark.sqldw.SqlDwWriter$$anonfun$saveToSqlDW$1.apply(SqlDwWriter.scala:92)
at com.databricks.spark.sqldw.SqlDwWriter$$anonfun$saveToSqlDW$1.apply(SqlDwWriter.scala:72)
at com.databricks.spark.sqldw.JDBCWrapper.withConnection(SqlDWJDBCWrapper.scala:281)
at com.databricks.spark.sqldw.SqlDwWriter.saveToSqlDW(SqlDwWriter.scala:72)
at com.databricks.spark.sqldw.DefaultSource$$anonfun$createRelation$3.apply(DefaultSource.scala:117)
at com.databricks.spark.sqldw.DefaultSource$$anonfun$createRelation$3.apply(DefaultSource.scala:86)
at com.databricks.spark.sqldw.Utils$.wrapExceptions(Utils.scala:249)
at com.databricks.spark.sqldw.DefaultSource.createRelation(DefaultSource.scala:86)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:72)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:88)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:146)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:134)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$5.apply(SparkPlan.scala:187)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:183)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:134)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:114)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:690)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:690)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withCustomExecutionEnv$1.apply(SQLExecution.scala:99)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:228)
at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:85)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:158)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:690)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:290)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:284)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3733811941490348:24)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3733811941490348:88)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw$$iw.<init>(command-3733811941490348:90)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw$$iw.<init>(command-3733811941490348:92)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw$$iw.<init>(command-3733811941490348:94)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$$iw.<init>(command-3733811941490348:96)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read.<init>(command-3733811941490348:98)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$.<init>(command-3733811941490348:102)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$read$.<clinit>(command-3733811941490348)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$eval$.$print$lzycompute(<notebook>:7)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$eval$.$print(<notebook>:6)
at lineccb000ada34d4c84b4dc47e0722ec5e166.$eval.$print(<notebook>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:793)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1054)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:645)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:644)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:644)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:576)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:572)
at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:199)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply$mcV$sp(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:590)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:545)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:190)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:323)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$8.apply(DriverLocal.scala:303)
at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:235)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:230)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:47)
at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:268)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:47)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:303)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:591)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:591)
at scala.util.Try$.apply(Try.scala:192)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:586)
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:477)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:544)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:383)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:330)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:216)
at java.lang.Thread.run(Thread.java:748)
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
@ExarKun1983 Thank you very much for bringing this to our attention. We are investigating and will update you when we have full clarity of the issue.
@ExarKun1983 This appears to be a authentication issue:
//SQL Data Warehouse related settings
val dwDatabase = "adbtutorial"
val dwServer = "adbtutorial.database.windows.net"
val dwUser = ""
val dwPass = ""
val dwJdbcPort = "1433"
val dwJdbcExtraOptions = "encrypt=true;trustServerCertificate=true;hostNameInCertificate=.database.windows.net;loginTimeout=30;"
How is the SQL DW authentication being established? Are you expecting SQL authentication or Azure Active Directory authentication?
@ExarKun1983 In looking at this tutorial, it is broken up into smaller sections and the case might be that that this section has not been completed: Load data into Azure SQL Data Warehouse (link).
If you have completed this section and are still receiving the error, then my previous comment about the following property values for authentication could be the issue, or did you strip out these values for the purposes of posting the issue?
val dwUser = ""
val dwPass = ""
Additionally, is adbtutorial the name of the database or the server instance, or is it both?
val dwDatabase = "adbtutorial"
val dwServer = "adbtutorial.database.windows.net"
Hi Mike,
I didn't paste values for dwUser and dwPass here but I pass it in my code. Apparently the problem was not with the DW connection itself but with the previous steps.
In the tutorial there is:
dbutils.fs.cp("file:///tmp/small_radio_json.json", "abfss://" + fileSystemName + "@" + storageAccount + ".dfs.core.windows.net/")
and it should be
dbutils.fs.cp("file:///tmp/small_radio_json.json", "abfss://" + fileSystemName + "@" + storageAccountName+ ".dfs.core.windows.net/")
The issue with the tutorial is in the following section: Ingest sample data into the Azure Data Lake Storage Gen2 account (link) where the code to be entered should be:
dbutils.fs.cp("file:///tmp/small_radio_json.json", "abfss://" + fileSystemName + "@" + storageAccountName + ".dfs.core.windows.net/")
The issue being storageAccount versus storageAccountName. It should be storageAccountName.
Assigning to the content author for review.
Thanks, @Mike-Ubezzi-MSFT. A PR has been submitted with the correction. #please-close