Azure-docs: SQL access using secrets

Created on 4 Oct 2018  ·  16Comments  ·  Source: MicrosoftDocs/azure-docs

Under the section Use SQL data, there is a code snippet on how to get access to a Microsoft SQL Server. The snippet calls a function to register a secret dprep.MSSQLDataSource(), and the text following the snippet tells that the secret is used to pass the password to the function.

I can't get this to work, and there are some things I find unclear about this.

First, is password=[DATABASE-PASSWORD] really correct inside the call to dprep.MSSQLDataSource()? My guess is that it should really be password=secret, right?

Second, there is no explanation to the function dprep.register_secret(). What does it really do; does it register a new secret in some vault, or does it just obtain a previously registered secret? And what should be inserted instead of "[SECRET-USERNAME]" and "[SECRET-PASSWORD]"?

Thirdly, is it necessary to connect to a Workspace before doing these steps? It seems likely that the secret should be stored in the key vault created along with the workspace. However, when I navigate to the key vault in Azure Portal, I can see that I am authorized to use the key vault, even though I was the one that created the Workspace that cause the creation of the key vault in the first place. I managed to find a guide on how to add secrets using the CLI, but I only get Access denied when I try, which seems reasonable due to the fact that I don't seem to be authorized to use the key vault.

I'm a little lost at this point. No matter on how I try to configure, I can only get so far as to get an error message saying ExecutionError: Login failed., so I would gladly take any help I can get to be able to connect to my MSSQL server.

Thanks in advance


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

assigned-to-author doc-bug machine-learninsvc triaged

Most helpful comment

Oh yes, you are absolutely right for 1. My mistake.

Happy to hear you are unblocked now! Will update this thread again once all the issues have been resolved on our end.

All 16 comments

@ArvidBaa Thanks for you details about the document, I have assigned the issue to the content author to investigate further and update the document as appropriate.

@cforbe Hi, can you please check this post, there are some unclear statements, can you please check it and also update the document as necessary? Thanks!

Looking into it now!

@cforbe Look forward to your response. Thanks!

Hi @ArvidBaa,

I am an engineer on the DataPrep team. These are good questions, I will try answering them:

  1. Q: “is password=[DATABASE-PASSWORD] really correct inside the call to dprep.MSSQLDataSource()? My guess is that it should really be password=secret, right?”
    A: Yes, you are correct, this is a documentation error on our end. The document should say password="[DATABASE-PASSWORD]". We will resolve this.
  2. Q: “Second, there is no explanation to the function dprep.register_secret(). What does it really do; does it register a new secret in some vault, or does it just obtain a previously registered secret? And what should be inserted instead of "[SECRET-USERNAME]" and "[SECRET-PASSWORD]"?”
    A: The documentation should say register_secret(value="[DATABASE-PASSWORD]", id="[DATABASE-USERNAME]"). SECRET-USERNAME and SECRET-PASSWORD corresponds to DATABASE-USERNAME and DATABASE-PASSWORD, this is another documentation error on our end and we will fix it. The register_secret method for now only registers the secret to our engine process and is only alive for the duration of engine process. If the engine process is restarted (e.g. restarting the notebook kernel), then all the registered secrets are gone. There is a reference notebook that explains our current secret story in more details. However, our read_sql should be calling register_secrets for you. I've filed a bug and we will fix it.
  3. Q: “Necessary to connect to workspace….”
    A: No for now. We are not using any Azure services to persist secrets. Secrets only live for the lifetime of the engine process. We are actively working on a better secrets synchronization story. I will update this thread once the feature has been shipped.

Please let us know if you have any other questions.

Hi @rongduan-zhu,

  1. You say that the document should say password="[DATABASE-PASSWORD]", but the data type for the method is not a string, but a azureml.dataprep.api.engineapi.typedefinitions.Secret object (see the documentation). So I believe the true value should be password=secret, where secret=register_secret(value="[DATABASE-PASSWORD]", id="[DATABASE-USERNAME]").

  2. Thank you, this was just the information I needed to succeed in connecting to my MSSQL server. However, this raises a few questions, mostly curious ones.
    a) Why is the database username passed twice to the function dprep.MSSQLDataSource()? Both as user_name and then inside the password object.
    b) Why is it called a secret, when by all means it is just a regular object containing two strings? Both the id and the value parameters are saved as strings, in plain sight for anyone reading the code. I have saved them as environment variables to prevent this, but it seems better to store (at least) the password in the key vault connected to the Workspace. I will follow any updates on this with great interest.

  3. I will follow the updates with great interest. Thank you.

Again, thank you for the help. As I said, I have now managed to connect to the database, and can continue to work with the data preparations.

Oh yes, you are absolutely right for 1. My mistake.

Happy to hear you are unblocked now! Will update this thread again once all the issues have been resolved on our end.

@rongduan-zhu - please update.

Have there been any updates to my follow up questions above? Especially how to handle the secrets without writing them in plain text? I am at the moment trying to implement this kind of data preparation in a pipeline for my Machine Learning service, and when doing so I don't want my database credentials to be included as plain text in the scripts. I would like to store them somewhere secure, like in a key vault in Azure Portal (for example the one that is being created along with the workspace?!), or in some other secure location.

@ArvidBaa I am currently working on MS SQL Datastore support in Data Prep and we are targeting the next release (end of Jan). However, since datastore belongs to a workspace, you will need a workspace in order to use datastores.

The experience will be like this once the feature is complete:

import azureml.dataprep as dprep
from azureml.core import Workspace, Datastore

workspace = Workspace.from_config()
datastore = Datastore(workspace, 'sql_datastore')

dflow = dprep.read_sql(datastore, query='SELECT * FROM mytable')

@rongduan-zhu Ok, that looks promising. I am looking forward to the release including that functionality.

Thank you for the update!

Best regards
Arvid Bäärnhielm

@trevorbye - can you respond here please and update the document as needed?

@rongduan-zhu is this design pattern implemented in the current SDK version? If so, I will add it into our documentation.

Yes, please refer to this notebook on how to read from SQL. It's at the bottom of the __Read data from Datastore__ section.

Hope this helps. If it does, #please-close.

In the notebook, it is not explained how the database is actually connected to the workspace Datastore in the beginning. However, I assume the instructions given in this notebook is covering that?

Is this implementation enough for me to be able to connect to my MS SQL? I am getting a little lost in all different types of SQL in Azure. The Datastore documentation tells me that there are methods for initializing two different types of SQL: Azure PostgreSQL and Azure SQL database. Is this enough?

@YutongTie-MSFT #please-close this as the remaining issue is with the notebook. I've moved this issue to that repo as referenced above. Thanks @ArvidBaa

Was this page helpful?
0 / 5 - 0 ratings