Azure-docs: Provide documentation on spark jobs on kubernetes reading data from adls

Created on 22 Jan 2019  Â·  5Comments  Â·  Source: MicrosoftDocs/azure-docs

Could you expand this documentation with how to run spark jobs in aks reading data from adls?

https://docs.microsoft.com/en-us/azure/aks/spark-job

Thank you


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Pri2 assigned-to-author container-servicsvc doc-enhancement triaged

All 5 comments

@ElianoMarques can you expand a bit on your ask? What exactly are you unsure of as it relates to this doc?

In your example, you run a spark submit job that reads a jar from azure blob but the example doesn't read data from anywhere as its the pi example.

What would be very usefull is to have the documentation around running spark jobs in kubernetes accessing data in adls. If you try out-of-the-box spark 2.4.0 and follow all the processes around building the docker image, adding the standard azure-data-lake jar and hadoop-azure-datalake jars into spark, configure core-site.xml and start spark via a spark-submit, or pyspark or sparkR, neither of the options connect to adls. If you would run spark locally with the same settings it works. Maybe there are some steps extra to configure that connectivity. Also, it would be nice to see an example in the documentation with jupyter.

Was this helpfull?

Eliano

@ElianoMarques got it. Thank you for the further explanation :)

@lenadroid can you take a look?

CC @iainfoulds

Thanks for the suggestion, @ElianoMarques. I've created a backlog work item to track this doc suggestion. I don't have an ETA on when this doc may be published.

@MicahMcKittrick-MSFT For now, #please-close

Hi, is there an update to this?

Was this page helpful?
0 / 5 - 0 ratings