Could you expand this documentation with how to run spark jobs in aks reading data from adls?
https://docs.microsoft.com/en-us/azure/aks/spark-job
Thank you
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
@ElianoMarques can you expand a bit on your ask? What exactly are you unsure of as it relates to this doc?
In your example, you run a spark submit job that reads a jar from azure blob but the example doesn't read data from anywhere as its the pi example.
What would be very usefull is to have the documentation around running spark jobs in kubernetes accessing data in adls. If you try out-of-the-box spark 2.4.0 and follow all the processes around building the docker image, adding the standard azure-data-lake jar and hadoop-azure-datalake jars into spark, configure core-site.xml and start spark via a spark-submit, or pyspark or sparkR, neither of the options connect to adls. If you would run spark locally with the same settings it works. Maybe there are some steps extra to configure that connectivity. Also, it would be nice to see an example in the documentation with jupyter.
Was this helpfull?
Eliano
@ElianoMarques got it. Thank you for the further explanation :)
@lenadroid can you take a look?
CC @iainfoulds
Thanks for the suggestion, @ElianoMarques. I've created a backlog work item to track this doc suggestion. I don't have an ETA on when this doc may be published.
@MicahMcKittrick-MSFT For now, #please-close
Hi, is there an update to this?