This page is very confusing. It starts off by saying that Parquet is the preferred format and sends me off to the Parquet site which has me thinking I'm going to have to write my own serializer. (Fortunately there appears to be a nuget library that can do the job …. https://github.com/elastacloud/parquet-dotnet). Then by the end it's talking about JSON but it's still very unclear (without a thorough read) whether I'm expected to provide blobs named in the format described or I can just throw any parquet blob in and have it processed (i.e. are you telling me about the internal implementation here?). A short code/pseudo-code example showing how to get a couple of records into the system would do wonders here.
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
Hi @NeilMacMullen Thanks for taking the time to provide your valuable feedback! I have assigned the issue to the content author to evaluate and update as appropriate.
Thanks Alberto. Having spent a fair amount of the day reading about TSI, I think a lot of my confusion came from assuming that this page was a describing how to flexibly import data from another system by, for example, writing to blobs that TSI would then read as input. It's only at the end of this article that it is mentioned that input sources "include" EventHubs and IOTHub which really means that these are the only supported mechanism! I think it would help a lot if the article _started_ with the outlining that the only way to get data into TSI at present is via those two mechanisms and then went on to make it clear that the discussion of Parquet is really describing the intermediate and output data format.
@NeilMacMullen - Thanks for the feedback - I will look into ways to clarify the introduction. Cheers!
@deepakpalled - pinged you.
@KingdomOfEnds - Let us review and restructure the content to lead the concept with the two mechanisms our customers can use to ingest data into TSI.
@NeilMacMullen - I've gone ahead and pushed the updated document. Thanks!
Most helpful comment
Thanks Alberto. Having spent a fair amount of the day reading about TSI, I think a lot of my confusion came from assuming that this page was a describing how to flexibly import data from another system by, for example, writing to blobs that TSI would then read as input. It's only at the end of this article that it is mentioned that input sources "include" EventHubs and IOTHub which really means that these are the only supported mechanism! I think it would help a lot if the article _started_ with the outlining that the only way to get data into TSI at present is via those two mechanisms and then went on to make it clear that the discussion of Parquet is really describing the intermediate and output data format.