When building a repository that will serve up large files (read-only), globally - would you recommend Azure Cosmos DB or Azure CDN as a better option?
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
@TracyGH Thank you for your interest in Azure cloud services. Do you have any specific requirements that stand out? How about developer resources, do you have a dev resource who understands NoSQL (for the case for CosmosDB) as well as an idea of your anticipated throughput? How many document requests per sec/min/hour do you anticipate?
CosmosDB can be used to store document metadata where the actual document is stored in an Azure Storage account. CosmosDB can also geo-replicated to specific regions that are closer to your client endpoints. Azure CDN also operates in a way that provides commonly requested documents to an edge cache for faster retrieval.
Initially, without any detailed requirements, Azure CDN is probably the way to go given the complex nature of CosmosDB but, without any specific requirements the recommendation is general in nature.
Thank-you for the quick feedback @Mike-Ubezzi-MSFT !
Specifically:
You mentioned, "CosmosDB can be used to store document metadata where the actual document is stored in an Azure Storage account."
Can you elaborate on that, or point me towards a good resource to learn more about the technical implementation and pros/cons of that strategy?
Thanks again!
@TracyGH I see your options as follows:
1) Use Azure Blob Storage (RA-GRS) with Azure Search Service. Consideration is that you will need to implement your own logic to find the closest storage region for file access and document management (deleting old content, etc.).
2) Use Azure CDN + Option 1. Azure CDN adds find shortest path logic as well as caching logic which could significantly improve the end user experience.
3) Use CosmosDB with option 1. CosmosDB will add index logic and meta-data 'doc' properties for document classification. CosmosDB will also add management capabilities in terms of running reports/analytics on content, and allow for overall management of all content. Application logic could be built-in to access closest storage region to improve end-user experience.
Since indexing/searching content is the deal breaker, here is a good Stack Overflow post with regards to Blob Storage indexing: How to Index the Blob Storage in Azure?