Azure-docs: Custom path mapping explanation is not clear.

Created on 28 Mar 2019  Â·  6Comments  Â·  Source: MicrosoftDocs/azure-docs

After reading the documentation, I'm still in doubt if I understood custom mapping correctly. You are saying you can specify the path, but I also have to put /* as the default path? Does that mean it will index all fields anyway? If not, what's the point of leaving that default value? In a scenario that I only want to index specific properties out of let's say a hundred fields, how do I set that up?


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

cosmos-dsvc cxp in-progress product-question triaged

Most helpful comment

@pournejati We generally do not recommend customers exclude everything, then selectively add paths to include in the index. Very often customers who use this approach end up wondering why their queries do not work when their schema changes because they forgot to modify their indexing policy.

The approach we recommend is to index everything by default and selectively exclude paths. Typically this is done as way to improve write performance and generally only needs to include paths that are deep hierarchies and/or arrays. The benefit you will receive by excluding paths on a flat schema such as yours will be negligible. Plus if you ever do want to query on one of those paths in the future, you will need to modify your indexing policy and wait for it to update before running your queries.

Thanks.

All 6 comments

And if I set my specific properties, what's the point of exclusion option? BTW, I tried to remove the "etag" from exclusion and add /*, but it didn't remove the etag from the list.

@pournejati Thanks for the valuable feedback.
@markjbrown Could you please elaborate on "When you set custom index paths, you're required to specify the default indexing rule for the entire item, denoted by the special path /*", as mentioned in the document.

As if we need provide the custom index paths where we want to add specific index path not all the paths. Then if we give default indexing rule i.e. /* as per the document. It would create index path for all paths and there would be no point of specific index paths.

Could you please bring some light on this.

Let me give you a scenario. Let's say I have a document with the following fields (including Cosmos Db internal fields):

{
    "id": "c9ee7bdd-67ee-efd8-f9e7-79852b687400",
    "customerId": "9683b229-d2cd-4c4f-bef2-d5f94ed61abc",
    "type": "BenefitAssignments",
    "sourceId": "827",
    "commonCode": "C",
    "displayName": "Not assigned",
    "_rid": "vxxDAPayRyNNAAAAAAAAAA==",
    "_self": "dbs/vxxDAA==/colls/vxxDAPayRyM=/docs/vxxDAPayRyNNAAAAAAAAAA==/",
    "_etag": "\"0b000320-0000-0000-0000-5c9bbe710000\"",
    "_attachments": "attachments/",
    "_ts": 1553710705
}

The "customerId" is the partition key. I'd like the following fields to be indexed only: id, customerId, type, and "sourceId". Any other field in the document should be excluded from the indexing process. What is the correct approach to address this? From the default setting, it is automatically excluding _etag. What if I want to exclude everything except those I mentioned above?

@pournejati Thank you for providing the scenario.
You can specify the fields that you want to be excluded from indexing.

 "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        },
        {
            "path": "/\"commonCode\"/?"
        },
        {
            "path": "/\"displayName\"/?"
        }
    ]

Please let us know if you have further questions.

@pournejati We generally do not recommend customers exclude everything, then selectively add paths to include in the index. Very often customers who use this approach end up wondering why their queries do not work when their schema changes because they forgot to modify their indexing policy.

The approach we recommend is to index everything by default and selectively exclude paths. Typically this is done as way to improve write performance and generally only needs to include paths that are deep hierarchies and/or arrays. The benefit you will receive by excluding paths on a flat schema such as yours will be negligible. Plus if you ever do want to query on one of those paths in the future, you will need to modify your indexing policy and wait for it to update before running your queries.

Thanks.

Thanks for the clarification. It helped a lot.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DanijelMalik picture DanijelMalik  Â·  82Comments

tvperez76 picture tvperez76  Â·  55Comments

renattomachado picture renattomachado  Â·  42Comments

jlorek picture jlorek  Â·  46Comments

m-andersen picture m-andersen  Â·  65Comments