Recent refactoring in #2873 and #2853 implemented isdir(), isfile() and exists(). exists() checks if the given path exists using isdir as well as isfile. isfile simply checks for existence of the blob (in case of s3 and gs), whereas isdir checks if any files with the prefix exists.
.cache_exists() uses .exists(), but does not care about directory paths. So, to reduce number of ._list_paths() remote calls, it's better to refactor .cache_exists() to use isfile.
As @Suor notes here https://github.com/iterative/dvc/pull/2873#discussion_r352328162, isfile might not have been implemented in all remotes, or might be slower in some remotes.
Further infomation: https://github.com/iterative/dvc/pull/2873#discussion_r352332637
BTW, this is a bit more complicated.
.exists() being more complex and slower is true for s3 and gs, but might be not true for other remotes likes local and ssh. So we should bench things and might even require using different calls for different remotes to get top performance everywhere.
@skshetry, I was thinking about leveraging S3's CommonPrefixes to check if a file exist. That should handle it in one call, instead of the isfile or isdir check.
Most helpful comment
BTW, this is a bit more complicated.
.exists()being more complex and slower is true for s3 and gs, but might be not true for other remotes likes local and ssh. So we should bench things and might even require using different calls for different remotes to get top performance everywhere.