If I run the following Python code:
from google.cloud import storage
#gcs_project = some_project
#bucket_name = some_bucket
#gcs_path = path/to/dir/
storage_client = storage.Client(project=gcs_project)
bucket = storage_client.get_bucket(bucket_name=bucket_name)
blobs = bucket.list_blobs(prefix=gcs_path)
I end up with an empty iterator for blobs. I was running the same code two weeks ago, with the same values for gcs_project, bucket_name and gcs_path and was getting results. I have been double-checking this for days, and am really hoping it's some silly mistake I'm overlooking.
Currently running Python 3.6.5, google-cloud-storage 1.18.0.
Thanks,
Alex
Can you double-check that the user / service account you are using to authenticate with has the right IAM permissions to list objects on that bucket?
It's a service account with Storage Admin role...I'm assuming that includes listing objects? It was also working before, and I never changed any of the permissions.
Storage Admin should be sufficient. Maybe @frankyn knows of other things to check?
Hi @alexf-a,
The prefix value depends on there being objects that use it. If there are none, GCS will respond with an empty list.
I'm assuming you're iterating through each blob such as:
for blob in blobs:
print(blob.name)
I'd verify that there are no typos in the prefix and that there is object names that are prefixed with it.
One short debug would be to remove "prefix=" to get a list of all your objects.
from google.cloud import storage
storage_client = storage.Client()
# Get blobs
blobs_iterator = storage_client.list_blobs("bucket-name")
# Print blobs at current prefix
for blob in blobs_iterator:
print("{}".format(blob.name))
Hey it worked!
I think the issue was the privileges the whole time, combined with the blobs.num_results property.
I added a check for an empty iterator using blobs.num_results, and throwing an error in case blobs.num_results == 0, to stop the function early if list_blobs returned empty.
The code below will still throw an error, even when list_blobs does not return empty.
blobs = bucket.list_blobs(prefix=gcs_path)
if blobs.num_results == 0:
raise Exception("list_blobs returned an empty iterator")
Is there any other way to quickly check if list_blobs returned an empty iterator?
Thanks for your help!
list_blobs doesn't actually make any API calls until you start iterating over it, so num_results won't be populated until you fetch at least the first page.
@alexf-a Please follow up / reopen if @tswast's answer isn't sufficient for your needs.
list_blobsdoesn't actually make any API calls until you start iterating over it, sonum_resultswon't be populated until you fetch at least the first page.
Why? That seems like more of a bug than a feature.
It鈥檚 how the core iterator works. I agree that it鈥檚 confusing, but I鈥檇 consider it to be a breaking change if list_ functions started making API calls immediately.
Hmm, I see. Thank you for the clarification! It's easy enough to workaround with a for blob in blobs: pass just not ideal.
for blob in blobs: pass
Just chiming in (after hitting this same issue and landing here) to say that list(blobs) works too and is much less ugly 馃槉
>>> blobs = list(bucket.list_blobs())
>>> len(blobs)
10 # or whatever
Most helpful comment
Just chiming in (after hitting this same issue and landing here) to say that
list(blobs)works too and is much less ugly 馃槉