I read a 16 GB CSV file with pandas. It used to work well, but recently I started hitting this error:
ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.
I tried waiting a few days in case it's a limit of some sort, but it didn't help. I keep hitting this error.
It used to work well before (the file is read and processed, no issues), so I would expect it to work the same way, as nothing has changed in the code nor in the file.
Is it because of the file size? Am I now hitting some limits or what? What are the limits, if so? Or could the file get corrupted by Google Drive? Some more detailed information would be great.
My browser is: Chrome 81.0.4044.122 (Official Build) (64-bit)
There's nothing colab-specific here: that error is coming from the library you're using (I'm guessing pandas?).
I'd try asking on stack overflow.
This actually IS related to google colab, which have been acting weird lately. This error appears indeed when calling pandas.read_csv, but numpy.load() will also throw an Input/Output Error.
@GillesVandewiele I believe it happens when you exceeded certain quotas. It's just the error is too obscure, Colab will never tell you what your quotas are or even that you've exceeded them. Not sure why it has to be so obscure. I moved back to my local machine, Colab is only slowing me down and is not helpful at all because of these issues.
Try just waiting a day or a few days, it may start to work later (apparently when you get a fresh quota, which they won't tell you but it will happen lol).
It's quite a disgrace how they put a quote on the amount you can read from your own drive... Thanks for the tips @sapph1re!
It's quite a disgrace how they put a quote on the amount you can read from your own drive... Thanks for the tips @sapph1re!
I agree, especially considering that I may actually be paying for my drive space. I do, because I needed a lot. I guess, not going to continue, because I still can't use it properly.
I'm paying for both drive and colab pro. Colab is being super weird these days. I might quit both, super annoying.
I got this error, and before it, I was getting the input/output one.
I'm having the same issue. It's really annoying. as others said it seems this happens when you read more than a specific amount from drive. but I just read a 22 MB csv file for about 10 times in 12 hours.
I tried to read csv file using a new account but encountered the same issue too
the only way I could get around with this was to upload the file directly to the temporary Colab Files section.
yeah, the same issue too. I tried to read excel with pandas and I got error
For anyone seeing new errors in the last 12 hours, see: https://github.com/googlecolab/colabtools/issues/1428
(tl;dr: we're seeing errors communicating with Drive; we're looking into it.)
I was facing same problem with my local jupyter notebook , what i found is there is a file called: .ipynb_checkpoints got created in the folder where i have all my csv files , so it was throwing an error for me while i was looping through the folder and reading all csv files one after another. Just sharing this here as i landed on this issue , if it helps anyone.
upload the datasest into the drive
https://www.youtube.com/watch?v=Gvwuyx_F-28&t=497s
watch this it worked for me
Most helpful comment
I was facing same problem with my local jupyter notebook , what i found is there is a file called: .ipynb_checkpoints got created in the folder where i have all my csv files , so it was throwing an error for me while i was looping through the folder and reading all csv files one after another. Just sharing this here as i landed on this issue , if it helps anyone.