I'm in the middle of removing more python2 related stuff, but leave the less trivial refactorings to separate PRs.
This one is referred in utils.data:
https://github.com/astropy/astropy/blob/master/astropy/utils/data.py#L202
@bsipocz I would like to start contributing to the project. This issue is regarding only data.py#L202 ? Or as part of the fix I should also search in other files for io.BytesIO? Thank you.
yes, I think the first approach would be to switch here only, but I'm cc-ing the io.fits and io.votable maintainers to comment whether they think it could be changed in those modules, too. (@pllim, @MSeifert04 @saimn)
Let's leave io.votable alone for now. While there is explicit note in utils to switch away from BytesIO, it is not obviously so over in io.votable. @mdboom added that note in utils 5 years ago, so if he would like to chime in, then great; if not, my decision stands.
It could be interesting to have a look for similar cases in io.fits if you want to (there is one in file.py I think), but it's fine to start with only the current issue in utils - one thing after the other.
@bsipocz I was making the first change, starting with https://github.com/astropy/astropy/blob/master/astropy/utils/data.py#L202 and I did the following:
if not hasattr(fileobj, 'seek'):
fileobj = io.BufferedReader(fileobj)
Then I executed python3 setup.py test -P utils and got the following error:
> fileobj = io.BufferedReader(fileobj)
E AttributeError: 'FakeStream' object has no attribute 'readable'
astropy/utils/data.py:209: AttributeError
Then I checked the test file and I saw that FakeStream has a read attribute, but not a readable:
https://github.com/astropy/astropy/blob/308ea8f6cbd828325bda19a1cecf680ac9b3e567/astropy/utils/tests/test_data.py#L375-L403
So I thought I would need to change the test only. But, I decided to check how the method utils.get_readable_fileobj is used and I saw this:
https://github.com/astropy/astropy/blob/308ea8f6cbd828325bda19a1cecf680ac9b3e567/astropy/io/ascii/core.py#L289-L295
Currently, only the attribute read is checked (I didn't check other usages for utils.get_readable_fileobj), but in order to use io.BufferedReader the object should have the attribute readable, as io.FileIO objects have.
So, when the parameter name_or_obj of utils.get_readable_fileobj is an object, we need to make sure it has the readable attribute, otherwise io.BufferedReader will fail.
So I guess we would need to add the attribute readable to FakeStream (to fix the test) and ensure that the objects passed to utils.get_readable_fileobj also have this attribute, but since I am new to the project, I don't know what would be the impact of the latter in terms of changes in the rest of the codebase.
Could please tell me what to do now?
I am not experienced with file I/O in python, so if there's something I am missing, please let me know.
I'm interested in working on this issue if it isn't actively being worked on. Is it okay for me to move forward with it? Thanks!
@jairideout , I don't see why not, though I would advise submitting one PR per sub-module. That is separate out, say utils changes from io.fits and so on. I also advise that you don't start with io.fits. :smile:
Thanks for the guidance @pllim! I'll get started on this soon, with a separate PR per sub-module.
@pllim I submitted a PR (#7981) to update the utils subpackage to use BufferedReader instead of BytesIO.
I'll review when I get a chance. Thank you very much!
An attempt at this , thanks to @jairideout , has made us realized that this battle is not worth fighting. We can revisit if there is a practical need in the future.
Most helpful comment
An attempt at this , thanks to @jairideout , has made us realized that this battle is not worth fighting. We can revisit if there is a practical need in the future.