Gsutil: make gsutil work on Python 3.x

Created on 22 Feb 2013  Â·  44Comments  Â·  Source: GoogleCloudPlatform/gsutil

_Original author: [email protected] (August 19, 2010 22:33:59)_

Currently if you try to run gsutil on Python v3.x it will get syntax errors at print statements, because Python 3.x made print into a function, so the old syntax:
print 'abcd'
doesn't work, and instead you need:
print('abcd')

There are likely other problems beyond just this to make gsutil work on v3.x. Moreover, at present the boto library only works on Python v2.x.

At some point, when boto moves to support Python 3.x, gsutil should as well.

_Original issue: http://code.google.com/p/gsutil/issues/detail?id=29_

Priority-High Python 3 Compat imported

Most helpful comment

Please consider removing the "Priority-Low" tag. Python 3 is fast becoming the default on many platforms.

All 44 comments

_From [email protected] on September 10, 2010 12:02:36_
An update from the boto-users mailing list indicates this Python 3.x status is unlikely to change soon:

Mitchell Garnaat [email protected] Sep 10 06:40AM -0400 ^

Hi -

Currently, boto works only with Python 2.x (specifically 2.4 or later). I
have done some preliminary work to see how difficult it would be to support
both 2.x and 3.x and the amount of effort was daunting.

Mitch

_From [email protected] on April 01, 2011 18:48:03_
Today Mitch Garnaat merged one of the boto project member's work to port boto to Python 3.x, into a branch at GitHub. It's not yet clear when that branch gets merged to master, but it means we're closer to being able to support gsutil on Python 3.x.

_From [email protected] on April 01, 2012 20:59:46_
Update: There's now a branch of boto where Python 3.0 support is actively being worked on. I believe the plan is still to support Python 2.x (they did some work in the past couple months to migrate away from 2.x-specific syntax in the master branch).

Once this work is done it should hopefully be not a huge amount of work to make gsutil work on Python 3.x.

I'm working on a set of patches here that, when coupled with my patches to boto here are at least somewhat useable (I can copy, put, list, etc.). All the unit tests pass, although not all the integration tests. You can convert gsutil to python3 with a small script:

#!/bin/sh

rm -rf convert; mkdir convert

cp -ap gsutil convert
2to3 -nw convert/gsutil >> convert/2to3.log 2>>convert/2to3.err
cp -ap gslib convert
2to3 -nw convert/gslib >> convert/2to3.log 2>>convert/2to3.err
2to3 -nw convert/gslib/tests >> convert/2to3.log 2>>convert/2to3.err
cp -ap oauth2_plugin convert
2to3 -nw convert/oauth2_plugin >> convert/2to3.log 2>>convert/2to3.err
cp -ap third_party convert
2to3 -nw convert/third_party >> convert/2to3.log 2>>convert/2to3.err

cp VERSION convert

Then just export BOTOLIB=/path/to/boto/repo and convert/gsutil [whatever, maybe test -u]

Hi Toby,

Thanks for your work on this.

I cloned your gsutil fork
https://github.com/kurin/gsutil/tree/py3_attempt and
ran the script you provided. We can take it from here, but I had a couple
questions:

  1. The converted code contains python3-specific features, such as an import
    for configparser. Is there a reasonable way to make the code work under
    both python 2.x and python 3.x, or would we basically have to maintain two
    forks going forward?
  2. When I tried running the converted gsutil under python3 it found some
    python 2-isms, such as:

File "gsutil", line 129
except getopt.GetoptError, e:

We could certainly fix these by hand, but it sounds like when you ran the
script the gsutil code it produced a mostly working gsutil; this one
obviously failed before running anything.

Thanks again,

Mike

On Sat, Feb 23, 2013 at 2:42 AM, Toby Burress [email protected]:

I'm working on a set of patches herehttps://github.com/kurin/gsutil/tree/py3_attemptthat, when coupled with my patches to boto
here https://github.com/kurin/boto are at least somewhat useable (I can
copy, put, list, etc.). All the unit tests pass, although not all the
integration tests. You can convert gsutil to python3 with a small script:

!/bin/sh

rm -rf convert; mkdir convert

cp -ap gsutil convert
2to3 -nw convert/gsutil >> convert/2to3.log 2>>convert/2to3.err
cp -ap gslib convert
2to3 -nw convert/gslib >> convert/2to3.log 2>>convert/2to3.err
2to3 -nw convert/gslib/tests >> convert/2to3.log 2>>convert/2to3.err
cp -ap oauth2_plugin convert
2to3 -nw convert/oauth2_plugin >> convert/2to3.log 2>>convert/2to3.err
cp -ap third_party convert
2to3 -nw convert/third_party >> convert/2to3.log 2>>convert/2to3.err

cp VERSION convert

Then just export BOTOLIB=/path/to/boto/repo and convert/gsutil [whatever,
maybe test -u]

—
Reply to this email directly or view it on GitHubhttps://github.com/GoogleCloudPlatform/gsutil/issues/29#issuecomment-13987630.

  1. The idea is that you'd maintain one code base, but code it in such a way that 2to3 does the right thing when run on it (such as by the script I gave). After this initial push I suspect the effort involved to do that would be pretty minimal; mostly you'd just have to be careful about whether the methods you're working with deal with _strings_ (such as for printing to the screen) or _bytes_ (such as for sending data on the wire).
  2. It looks like you were running the ./gsutil right under the root; my script creates a whole ecosystem under ./convert where a new, python3-compatible gsutil is, and that should run fine.

Note also that this isn't complete; the unit tests pass, but many of the integration tests fail. I suspect most of these are because the tests are written with the expectations that we'll get strings back from the server, and obviously in python3 we get bytes instead. So a lot of the tests have to be (very slightly) modified to account for this.

Also Mitch is being aloof about my boto patches. But when I have as many integration tests passing as I think I can get (hopefully all of them), I'll submit a pull request.

On Sat, Feb 23, 2013 at 11:23 AM, Toby Burress [email protected]:

1.

The idea is that you'd maintain one code base, but code it in such a
way that 2to3 does the right thing when run on it (such as by the
script I gave). After this initial push I suspect the effort involved to do
that would be pretty minimal; mostly you'd just have to be careful about
whether the methods you're working with deal with _strings_ (such as
for printing to the screen) or _bytes_ (such as for sending data on
the wire).

So you're suggesting that we distribute both a python2 and a python3
version of the code, where the latter is generated via 2to3 from the source?

My preference would be to make the code smart enough that we can still
distribute just one version, and have conditional logic in places so that
one distro works under any of 2.6, 2.7, 3.2, or 3.3.

1.

It looks like you were running the ./gsutil right under the root; my
script creates a whole ecosystem under ./convert where a new,
python3-compatible gsutil is, and that should run fine.

No, I ran gsutil from the convert dir and got this failure.

Note also that this isn't complete; the unit tests pass, but many of the
integration tests fail. I suspect most of these are because the tests are
written with the expectations that we'll get strings back from the server,
and obviously in python3 we get bytes instead. So a lot of the tests have
to be (very slightly) modified to account for this.

Yes, understood.

Also Mitch is being aloof about my boto patches. But when I have as many
integration tests passing as I think I can get (hopefully all of them),
I'll submit a pull request.

Heh, I don't think he's being aloof; I think he's got too much on his
plate, and the python3 conversion is not his highest priority. I think he's
actually quite appreciative that you're doing it, and open to it being
incorporated into boto (as long is it "does no harm").

Note that I'm also a committer on boto, so I'm willing to help with reviews
etc. Getting gsutil working on Python3 has some value to Google, and
getting boto working on Python3 obviously is prereq...

Mike

—

Reply to this email directly or view it on GitHubhttps://github.com/GoogleCloudPlatform/gsutil/issues/29#issuecomment-13994730.

Hm, well, there is also the six module, which allows one to conditionally write python2- or- 3isms. I haven't used it, but this approach would require more extensive changes to the code base (e.g. fixing every import, every print statement, etc.), I think. Perhaps a compromise might be logic at the top of gsutil that detects on the first run whether it's in python3.2+ and, if so, it automatically modifies all the files in place. Then the actual method of distribution would be identical to what I've already seen (download this tarball, run this program, follow these directions).

Ok, thanks Toby.

After some discussion, we've decided we'll wait until the boto code port to
Python 3 is done, and then we'll do a manual transformation of the gsutil
code so it supports both 2.x and 3.x.

Thanks again,

Mike

On Sat, Feb 23, 2013 at 11:40 AM, Toby Burress [email protected]:

Hm, well, there is also the six module, which allows one to conditionally
write python2- or- 3isms. I haven't used it, but this approach would
require more extensive changes to the code base (e.g. fixing every import,
every print statement, etc.), I think. Perhaps a compromise might be logic
at the top of gsutil that detects on the first run whether it's in
python3.2+ and, if so, it automatically modifies all the files in place.
Then the actual method of distribution would be identical to what I've
already seen (download this tarball, run this program, follow these
directions).

—
Reply to this email directly or view it on GitHubhttps://github.com/GoogleCloudPlatform/gsutil/issues/29#issuecomment-13995047.

On Sat, Feb 23, 2013 at 11:16:43AM -0800, Mike Schwartz wrote:

Ok, thanks Toby.

After some discussion, we've decided we'll wait until the boto code port to
Python 3 is done, and then we'll do a manual transformation of the gsutil
code so it supports both 2.x and 3.x.

I see, so something like
https://github.com/kurin/gsutil/tree/py3_single_codebase? This is the
result of a 2to3 conversion of the source, plus some massaging to get
it to play nice with 2.6. It still passes the full test suite in python
2.6 and 2.7, and the unit tests and most of the integration tests in 3.3.

This seems to work, although it's explicitly discouraged in PEP

  1. I had to do some violence to urllib. The aftermath is in
    third_party/fix_urllib.py.

Note that it reflects the new cacert.txt that was pushed to boto/boto
earlier today or yesterday.

Seems like boto supports python3 now.

Perhaps we can revisit this now :)

Please consider removing the "Priority-Low" tag. Python 3 is fast becoming the default on many platforms.

It's been a while since the last comment here (and over 4 years since this issue was created). Is there any status update for this issue available?

This is still fairly low priority at this point, unfortunately. There are a several blockers for us starting on this this work, a few of them being:

  • Higher-priority tasks taking up development time
  • Not all dependencies are fully compatible with Python 3 (some are partially ported, or working but without full feature parity)
  • Gsutil is packaged with gcloud, which only supports Python 2 syntax currently -- users that use gsutil via gcloud would see no benefit from any Python 3 compatibility changes

IMHO, this issue is quite high priority for a lot of developers and new comers to the GCP. So, if I take a look at my crystal ball, I see two futures:

  • Option A: GCP develops soon an independent version of gsutil that is compatible with Py3. This version will be used as reference for gcloud migration to Py3. Happy developers will use this new gsutil since the beginning.
  • Option B: GCP will release the gsutil that is compatible with Py3 after X years together with gcloud. In this time, new libraries will appear that are not official but will more popular that gsutil. Developers will not be eager to change their codebase to use the new official library. Unofficial libraries will not be totally compatible with the API. Bottom line: it will be a mess.

it seems like the priority should be raised either this year or next year, https://python3statement.org/
https://pythonclock.org/
python 2 is being slowly abandoned

Additional data that may help with priority:

RHEL 8 will drop Python 2.7 as documented in the deprecation notice for RHEL 7.5

The lack of gsutil support for python3 constantly increases the complexity of migration projects I have been involved with.

Hey! If we wanted to help with this, is there any way how to do that? This is the only High-priority task in this repo, so I am surprised it's not getting any traction.

Boto3 has been supported on python3 for 3 years already.

It is getting traction :) We just haven't really been updating this thread, apologies.

@edhodapp is currently working through getting our integration tests passing for both the JSON and XML APIs. He's testing against a project that's billed to the Cloud Storage team; I wouldn't feel comfortable asking external users to work too deeply on this phase of the project, since we don't have an in-memory emulator for GCS... thus, running the integration tests requires sending real requests that you get billed real money for :(

We've created a separate py-six branch which we'll be using for the Python 3 changes, but we haven't really been keeping it in sync with Ed's branch on his fork. We plan to sync these up within the next couple of weeks.

Also, Boto3 doesn't seem particularly relevant to this thread; Boto3 includes no support for working with GCS, AFAICT. However, the older Boto library (which gsutil uses) supports Python 3, and part of Ed's work is porting our gs code tree in Boto to be Py-2-and-3-compatible.

Awesome! Thanks a lot for detailed response, greatly appreciated.

Ed's allocated time on the Python 3 project was up before we could get everything completely finished, but I checked out Ed's py-six branch from his fork and merged our master branch on top of it. I also added another commit on top of that so that tests in the current runtime (PY2) are 100% passing again. See the branch (py-six-current) and commit linked below. We'll resume work on this project after the holidays are over :)

Note: Ed's gsutil fork was also running against his boto fork (the develop branch) that had some changed to our code tree (boto/gs) to work toward making it PY2/3-compatible as well.

https://github.com/GoogleCloudPlatform/gsutil/commit/0a9fb95caf0ba3bbe6fd149cb2aa1057ce93a985

Thanks for letting me know. I don't want to be annoying, but any ETA for when this may be done?

We've got a couple folks working on this full-time for the next 2 1/2 months. Hoping to have it done by then.

That's good news. I have quite a number of small Python scripts waiting to
be ported to v3 because I also need gsutil to work with the same version

On Tue, Jan 22, 2019 at 10:18 PM Matt Houglum notifications@github.com
wrote:

We've got a couple folks working on this full-time for the next 2 1/2
months. Hoping to have it done by then.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://mailtrack.io/trace/link/73e1b0423ed9ac8150c71573f91b224f63668973?url=https%3A%2F%2Fgithub.com%2FGoogleCloudPlatform%2Fgsutil%2Fissues%2F29%23issuecomment-456527406&userId=1053505&signature=1063c25bf7d38351,
or mute the thread
https://mailtrack.io/trace/link/045dd0463e3ab7095a1faff19c986e0f1ba6eba3?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAALnfC5C_B4lciSOYxrOw5-kzYHKsPVvks5vF2QHgaJpZM4Acxzs&userId=1053505&signature=a9cc9d95e9eb199f
.

--
Yours faithfully,
Gleb

Update concerning supported Python versions: We'd originally planned to support 3.4+, but with the announcement of Python 3.4's upcoming end of life (see [1] and [2]), we've decided to limit support to 3.5+ instead. In pure coincidence, this has the advantage of reducing the amount of remaining work in the migration. A lot of the remaining bugs and failing integration tests only occurred on Python 3.4, and this means we no longer have to focus on those work items.

[1] https://cloud.google.com/appengine/docs/deprecations/python34
[2] https://devguide.python.org/#status-of-python-branches

Sounds very reasonable to me :+1: .

Master branch should now have compatibility with Python 2.7, 3.5, 3.6, and 3.7. :slightly_smiling_face:

I will post an update here once there is a new release of gsutil containing the changes. :package:

Is there any news on when this will be released?

Hi @chris-transloc ,

I can't make any official promises, but we are currently looking at releasing gsutil v4.39 next week.

Note that once gsutil is released, if gsutil is installed though gcloud SDK, the SDK will still launch gsutil with Python 2.x by default. In the next gcloud SDK release, the default interpreter for gsutil will be 3.x.

In the gap between the gsutil release and the gcloud SDK release, you can manually tell the SDK to use a 3.x interpreter by pointing CLOUDSDK_PYTHON environment variable at your local Python 3.x.

I will keep this issue updated when we release gsutil, and also when the newest SDK is released that will allow gcloud installations of gsutil to run under 3.x by default.

All the best,
Lee

Hi everyone,

During pre-release testing, we found a couple Python 3 compatibility bugs that we need to address. This means the release won't be this week as mentioned in my last comment, but we are working hard to make a stable and production-ready release of gsutil as soon as possible. I'll be sure to keep you all updated as we progress.

All the best,
Lee

Hi all,

Just a heads up that we're releasing v4.39, it won't have official Python 3 support yet since we're still ironing out some of the credential and proxy issues found doing pre-release tests. I'm working hard to make sure we have full Python 3 compatibility for 4.40, which I'm hoping will be ready in the very near future!

All the best,
Lee

Just released 4.40 an hour or so ago (to PyPI and as a tarball/zip -- the gcloud install still has a few safeguards prohibiting Python 3 invocation of gsutil). This should run on Python 3.5 and up, in addition to 2.7. We can close this nearly-decade-old bug.

Despite (what I consider to be) fairly exhaustive integration testing, there will inevitably be bugs. Please don't comment here and try to necro this issue; open a new one instead :)

Hi Matt,

Did you end up using any of my changes, or did I screw it up too much?

-Ed

Sent from my iPhone

On Jul 1, 2019, at 13:21, Matt Houglum notifications@github.com wrote:

Just released 4.40 an hour or so ago (to PyPI and as a tarball/zip -- the gcloud install still has a few safeguards prohibiting Python 3 invocation of gsutil). This should run on Python 3.5 and up, in addition to 2.7. We can close this nearly-decade-old bug.

Despite (what I consider to be) fairly exhaustive integration testing, there will inevitably be bugs. Please don't comment here and try to necro this issue; open a new one instead :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Haha :) We ended up grabbing most of your changes and continuing to iterate on them. I can't give you a big enough thanks for all the progress you made. Really glad I got to work with you, @walkerjoe, and @catleeball (and all the others who've submitted contributions to help toward Py3 compat!)

updated to gsutil version: 4.40, and still can't use gsutil with python3. what am I missing?

$ echo $CLOUDSDK_PYTHON
python

$ python -V
Python 3.7.3

$ gsutil version
ERROR: Python 3 and later is not compatible with the Google Cloud SDK. Please use Python version 2.7.x.

$ pyenv global 2.7.16

$ gsutil version
gsutil version: 4.40

@rednap We still need to get some changes in to the Cloud SDK code to allow gsutil to run under Python 3. They previously had some safeguards in to prevent users from accidentally trying to invoke it with Python 3 (back when it wouldn't have worked). I submitted a change today that should allow users to specify the Python interpreter used to run gsutil, but we'll need a couple more weeks to test edge cases (i.e. not allowing Python 3.0 - 3.4) and make sure it's working across all the supported OSs before publishing the recommended way to switch to Py3 when invoking gsutil through gcloud's wrapper script.

If you'd like to try out Python 3 support in gsutil, you can install the standalone version. Or, you can bypass the gcloud wrapper script for the gcloud installation (disclaimer: this can get a bit confusing - I've seen folks do this and wonder why their gcloud settings/credentials weren't being loaded, since it's the wrapper script that does all that) by running the underlying executable at <gcloud-sdk-root>/platform/gsutil/gsutil, rather than via the usual wrapper script at <gcloud-sdk-root>/bin/gsutil.

As of the 259.0.0 gcloud release (2019-08-20), we've enabled opt-in Python 3 support for gsutil. You can set the CLOUDSDK_GSUTIL_PYTHON environment variable to the path (absolute path, or the name of an executable findable in your PATH) of your Python 3 interpreter to run it with Python 3.

$ export CLOUDSDK_GSUTIL_PYTHON=python3
$ gsutil -D version
***************************** WARNING *****************************
*** You are running gsutil with debug output enabled.
*** Be aware that debug output includes authentication credentials.
*** Make sure to remove the value of the Authorization header for
*** each HTTP request printed to the console prior to posting to
*** a public medium such as a forum post or Stack Overflow.
***************************** WARNING *****************************
gsutil version: 4.42
checksum: d901cf5a002200f4a5526a372681677c (OK)
boto version: 2.49.0
python version: 3.6.8 [...]
OS: Linux [...]
[...]

Note that this is a new env var. Some other components of the Cloud SDK still require Python 2, so we couldn't easily use the existing CLOUDSDK_PYTHON env var.

I had another issue, after running setup.py for gsutil it still did not work, so at least for python37 I had to change file: "gsutilgslib\tests\util.py" to remove errors related to lib mock_storage_service.
As it looked like some kind of test, I commented the import to this module and also the 2 classes that use it in the script, classes are: GSMockConnection(...) and GSMockBucketStorageUri(...)
looks like it worked properly.

Would you be willing to submit a pull request with those changes?

all right, did it right now

Thanks for that! There's an additional CLA step, but once you do that I can help get it into the next release, potentially.

just to point, I've tested only for downloading data from public google cloud buckets.

Isn't this resolved at this moment?

This is indeed resolved. Per my request in https://github.com/GoogleCloudPlatform/gsutil/issues/29#issuecomment-507410800, I'm going to close this issue again and lock it.

@thomasmaclean For any additional bugs mentioned here, e.g. https://github.com/GoogleCloudPlatform/gsutil/issues/29#issuecomment-542284508, feel free to open new issues and track them separately.

Was this page helpful?
0 / 5 - 0 ratings