Google-cloud-ruby: Docs (README) should warn customers about the implications of requiring the all-in-one google-cloud module

Created on 21 Mar 2017  路  9Comments  路  Source: googleapis/google-cloud-ruby

See https://github.com/GoogleCloudPlatform/google-cloud-node/issues/2102 for corresponding NodeJS issue.

Per @lukesneeringer: "You should definitely open up an issue on Ruby; aggressive require resolution is standard practice in Sinatra and other places, and use of the meta-package should be discouraged there."

p1 acknowledged feature request

Most helpful comment

For the sake of history, here is the memory pressure that the google-cloud libraries add. We have tried really hard to make it so only the services that are used are loaded.

First creating a global gcloud object and loading a service from it:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# load the Google Cloud library, create global object
require "google/cloud" #-> 44.3 MB
gcloud = Google::Cloud.new #-> 44.3 MB

# create BigQuery object (and load the library) from global object
bigquery = gcloud.bigquery #-> 64.9 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 64.9 MB
# create another BigQuery library using library initializer
bigquery2 = Google::Cloud::Bigquery.new #-> 65.7 MB

Second, load google/cloud, but don't create a global object:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# load the Google Cloud library
require "google/cloud" #-> 44.3 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 64.0 MB
# create BigQuery library using library initializer
bigquery = Google::Cloud::Bigquery.new #-> 65.2 MB

# create global object
gcloud = Google::Cloud.new #-> 65.2 MB
# create another BigQuery object from global object
bigquery2 = gcloud.bigquery #-> 65.2 MB

Third, load and create a service object, then load google/cloud:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 63.6 MB
# create BigQuery library using library initializer
bigquery = Google::Cloud::Bigquery.new #-> 64.3 MB

# load the Google Cloud library
require "google/cloud" #-> 64.3 MB
# create global object
gcloud = Google::Cloud.new #-> 64.3 MB
# create another BigQuery object from global object
bigquery2 = gcloud.bigquery #-> 65.4 MB

As you can see, requiring google/cloud adds very little memory overhead.

All 9 comments

Does Ruby have this problem? We limit what code is loaded until the calls are actually made, so requiring "google/cloud" does not load every gem fully into memory.

Here is where google-cloud-bigquery lazily executes the require. Other service support is similar.

Sounds good. You clearly actively covered this base already.

@bjwatson Can we close this issue?

Yep, thanks!

For the sake of history, here is the memory pressure that the google-cloud libraries add. We have tried really hard to make it so only the services that are used are loaded.

First creating a global gcloud object and loading a service from it:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# load the Google Cloud library, create global object
require "google/cloud" #-> 44.3 MB
gcloud = Google::Cloud.new #-> 44.3 MB

# create BigQuery object (and load the library) from global object
bigquery = gcloud.bigquery #-> 64.9 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 64.9 MB
# create another BigQuery library using library initializer
bigquery2 = Google::Cloud::Bigquery.new #-> 65.7 MB

Second, load google/cloud, but don't create a global object:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# load the Google Cloud library
require "google/cloud" #-> 44.3 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 64.0 MB
# create BigQuery library using library initializer
bigquery = Google::Cloud::Bigquery.new #-> 65.2 MB

# create global object
gcloud = Google::Cloud.new #-> 65.2 MB
# create another BigQuery object from global object
bigquery2 = gcloud.bigquery #-> 65.2 MB

Third, load and create a service object, then load google/cloud:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 63.6 MB
# create BigQuery library using library initializer
bigquery = Google::Cloud::Bigquery.new #-> 64.3 MB

# load the Google Cloud library
require "google/cloud" #-> 64.3 MB
# create global object
gcloud = Google::Cloud.new #-> 64.3 MB
# create another BigQuery object from global object
bigquery2 = gcloud.bigquery #-> 65.4 MB

As you can see, requiring google/cloud adds very little memory overhead.

Most ruby applications use a Gemfile to declare their dependencies, and some applications like Sinatra and Rails may choose to pre-load libraries. We have thought through this and believe we properly control how much is loaded into memory in these circumstances. For instance, given this Gemfile:

gem "google-cloud"

And creating a ruby process with bundle exec, the following memory is used:

# create a baseline of memory usage
baseline = Object.new #-> 54.8 MB

# call require which will load all gems in the Gemfile
Bundler.require #-> 55.2 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 69.6 MB
# create BigQuery library using library initializer
bigquery = Google::Cloud::Bigquery.new #-> 70.6 MB

If the Gemfile is changes to include the umbrella gem as well as the service gem used:

gem "google-cloud"
gem "google-cloud-bigquery"

Then the memory usage is similar:

# create a baseline of memory usage
baseline = Object.new #-> 54.8 MB

# call require which will load all gems in the Gemfile
Bundler.require #-> 55.2 MB

# explicitly load the BigQuery library
require "google/cloud/bigquery" #-> 69.6 MB
# create BigQuery library using library initializer
bigquery = Google::Cloud::Bigquery.new #-> 70.6 MB

Finally, here is the memory usage for loading all services into memory:

# create a baseline of memory usage
baseline = Object.new #-> 43.7 MB

# explicitly loading all google-cloud gems 
require "google/cloud/bigquery"
require "google/cloud/datastore"
require "google/cloud/dns"
require "google/cloud/error_reporting/middleware"
require "google/cloud/language"
require "google/cloud/logging"
require "google/cloud/monitoring"
require "google/cloud/pubsub"
require "google/cloud/resource_manager"
require "google/cloud/speech"
require "google/cloud/storage"
require "google/cloud/trace"
require "google/cloud/translate"
require "google/cloud/vision" #-> 95.4 MB

Creating object instances for each service will use additional memory.

Great! Thanks for the metrics @blowmage, and thanks to both of you for all the effort to keep the metapackage lean (i.e. only loading what's needed).

Was this page helpful?
0 / 5 - 0 ratings