Rails: Folder with active storage

Created on 2 May 2018  路  45Comments  路  Source: rails/rails

I try to use ActiveStorage to store my images. With paperclip, in S3 bucket we can have product/id_image/*.png. But with ActiveStorage, all is at the root and in the variants folder.

Any pan to hierarchy this with folder, at least one folder by model ? Because brows in FTP with several images is very slow.

Most helpful comment

We hope this feature will be added to ActiveStorage. That would be helpful.

All 45 comments

No, sorry.

We hope this feature will be added to ActiveStorage. That would be helpful.

Why no?

@georgeclaghorn Thanks for all things Rails (OMG I am actually in Philly right now!). Would you mind putting a quick explanation of why "no?" If you do, I am happy to add a few sentences to the ActiveStorage Overview Rails Guide.

At this point, using ActiveStorage means using one bucket per app, rather than being able to use one bucket for a bunch of different Rails apps, with each using their own folder within the bucket.

We very much agree with what @ybakos said.

We also totally understand this feature was not selected as a required one for a first version of Active Storage as it must have been a lot of work already! But we kindly do not understand why there would be no plans nor discussion to support this in the future?

As people pointed out, having everything at the root could make browsing files hard. Also, in our case, we are doing a lot of micro-sites for clients (3 sites a fortnight). The corresponding assets need separation. We cannot use buckets to segregate the assets as there is a max of 100 buckets per AWS account. Also, this would mean contacting the infrastructure team every time we want to create a new micro-site. In other words, it would simply not scale for us.

Just want to reiterate it would be really helpful to maintain the structure post migration as mentioned in the original description. It would definitely be helpful if the library allows you to segregate data into a separate folder under a bucket and not dump everything under a single bucket.

@georgeclaghorn

100 buckets is the default limit on S3. You can request a limit increase from AWS support.

If you really spin up so many applications that you鈥檙e frequently hitting the S3 bucket limit, maybe S3 isn鈥檛 right for you. Google Cloud Storage doesn鈥檛 have a bucket limit.

There are also reasons to separate data inside a single application. For example for uploads of invoicing-related data, it is very cumbersome to create separate buckets for the PDFs of invoices, credit notes and other related models. With Carrierwave it is no problem to seperate them into different folders of one "accounting" bucket and also to specify their filename, like adding customer number and invoice number, so that they can be identified by other systems.

@georgeclaghorn, what is the main reason for the rejection of this feature?
You need only specify prefix in has_one_attached, has_many_attached associations, and change one line in this methods: object_for(key), path_for(key), file_for(key).

class User < ApplicationRecord
  has_one_attached :avatar, prefix: 'avatars'
  has_many_attached :docs, prefix: 'docs'
end

@vtm9 Send a PR?

@ybakos, make a PR is not a problem. I think @georgeclaghorn fundamentally does not want this feature.

@georgeclaghorn

If you want this to work with sub/sub/sub folders and be able to specify anything you desire in a :prefix as of today.
Just use the proactivestorage gem.

https://github.com/doberg/proactivestorage/tree/master/proactivestorage

@georgeclaghorn Is it possible to have ActiveStorage save your images on s3 with a custom name (timestamp) followed by a trailing file_format (i.e jpg etc)?

We moved to shrine

Can you reopen this?

We can't use ActiveStorage without some form of custom directory structure.

It's unreasonable IMO to simply say "no", close the issue, give no explanation of any kind and offer no alternatives besides "increase your bucket limit" and "use a different file storage"

This is not the communication I expect from anything that is merged into rails core.

We can't use ActiveStorage for multi-tenant applications without having some control over the directory structure. ActiveStorage is a nogo to us.

Was there ever an explanation for the "no"?

The irony here is that there are really no such thing as directories in S3 ... the path becomes part of the key in a giant flat directory as "foo/bar/baz.txt => file" ... directory structures are simulated.

I am extremely surprised at the pushback on this request. Not so much the lack of eagerness to add the functionality, but rather the hard stance suggesting it's not valuable.

What do you think about that solution?
1) Create a Blob with desired pathname structure and uploaded a file.
2) Pass created Blob to attach method _(as create_from_blob method accepts a Blob)_
StackOverflow

The OP asked a question鈥攚hether we planned to add a feature鈥攁nd I answered it honestly. We do not plan to add the feature in question. The only major Active Storage features I plan to work on in the foreseeable future are validations and support for public files. We also don鈥檛 take feature requests on GitHub, which I could have made more clear. I鈥檓 sorry for being so curt.

Please feel free to continue exploring the issue in the rubyonrails-core mailing list. If you鈥檇 find this feature useful, I鈥檇 love to chat with you about your needs to hone in on a compelling use case and an appropriate implementation. My email address is [email protected].

@georgeclaghorn you mentioned that you'd love to chat about compelling use case and appropriate implementation. I see a number of posts in this thread with compelling enough use case example, so why not share your thoughts on appropriate implementation here so that general public can take advantage of it.

In its current implementation, the ActiveStorage implementation is not even SEO-friendly. In my excitement to use Rails 5.2, I used ActiveStorage which I see is perhaps not even mature enough to be added in to rails core. Not sure how decisions to add such a gem to rails core is made when basic SEO considerations are not being managed. With Rails core, I have become accustomed to best practices, with out of the box best solutions keeping in view security and basic standards (SEO certainly being one of them), and ActiveStorage seems to lack it.

Please do take time to share your thoughts as to what I might be missing or misunderstanding. Thank you for your support.

Hopefully this may help someone else with this problem

# lib/active_storage/service/better_s3_service.rb
require 'aws-sdk-s3'
require 'active_storage/service/s3_service'
require 'active_support/core_ext/numeric/bytes'

module ActiveStorage
  class Service::BetterS3Service < Service::S3Service
    attr_reader :client, :bucket, :root, :upload_options

    def initialize(bucket:, upload: {}, **options)
      @root = options.delete(:root)
      super(bucket: bucket, upload: upload, **options)
    end

    private

    def object_for(key)
      path = root.present? ? File.join(root, key) : key
      bucket.object(path)
    end
  end
end

Then, in storage.yml you can simply add

s3:
  service: BetterS3
  root: 'directory'

@yoones That's has nothing to do with original ask. Amazon just deprecating bucket names as part of the path. Actual object keys still can be of "/foo/bar/baz.png" structure.

https://groups.google.com/forum/#!searchin/rubyonrails-core/activestorage$20folders%7Csort:date/rubyonrails-core/VqCAEu3IMvE/8_nu_EDIBgAJ

Per DHH, as of Oct, 2018, there are no roadmap'ed features for ActiveStorage.

https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/

In regards to this conversation, the title on that article is misleading.

It's not about deprecating path based S3 keys, it's about deprecating path-based S3 bucket access.

The examples they give actually cover that:

Old Style: https://s3.amazonaws.com/jbarr-public/images/ritchie_and_thompson_pdp11.jpeg
New Style: https://jbarr-public.s3.amazonaws.com/images/ritchie_and_thompson_pdp11.jpeg

You'll note that the 'path' part of the key -- /images/ remains intact, they're just moving the bucket identifier from the path and into a subdomain. Presumably, this is to help with routing and load balancing issues.

My use case:

Store files in the cloud given the constraint of relying on Cloud Cube S3 storage, which always shares one S3 bucket (cloud-cube) with a different path prefix per project. For example: https://cloud-cube.s3.amazonaws.com/projectx

ActiveStorage doesn't support that use case (nor other 3rd party S3 storage providers probably.)

In any case, I'm going to give you the benefit of the doubt. The project (Rails) or feature (ActiveStorage) will fail organically if support for commonly asked useful features falters. Also, one may always fork or create a new project and add what's needed while offering to others (kinda like ProActiveStorage).

Thanks for making ActiveStorage available, but I'm either ditching Cloud Cube S3 (if I can convince management) or ditching ActiveStorage as a result of this. Cross your fingers it's Cloud Cube S3. :P

@vtm9 re-awakening this issue, as I stumble upon it.

The above-linked discussion by DHH directly invites PRs as a way to discuss the addition.

I'd like to suggest the post-mortem on this conversation is that the mod missed the social/community context of how many people were involved / cared, here on github, and therefore didn't give enough support to move the conversation to the mailing list. Had they done that, it would have created a more supportive and welcoming environment.

Just use the proactivestorage gem.
https://github.com/doberg/proactivestorage/tree/master/proactivestorage

Hi @doberg, the complementary gem was a good idea, but now the gem is not too active and overriding Rails is not a good idea (I'm using now Rails 6)
鈽癸笍

I take it there's no progress on this? We're building an app and would like to keep it focused on a single bucket, where each environment operates with its own IAM credentials that isolates it to a particular prefix (my-activestorage-bucket/dev, my-activestorage-bucket/prod, etc). From an AWS Policy perspective, this is all fine and easy to setup. Now ActiveStorage philosophies are getting in the way of finishing the work.

We are operating in an environment that is at max capacity buckets, just telling everyone to throw more buckets at the problem is ignoring real world use cases. We have legacy environments we're dealing with here. Not everything is green field.

I take it there's no progress on this? We're building an app and would like to keep it focused on a single bucket, where each environment operates with its own IAM credentials that isolates it to a particular prefix (my-activestorage-bucket/dev, my-activestorage-bucket/prod, etc). From an AWS Policy perspective, this is all fine and easy to setup. Now ActiveStorage philosophies are getting in the way of finishing the work.

We are operating in an environment that is at max capacity buckets, just telling everyone to throw more buckets at the problem is ignoring real world use cases. We have legacy environments we're dealing with here. Not everything is green field.

You should probably check Shrine. It is well maintained and on the whole, it feels much better than active storage

Thanks @abhionlyone, ended up created a custom Service based on the comment from @bradcrawford above. I prefer small customizations over adding gems to solve small problems. Struggled with a few things, an hour or so of lost time, but it's operational now. If we continue to bang our heads against the wall with ActiveStorage's opinions, we'll give Shrine a shot. Thanks for the quick followup 馃憤

@georgeclaghorn, what is the main reason for the rejection of this feature?
You need only specify prefix in has_one_attached, has_many_attached associations, and change one line in this methods: object_for(key), path_for(key), file_for(key).

class User < ApplicationRecord
  has_one_attached :avatar, prefix: 'avatars'
  has_many_attached :docs, prefix: 'docs'
end

Any chance to have new about that ?

@vtm9 Hey is there any chance to have more detail about the line of code that we have to change to make that working ? Or is it longer than "a single line of code "? here one of the method that need to be fixed as you wrote.
https://github.com/rails/rails/blob/cf7c27f2ff2a38f90af29162d11c41628600999b/activestorage/lib/active_storage/service/s3_service.rb#L127
https://github.com/rails/rails/blob/352445b2b6db00d040d3fcb6c742f9b2795a68d1/activestorage/lib/active_storage/service/disk_service.rb#L101

Your pattern is elegant.

I'm rigth now working on a kinda patch to get same objective as @kaluznyo and i'm following up that pattern here It's ok but your proposition is really cool.
Let me Know if you have more detail about your proposition. Why not changing it by our own.

digital ocean and active storage create dynamically folder and subfolders

setup rails 6 / gem "aws-sdk-s3" / and you have set up AWS with your machine

just in case if someone is looking to figure an extra solution or option to do surround that lack of :prefix pattern.

i did that kind of code within a model which have one attachment

this is not elegant yet but this is the start point ...

inside #room.rb

class Room < ApplicationRecord
  include RoomsUploaderHelper
  # some hidden callback not important for that case
  # some hidden emum not important for that case 
  after_create :room_helper_method
  after_update :room_helper_method
  has_one_attached :header_image_project

  def numeric_instant
    # some hidden code not important for that case 
  end

  def set_full_address_attributes
     ## some hidden code not important for that case 
  end

  def room_helper_method

    if self.header_image_project.attached? || self.photos.attached? 
      url = self.header_image_project.blob.service_url
      asset = self.header_image_project.blob
      restructure_attachment(
          url,
          asset, 
          {
            id: asset.id,
            key: asset.key,
            filename: asset.filename,content_type: asset.content_type,
            byte_size: asset.byte_size, 
            checksum: asset.checksum
          },
          "new_folder")
    end
  end
end

inside #models/concerns/rooms_uploader_helper.rb
This starting solution code help me to figure a way to protect my cloud asset setup for a future migration to another provider. We never know ... Today I'm often working with digital ocean, this code is based on this cloud setup.

Objectives : Copy asset A to a specific folder based on my logic structure. I'm not trying to move the asset A from a point to another point but at least now I could organize my droplet well (for sure I need then to pas some other params to create sub folders linked to this one...).

Bad point : I have all my asset are at lest duplicate two time . But as I explain this my starting point of work.

module RoomsUploaderHelper
  extend ActiveSupport::Concern

  # link t.ly/B0gx

  def restructure_attachment(url,fullObject, active_storage_object, new_structure)

    old_key = active_storage_object[:key]
    config = YAML.load_file(Rails.root.join('config', 'storage.yml'))
    client = Aws::S3::Client.new(
      access_key_id: ENV['access_key_id'],
      secret_access_key: ENV['secret_access_key'],
      endpoint: ENV['endpoint'],
      region: ENV['region']
    )
    objects = client.list_objects({bucket: "your-bucket-name"})
    array = objects.contents.pluck(:key)
    index = array.index(old_key) 
    old_obj = objects.contents[index]

    object = client.put_object({
      bucket: "your-bucket-name",
      key: "#{new_structure}/#{old_key}", # this is the creation of the folder new_folder
      body: open(url),
      acl: "public-read"
    })

     # this is not important for the objective but documentation 
     # is very poor about that subject 
     # so adding an example of download is not too bad 
     # (it will download that asset localy to your machine)
    ddd = client.get_object(
      bucket: 'your-bucket-name',
      key: old_key,
      response_target: '/tmp/local-file.jpg'
    )

  end
end

If some of you have more elegant solution I would be very inserting to have a look on it.

FYI - CHANGELOG at 2020-06-10 (https://github.com/rails/rails/blob/master/activestorage/CHANGELOG.md) for active storage mentions changes close to the feature discussed here:

You can optionally provide a custom blob key when attaching a new file:

user.avatar.attach key: "avatars/#{user.id}.jpg",
  io: io, content_type: "image/jpeg", filename: "avatar.jpg"

Active Storage will store the blob's data on the configured service at the provided key.
George Claghorn

Waiting for 6.1.
For those interested there are also things like this in the CHANGELOG:

Permanent URLs for public storage blobs.

Services can be configured in config/storage.yml with a new key public: true | false to indicate whether a service holds public blobs or private blobs. Public services will always return a permanent URL.

Deprecates Blob#service_url in favor of Blob#url.

Peter Zhu

:smile:

Hopefully this may help someone else with this problem

# lib/active_storage/service/better_s3_service.rb
require 'aws-sdk-s3'
require 'active_storage/service/s3_service'
require 'active_support/core_ext/numeric/bytes'

module ActiveStorage
  class Service::BetterS3Service < Service::S3Service
    attr_reader :client, :bucket, :root, :upload_options

    def initialize(bucket:, upload: {}, **options)
      @root = options.delete(:root)
      super(bucket: bucket, upload: upload, **options)
    end

    private

    def object_for(key)
      path = root.present? ? File.join(root, key) : key
      bucket.object(path)
    end
  end
end

Then, in storage.yml you can simply add

s3:
  service: BetterS3
  root: 'directory'

While this code helped, I also need the service option for has_one_attached, which isn't available until after Rails 6 (I'm on 5.2.4).

The feature suggested by OP should be the default, and I cannot understand the initial pushback. I need for certain types of user uploaded content to be in a private S3 bucket (or a directory with a specific IAM policy), but I don't need _everything_ to go there. The way things are now, there is no way to set a granular IAM policy on a single bucket with _only one folder_.

I take it there has been no movements on this? im still confused as to why @georgeclaghorn you are so opposed to this idea?

I could've sworn I read that Rails core saw the light on this, and added support for prefixing, but maybe it was just a dream.

The way things are now, there is no way to set a granular IAM policy on a single bucket with only one folder.

@birthdaycorp you should be able to setup a custom policy for a bucket that specifies a prefix like (this is just the relevant snippet from the custom policy):

"Condition": {
  "StringLike": {
    "s3:prefix": ["my-prefix/", "my-prefix/*"]
  }
}

UPDATE: It seems to be in 6.1.0.alpha, lets switch to that then :)

Great to see this in 6.1.0. (changelog). Has anyone figured out how to set the default folder, so we don't have to specify it with each upload?

@timwis obviously you can do it by passing a key to the attachment:

user.avatar.attach key: "avatars/#{user.id}.jpg",
  io: io, content_type: "image/jpeg", filename: "avatar.jpg"
  ``` 

idk if that would even be possible in the default settings because you need to associate the key with the model somehow to do lookups, but if you figure that out, you can pass `key:` to the `upload:` options: 
  ``` code
  production:
  service: s3
  access_key_id: <%= Rails.application.credentials.dig(:aws, :access_key_id) %>
  secret_access_key: <%= Rails.application.credentials.dig(:aws, :secret_access_key) %>
  region: us-east-1
  bucket: my-bucket
  upload:
    multipart_threshold: <%= 250.megabytes %>
    key: "avatars/#{model_lookup_method}.jpg"
Was this page helpful?
0 / 5 - 0 ratings