Post.all.order('random()')
It should silently work, invoking postgres's random()
function for determining order
I get this warning:
DEPRECATION WARNING: Dangerous query method (method whose arguments are used as raw SQL) called with non-attribute argument(s): "random()". Non-attribute arguments will be disallowed in Rails 6.0. This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql(). (called from …)
Rails version: 5.2.0
Ruby version: 2.5.1p57
Using postgres
This is expected. It's trying to tell you to do like so: Post.all.order(Arel.sql('random()'))
.
See 310c3a8f2d043f3d00d3f703052a1e160430a2c2
I'm sorry if I'm being dense, but my reaction on this being closed was to feel dubious of what value Arel.sql('random()')
is providing me as a user in this case.
If wrapping any potentially-unsafe thing in Arel.sql(…)
is simply the equivalent of a --force
confirmation to protect users from themselves, it seems contrary to Rails' ethos of preferring terse expression even if it means trusting people with sharp tools.
If instead the wrapping Arel.sql(…)
call serves some technical purpose, couldn't the framework detect this and do it for users?
If I'm right in assuming it's the former, then this sort of change reminds me a bit of strong_params, except instead of coercing users to re-think mass data access, the most likely outcome here will be for people to get angry when Rails 6.0 breaks a bunch of queries and then blindly sprinkle in Arel.sql()
calls all over the place to make the errors go away. And in that case, wouldn't the end result will be messier codebases without making queries more secure (prompting a Rails 7 Arel.no_really_sql(Arel.sql(…))
wrapper)?
I just stumbled upon this, so it's likely I'm off-base here, but this seems like a default I'd rather be hidden behind an opt-in config option for people who want to prevent queries like this to be generated conveniently.
cc/ @mastahyeti since you implemented the original implementation what are your thoughts on this issue? It does seem a bit heavy handed for users of Rails.
Is not the deprecation message making users to re-think SQL injection by telling them to review the query to see if there is user input on it and if not call Arel.sql
?
This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql().
That is the reason for this change. If users are blindly calling Arel.sql
, well, that is where the provide sharp tools enters, it is their call.
I don't believe this should be opt-in. I strong believe we should not allow people to opt-in
security because most likely they will not. Opt-out is an option, if you want to deal with this yourself I think you should be allowed. The good news is: it is already possible to opt-out using allow_unsafe_raw_sql
. (it is missing documentation and I'll fix this right now).
If the message is not making users to re-think SQL injection, so we should improve it.
In summary, in my opinion, the current implementation we have is the one that maps better what we need. We need people to rethink SQL injection by not allowing untrusted values in methods that are possible to inject SQL.
Hi @searls. The issue https://github.com/rails/rails/pull/27947 was intended to address was
Post.order(params[:order])
which is a common pattern in Rails app, unfortunately leading to SQL injection.
... it seems contrary to Rails' ethos of preferring terse expression even if it means trusting people with sharp tools.
I'm not sure this has been my experience of the Rails ethos when it comes to security. While Rails doesn't have the best history of being "secure by default", some effort is made to protect users from themselves. Think html_safe
.
The protections added in https://github.com/rails/rails/pull/27947 will likely add some friction for some users, but I don't think it will be so dire. The #pluck
and #order
APIs are usually passed column names. The decision was made to draw the line at allowing column_name
or table_name.column_name
, but I can see allowing some additional whitelist of "safe" functions if this becomes too much of a burden.
And in that case, wouldn't the end result will be messier codebases without making queries more secure (prompting a Rails 7 Arel.no_really_sql(Arel.sql(…)) wrapper)?
As someone who works on securing a large Rails app, I can tell you that having to audit for .order(Arel.sql(...))
calls is much easier than having to audit all .order(...)
calls. It has been my experience that developers rarely have to pass Arel.sql(...)
to these APIs because they rarely are wanting to pass anything other than a column name.
Thanks, @mastahyeti, for the added context. Knowing this applies just to a handful of Arel's methods is definitely helpful and makes this seem less dire of a concern than if it applied to any string passed into any of them.
I'm also glad to hear you'd be open to a whitelist of common functions, as I think you'd pretty quickly only see this error in pretty unusual circumstances.
[I'm curious, too, whether there's some other heuristic (a regex?) in any of these cases that could (performantly enough) establish confidence that nothing harmful could have been injected. But this is a well worn problem and I'm a security noob, so I'm sure that might be a silly suggestion.]
@searls regarding a regex, IIRC the core team's position is "don't write a SQL parser" :)
I would like to echo my pain here upgrading Discourse to Rails 5.2. which is in progress
https://github.com/discourse/discourse/commit/5373e84b9973550f96e4c1cfdf79fc1f4996739f (link will die at some point)
I would strongly prefer if we could just turn off this feature for Discourse.
Especially since I now have this kind of crazy.
sent.where("created_at BETWEEN ? AND ?", start_date, end_date)
.group("DATE(created_at)")
- .order("DATE(created_at)")
+ .order(Arel.sql "DATE(created_at)")
.count
This is very frustrating on a couple of counts
I need to now train all our devs to remember that #pluck and #order are special snowflakes
We are now allocating one more object (and one extra method call) unless we pull these Arel.sql magic things into a const or something
Would Rails core be open to making this new restriction configurable?
@rafaelfranca reading through the source I am not seeing how this is optional?
It appears allow_unsafe_raw_sql
can not be set to true for a bypass, it only checks if it is :deprecated or not
Are you open to a PR that allows this to be set to true
and then we add a bypass prior to calling enforce_raw_sql_whitelist
?
This also helps us cause we get to cut down on the whitelist match stuff for a very minor speed bump.
I think we should allow it. I'm happy with PR.
I can help with the asymmetry: group
should almost certainly apply this rule too in a future version. where
is a conscious exclusion, but others may still need consideration.
In your commit you seem to have added Arel.sql
to a number of order
calls that don't require it: what we're "--forcing" here is that a method which is generally passed a column name, is intentionally instead being passed a more complex SQL fragment. (e.g. 5 of 7 instances in post.rb
)
We originally considered allowing allow_unsafe_raw_sql
to disable the check entirely, but ultimately decided against it because the warning already provides a 'disabled' mode in the sense of "still works like 5.1"; a "works without complaining" would be unhelpful long term, because like all upgrade tools this setting will eventually go away -- we don't want to carry an opt-out conditional forever. Global settings also make life more complicated for gem authors, for example.
Shaving a flat one or two short-lived-object allocations off a full "build a query, run it, wait for the network, load the results" cycle doesn't seem performantly exciting to me. If Ruby had a way to identify a shared/static string, it'd be neat to cache their SQL-node forms, but I know of no such thing.
These methods are already pretty snowflaky: they mostly look like they take a column name, but sometimes they take an SQL fragment, but not if you want ?
placeholders, but there's actually a special spelling to do that too.
If you really want to kill it you can presumably monkey-patch enforce_raw_sql_whitelist
to a no-op, but I'm having a hard time buying a "developer ergonomics should trump a security/safety mechanism" argument the same week as "performance should trump developer ergonomics".
I'm having a hard time buying a "developer ergonomics should trump a security/safety mechanism" argument the same week as "performance should trump developer ergonomics".
This is an unfair stab, especially since I never said that, all I said is that I wish performance was somehow mentioned in the Rails doctrine, even in a "Kaizen" form where we strive to make Rails faster every release. A faster Rails encourages developers to upgrade to the latest version. It encourages people to say "yes" to features so it overall "improves developer happiness".
Back to the issue at hand, I am monkey patching out the method (I just did that in a commit to Discourse) I will consider moving to the Rails-way here if you can demonstrate in a file or 2 how this does not cause the code to suffer and become ugly. The training vector really needs consideration here especially around the disparity with other methods.
Overall a custom whitelist may be a way to unblock us here cause we could then explicitly whitelist some of this stuff. Then use something like LRURedux to bypass this evaluation for strings we know are good or something like that. Especially if stuff like order('something ?', something)
automatically whitelists. I am concerned about running every order string via 100 regexes long term but there may be optimization.
A big concern here is that the "escape hatch" proposed in the deprecation message is not ideal. You see it enough times on upgrade and you take an atomic approach to adding it even where it is not needed. My screen filled up with add Arel.sql
so I just blanket added it everywhere to make the spec suite stop nagging. If I know that date({column})
is always going to be good I should be able to just blanket allow it.
In other news Discourse does not appear to be any slower post the 5.2 upgrade (performance seems on par with the bench) so 🎊
Hello,
So the assumption here is that there's widespread use of Model.order(<unsanitised params>)
. Could we not target the input of unsanitised params directly instead of assuming any string literal passed here is unsafe?
This would allow for a more precise warning to the users, too, or even an exception, stating that this invocation creates a vulnerability and is thus disallowed (see Model.where(params)
).
Thinking about this, it leads me to a more general solution that allows param _values_ to be marked as unsanitised?
@ignisf the assumption is that it's a sufficiently widespread problem (and more generally, non-obvious danger in an API that looks like it's taking a column name) that someone made a website about it.
The stronger assumption, though, is that most calls to order
_are_ passing a simple column reference; much of the discussion in #27947 was around how to minimize the number of callers that would be affected by the warning.
It sounds like you're getting close to inventing string tainting, which has its own issues.
This case isn't really about rejecting user input, though -- it's about choosing which of two "identical"-looking APIs you intended to use.
A big concern here is that the "escape hatch" proposed in the deprecation message is not ideal. You see it enough times on upgrade and you take an atomic approach to adding it even where it is not needed.
I'm really not sure how much we can do about this. If it was as simple as making the change everywhere, we would've done it for you; the reason we're flooding your output with pointers to each individual call is that we want you to consider them individually.
If you want to alias rm="rm -f --no-preserve-root"
the first time you actually need to rm /
, you're free to do that... but I think most people will benefit from only disabling the safety when they need the unconstrained power.
Granted, Discourse has 5 instances of order-by-date... but those actually seem to have enough more in common (.where("<column> BETWEEN ? AND ?", x, y)
.group("date(<column>)")
.order("date(<column>)")
) that they'd benefit more from a method extraction than a custom regexp.
The others all seem pretty impractical to define general rules for.
The code is expected to become incrementally uglier in places where it's already uglyish (SQL inside strings); I can't offer an example where it makes things prettier. The argument is only that it happens rarely enough that the safety trade-off is worth it.
an unfair stab
Quelle horreur. 😒
The Doctrine is David's description of his design philosophy. It also doesn't say anything about fixing bugs, for example. Most importantly, while its sentiment is weighed in considering how we present things to our users, it has no influence on how our volunteers spend their uncompensated free time.
FWIW, Sequel introduced the same change in version 5.0. You now need to explicitly mark the SQL string as a "literal" (just like Arel.sql
):
Post.order(Sequel.lit("random()"))
I consider this to be a positive change, so I support the same changes in Active Record. Jeremy Evans argued that, in addition to being secure by default, this change also makes security audits for SQL injection easier, because you can grep for Sequel.lit
(in this case Arel.sql
).
Sequel also provides an extension to disable this behaviour, and from what I understand it will be supported indefinitely. It would be good if Active Record would offer the same indefinitely, though I understand the wish to remove it at some point. It could be a module (a "concern", if you will) that gets included when the user enables the old functionality; that way these conditionals will be separated from the main code (that's how Sequel does it).
I'm really not sure how much we can do about this. If it was as simple as making the change everywhere, we would've done it for you; the reason we're flooding your output with pointers to each individual call is that we want you to consider them individually.
Oh, one simple change that we could make here is only alerting in STDERR the first time the call site is detected, a very big pain here was that the same line (which was covered many times) kept on showing up over and over again on one test run.
The argument is only that it happens rarely enough that the safety trade-off is worth it.
It could be that I was a lot biased by having to through the entire app in one go, its just a lot of changes and we use SQL a lot.
So to summarize
group
, joins
and others so it correctly protectsThe idea around 3 would possibly be allowing:
.order('count(*) desc')
.safe
```
or maybe:
.unsafe_order(
```
not sure.
note: I was super careful not to mention string tainting, cause this is very much a feature I want erased from Ruby
The logspam from a single frequently-hit deprecation is a fair point, though we should address it generally in the deprecation reporter. Personally, I think it feels really good when I make a single change and a few thousand warnings go away, but to each his own :smile:
unsafe_order
; Arel.sql
felt more like HTML-escaping, and is consistent with the fact that everything ultimately passes through Arel anyway. (And correspondingly avoids us having to pass (value, safeness)
pairs around everywhere internally.)Are you trying to avoid the allocation here, or is it an aesthetic concern?
(Yeah, I only mention tainting as a self-evident explanation that that path is impractical.)
Are you trying to avoid the allocation here, or is it an aesthetic concern?
I know it is mega weird coming from the "performance guy" but yeah this is mostly and aesthetic/symmetry concern, the extra single digit RVALUEs are not the end of the world.
I still have a preference for unsafe_raw_sql_order()
. I think it's harder for a developer to type that method name without second guessing their approach.
How about allowing frozen strings/objects to the whitelist since an user input is always a mutable string?
I've proposed the PR #33330.
Override methods like unsafe_order
and unsafe_group
would be wonderful, and consistent with the security by default, but with a way around approach used with strong params and to_unsafe_h
. Or a config to disable this safety check entirely.
For those of us that build atop legacy third party databases or have a need of simple functions it seems preferable to have a rails api method for those needs rather than having to jump over to the Arel api.
Our main Rails app @chatterbugapp now has sorder
and spluck
in ApplicationRecord
since we have quite a bit of SQL inside of our order
and pluck
calls. Here's what i settled on:
class ApplicationRecord < ActiveRecord::Base
# Safe pluck! Deal with Rails 6.0 deprecations.
def self.spluck(*fields)
pluck(*fields.map { |field| has_attribute?(field) ? field : Arel.sql(field) })
end
# Safe order! Deal with Rails 6.0 deprecations.
# Note: Arrays are in charge of their own Arel.sql'ing
def self.sorder(*fields)
order(*fields.map { |field|
if field.is_a?(Array) || has_attribute?(field)
field
else
Arel.sql(field)
end
})
end
end
Overall I'm not a fan of this change and would prefer a community supported way of dealing with this instead of wrapping so many of our calls in Arel.sql
.
Naming methods safe_<dangeroug_thing>
, especially when they just bypass a security feature, seems like not a great pattern 😆
Agreed! That raises a similar point about readability of order(Arel.sql(...))
vs. unsafe_order(...)
.
The latter more clearly expresses a need for caution to future readers of the code.
it seems preferable to have a rails api method for those needs rather than having to jump over to the Arel api
That seems a bit of an arbitrary distinction: there is an API, and it happens to include the word "Arel". Could you help me understand what about that word makes it a bad name?
The latter more clearly expresses a need for caution to future readers of the code.
Disagree: the former expresses that the argument is SQL; the latter gives a vague "here be dragons".
In my experience the ActiveRecord query interface services almost every querying need. Arel is lower level api. It isn't mentioned at all in the query guide. Until now users of the ActiveRecord api could be blissfully unaware Arel. This strikes me as jumping across levels of abstraction. I readily concede that we're talking about advanced use here, so maybe that's fine. The second point seems more important to me.
I agree that the the current form of order(Arel.sql(...))
express that it is SQL, but saying what it is doesn't express that using SQL is unusual and may be dangerous. The point of the original change is to warn people of the dangers of unsafe input in SQL. As @mastahyeti says above, typing the word unsafe should make developers think twice, and having that word there is more explicit about it being unusual for future readers. This seems similar to to the conventions of method
vs method!
, if there is a !
the reader is takes a moment to consider the more dangerous form. Same in the existing api with params.to_h
vs params.to_unsafe_h
. And I believe same here with order
vs unsafe_order
. "here may be dragons" seems exactly right.
The intended analogy is to h()
(as in "HTML"). "Unsafe" is in the eye of the beholder; to me the only aim is for the caller to say "this thing I am passing, I intend to be interpreted as SQL" -- because most of the time it's just a column name, and therefore a reasonable caller could pass an unchecked value _while believing that the only thing that'll be accepted is a column name_. "You should only rarely need to pass raw SQL, maybe reconsider your need" is moving from "we need to understand/confirm the caller's intent" to "we want to influence the caller's design decisions"... we don't _not_ do that, but I think requiring explicitness is a better line of trade-off than making people lecture themselves each time they want to use the thing. (See a certain minitest method.)
to_unsafe_h
is a bit different: there, what's unsafe is the thing that you get back. It's not "unsafely give me a hash", it's "give me an unsafe hash".
If people think order_sql(...)
as an alias for order(Arel.sql(...))
would improve the ergonomics, that doesn't seem impossibly bad... but there was some reasoning behind using Arel.sql
:
1) it's the same thing you can use in other interesting places,
2) it's short [without polluting any global namespaces],
3) it's an initial tentative step towards a gentler boundary between AR and Arel,
4) because it provides an extra level of argument wrapping, it can help with parameter binding, so you in turn need to do less manual SQL manipulation,
5) while pluck is more constrained, there's a whole family of order methods that'd need _sql
variants, which seem they'd be ugly to document, learn, and switch between.
I would much prefer a order!
or pluck!
that explicitly allows unsafe SQL. Some of that "unsafe" SQL for our app is _very_ reasonable and 100% "safe".
Here's some snippets using pluck
/ order
as noted above from our codebase that do emit errors on Rails 5.2.1. All of these _ARE_ safe and not from user-input, but we'd have to wrap a lot of this code in Arel.sql
instead, making it harder to read and understand.
# complex COUNT's
.pluck("COUNT(*) FILTER (WHERE start_time > '#{28.days.ago}')")
# MAX across a join
.pluck(:exercise_id, 'MAX(chat_session_exercises.teacher_rating)')
# jsonb access
.pluck("context->'example'")
# Just a simple LOWER
.order("LOWER(word) ASC")
# Date functions...
.order("DATE_TRUNC('second',user_achievements.updated_at) ASC, user_achievements.key ASC")
# Not even a comparsion or NULLS FIRST?
.order("user_media_sources.rating <= 2 DESC NULLS FIRST")
I think the unstated goal here to push people towards writing only Ruby instead of SQL snippets interspersed with their Ruby. While that's a nice goal, I'm confident that is not how a lot of apps get written. Also IMO, writing complex SQL functions is not easy or approachable currently with Arel. ActiveRecord for years has let developers get the most of out data abstractions by keeping the "under the hood" SQL just hidden enough. This feels like a step in the opposite (or at least different) direction that directly impedes with that goal.
I can imagine the explosion of warnings from most apps that actually use SQL this way will impede upgrades to Rails 5.2 and beyond. I'd suggest the following as a course of action if Rails wants to continue down this path:
🙇
@matthewd Thank you for explaining the intent. Point taken about the unsafeness of the param vs the result and I didn't know that there was an intention to soften the boundary between ActiveRecord and Arel. I still have a preference for order_sql
and like @qrush's suggested plan.
I think the unstated goal here to push people towards writing only Ruby instead of SQL snippets interspersed with their Ruby.
Absolutely not, and given neither the PR author nor any member of the Rails team has suggested that over ~150 comments, I think it'd be easier if we discuss the actual (and conveniently, stated) goals. Being able to easily drop to SQL is a core feature of the Active Record API; having seemingly-insignificant parameters surprisingly interpreted as SQL, not so much.
Some of that "unsafe" SQL for our app is very reasonable and 100% "safe".
Yep, and that's one of the reasons I'm pushing against (and the API currently avoids) using the word "unsafe" here. It's not unsafe; _it is_ SQL.
To summarize the recent suggestions:
# now shows warning
.pluck("context->'example'")
# adding one character: acceptable
.pluck!("context->'example'")
# adding four characters: maybe
.pluck_sql("context->'example'")
# adding eight characters: so bad it prevents upgrades
.pluck(Arel.sql "context->'example'")
Another advantage of Arel.sql
is in wrappers. Given:
def my_pluck(something)
pluck(:id, something).to_h
end
With Arel.sql
, you can keep using that method, and pass it either :email
or Arel.sql("first_name || ' ' || last_name")
. With a pluck_sql
, you'd either have to change the above method to always use it, thereby exposing all of its callers to the potential danger of having a seemingly-column-name argument interpreted as SQL, or define a parallel my_pluck_sql
.
My ActiveRecord query below is triggering this deprecation warning on Rails 5.2:
INBOX = "Uncategorized"
.order ActiveRecord::Base.send(:sanitize_sql_array, ['tracker_categories.name = ? DESC, position ASC', INBOX])
which I find a bit crazy since it's clearly not a SQL injection risk, and I'm a bit lost how to go about rewriting this to make it compliant without needing to throw in horrendously ugly calls to Arel.sql
. (If there is a better Railsy way of writing this .order
query, I'm all ears.)
I'm interested in this one too. Trying to upgrade to 5.2 I noticed it, and now I've updated all our pluck
methods that use function calls to use Arel.sql('funct()')
instead.
I guess we need to do order
too? Anything else?
In particular, do we always need to do this with a string?
It's easy enough to convert simple attributes to a symbol to be sure we don't trigger it there, but what about when we're ordering by attributes in multiple tables: e.g., order('table1.col1, table2.col2')
?
Should that be done using Arel.sql as well, or only function calls?
I'm also incredibly scared to upgrade to 6.0 when it becomes available. This is a massive breaking change to simple stuff that we might not often test against. We're not typically shooting for 100% coverage in every place we might use order
or pluck
, so it's going to be very costly to upgrade to 6.0 safely.
@codeodor I would not carry some sort of fear, Discourse are going to have to solve this problem some way or other and have a public code base, at the moment we monkey patch this out. We are going to upgrade to 6.0 very shortly post release.
After reviewing each comment on this tread I don't see any reason why we should change the current behavior.
The reasoning for this change was made clear and I don't think anyone disagrees with the reasons why we require SQL literals to be wrapped on Arel.sql
calls for those methods.
We are now discussing if Arel.sql
is a "pretty" call or not. I'd say this is matter of test and while we are open to revisit this API I don't think it is productive to be bikesheding over the name given all suggestion where already considered and the core team still thinks the current API makes sense.
Others ORMs like Sequel also made similar changes with similar API so I think more people thinks this is a good idea.
About being safe to upgrade or not: this show deprecation warnings if you are not using the current API, so even if you don't have test coverage for your entire application it should be possible to use the deprecations to find where the code needs to change. This is not specific for this change though. Any API change would cause this fear, so I don't think this is an argument to remove this protection.
Thank you all for the discussion.
I'm a bit too late, but I think, that main reason - protect novices - is fail here because nothing stops juniors to mindlessly write Arel.sql
just to get things done. Moreover, they may think that that method can magically make their code safe :)
Usually that kind of checks make not ORM code, but linters/static analyzers. Also, sanitize raw SQL - it's fundamental web development knowledge, not only for rails.
About naming. order!
looks like it can throw exception (because it's Rails convention)
Would it be helpful to add documentation on Arel.sql
? Right now there's nothing to indicate what applying this is doing.
I also think that all the example (e.g. in Ruby Guides on ActiveRecord Query Interface) should be updated to reflect this, along with a discussion about what Arel.sql
is doing. (That it's marking something as reviewed-and-safe, not doing anything to make it safe.)
Allowing frozen string literal, as here ( https://github.com/kamipo/rails/commit/fe120e2c9aa3b0811309907dc76230cb45302720 ) seems to be the perfect solution to this problem, if we don't want to automatically wrap unsafe raw SQL into Arel
, which would be how users expected Rails to behave.
Another flaw with this new requirement. Gems. I'm currently using a gem that is spitting out this warning. I have no control of the code in the gem. I could fork it sure, but that path leads to dragons.
I won't be able to move to 6 unless all the gems I use upgrade.
@baash05 have you opened an issue on those gems? If you're not able to can you list them here? We (Rails) can't help if we don't have the information of what gems.
FaeCMS is one of them. But that sort of misses the point of Gems. If a change in rails version requires a portion of the gems I use, to be forked so I can use them then Uck. If by extension these gems also require subsequently forking of their "base" gems Uck*n.
I argue the change has to be of real benefit. It has to be compelling if it is going to require such a significant change to our code base.
I keep returning to this thread and leaving frustrated.
Where is the documentation for Arel? Arel got absorbed into ActiveRecord and all the documentation disappeared with it. The Rubydoc site that comes up first in google is for an old version and can't necessarily be trusted. The new one that's part of ActiveRecord is as spartan as it gets. There are posts about this on StackOverflow from 2011. There's even an issue for this very subject on this repository. Documentation on this api we're being asked to use would solve 99% of this problem and render the rest of my comments here unnecessary, but here we are.
Never mind that I am divided on the solution, this is not how a deprecation should be done regardless of the implementation. The implementers have declared they have chosen to do this and so too must we choose the same, provided no explanation as to why or documentation as to how, and the tone of this almost year-old thread is largely "Other frameworks have done it, do it or don't use our software, we do not care."
I'm going to endorse/piggyback on what @SamSaffron said here: at the very least, an allow_unsafe_raw_sql = true
option needs to happen. The current choice is between logspam or an exception and while the intent is to strongly push people in the direction of writing "secure by default" SQL, in practice what is happening in most codebases is people are blindly wrapping everything that looks like it could be problematic in Arel.sql
even when not necessary.
If nobody else is writing a PR for that, I would be happy to give it a shot.
Opening a PR for that will not solve anything. The pull request will be just rejected. But, we are open to pull requests improving the deprecation message and documenting the Arel.sql
method, and that would be very helpful.
Given that the impetus for this appears to be protecting against accidental SQL injection, it feels to me that the approach chosen is backwards.
Instead of assuming arguments to order
, etc. are insecure by default (unless told otherwise by wrapping them in Arel.sql
), wouldn't it be a simpler solution to _make them secure_ by sanitizing strings by default? The outcome would be the same (ie. prevent SQLi), but the developer would actually be protected by default, instead of being responsible for deciding whether the string is safe (as others have discussed, this just leads to wrapping everything in Arel.sql
for ease of use, which, anecdotally, I've already seen developers do).
This also seems like it would have a way simpler migration, since existing code would continue to work (especially considering the potential of having to update gems way down the dependency tree as @baash05 pointed out).
I concede I don't know the internals of AR well, so there may be a reason why this can't be done, but I think if that was the case it'd be useful to disclose why that is.
If this feature is now set in stone as-is, I would appreciate knowing why in the past months there has been an about face from being open to a PR to opt-out, to stating it would be rejected.
If this feature is now set in stone as-is, I would appreciate knowing why in the past months there has been an about face from being open to a PR to opt-out, to stating it would be rejected.
This comment explain why we would reject it https://github.com/rails/rails/issues/32995#issuecomment-394927823
Instead of assuming arguments to
order
, etc. are insecure by default (unless told otherwise by wrapping them inArel.sql
), wouldn't it be a simpler solution to _make them secure_ by sanitizing strings by default?
This solution would cause the same problem in terms of upgradability as the current solution we chose but with the difference that it would silently break applications what were relying on order
arguments not being escaped.
So look at this use-case where I want to want to filter by column.
scope :sorting, lambda { |value|
kind, dir = value.split('-')
dir = 'desc' unless %w{asc desc}.include?(dir)
column =
case kind
when "creation" then "created_at"
when "views" then "page_views_count"
when "reactions" then "positive_reactions_count"
when "comments" then "comments_count"
when "published" then "published_at"
else
'created_at'
end
order("? ?",column, dir)
}
So I thought when you pass them in as extra arguments with ?
its suppose to sanitize them and place them in the correct position. I haven't used Arel for a few years so fuzzy on its functionality. This is where I would the removal of possible injection.
There's a broader discussion here that I think obscures what the original point:
Postgres random()
and MySQL/MariaDB RAND()
should not be treated as dangerous. Treat user-provided params as dangerous, or not -- smarter people can debate that. But don't treat canonical database technologies as dangerous. Don't create arbitrary cruft around using the best technology for the job.
If it's a bad implementation to whitelist certain strings, then come up with a code path for these technologies (Model.mysql_rand
, Model.random(:mysql)
, or whatever).
When I have to include user input in my .order()
, for example, order by the relevance to the search keywords.
Article.order("ts_rank_cd(persisted_tsvector, to_tsquery('english', '#{user_input}')) desc")
how can I do it properly without the danger of SQL injection? Arel::sql
does not cover this case.
This is probably not the place to ask this kind of questions @Aetherus
But you have to filter user input. I would use Arel.
I am having issue with the deprecation warning, not 100% sure what is going on here:
.order(ActiveRecord::Base.sanitize_sql_for_order(['ABS(quantity_value - ?)', value.to_f]))
From the looks of it in the code, everything gets wrapped in Arel.sql
however the deprecation warning persists.
Am I missing something? Is the suggested action of wrapping with Arel.sql not the correct thing to do now?
Is not the deprecation message making users to re-think SQL injection by telling them to review the query to see if there is user input on it and if not call
Arel.sql
?This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql().
That is the reason for this change. If users are blindly calling
Arel.sql
, well, that is where the provide sharp tools enters, it is their call.I don't believe this should be opt-in. I strong believe we should not allow people to
opt-in
security because most likely they will not. Opt-out is an option, if you want to deal with this yourself I think you should be allowed. The good news is: it is already possible to opt-out usingallow_unsafe_raw_sql
. (it is missing documentation and I'll fix this right now).If the message is not making users to re-think SQL injection, so we should improve it.
In summary, in my opinion, the current implementation we have is the one that maps better what we need. We need people to rethink SQL injection by not allowing untrusted values in methods that are possible to inject SQL.
The trouble is this deprecation warning is displayed even when no SQL-injection is potentially in play. I experienced this too, consider the following:
.order('extract(year from imports.end_date) ASC, reports.number_of_months_covered ASC, reports.annual_overnight_visitation_type_id ASC')
No value is being passed to this method, but somehow I get this deprecation warning over use of PGSQL extract
?
This makes me feel like the AR ORM is moving towards the way of thinking of Eloquent, where we have to hand hold the ORM each step of the way. I agree with the sharp tools, developer beware methodology; especially considering that this is a departure from the way Rails has always done this.
I am having issue with the deprecation warning, not 100% sure what is going on here:
.order(ActiveRecord::Base.sanitize_sql_for_order(['ABS(quantity_value - ?)', value.to_f]))
From the looks of it in the code, everything gets wrapped in
Arel.sql
however the deprecation warning persists.Am I missing something? Is the suggested action of wrapping with Arel.sql not the correct thing to do now?
It's true sanitize_sql_for_order
wraps the result in Arel.sql
but the deprecation warning is happening before the internal call to Arel.sql
. To resolve this you can call Arel.sql
on the string you pass in. Also you can bypass sanitize_sql_for_order
and pass the array directly to order
.
.order([Arel.sql('ABS(quantity_value - ?)'), value.to_f])
apologies if I missed something, but was a whitelist added? or was this rolled back?
The following contrived example throws a deprecation warning in active record 5, yet is fine when running active record 6.
require 'bundler'
require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'activerecord', '~> 5'#toggle versions here
gem 'sqlite3'
end
require 'sqlite3'
require 'active_record'
ActiveRecord::Base.establish_connection(adapter: 'sqlite3', database: ':memory:')
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Schema.define do
create_table :sales do |table|
table.column :amount, :integer
table.column :year, :integer
table.column :name, :string
end
end
Sale = Class.new(ActiveRecord::Base)
Sale.create(name: 'Dwalin', year: 2019, amount: 12)
Sale.create(name: 'Dwalin', year: 2020, amount: 234)
Sale.create(name: 'Dwalin', year: 2020, amount: 332)
Sale.create(name: 'Gloin', year: 2020, amount: 123)
puts Sale.where(year: 2020).group(:name).having('SUM(sales.amount) > 0').pluck('SUM(sales.amount)')
@firien See #36448.
@matthewd / @rafaelfranca what are your thoughts on the fstring bypass proposed by @kamipo , I really like it :heart: I would certainly kill off our freedom patch if that was in place. fstrings are inherently pretty safe, it is not trivial to get stuff snuck into them?
This may help others upgrading to Rails 5.2 avoid fixing warnings for queries that Rails 6 will actually allow:
We backported the changed whitelist/allowlist from the above-mentioned https://github.com/rails/rails/pull/36448 to Rails 5.2, so it no longer warns about stuff like order("RANDOM()")
that Rails 6 will allow.
If you want to verify that this backport works, try doing SomeModel.order("RANDOM()").to_sql
in a console and check that it no longer complains. And try SomeModel.order("RANDOM() * 2").to_sql
and check that it still complains.
Stick this in config/initializers/rails6_backports.rb
:
raise "Remove no-longer-needed #{__FILE__}!" if Rails::VERSION::MAJOR >= 6
# Backport https://github.com/rails/rails/pull/36448 to avoid fixing things unnecessarily.
# https://github.com/rails/rails/issues/32995#issuecomment-685475521
module ActiveRecord
module AttributeMethods
module ClassMethods
remove_const(:COLUMN_NAME_WHITELIST)
COLUMN_NAME_WHITELIST = /
\A
(
(?:
# table_name.column_name | function(one or no argument)
((?:\w+\.)?\w+) | \w+\((?:|\g<2>)\)
)
(?:(?:\s+AS)?\s+\w+)?
)
(?:\s*,\s*\g<1>)*
\z
/ix
remove_const(:COLUMN_NAME_ORDER_WHITELIST)
COLUMN_NAME_ORDER_WHITELIST = /
\A
(
(?:
# table_name.column_name | function(one or no argument)
((?:\w+\.)?\w+) | \w+\((?:|\g<2>)\)
)
(?:\s+ASC|\s+DESC)?
(?:\s+NULLS\s+(?:FIRST|LAST))?
)
(?:\s*,\s*\g<1>)*
\z
/ix
end
end
end
Great idea, @henrik! The fact so many folks are still stumbling over this as they work in 5.2 has confused me several times as to where Rails 6 actually landed (despite it ultimately not having been a problem at all—upgrade to 6, everybody!)
If the string param that you pass to order()
, for example, doesn't contain basic string interpolation – i.e. #{params[:potentially_unsafe_and_unfiltered_user_input}
, can it not be assumed that the raw SQL string is safe from injection?
Jeffery, the interpolation happens before the parameter value gets to the
order function. If the order methods gets a string object there is no way
for it to know if the string was previously interpolated.
On Mon, Oct 19, 2020 at 1:59 PM Jeffrey Dill notifications@github.com
wrote:
If the string param that you pass to order(), for example, doesn't
contain basic string interpolation – i.e.{params[:potentially_unsafe_and_unfiltered_user_input}, can it not be
assumed that the raw SQL string is safe from injection?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/rails/rails/issues/32995#issuecomment-712340850, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAONGJHUI6QUGNRHMN7GSDSLR5ANANCNFSM4FB2WUFQ
.
@andynu derp, of course. Nevermind. 😄
Most helpful comment
I'm sorry if I'm being dense, but my reaction on this being closed was to feel dubious of what value
Arel.sql('random()')
is providing me as a user in this case.If wrapping any potentially-unsafe thing in
Arel.sql(…)
is simply the equivalent of a--force
confirmation to protect users from themselves, it seems contrary to Rails' ethos of preferring terse expression even if it means trusting people with sharp tools.If instead the wrapping
Arel.sql(…)
call serves some technical purpose, couldn't the framework detect this and do it for users?If I'm right in assuming it's the former, then this sort of change reminds me a bit of strong_params, except instead of coercing users to re-think mass data access, the most likely outcome here will be for people to get angry when Rails 6.0 breaks a bunch of queries and then blindly sprinkle in
Arel.sql()
calls all over the place to make the errors go away. And in that case, wouldn't the end result will be messier codebases without making queries more secure (prompting a Rails 7Arel.no_really_sql(Arel.sql(…))
wrapper)?I just stumbled upon this, so it's likely I'm off-base here, but this seems like a default I'd rather be hidden behind an opt-in config option for people who want to prevent queries like this to be generated conveniently.