Logstash: The Principle of Least Astonishment and the New Event API

Created on 14 Apr 2016 · 31Comments · Source: elastic/logstash

I've given a lot of thought to some past discussions WRT the new Event API and I think there's a solution below that will make everyone happy.

We have a bad API/user experience problem in Logstash 5.0. event[‘foo’][‘bar’] = ‘blah’ and event['foo'].downcase! should either set the value (as it currently does) or raise an error if it doesn't. This follows the principle of least astonishment, a key tool in UX design that definitely applies here. It doesn't matter which one it does, but it should perform one of those two actions. Setting the value is the obvious thing it would do. If it doesn't do the obvious thing it should raise an error.

An API where things can easily look like they do something, and in fact used to be the recommended way of doing things (going by our own code), but now does nothing with no feedback is bad.

Silent failure is the worst possible experience for a user. We should not make that choice intentionally. Even if they eventually learn and understand this API it doesn't mean they would (or should) like it.

Regardless of how well we document it, or how many people can be trained to not do it, for a developer new or old that's still a bad experience. People could easily make an innocent mistake.

What's the fix?

Preserving the old behavior is a controversial topic. Doing so would require a Java event that could operate equally well on Ruby objects as well as Java ones and be switched between ruby and Java internal representations depending on whether it was being used in a Java plugin.

I believe we can all agree as a lowest common denominator that the following would be better than what's currently in 5.0.0. We simply should invoke .freeze recursively on any ruby objects returned. This would only be a handful of lines of code, and would clearly warn and educate users to not mutate data.

All this means is that:

event['foo'].downcase! # doesn't work with current java event, but fails silently

would raise an exception forcing the user to rewrite it correctly as

event['foo'] = event['foo'].downcase # does the right thing

We could even add extra info to FrozenString errors letting users know about event mutability.

Thank you all for considering this proposal! I think it may still be worth talking about a 'full' fix where nothing would change but I'm not so certain that perspective is very popular ATM, though I think it is technically achievable.

design discuss v5.0.0

Source

andrewvc

👍2

Most helpful comment

Good healthy discussion here but it should have taken place in 5035

Bold, unabashed, draw a line in the sand statement - This relaxed approached to the Event API must end. No looks backward.

Here is why.

First, some arguable assertions on the Java Event implementation.

The Event object itself 'belongs' to us.
The leaf objects 'belong' to the user.
Intermediate collections 'belong' to us (if they didn't we would not return a copy).
The accessor path addressing mechanism is an indirect reference to either a leaf object or an intermediate collection object.
We have to respect user operations on leaf objects.
We can choose whether to respect user operations on intermediate objects.
We are not taking higher level operations like add_field and those exposed by the mutate filter into account for these discussions - these operations target (via accessor paths) either a leaf object or an intermediate collections.

Before reading this thread I thought about freeze too but I rejected it on the grounds of needing to be done on intermediate collections only which is very opaque to the users.

Mutate operations on collections have bad side effects.
Example from Ruby Event

> event = LogStash::Event.new
=> #<LogStash::Event:0x25ce435 @metadata_accessors=#<LogStash::Util::Accessors:0x7ea71fc2 @store={}, @lut={}>, @cancelled=false, @data={"@version"=>"1", "@timestamp"=>"2016-04-15T14:19:31.561Z"}, @metadata={}, @accessors=#<LogStash::Util::Accessors:0x7cd5fcf4 @store={"@version"=>"1", "@timestamp"=>"2016-04-15T14:19:31.561Z"}, @lut={}>>
> event["[a][0]"]
=> nil
> event["[a][0]"] = 42
=> 42
> event["[a]"]
=> {"0"=>42}
> event["[a]"] = [42,24]
=> [42, 24]
> event["[a]"]
=> [42, 24]
> event["[a][0]"]
=> 42
> event["[a]"] = [24,42]
=> [24, 42]
> event["[a][0]"]
=> 42
> event["[a][0]"] = {"a"=> 99, "b" => 98}
=> {"a"=>99, "b"=>98}
> event["[a][0]"]
=> {"a"=>99, "b"=>98}
> event["[a]"]
=> [24, 42]
> event["[a][0]"]
=> {"a"=>99, "b"=>98}
> event["[a][0][b]"]
TypeError: can't convert String into Integer
  from org/jruby/RubyFixnum.java:1143:in `[]'

It seems to me that there a list of troublesome operations on intermediate collections including []= and others

Lets remove them.

In the Java Event we control exactly which class we convert Java Map and List to.
I propose that we convert these to EventHash and EventArray (wrappers to their Ruby equivalents)
These classes would not expose any setters or mutation methods e.g. merge! keep_if []=.
Further we can introduce methods that allow some changes to occur in the intermediate collection but also invalidate the LUT and PathCache.

guyboertje on 15 Apr 2016

❤2

All 31 comments

event['foo'].downcase! # doesn't work with current java event, but fails silently

Is this true? In MRI with normal Ruby Strings, String#downcase! raises an exception:

>> a.downcase!
RuntimeError: can't modify frozen String
        from (irb):4:in `downcase!'
        from (irb):4
        from /home/jls/.rvm/rubies/ruby-2.2.1/bin/irb:11:in `<main>'

(I like the exception behavior here)

jordansissel on 14 Apr 2016

Assuming I'm reading you right ---
I am +1 on moving towards a smaller interface (not allowing event[key][key2]` for example, if possible). I am also +1 on having these exceptions thrown when a user tries to do an unsupported mutation.

jordansissel on 14 Apr 2016

@jordansissel yes, that is what I'm proposing.

I think we should add a top-level rescue checking for "can't modify frozen string" and wrap the error with a new Error class saying something like:

Whoa there cowboy! 
It looks like you're trying to lasso a frozen object with that ruby rope! 
Try invoking .dup to let that buckin' steer out of the corral proper.
Then, see if you can get that ornery bull back in with a direct assignment.

Or we could do something more boring without the rich western imagery. But my point is that we can enhance the default error message.

andrewvc on 15 Apr 2016

@jordansissel just to clarify, we've already moved to the smaller interface. It's just a question of whether we throw exceptions or not. Or, whether we DO continue to support the old interface (which is extra work to achieve)

andrewvc on 15 Apr 2016

In principle, I am a big +1 on throwing exceptions to users when they wrongly use an API in Ruby Filter (for 5.0). IMO fail fast is the right thing here, and should be done at the code level in addition to docs/breaking change announcements.

suyograo on 15 Apr 2016

👍1

I think we all agree to minimize backward compatibilities but having them and documenting them in major version bump is ok.

Personally I don't see this API change as dramatic as @andrewvc sees it. As long as we properly document it and warn about this change it should be sufficient. This is an API change that gets rid of an implementation detail and there will certainly be more such changes where relying on some pre-conception on the nature of returned objects from the Event will not be possible. I do agree we should aim for the least surprise principle, but frankly, assuming that the returned object are references to the Event inner objects and are in-place mutable is just plain wrong to start with - it just happened it was the case and some, not all, developers used that _trick_.

I am +1 on finding ways to warn or prevent these mis-usages. But I don't think returning immutable objects here is necessarily the best approach. Since the returned objects are copies, you should really be free to do what you want with them.

With the new API, one way to modify an event is to do something like this:

# given this event
event["foo"]["bar"] = "baz"
# I want to populate event["foo"] using an intermediate Hash
h = event["foo"]
h["bar2"] = "baz2" # this would fail because h is immutable
h["bar3"] = "baz3" # this would fail because h is immutable
# set that back into event
event["foo"] = h

IMO this approach should be valid.
Also, recursively setting returned objects as immutable seems like a waste too.

And for the event["foo"]["bar"].downcase! situation, well, this is why these are "bang" methods and you should generally avoid using them unless you completely understand the context it is used in. Again the following would not work either:

# given this event
event["foo"]["bar"] = "BAZ"
s = event["foo"]["bar"] 
s.downcase! # this would fail because s is immutable
event["foo"]["bar"] = s

Which IMO should also be valid.

colinsurprenant on 15 Apr 2016

@colinsurprenant I think that I'd agree with you if this was a closed source program that only people internal to the company worked on.

However, Logstash being an open source project I think that having an easy to use API is a feature with significant value. An API that _looks_ like it is doing work that it is in fact not doing is frustrating.

I don't believe that a good API requires people to be high priests who have spent sufficient time studying the sacred Logstash user manual to understand the nuances of object references in events in order to get something useful done. This is the principle of least astonishment. We can document any behavior we wish.The ideal API is one that needs as little documentation as possible and still works well.

With regard to your example, I think that's good that the code sample doesn't work with frozen objects. The exception will encourage them to rewrite this as:

event['[foo][bar]'] = event['[foo][bar]'].downcase

which is what we want, right? We want the API to help the user understand, not to befuddle and confuse.

Perhaps we event write a convenience method (inspired by clojure) that makes it look like this.

event.mutate('[foo][bar]') {|v| v.downcase}
# or even more concisely
event.mutate('[foo][bar]', &:downcase)

andrewvc on 15 Apr 2016

So, setting the returned values as immutable does have the benefit of providing advanced warning on _potential_ compatibility issues. Maybe instead of doing this systematically we could provide a runtime options to _turn on_ objects immutability? or maybe have it turned on by default in the first release and users can turn it off and eventually leave that off by default and users can turn it on to do sanity checks eventually?

colinsurprenant on 15 Apr 2016

Taking it one step further, we COULD expose mutable objects in the same way. We could say that mutate returns a mutable object that will be properly saved when the block exits. So:

event.mutate('[foo][bar]') do |value|
  value['baz'] = 'bot'
  value['blah'].downcase!
  value
end

Would work correctly, and we'd have an explicit area where mutable operations could work consistently.

andrewvc on 15 Apr 2016

@andrewvc

I don't believe that a good API requires people to be high priests who have spent sufficient time studying the sacred Logstash user manual to understand the nuances of object references in events in order to get something useful done.

No, this is basic sanity principles of software engineering and properly documenting how to use and what is expected is totally ok.

This is the principle of least astonishment.

Respecting this principle for the sake of backward compatibility is not a valid point IMO. Major version releases gives us the opportunity to break APIs and correct previous design mistakes.

colinsurprenant on 15 Apr 2016

@colinsurprenant well, we're definitely giving up backwards compatibility here. I'm not arguing that's relevant here. I'm saying the API should be unsurprising. The less it behaves in unanticipated ways, the less we have to document and the more enjoyable it will be to use.

WRT your position on documentation we'll just have to agree to disagree.

andrewvc on 15 Apr 2016

@andrewvc

event.mutate('[foo][bar]') do |value|
  value['baz'] = 'bot
  value['blah'].downcase!
end

and

event.mutate('[foo][bar]') {|v| v.downcase}
# or even more concisely
event.mutate('[foo][bar]', &:downcase)

are completely over-designed. we don't need to go that far.

The principles are super-easy to understand and respect, very healthy and also very common: you can't assume that what is returned from an "opaque" object method/getter/accessor are references that you can use to modify that original object in-place. this completely violates all sanity principles. To update and object you have to use its _setters_ methods.

colinsurprenant on 15 Apr 2016

@andrewvc again, breaking compatibility in a major version release is OK and this is the whole idea with the semantic versioning stuff, which we don't respect to the letter but in its principles.

colinsurprenant on 15 Apr 2016

@colinsurprenant I want to be clear. This isn't about compatibility. This is about having an ergonomic API and a good developer experience. Even if this was v0.0.1-alpha of the logstash API I would have the same concerns.

andrewvc on 15 Apr 2016

Well, if «ergonomic API and a good developer experience» means relying on implementation specific behaviours and bad software engineering practices then I am not following. The quality is likely to increase as we use better practices, which in turn are easy to understand by developers.

Let me know if my suggestion of having immutability as a runtime option make sense?

colinsurprenant on 15 Apr 2016

Hi,
I know has derived to many discussions, I will try to share here how I do think the situation, I want to say before that I think we're all willing to archive here the best user experience here, all arguments are very valid and acceptable.

For me, I think the best path to reach the principle of least astonishment is to let users understand that the new way of working is better and encouraged, but as we're introducing a new change we can:

Communicate this breaking change with blogs, documentation, talks, etc...
Detect when the user is entering dangerous areas, like the ruby filter or custom plugins and make him explicit run into them.
Make former configurations that will be problematic not work now, this means throwing exceptions when something "wrong" happens.

At my understanding 1. is a must, but not necessary covering everyone, specially due to our always best effort not to break stuff.

To make the user more aware of this probable issues we should move towards something like 2. or even better 3., as this together with 1. will make the change very explicit, and for sure make people aware they are entering the breaking change area.

On the other side, (credits to @jsvd for this proposal ), we should strongly think about going into an experimental feature flag, this will help to educate people with breaking changes.

What do you think?

purbon on 15 Apr 2016

@colinsurprenant

I am just trying to understand this.

Let me know if my suggestion of having immutability as a runtime option make sense?

If we make @andrewvc proposal a flag and turn it on by default and make the event returning immutable objects. When the user configuration works, why they will want to turn this immutability feature off? What are the benefit of doing so? Performance?

ph on 15 Apr 2016

I'll just add that the mutate function I'm proposing--which I stole from update-in which is one of the most commonly used functions in Clojure--makes a ton of sense whether we use immutable data types or not. We could even use varargs for the path to make it cleaner.

Some examples

event['foo'] = event['foo'].downcase
# is less concise and requires more repetition than
event.mutate('foo', &:downcase)

tmp = event['[foo][bar]'].dup # returns a Hash
tmp['blah'] = 'bar'
tmp['bot'] = 'baz'
event['[foo][bar]'] = tmp
# is less concise than
event.mutate('foo', 'bar') do |v|
  v['blah'] = 'bar'
  v['bot'] = 'baz'
  v
end

Additionally, mutate can be more efficient since we don't need to do the path lookup twice and just hold on to the object reference. It's also less error prone and is more DRY since we only specify the path once.

andrewvc on 15 Apr 2016

@andrewvc I find this mutate method is similar to what #tap is doing.

ph on 15 Apr 2016

@ph, it is a lot like tap. Now that you mention it it'd be good to have both! The difference is that my mutate function changes the value to the return of the block. That's really useful because you do things like:

event.mutate('foo','bar') {|v| v.to_s } # Change the type of the object
event.mutate('foo','bar') {|v| v.downcase } # use the return value instead of mutating the actual value

However, tap is nice because you can do

event.tap('foo','bar') {|v| v['foo'] = 'bar'}
# where with mutate you need to do
event.mutate('foo','bar') {|v| v['foo'] = 'bar'; v}

In Clojure all data-structures are immutable (in 99% of situations), so the return value is necessary. That's why it popped into my mind here. Even if we don't use immutable data-structures you still often have the case where you want to replace the data, not mutate it.

Also, thinking it through I think clojure has it right with update, it's a better word than mutate.

andrewvc on 15 Apr 2016

@ph because having immutable object _can_ be helpful to identify _potential_ event assignment problem but prevent you from doing assignment to an intermediate inner hash in a totally correct way if this is what you want to do. Also, as I said

recursively setting returned objects as immutable seems like a waste

This is an unnecessary overhead when your plugin is working correctly.

For the mutate function, I wouldn't mind introducing it as a new feature, it's not a bad idea and could be a nice addition but not necessary nor urgent at this point because we should still support the current fieldref style gettters and setters a = event["[foo]"] event["[foo]"] = 1.

Let's not forget here that the vast majority of users are not impacted by any of this. Only a small subset which use Ruby filters and/or custom plugins could be impacted. Also, only a few of our ~200 plugins had to be modified to respect the new semantic.

colinsurprenant on 15 Apr 2016

@colinsurprenant thanks for clarifying this.

ph on 15 Apr 2016

Maybe we should keep all the good Closure ideas for when we introduce a Closure API? I mean, if we start in that direction we will yak shave ad nauseam defining new API paradigms that bring very little value to the table.

colinsurprenant on 15 Apr 2016

@colinsurprenant these are just programming ideas. Clojure is functional and ruby is mixed functional / OO. A lot of core clojure ideas (map/reduce, etc.) are ruby natives, so it makes sense here.

When you say that these things bring very little value that's your opinion, not an empirical fact. I respect that, but I disagree. There's FP stuff all over ruby (and matz cited Lisp as a major influence on Ruby).

andrewvc on 15 Apr 2016

Good healthy discussion here but it should have taken place in 5035

Bold, unabashed, draw a line in the sand statement - This relaxed approached to the Event API must end. No looks backward.

Here is why.

First, some arguable assertions on the Java Event implementation.

The Event object itself 'belongs' to us.
The leaf objects 'belong' to the user.
Intermediate collections 'belong' to us (if they didn't we would not return a copy).
The accessor path addressing mechanism is an indirect reference to either a leaf object or an intermediate collection object.
We have to respect user operations on leaf objects.
We can choose whether to respect user operations on intermediate objects.
We are not taking higher level operations like add_field and those exposed by the mutate filter into account for these discussions - these operations target (via accessor paths) either a leaf object or an intermediate collections.

Before reading this thread I thought about freeze too but I rejected it on the grounds of needing to be done on intermediate collections only which is very opaque to the users.

Mutate operations on collections have bad side effects.
Example from Ruby Event

> event = LogStash::Event.new
=> #<LogStash::Event:0x25ce435 @metadata_accessors=#<LogStash::Util::Accessors:0x7ea71fc2 @store={}, @lut={}>, @cancelled=false, @data={"@version"=>"1", "@timestamp"=>"2016-04-15T14:19:31.561Z"}, @metadata={}, @accessors=#<LogStash::Util::Accessors:0x7cd5fcf4 @store={"@version"=>"1", "@timestamp"=>"2016-04-15T14:19:31.561Z"}, @lut={}>>
> event["[a][0]"]
=> nil
> event["[a][0]"] = 42
=> 42
> event["[a]"]
=> {"0"=>42}
> event["[a]"] = [42,24]
=> [42, 24]
> event["[a]"]
=> [42, 24]
> event["[a][0]"]
=> 42
> event["[a]"] = [24,42]
=> [24, 42]
> event["[a][0]"]
=> 42
> event["[a][0]"] = {"a"=> 99, "b" => 98}
=> {"a"=>99, "b"=>98}
> event["[a][0]"]
=> {"a"=>99, "b"=>98}
> event["[a]"]
=> [24, 42]
> event["[a][0]"]
=> {"a"=>99, "b"=>98}
> event["[a][0][b]"]
TypeError: can't convert String into Integer
  from org/jruby/RubyFixnum.java:1143:in `[]'

It seems to me that there a list of troublesome operations on intermediate collections including []= and others

Lets remove them.

guyboertje on 15 Apr 2016

❤2

When you say that these things bring very little value that's your opinion, not an empirical fact.

It brings little value because it is absolutely not required, it is perfectly possible to continue using the current API correctly. The fact that it's an empirical fact or not is totally irrelevant to the discussion.

colinsurprenant on 15 Apr 2016

@guyboertje I think that returning different collection types makes a lot of sense. Thanks for the examples demonstrating the confusing behavior of the current impl.

Question: How would this work with values? How would this fix the downcase! problem?

It'd be useful to create a list of test cases for use of the Event API and figure out how we'd want these to work. I think we've wound up going about the process from the internals out instead of the outside in.

andrewvc on 15 Apr 2016

@andrewvc
The downcase! problem...
It depends on whether my Dual Value for leaf objects PR gets merged - the downcase! problem not fixed without it.
If it is merged, then it works - and is fixed with or without EventHash like wrappers.
So
Internally it is a collection of collections (Java) but the leaf is a Dual Value [ruby value, java value | null]
if event["[a][b][0]"] points to a leaf then before the value is returned by the Event [] method the JRuby Extension unpacks the ruby value from the Dual Value and returns it - this is the original Ruby value - you can downcase! it. Internally that ruby value still exists in the Dual Value in the collection of collections. This works too event["[a][b]"][0].downcase! because event["[a][b]"] returns a new Array that contains the original leaf Ruby values.

But actually I want to discourage this because its a 'set up for a fall'...
Assume event["[a][b][0]"] -> 42 then event["[a][b][0]"] += 4 works because internally Ruby using Event#[] gets the value (i.e. the original value) and then adds 4 and uses Event#[]= to put it back.
However this event["[a][b]"][0] += 4 fails because Array#[]= is called on a copy.

guyboertje on 15 Apr 2016

This issue is becoming a melting pot of many different discussions. On my side I will conclude with this last comment.

I will discuss in #5035 @guyboertje excellent observation of this problematic behaviour, which is not new to the Java impl, and that we should definitely look at solving in 5.0. IMO new collection types are not required, just proper accessor/path invalidation upon setting new values.
Going back to the original intent of this issue:
- We are talking about users or custom plugins, which are a very small minority and users of Ruby filter scripts which are more frequent but still a minority of users and for which, in both cases, could still be unaffected by the API change.
- I don't see the point in putting in place _new stuff_ that might impact a majority for which everything is working perfectly in the first place for the benefit of a minority that _might_ have a problem.
- Setting returned values as immutable could maybe help users discover potential problems but will certainly break other perfectly correct config/plugins.
- In that context, providing advanced notifications on breaking changes with complete and well documented explanations, instructions and tips seems totally reasonable to me.
- Finding way to emulate the previous behaviour seems totally unreasonable as it perpetuates a design that is not hygienic and too implementation specific.

Setting returned objects as immutable only if a logstash runtime/command line option is given to help users spot potential problems seems reasonable but I wouldn't enable it by default.

Other new Event API methods/paradigms/philosophies proposals should probably be done and discussed into their own issues.

colinsurprenant on 15 Apr 2016

@andrewvc can we close this now in lieu of #5140 ?

suyograo on 21 Apr 2016

@suyograo yup! Closed.

andrewvc on 22 Apr 2016

Was this page helpful?

0 / 5 - 0 ratings