Rspec-core: on long strings, rspec may not display the part of the string that does not match

Created on 6 Apr 2018  路  17Comments  路  Source: rspec/rspec-core

I'm trying to track down an error in a flaky test. Unfortunately the string output presented by rspec does not display the problem.

The following test should be enough to demonstrate the problem:

describe 'a long string' do
  it 'shows the right part of the diff' do
    got = "a"*100 + "Z" + "a"*100
    want = "a" * (1 + 100*2)
    expect(got).to eq(want)
  end
end

Yields the following output:

Failures:

  1) a long string shows the right part of the diff
     Failure/Error: expect(got).to eq(want)

       expected: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
            got: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

       (compared using ==)
     # ./longstring_spec.rb:5:in `block (2 levels) in <top (required)>'

The strings displayed by rspec are identical; the diff is in the truncated part. If the string is any shorter, the diff appears correctly.

Running on a Mac in iTerm2.

$ rspec --version
RSpec 3.7
  - rspec-core 3.7.1
  - rspec-expectations 3.7.0
  - rspec-mocks 3.7.0
  - rspec-support 3.7.1

Most helpful comment

Ah, this is a dupe of https://github.com/rspec/rspec-expectations/issues/991.

The suggestion there is to set RSpec::Support::ObjectFormatter.default_instance.max_formatted_output_length to a high enough value.

All 17 comments

Thanks for reporting this!

It's unfortunate we're not printing the differing part of the string. I think RSpec's output could get much better with a differ designed for RSpec itself, that would give better snippets that highlights the differences. That's a feature I'd like to see us gain in the future.

In the meantime, to fix your issue, I think we can add some conditional logic to not truncate the string like that if the expected/got values are the same. Want to take a stab at that?

Want to take a stab at that?

Apologies - I am not likely to have time to investigate/test/produce a proper patch :( Just wanted to alert you to this issue as is.

Ah, this is a dupe of https://github.com/rspec/rspec-expectations/issues/991.

The suggestion there is to set RSpec::Support::ObjectFormatter.default_instance.max_formatted_output_length to a high enough value.

Otherwise it might be difficult.

        def failure_message
          "\nexpected: #{expected_formatted}\n     got: #{actual_formatted}\n\n(compared using ==)\n"
        end

Each of expected_formatted and actual_formatted formats the string independently, so we'd need to do something to check whether those strings are identical, then call the same method again with different parameters or a different method in the hopes of getting a non-identical string.

I will probably try to work on this if we are ok with the scope of the task.

What I think could be nice to display is:

Failures:

  1) a long string shows the right part of the diff
     Failure/Error: expect(got).to eq(want)

       expected: "...aaaaaaaa..."
            got: "...aaaabaaa..."

       (compared using ==)
     # ./longstring_spec.rb:5:in `block (2 levels) in <top (required)>'

But I have few issues:

  • Is it limited to strings?
  • What happens when there are multiple differences?
  • Is it an option to set?

Example

Say we have two strings of strings of length 11:

 str1 = "aaaaaaaaaaa"
 str2 = "aaaaaZaaaaa"

And the value of RSpec.Support.ObjectFormatter.max_formatted_output_length was set to 4

Current formatting

   expected: "aa...a"
        got: "aa...a"

I'm not a big fan of that.

Proposed formatting

Leading & traling ellipsis. You'd see something like this...

   expected: "...aaaZ..."
        got: "...aaaa..."

Here's where the formatting is actually done:

https://github.com/rspec/rspec-support/blob/master/lib/rspec/support/object_formatter.rb#L32

Still up to work on this even if I still have few questions maybe for @JonRowe or @myronmarston.

But I have few issues:

 Is it limited to strings?
 What happens when there are multiple differences?
 Is it an option to set?

Those are great questions.

Is it limited to strings?

Ideally, it would apply to any object that gets the truncation treatment--which I think means any object. But if it is easier to limit this improvement to strings I think that's ok.

What happens when there are multiple differences?

Can we apply the formatted string limit to each difference and show multiple ellipses as needed?

Is it an option to set?

Ideally, we can get this to display well w/o needing any user options.

| Can we apply the formatted string limit to each difference and show multiple ellipses as needed?

I think that sounds super confusing. Why not just report all the errors that show up in the max output length, and truncate the rest? It's a pretty niche setting that most people don't even know exist (I had to go digging to find it), so why add a bunch of complexity that just makes the code base harder to maintain?

I think that's all I have to say on the topic, and I don't have a strong opinion. I just happened to bump into this issue, and I found the topic interesting enough to investigate a bit :)

Why not just report all the errors that show up in the max output length, and truncate the rest?

I was thinking about this too.

str1 = "aaaaaaaaaaaaaaaaaaaaaaaaa"
str2 = "aaaaaZaaaaaaaaZaaaaZaaaaa"
````

expected: "...aaZaa...aaZaa...(truncated)"
got: "...aaaaa...aaaaa...(truncated)"
```

But it can get very confusing.

Thanks a lot, @myronmarston for answers.

I'm fine with going with a simpler solution, as long as it improves things (as I expect it to!) and we can always iterate.

Is it limited to strings?

From the differs perspective everything is strings.

What happens when there are multiple differences?

If we get it to show at least one difference thats a big improvement, I wouldn't worry so much about multiples.

Is it an option to set?

I think it should just happen

Hello

I worked on a first draft version inspired by Differ gem.

class DiffObjectFormater
  attr_accessor :diff_intro, :diff_outro, :received_diff_part, :expected_diff_part
  INTRO_OUTRO_LENGTH = 3
  ELLISPSIS_ARRAY = ['.','.','.']

  def initialize
    @diff_intro = []
    @diff_outro = []
    @received_diff_part = []
    @expected_diff_part = []
  end

  def diff(target, source)
    target = target.split('')
    source = source.split('')

    until source.empty? || target.empty? || output_limit_reached?
      compose_diff_parts(source, target)
    end

    expected = diff_intro_ellipsed + expected_diff_part + diff_outro_ellipsed
    received = diff_intro_ellipsed + received_diff_part + diff_outro_ellipsed

    [expected.join, received.join]
  end

  private

  def compose_diff_parts(source, target)
    first_source_item = source.shift
    first_target_item = target.shift

    if first_source_item == first_target_item
      if expected_diff_part.empty?
        diff_intro << first_source_item
        diff_intro.delete_at(3) if diff_intro.size > 6
      elsif diff_outro.size < 7
        diff_outro << first_source_item
      end
    elsif diff_outro.empty?
      received_diff_part << first_source_item
      expected_diff_part << first_target_item
    end
  end

  def output_limit_reached?
    INTRO_OUTRO_LENGTH + received_diff_part.size + INTRO_OUTRO_LENGTH > 15  # Set to RSpec::Support::ObjectFormatter.default_instance.max_formatted_output_length
  end

  def diff_intro_ellipsed
    return diff_intro unless diff_intro.any? && diff_intro.size > INTRO_OUTRO_LENGTH
    diff_intro[0..2] = ELLISPSIS_ARRAY
    diff_intro
  end

  def diff_outro_ellipsed
    return diff_outro unless diff_outro.any? && diff_outro.size > INTRO_OUTRO_LENGTH
    diff_outro[4..-1] = ELLISPSIS_ARRAY
    diff_outro
  end
end

begin
  require "bundler/inline"
end

gemfile(true) do
  source "https://rubygems.org"

  gem "rspec"
end
require 'rspec/autorun'

RSpec.describe 'additions' do
  describe 'middle change' do
    it 'returns ellipsis, intro, diff, outro, ellipsis' do
      expect(DiffObjectFormater.new.diff('aaaaaaaa1234567890bbbbbbbb', 'aaaaaaaa12345___90bbbbbbbb'))
        .to eq(["...34567890bb...", "...345___90bb..."])
    end
  end

  describe 'first char change' do
    it 'returns first letter as changed then outro + ellipsis' do
      expect(DiffObjectFormater.new.diff('Zaaaaaaa1234567890bbbbbbbb', 'aaaaaaaa1234567890bbbbbbbb'))
        .to eq(["Zaaaa...", "aaaaa..."])
    end
  end

  describe 'First and middle change' do
    it 'returns only first change' do
      expect(DiffObjectFormater.new.diff('Zaaaaaaa1234567890bbbbbbbb', 'aaaaaaaa12345___90bbbbbbbb'))
        .to eq(["Zaaaa...", "aaaaa..."])
    end
  end
end

The way I handle this is probably not the easiest to follow. Some complexity has been added to deal with first or last letter change in a diff and when we need to display ellipsis. Feel free to make any comments.

Also for the moment, it seems I'm quite "stuck" to implement this into existing rspec-support (https://github.com/rspec/rspec-support/blob/master/lib/rspec/support/object_formatter.rb#L32) because the formater is used separately for "expected" and "got". It doesn't know the other and in my code it needs to know the other (got VS expected).
Before going further I think I need few feedbacks and tips.

Thanks in advance. 馃帀

Sorry for the slow review here @benoittgt !

The ObjectFormatter is currently used in several ways, to format objects for descriptions etc, and to format them for expected / actual. I think this diffing formatter is worth having as an additional object (although if we could refactor it to share some code with the current formatter that'd be nice), which is then used when we know what expected / actual are. (See rspec-expectation matchers for how this is used). This could obviously then just be added as an additional peice, with tests demonstrating usage of course, and then we can switch the matchers to use this.

FWIW, I think RSpec badly needs to move away from using diff-lcs for diffing, and implement its own differ. diff-lcs works OK, but we convert everything to strings to diff, and we only get a line-by-line diff. This has several problems:

  • We aren't able to get diffs of long single-line strings
  • Non-strings (e.g. arrays, hashes, etc) are only diffable if they get formatted as multi-line strings
  • The differ is only able to work off of _string_ differences, but that doesn't align with our matching semantics. For example, since we've supported composable matchers, it's very common to have an actual value in a data structure and a matcher in the expected data structure. They have different string formatting (and thus show up as a difference) even if they match. Even putting aside composable matchers, it's not true that equal objects always have the same string output, or that unequal objects have different string output.

A differ written for RSpec, that understands the semantics of RSpec matching, would produce far better output.

I know that's a much larger task than just this issue, but it's something I've been thinking about for a while, and was thinking could be a headline feature for RSpec 4.

If someone wants to experiment with this, it might be instructive to look at the differ that is included with Elixir's ExUnit. It works quite well:

https://github.com/elixir-lang/elixir/blob/master/lib/ex_unit/lib/ex_unit/diff.ex

I totally agree on that front

I have a draft version of the "new" object formater but it needs more work. I hope to send a proper version soon. ObjectFormater does a lot I need to understand what I need to keep from the original object_formater. Also I need to DRY the code with ObjectFormater.

At the same time I'm doing a small class about computer science at Harvard (essentially to learn C) and I have a final project. I can give a try to "port" ex_unit/diff to ruby (elixir test are instructive on how it works elixir-https://github.com/elixir-lang/elixir/blob/master/lib/ex_unit/test/ex_unit/diff_test.exs). I did some Elixir last year it could be a good project to end that class and validate my diploma.

Was this page helpful?
0 / 5 - 0 ratings