Hi there,
I am currently working on some factories, and want to have each with static-values and randomized values. To use the already defined traits I would like to something like:
FactoryGirl.define do
factory :currency do
eur
trait :usd do
name 'US-Dollar'
symbol '$'
end
trait :eur do
name 'Euro'
symbol '€'
end
# what would be really nice..
trait :random, traits: [
[:eur, :usd].sample
]
end
end
Could we add this feature in any future release, or is there any possibility to do this already?
@loybert You can technically do this when building factories: create(:currency, [:eur, :usd].sample)
That said, I'd highly encourage you _not_ to use random data in your acceptance or unit test suites - I've found time after time that introducing random values in test suites only end in me banging my head against the wall.
?? I use random data in my test suites routinely. I've never found it to cause any problems at all, and sometimes it points out cases I didn't consider. I think fixed data is a testing smell.
@marnen I think it depends on the tests; I can definitely appreciate a fuzzing tool, like tarantula, to actively try to break areas of an application.
However, I've found random data in tests which aren't explicitly written to fuzz the application often introduces nondeterminism, which Martin Fowler has written about at length here.
The trouble with non-deterministic tests is that when they go red, you have no idea whether its due to a bug, or just part of the non-deterministic behavior. Usually with these tests a non-deterministic failure is relatively common, so you end up shrugging your shoulders when these tests go red. Once you start ignoring a regression test failure, then that test is useless and you might as well throw it away.
If I'm testing an interface of an object where the results are the same, I'd rely on Ruby and my test framework to generate deterministic tests for multiple cases, so if the test fails, I know exactly where, how, and the state the object is in when it's failing. Any information less than that, to Martin Fowler's point, is useless because it gives no actionable feedback other than that _something_ is broken.
@joshuaclayton I don't use a fuzz-testing tool as such, but my factories and unit tests always have random data as much as possible. And I disagree with the Fowler quote you posted: while the _data_ in my tests may be non-deterministic, the relationship between the input and the output is completely deterministic, so I've never had any trouble telling whether I have a bug or an edge case. It doesn't really matter anyway: if I'm not handling an edge case properly, that _is_ a bug, either in my test or in my implementation. I certainly don't ignore failing tests.
Not using randomized data in your tests introduces confirmation bias: you'll be tempted to only test the cases you already expect will succeed or fail, so your test won't really be very useful. Randomized data, with a well-defined deterministic relationship between input and output, keeps you honest.
Example of the way I write tests:
# models/user.rb
class User < ActiveRecord::Base
def uppercase_name
name.upcase
end
end
# factories/user.rb
FactoryGirl.define do
factory :user
name { Faker::Name.name }
end
end
# spec/models/user_spec.rb
describe User, '#uppercase_name' do
it "returns the user's name in uppercase"
user = FactoryGirl.create(:user)
expect(user.uppercase_name).to be == user.name.upcase
end
end
So the relationship between input and output is completely deterministic: "John" => "JOHN", "Billy Bob" => "BILLY BOB", and so on. But the RSpec code is more intention-revealing (because it has to describe the relationship, not a hard-coded result), and I can't break the application by writing an implementation that won't pass all cases.
Most helpful comment
@joshuaclayton I don't use a fuzz-testing tool as such, but my factories and unit tests always have random data as much as possible. And I disagree with the Fowler quote you posted: while the _data_ in my tests may be non-deterministic, the relationship between the input and the output is completely deterministic, so I've never had any trouble telling whether I have a bug or an edge case. It doesn't really matter anyway: if I'm not handling an edge case properly, that _is_ a bug, either in my test or in my implementation. I certainly don't ignore failing tests.
Not using randomized data in your tests introduces confirmation bias: you'll be tempted to only test the cases you already expect will succeed or fail, so your test won't really be very useful. Randomized data, with a well-defined deterministic relationship between input and output, keeps you honest.
Example of the way I write tests:
So the relationship between input and output is completely deterministic: "John" => "JOHN", "Billy Bob" => "BILLY BOB", and so on. But the RSpec code is more intention-revealing (because it has to describe the relationship, not a hard-coded result), and I can't break the application by writing an implementation that won't pass all cases.