Pytest: Allow list of dictionaries for @pytest.mark.parametrize

Created on 29 Jul 2020 · 12Comments · Source: pytest-dev/pytest

Right now it is only allowed list of tuples to pass into @pytest.mark.parametrize decorator:

import pytest

from datetime import datetime, timedelta

testdata = [
    (datetime(2001, 12, 12), datetime(2001, 12, 11), timedelta(1)),
    (datetime(2001, 12, 11), datetime(2001, 12, 12), timedelta(-1)),
]


@pytest.mark.parametrize("a,b,expected", testdata)
def test_timedistance_v0(a, b, expected):
    diff = a - b
    assert diff == expected

The large test data is complicated to manage testdata and it became less readable. To overcome this issue I proposing pass testdata as a list of dictionaries and keep arguments names only in testdata:

import pytest

from datetime import datetime, timedelta

testdata = [
    {
      'a': datetime(2001, 12, 12),
      'b': datetime(2001, 12, 11), 
      'expected': timedelta(1),
    },
    {
      'a': datetime(2001, 12, 11),
      'b': datetime(2001, 12, 12), 
      'expected': timedelta(-1),
    },
]


@pytest.mark.parametrize(testdata)
def test_timedistance_v0(a, b, expected):
    diff = a - b
    assert diff == expected

parametrize proposal

Source

dbalabka

Most helpful comment

Make sure you check out pytest.param which cuts out a lot of the boilerplate you've got https://docs.pytest.org/en/stable/example/parametrize.html#different-options-for-test-ids

asottile on 24 Sep 2020

👍2

All 12 comments

I'd love to have something like this, but the problem is that there isn't really an agreement what dictionaries mean exactly when passing them to parametrize - for a different take (which ended up being rejected) and some previous discussion, see #5487 and #5850.

I was wondering whether it'd make more sense for pytest.param to accept keyword arguments, i.e.:

((datetime(2001, 12, 12), datetime(2001, 12, 11), timedelta(1)),  # already works
pytest.param(datetime(2001, 12, 12), datetime(2001, 12, 11), timedelta(1)),  # already works
pytest.param(a=datetime(2001, 12, 12), b=datetime(2001, 12, 11), expected=timedelta(1)),  # new and seems logical

But the problem with that is that pytest.param already takes id and marks as keyword arguments, so that's kind of problematic as well. Possible solutions:

Use dicts like in your example (but that seems too ambiguous)
Make pytest.param handle everything mentioned in the pytest.mark.parametrize argument list as arguments (so, if id or marks is listed there, that'd be effectively ignored when passed as keyword argument) - but that's a bit magic and also backwards incompatible
Just be okay with the fact that we couldn't pass id and marks as keyword arguments to a test - but that means we could not add any new keyword arguments to param in the future
Introduce a new pytest.param_args(...) or whatever, but that seems confusing together with pytest.param

The-Compiler on 29 Jul 2020

👍2

@The-Compiler thanks for a quick reply. As I understood, previous discussion (#5487 and #5850) are mostly about test cases' descriptions/ids, which are also required to improve the developer experience (DX).

I will add more explanation to what I have written above. Probably, the most frustrating issue at the moment is that the developer should support a list of arguments that duplicates already declared function/method parameters method next to it:

@pytest.mark.parametrize("a,b,expected", testdata)
def test_timedistance_v0(a, b, expected):

It would be a great improvement to avoid writing this by extracting meta-information from the function/method signature. IMO it is worth to pay a small amount of performance and include some "magic" for DX enhancement.

@pytest.mark.parametrize(testdata)
def test_timedistance_v0(a, b, expected):

The proposed approach (allow maintain parameters names in test data as a dictionary) gives a possibility to have readable test data. It would be great to enrich it with test case id as well. So, in the end, we can come up with the following structure:

testdata = {
    'test case 1': {
      'a': datetime(2001, 12, 12),
      'b': datetime(2001, 12, 11), 
      'expected': timedelta(1),
    },
    'test case 2': {
      'a': datetime(2001, 12, 11),
      'b': datetime(2001, 12, 12), 
      'expected': timedelta(-1),
    },
}

Talking about id, pytest.param helps to specify test case id, but I found it too verbose.

Probably, it is possible to utilize @pytest.mark.parametrize and implement dictionary support if only the first parameter is passed:

@pytest.mark.parametrize(testdata)

dbalabka on 29 Jul 2020

Sometimes, explicit is better than implicit :wink: Arguments can also refer to fixtures, so having an explicit list of arguments to parametrize helps to get e.g. sensible error messages when there is a typo.

I also don't think it makes sense to have yet another "magic" data structure API for how to use parametrize, there are already various things in there (specifying a list of items for a single argument vs. a list of tuples for multiple, etc. etc.). This gets confusing fast.

Case in point: In your example above, you'd now always have to explicitly specify test IDs, which can get cumbersome fast. Perhaps you want something like pytest-cases though?

So, I'd still like to explore a more explicit solution involving pytest.param, but I'm -1 on any "dicts have some special meaning and then there's a lot of magic around it" solution because different people expect wildly different behaviors in that case, and I doubt it'd be good for pytest to guess what the user meant.

The-Compiler on 29 Jul 2020

I actually do this:

import datetime as dt

import attr
import pytest



@attr.dataclass(fronzen=True)
class TimeDistanceParam:
    a: dt.datetime
    b: dt.datetime
    expected: dt.timedelta

    def pytest_id(self):
        return repr(self)  # usually something custom


@pytest.mark.parametrize(
    "p",
    [
        TimeDistanceParam(
            a=dt.datetime(2001, 12, 12),
            b=dt.datetime(2001, 12, 11),
            expected=dt.timedelta(1),
        ),
        TimeDistanceParam(
            a=dt.datetime(2001, 12, 11),
            b=dt.datetime(2001, 12, 12),
            expected=dt.timedelta(-1),
        ),
    ],
    ids=TimeDistanceParam.pytest_id,
)
def test_timedistance_v0(p):
    assert p.a - p.b == p.expected

graingert on 29 Jul 2020

something that would help my usacase is a PytestParam abc so I can do:

class PytestParam(metaclass=abc.ABCMeta):
    @abc.abstractmethod
    def pytest_marks(self): ...

    @abc.abstractmethod
    def pytest_id(self): ...

@pytest.PytestParam.register
@attr.dataclass(frozen=True)
class TimeDistanceParam:
    a: dt.datetime
    b: dt.datetime

    def pytest_marks():
        ...

    def pytest_id():
        ...

and pytest.mark.parametrize can special case registered instances of PytestParam

graingert on 29 Jul 2020

fwiw, I'm still -1 on this as per the previous (duplicate) discussions. parametrize is already type-complicated enough without introducing yet-another-way to do this by accepting mapping types

asottile on 29 Jul 2020

I'm -1 on this as well, as it seems fairly easy to do in user code:

import datetime as dt

import pytest


def params(d):
    return pytest.mark.parametrize(
        argnames=(argnames := sorted({k for v in d.values() for k in v.keys()})),
        argvalues=[[v.get(k) for k in argnames] for v in d.values()],
        ids=d.keys(),
    )


@params(
    {
        "test case 1": {
            "a": dt.datetime(2001, 12, 12),
            "b": dt.datetime(2001, 12, 11),
            "expected": dt.timedelta(1),
        },
        "test case 2": {
            "a": dt.datetime(2001, 12, 11),
            "b": dt.datetime(2001, 12, 12),
            "expected": dt.timedelta(-1),
        },
    }
)
def test_timedistance_v0(a, b, expected):
    assert a - b == expected

graingert on 30 Jul 2020

@The-Compiler

Sometimes, explicit is better than implicit 😉

In this case, it is too explicit because as I already said: "...developer should support a list of arguments that duplicates already declared function/method parameters method next to it..."

Arguments can also refer to fixtures, so having an explicit list of arguments to parametrize helps to get e.g. sensible error messages when there is a typo.

Unfortunately, I multiple times forgot to add parameters into this list and got an error that does not have relation to broken code or incorrect test logic. So I can conclude that the duplicated list of parameters does not help in most cases.
Could you please provide some typical developers mistakes that can be caught by an additional list of parameters?

I also don't think it makes sense to have yet another "magic" data structure API for how to use parametrize, there are already various things in there (specifying a list of items for a single argument vs. a list of tuples for multiple, etc. etc.). This gets confusing fast.

The proposed data structure is not an alternative to already existing. It is a logical continuation list of tuples for cases when a developer wants to explicitly provide parameters names and/or ids for test cases. So we are not introducing invariant of usage.

Case in point: In your example above, you'd now always have to explicitly specify test IDs, which can get cumbersome fast. Perhaps you want something like pytest-cases though?

Probably, I didn't mention but I would like to keep BC with the old structure. So the developers would not be required to provide IDs if they don't want to.

Findings

After some investigation of why parameters list is required, I found the place where the real magic lives. It is possible to stack decorators to get all possible combinations:

@pytest.mark.parametrize("x", [0, 1])
@pytest.mark.parametrize("y", [2, 3])
def test_foo(x, y):
    pass

In this case list of parameters is required to avoid invariant of list usage. It is impossible to say is it list parameters or list of test cases w/o explicit list of parameters.

IMO it is unfounded API complication that can be replaced with more readable variant using itertools:

@pytest.mark.parametrize("x,y", itertools.product([0,1], [2,3]))
def test_foo(x, y):
    pass

or single parameter parameterization should live in separate decorator.

Conclusion

Parametrize API already overcomplicated. ~~IMO we should deprecate stacked decorators and use itertools~~ IMO stacked decorators should be implemented by separate decorator to simplify @pytest.mark.parametrize functionality. This will give a possibility to rid off explicitly specified parameters list and avoid invariants in the arguments list. Otherwise proposed structure improvements will increase the complexity of implementation and will not invest much into better developers experience.

dbalabka on 30 Jul 2020

Stacked decorator are not the only source of parameterization, any fixture as well as custom plugins can participate

It's a well used feature so your proposal doesn't exactly help/work for the project.

RonnyPfannschmidt on 30 Jul 2020

👍1

I appreciate everyone’s input into this ticket’s discussion. I'm closing this ticket because it is discussed.

dbalabka on 30 Jul 2020

I was directed to here from #7790 since I have a similar problem with readability and maintainability when using pytest.mark.parametrize.

(I hope it's okay to post here even though the thread is closed).

I use a helper function that wraps the pytest.mark.parametrize decorator:

from types import SimpleNamespace

class Case:
    def __init__(self, **kwargs):
        self.label = None
        self.kwargs = kwargs


class LabelledCase:
    def __init__(self, label):
        self.label = label

    def __call__(self, **kwargs):
        case = Case(**kwargs)
        case.label = self.label
        return case


def nicer_parametrize(*args):
    for case in args:
        if not isinstance(case, Case):
            raise TypeError(f"{case!r} is not an instance of Case")

    first_case = next(iter(args))
    first_attrs = first_case.kwargs.keys()
    argument_string = ",".join(sorted(list(first_attrs)))

    case_list = []
    ids_list = []
    for case in args:
        case_dict = case.kwargs
        attrs = case_dict.keys()

        if attrs != first_attrs:
            raise ValueError(
                f"Inconsistent argument signature: {first_case!r}, {case!r}"
            )

        case_tuple = tuple(value for key, value in sorted(list(case_dict.items())))
        case_list.append(case_tuple)
        ids_list.append(case.label)

    return pytest.mark.parametrize(
        argnames=argument_string, argvalues=case_list, ids=ids_list
    )

It's used like this:

@nicer_parametrize(
    LabelledCase("Strategy A")(
        flavour_prices={
            "Vanilla": 1.50,
            "Strawberry": 1.80,
            "Chocolate": 1.80,
            "Caramel": 1.65,
        },
        expected_revenue=1_200_000,
    ),
    LabelledCase("Strategy B")(
        flavour_prices={
            "Vanilla": 1.25,
            "Strawberry": 1.55,
            "Chocolate": 1.65,
            "Caramel": 2.10,
        },
        expected_revenue=1_350_000,
    ),
    # if no label is wanted/needed, just use Case
    Case(
        expected_revenue=2_400_000, # order is irrelevant
        flavour_prices={
            "Vanilla": 1.40,
            "Chocolate": 1.50,
            "Strawberry": 1.85,
            "Caramel": 1.35
        }
    )
)
@nicer_parametrize(
    # we can stack to get the Cartesian product just like mark.parametrize
    LabelledCase("USA")(country="United States"),
    LabelledCase("France")(country="France"),
    LabelledCase("Japan")(country="Japan")
)
def test_ice_cream_projections(flavour_prices, country, expected_revenue):
    ...

Gives:

tests/test_wrapper.py::test_ice_cream_projections[USA-Strategy A] PASSED                    [ 11%] 
tests/test_wrapper.py::test_ice_cream_projections[USA-Strategy B] PASSED                    [ 22%] 
tests/test_wrapper.py::test_ice_cream_projections[USA-2400000-flavour_prices2] PASSED       [ 33%] 
tests/test_wrapper.py::test_ice_cream_projections[France-Strategy A] PASSED                 [ 44%] 
tests/test_wrapper.py::test_ice_cream_projections[France-Strategy B] PASSED                 [ 55%] 
tests/test_wrapper.py::test_ice_cream_projections[France-2400000-flavour_prices2] PASSED    [ 66%] 
tests/test_wrapper.py::test_ice_cream_projections[Japan-Strategy A] PASSED                  [ 77%] 
tests/test_wrapper.py::test_ice_cream_projections[Japan-Strategy B] PASSED                  [ 88%] 
tests/test_wrapper.py::test_ice_cream_projections[Japan-2400000-flavour_prices2] PASSED     [100%]

The advantages of this over plain pytest.mark.parametrize are:

One less layer of indentation, fewer lines
More readable
More explicit
Related values (argument names, values, test id) are grouped together, rather than being smeared out over two lists and a string

The advantages over other suggestions in this thread are:

Does not force you to specify a test ID, and if you do specify one, it doesn't interfere with argument names (you can have argnames called id or mark or label or whatever)
Does not require the user to make a bespoke class for each kind of argument set
Does not alter the existing behaviour of pytest.mark.parametrize
Can be stacked with other parametrize decorators
Keyword arguments feel a bit safer to use than string keys, and are visually consistent with the arguments in the test function definition

Disadvantages:

It's a bit weird to specify the test ID in that currying-esque way
Probably needs better naming
?

I think something like this would be a nice addition to pytest.

EDIT

Actually I overlooked a simpler way: we can use the positional-only argument feature of Python 3.8 to remove the need for the LabelledCase class:

class Case:
    def __init__(self, label=None, /, **kwargs):
        self.label = label
        self.kwargs = kwargs

That means we can just pass in an optional string at the beginning of the Case initialization and avoid the awkward currying:

Case(
    "Strategy B",
    flavour_prices={
        "Vanilla": 1.25,
        "Strawberry": 1.55,
        "Chocolate": 1.65,
        "Caramel": 2.10,
    },
    expected_revenue=1_350_000,
)

ckp95 on 24 Sep 2020

Make sure you check out pytest.param which cuts out a lot of the boilerplate you've got https://docs.pytest.org/en/stable/example/parametrize.html#different-options-for-test-ids

asottile on 24 Sep 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings