NamedTuples are currently handled the same way as tuples when Prefect builds its context (and probably at other times), which is incorrect since a NamedTuple can't be initialized like a normal tuple. It needs one argument per defined property instead all of the arguments in a list.
Where the error happens:
https://github.com/PrefectHQ/prefect/blob/cc2f4e16227d9f40c6aeace3bae887cc1fdc90f6/src/prefect/utilities/collections.py#L145-L177
It should initialize them properly using either the splat *[] or using the ._make([]) method of the NamedTuple object, or perhaps just not modify them at all (just return them raw on L177).
I don't know why things have to be transformed into dictionaries so don't know what the right decision is.
from prefect import Flow, Parameter, task
from collections import namedtuple
@task
def printme(x):
print(x)
with Flow('Test') as flow:
myarg = Parameter('myarg')
printme(myarg)
flow.run(myarg=namedtuple('MyObj', ['x', 'y'])(1, 2))
On Line 160:
The removal of the tuple type check will resolve the issue and return the object as expected. I'm not 100% sure why it needs to be transformed to a dict here but I'm under the assumption that it stems from something regarding serialization.
FWIW I removed this check in a local branch and the tests still passed. Feel free to open a PR for the change unless someone has a reason against the change!
Yes that is correct. However, I assume there is an intention to wanting to recursively serialize all lists/sets/tuples. If we can safely remove tuple, shouldn't we be able to safely remove list/set as well? I'm thinking that it might be good to dig up the original idea with this to figure out what the best path forward is :)
Just afraid that there might be test cases missing that would have failed if we removed tuple.
No matter what, I'd happily send a PR once a decision has been made.
The reason for this line is ultimately for taking the following type of situation where elements of the tuple themselves need to be transformed into the appropriate dict class:
from prefect.utilities.collections import DotDict, as_nested_dict
d = DotDict(x=(0, DotDict(inner_dict=9)))
as_nested_dict(d, dct_class=dict) # inner_dict is now a real dict
Out of curiosity could you use Prefect's own DotDict class instead of a named tuple?
I could probably use DotDict as a work-around in the meantime, yes, but long-term I still believe this should be fixed.
Since you still allow non-serialize-able objects to be passed in (e.g random classes) I'm confused as to why you're even trying to break up/deep-serialize lists/tuples/sets to begin with? Are you trying to achieve a best-effort as-much-as-possible serialization? In that case, it seems like the solution is perhaps to do something like if type(obj) in ('list', 'tuple', 'set') instead, since that way you just won't screw with custom types that inherit from those (I'm guessing NamedTuple isn't the only one that would be problematic?). Or is there some Prefect Cloud criteria here that I'm missing?
@fgblomqvist yes, there is a Cloud criteria: currently Parameters are required to be JSON-serializable objects, so any attempt at using namedtuples in Cloud would ultimately result in the type changing to a plain list. Soon we will be relaxing this condition, so that could be the appropriate time to try and fix this as well.
That's what I suspected. In that case, I think the solution I proposed might be the right one. Nonetheless, lmk if you want help with a PR on this once you've made up your mind on how you want to fix it (whether that happens now or later).
Most helpful comment
That's what I suspected. In that case, I think the solution I proposed might be the right one. Nonetheless, lmk if you want help with a PR on this once you've made up your mind on how you want to fix it (whether that happens now or later).