Right now format_args!
uses, e.g. ArgumentV1::new(&runtime_data, Debug::fmt)
(for {:?}
), at runtime, using up two pointers per argument at runtime instead of just one (&runtime_data
).
With allow_internal_unsafe
and #44240, we can place the (e.g. Debug::fmt
) fn
pointers in (rvalue-promoted) 'static
data, the remaining hurdle is how to infer the type of the runtime data.
That is, Debug::fmt
is really <_ as Debug>::fmt
and that _
is right now inferred because of ArgumentV1::new
's signature typing them together. If they're separate, we need something new.
I propose using the HList
pattern (struct HCons<H, T>(H, T); struct HNil;
- so for 3 elements, of types A
, B
and C
you'd have HCons<A, HCons<B, HCons<C, HNil>>>
), with #[repr(C)]
, which would give it a deterministic layout which matches that of an array, that is, these two:
&'static HCons<fn(&A), HCons<fn(&B), HCons<fn(&C), HNil>>>
&'static [unsafe fn(*const Opaque); 3]
have the same representation, and the latter can be unsized into a slice. This transformation from HList
to array (and then slice) can be performed on top of a safe, rvalue-promoted HCons
, which is a necessary requirement for moving the fn
pointers into 'static
data at all.
For inference, we can simply insert some function calls to match up the types, e.g. to infer B
we could dofmt::unify_fn_with_data((list.1).0, &b)
, which would makeB
into typeof b
.
It might actually be simpler to have a completely safe "builder" interface, which combines the HList
of formatters with a HList
of runtime references, unifying the types, but I'm a bit worried about compile-times due to all the trait dispatch - in any case, the impact should be measured.
@rustbot claim
For inference, we can simply insert some function calls to match up the types, e.g. to infer
B
we could dofmt::unify_fn_with_data((list.1).0, &b)
, which would makeB
intotypeof b
.
Not sure what I was thinking there, it should be much easier than that!
struct ArgMetadata<T: ?Sized> {
// Only `unsafe` because of the later cast we do from `T` to `Opaque`.
fmt: unsafe fn(&T, &mut Formatter<'_>) -> Result,
// ... flags, constant string fragments, etc.
}
// TODO: maybe name this something else to emphasize repr(C)?
#[repr(C)]
struct HCons<T, Rest>(T, Rest);
// This would have to be in a "sealed module" to make it impossible to implement on more types.
trait MetadataFor<D> {
const LEN: usize;
}
impl MetadataFor<()> for () {
const LEN: usize = 0;
}
impl<'a, T: ?Sized, D, M> MetadataFor<HCons<&'a T, D>> for HCons<ArgMetadata<T>, M>
where M: MetadataFor<D>
{
const LEN: usize = M::LEN;
}
impl<'a> Arguments<'a> {
fn new<M, D>(meta: &'a M, data: &'a D) -> Self
where M: MetadataFor<D>
{
Self {
meta: unsafe { &*(meta as *const _ as *const [ArgMetadata<Opaque>; M::LEN]) },
data: unsafe { &*(data as *const _ as *const [&Opaque; M::LEN]) },
}
}
}
i.e. we build two HList
s "in parallel", one with entirely constant metadata, and the other with the references to the runtime data, and then all of the type inference can come from the where
clause on fmt::Arguments::new
, with zero codegen cruft!
EDIT: @m-ou-se had to remind me why I went with the explicit inference trick in the first place: random access arguments :disappointed:
(maybe with enough const generics abusage we could have D: IndexHList<i, Output = T>
but that's a lot of effort)
I have a new implementation of fmt::Arguments
that is only two pointers in size by using a new form of 'static metadata that contains both the string pieces and any formatting options (if any). (So it now fits in a register pair, which is really nice.) It also only requires one pointer on the stack per argument instead of two, as @eddyb apparently already suggested three years ago in this issue. ^^ (Completely missed this issue until @eddyb pointed it out yesterday. ^^')
What's still left is updating format_args!()
to produce this new type instead, which will run into the problem of splitting the object pointer and function pointer (which are currently together in ArgumentV1
), as the function pointer should now go in the 'static metadata instead. The suggestion in this issue looks like a good way to do that. Will try to implement that soon. Looking forward to the perf run :)
Most helpful comment
Not sure what I was thinking there, it should be much easier than that!
i.e. we build two
HList
s "in parallel", one with entirely constant metadata, and the other with the references to the runtime data, and then all of the type inference can come from thewhere
clause onfmt::Arguments::new
, with zero codegen cruft!EDIT: @m-ou-se had to remind me why I went with the explicit inference trick in the first place: random access arguments :disappointed:
(maybe with enough const generics
abusage we could haveD: IndexHList<i, Output = T>
but that's a lot of effort)