Occasionally there are questions on the forums regarding the number types for boundary_id and material_id, and ultimately these questions arise due to the choice of type for these data members. At this moment, setting the value works intuitively
cell->set_material_id(5);
but getting the expected value when outputting to screen requires some additional work
stream << cell->material_id() << std::endl; // Prints nonsense
stream << static_cast<int>(cell->material_id()) << std::endl; // Prints expected value
This little quirk can be frustrating for the novice users and its not really obvious _why_ the behaviour is as it is.
Furthermore, I personally find it quite restrictive to work within the range [0,255] for these two parameters. I typically have a zoo of boundary conditions and material id's that I work with, and I have to devise some strange "logical" categorisation to work within this restriction. I remember in my early days of working with deal.II I got burned by setting material_id to 300 or something like that.
Changing the data type from unsigned char to, for example, unsigned short (range [0,65535]) or unsigned int (like the other types), would alleviate these pains for both myself and, I'm guessing, many of the users. Since the next release is a major one, this would be the right time to make such a change.
Is there any major reason as to why we would not want to change the type for material_id and boundary_id? I understand that memory usage was likely the original reason for keeping these data types as small as possible, but given that we now also store subdomain_ids, manifold_ids, active_fe_indexes etc. associated with each cell, this is probably less of an issue nowadays.
Funny story: on my computer there is no difference in memory usage due to structure packing:
#include <iostream>
template <int vertices_per_cell, typename IDType>
struct CellData
{
unsigned int vertices[vertices_per_cell];
union
{
IDType material_id;
IDType boundary_id;
};
unsigned int manifold_id;
};
int main()
{
std::cout << "sizeof(unsigned int) = " << sizeof(unsigned int) << '\n';
std::cout << "sizeof(unsigned char) = " << sizeof(unsigned char) << '\n';
std::cout << "1D:\n";
std::cout << sizeof(CellData<2, unsigned int>) << '\n';
std::cout << sizeof(CellData<2, unsigned char>) << '\n';
std::cout << "2D:\n";
std::cout << sizeof(CellData<4, unsigned int>) << '\n';
std::cout << sizeof(CellData<4, unsigned char>) << '\n';
std::cout << "3D:\n";
std::cout << sizeof(CellData<8, unsigned int>) << '\n';
std::cout << sizeof(CellData<8, unsigned char>) << '\n';
}
output:
sizeof(unsigned int) = 4
sizeof(unsigned char) = 1
1D:
16
16
2D:
24
24
3D:
40
40
which makes sense since we don't want to put the manifold_id on something that is not already 4-byte aligned, so those bytes are currently just empty.
I recommend full speed ahead!
Update: a correction: 4 bytes, not 8
The comparison by @drwells doesn't really tell us very much other than the fact that a structure that contains a small object is padded to a size that allows for aligned allocation. But we don't store boundary_ids in types that only contain one object of this kind, but rather in arrays of these objects, so they need not be padded -- only the overall array will be padded. (This, by the way, happens in include/deal.II/grid/tria_level.h IIRC.) In other words, memory consumption will go up if we increase the size of these objects.
That said, I'm completely unconcerned about this. In fact, I'm in support of making the suggested change.
Yes. Regardless of that structure and its padding we still have
class TriaObjects
{
// ...
std::vector<BoundaryOrMaterialId> boundary_or_material_id;
// ...
};
which will certainly take up more memory if we switch but I agree that this is such an insignificant performance problem that if it prevents someone from writing one static_cast<int>(cell->material_id()) then it is worth doing.
Great, since this seems to have had a positive response I'll put together a PR in the coming week that implements this change.
I would also prefer to have those two ids in bigger sizes.
In my code I need to colourise a lot of combinations of boundary_ids (multiphysics problem with component-wise mixed boundary conditions). To get my head not in to much workload in applying boundary-colours, I wanted to write something like bnd_con1 + bnd_con2 + ... + bnd_conN.
With std::char it is just working, but any extension (which will come in the future) would need a complicated remapping to use the 255 (-2?) colours.
I do not have any crucial memory issues due to a mpi-approach, but I see the point, that this will lead to higher amount in memory usage.
Looking forward to the PR of @jppelteret, then we can see how much more this would be.
@koecher Thanks for your feedback. You can expect this change to be tackled during the course of the next week :-)