const std = @import("std");
pub fn main() void {
const foo:[50]u32 = []u32{1} ** 50;
const a = &foo[20];
const b = &foo[40];
std.debug.warn("a={}, b={}\n", a, b);
std.debug.warn("b-a={}\n", b-a);
}
Fails with:
test.zig:8:33: error: invalid operands to binary expression: '*const u32' and '*const u32'
std.debug.warn("b-a={}\n", b-a);
^
Related to #770 ?
yes I've used that as a workaround. However #770 said that subtraction directly should be defined.
Also I saw this go past in IRC:
15:42:03 | <andrewrk> | begin[0..end - begin]
15:42:44 | <andrewrk> | did I not implement subtraction? if so then begin[0..@ptrToInt(end) - @ptrToInt(begin)]
Your example doesn't work with addition either. Pointer arithmetic only works on certain types. As *const u32 is a pointer to a single u32, it isn't one of those types.
const std = @import("std");
pub fn main() void {
const foo:[50]u32 = []u32.{1} ** 50;
const a = @ptrCast([*]const u32, &foo[20]);
const b = @ptrCast([*]const u32, &foo[40]);
std.debug.warn("a={}, b={}\n", a, b);
std.debug.warn("b-a={}\n", b-(@ptrToInt(a)/@sizeOf(u32)));
}
You'll note a few things about this. First, you still can't directly add and subtract two pointers, only a pointer and an int. Second, when performing pointer arithmetic you're working with the size of the child type, so @ptrToInt(b-1) == @ptrToInt(b) - @sizeOf(u32).
Note that you can subtract integers from (unknown length) pointers, and you can add integers to pointers. What does not work is adding pointers to pointers or subtracting pointers from pointers.
I'm inclined to leave the behavior as status quo. I think that @ptrToInt(&b) - @ptrToInt(&a) is appropriately verbose, and answers all the questions, without requiring additional documentation, such as:
These questions all go away if we reject this proposal.
I am with @andrewrk, such actions should come with explicit verbosity. @ptrToInt will optimize appropriately under the hood, so don't think of it being overly heavier than b-a.
Move to reject -- thanks for the question!
are we subtracting bytes or "objects"?
That's the issue I'm running into here: when I subtract two pointers I expect them to work on size of the object taken up by that object in an array.
To get the array index from an item pointer and the array start, is this correct?
(@ptrToInt(&item_pointer) - @ptrToInt(&base))/@sizeOf(@typeOf(base[0]))
Are there scenarios where an item will take more than its size?
This is the code where I ran into this issue: https://github.com/daurnimator/zig-timeout-wheel/blob/5fa2c57b2b24e40d5220c1b6fe995bab92b28d8c/timeout_wheel.zig#L215
It turns out there are. We discussed this in IRC, copying it here:
@sizeOf(u24) == 3 but @alignOf(u24) == 4 on x64. This means that @sizeOf([10]u24) == 40, not the 30 that the code above assumes. Also, this violates the documented @alignOf guarantee: The result is a target-specific compile time constant. It is guaranteed to be less than or equal to @sizeOf(T).
I think that @sizeOf(u24) should be 4. Note that if you put a u24 in a struct, it becomes size 4 instead of 3:
const std = @import("std");
test "oaeu" {
std.debug.warn("u24={}\n", usize(@sizeOf(u24)));
std.debug.warn("struct={}\n", usize(@sizeOf(struct {
x: u24,
})));
}
Test 1/1 oaeu...u24=3
struct=4
OK
@sizeOf should be defined to return the number of bytes you would need to allocate for an array element of that type, for it to be properly aligned. This will give @alignOf the property of being less than or equal to @sizeOf.
As far as I'm aware, u24 (and u17... u23) is the only violation of this right now.
@sizeOf should be defined to return the number of bytes you would need to allocate for an array element of that type, for it to be properly aligned.
Why should a [2]u24 take more than 7 bytes?
e.g. imagine I want to know if I can fit in a u8 while still fitting inside an atomic instruction?
As far as I'm aware,
u24(andu17...u23) is the only violation of this right now.
I'm guessing you'll see this behavior for any type that is between 2 "native" types. For instance:
debug.warn("{},{}\n", usize(@sizeOf(u33)),usize(@alignOf(u33)));
5,8
On x64 that should mean u56 is the upper bound.
Anyway, I agree with andrewrk that @sizeOf should return the number of bytes the type will occupy in memory sans any packing. If you want to know how many bytes it is, you can use std.meta.bitCount(T) / std.meta.bitCount(u8) for primitives. For structs we can just add a meta function to total up all their members' bitCount/8. std.meta.packedSizeOf maybe.
Why should a
[2]u24take more than 7 bytes?
To make the elements aligned properly:
const std = @import("std");
test "[2]u24" {
var array: [2]u24 = []u24{ 0xaabbcc, 0xddeeff };
std.debug.warn("size={}\n", usize(@sizeOf([2]u24)));
std.debug.warn("ptr of index 0 = 0x{x}\n", @ptrToInt(&array[0]));
std.debug.warn("ptr of index 1 = 0x{x}\n", @ptrToInt(&array[1]));
std.debug.warn("difference = {}\n", @ptrToInt(&array[1]) - @ptrToInt(&array[0]));
}
Test 1/1 [2]u24...size=8
ptr of index 0 = 0x7ffec0b5b878
ptr of index 1 = 0x7ffec0b5b87c
difference = 4
OK
The processor needs 4 byte alignment, because loads/stores/registers are not actually 24 bits, but 32 bits. If the first element of the array took up 3 bytes, then the element at index 1 would start at 0x7ffec0b5b87b, which violates the required alignment of u24.
Also here is an alternative to @tgschultz's example, without @ptrCast:
const std = @import("std");
pub fn main() void {
const foo: [50]u32 = []u32{1} ** 50;
const a = foo[20..].ptr;
const b = foo[40..].ptr;
std.debug.warn("a={}, b={}\n", a, b);
std.debug.warn("b-a={}\n", b - a);
}
Pointer-pointer subtraction should be supported as the inverse function of pointer+scalar addition.
var array: [32]u32 = undefined;
var b: [*]u32 = array[0..].ptr; // base
var i: usize = 5; // index
var e: [*]u32 = base + index; // element address
assert(e - i == b); // inverse of e = b + i, solved for b.
assert(e - b == i); // inverse of e = b + i, solved for i
var e2: *u32 = &array[i];
assert(e2 - b == i); // this should also work
The usecase is: subtract a pointer to an element in an array from the pointer to the base of the array to get the index of the element in the array. There are several restrictions we can assume from this usecase:
*T or [*]T and the right side (array base pointer) is a [*]T. const and volatile qualifiers are ignored. align qualifiers have to match (are they even allowed on [*]T pointers?).usize or isize? The answer: usize, because array indexes are usize.array[-1], which isn't allowed, or array[0xffffffffffffffff], which is almost certainly a bug.array[5] offsets the pointer by 5 times the object size, therefore the difference in memory addresses should be divided by the child size.array[3.14].There's another usecase we could consider, but I don't know if it's important. The usecase is subtracting the addresses of two elements in an array, but you don't know which element has a greater index than the other. This would cause trouble with the "What if you would get a negative number" situation above. I don't think this is a problem in practice, because this usecase is rare and is fraught with peril even if the language supported it in some capacity. I'd be happy to reconsider if shown some real actual code trying to do this.
Pointer-pointer subtraction should be supported as the inverse function of pointer+scalar addition.
After reconsidering and reading your counter proposal, I agree. I think the usecase of finding array index based on an element pointer and base pointer is valid, as @daurnimator demonstrated above.
Let me consider whether to allow single-item pointers at all. My intuition is that single-item pointers should be forbidden from participating in pointer addition and subtraction without a cast. I note that in the timeout wheel usecase example above, both sides of the subtraction are single-item pointers. However one could make the argument that the pending: ?*TimeoutList, should be changed to pending: ?[*]TimeoutList, since it actually represents a pointer to an element rather than a single item. And then on the other side of the subtraction, self.wheel[0][0] could be changed to self.wheel[0][0..].ptr to get a [*]T.
alignqualifiers have to match (are they even allowed on[*]Tpointers?).
Yes they are allowed and they work the same as single-item pointers. I think align qualifiers can be ignored when doing pointer subtraction. As a rule of thumb it should be true that anywhere a less-aligned pointer is needed, a more-aligned pointer is accepted.
Pointer math seems necessary for interfacing with existing C code and libraries, but would it be considered a best practice within a pure Zig project? If not, it might be worth requiring a builtin function to use it rather than making it possible via the + and - operators.
I was thinking more or less the same thing, this could all be easily accomplished with a few functions in std. But just because I personally haven't had any particular need for this feature doesn't mean it wouldn't be used often enough in some domain to be part of the language.
The processor needs 4 byte alignment, because loads/stores/registers are not actually 24 bits, but 32 bits.
On what cpu(s)? 24 bit architectures exist e.g. eZ80.
On what cpu(s)? 24 bit architectures exist e.g. eZ80.
On such an architecture, a u24 would have alignment 1 (or alignment 3?), and then @sizeOf(u24) would be 3, so everything works. Above I was referring to an architecture where u24 has alignment 4, such as x86_64. What I'm proposing here, in addition to @thejoshwolfe's proposal + my edits, is to add a check in the implementation of @sizeOf so that if it would be less than @alignOf then it returns @alignOf instead.
~Consider this problem, without this proposal:~
const std = @import("std");
test "@sizeOf(u24) is the store size of a u24" {
var x: u24 = 0xaabbcc;
const ptr = @ptrCast([*]u8, &x);
ptr[@sizeOf(u24)] = 0x99;
std.debug.assert(x == 0xaabbcc);
}
I was going to cite this test as a problem, but it actually passes. LLVM generates code like this:
0000000000000010 <entry>:
10: 55 push %rbp
11: 48 89 e5 mov %rsp,%rbp
14: c6 45 fe aa movb $0xaa,-0x2(%rbp)
18: 66 c7 45 fc cc bb movw $0xbbcc,-0x4(%rbp)
1e: 48 8d 45 fc lea -0x4(%rbp),%rax
22: 48 89 45 f0 mov %rax,-0x10(%rbp)
26: 48 8b 45 f0 mov -0x10(%rbp),%rax
2a: c6 40 03 99 movb $0x99,0x3(%rax)
2e: 0f b6 4d fe movzbl -0x2(%rbp),%ecx
32: c1 e1 10 shl $0x10,%ecx
35: 0f b7 55 fc movzwl -0x4(%rbp),%edx
39: 09 ca or %ecx,%edx
3b: 89 d0 mov %edx,%eax
3d: 5d pop %rbp
3e: c3 retq
You can see that the load actually only reads 3 bytes, and the store only reads 3 bytes as well.
So the size 3 only becomes a problem when it's in an array, where the actual formula is max(size_of, align_of).
I can't draw a conclusion yet, but I have to go, so I'm posting this comment in its current form and I will follow up later.
On such an architecture, a
u24would have alignment 1 (or alignment 3?), and then@sizeOf(u24)would be3, so everything works.
True. Yet something about @sizeOf() being different per architecture doesn't sit well with me.
Above I was referring to an architecture where u24 has alignment 4, such as x86_64.
Note that modern x86_64 doesn't really have any penalties for unaligned accesses. See e.g. https://stackoverflow.com/a/45116730/282536
It's not clear to me what "alignment" is meant to mean: is it meant to be access requirements? or is it meant to be the size of the accessed data? or is it register size?
Such operation should be implemented in stdlib. Possible implementation:
fn getIndex(array: [*]T, element: *T) usize {
return (@ptrToInt(element) - @ptrToInt(array)) / @sizeOf(T);
}
Maybe even remove pointer +/- integer and implement it in stdlib. For example:
var array: [32]u32 = undefined;
var b: [*]u32 = array[0..].ptr; // base
var i: usize = 5; // index
var e: [*]u32 = std.ptrmath.offsetForward(base, index); // element address
assert(std.ptrmath.offsetBackward(i, e) == b); // inverse of e = b + i, solved for b.
assert(std.ptrmath.getIndex(b, e) == i); // inverse of e = b + i, solved for i
var e2: *u32 = &array[i];
assert(e2 - b == i); // this should also work
Another idea is _fn getRelativeIndex(element: *T, element2: *T) isize;_
Note that modern x86_64 doesn't really have any penalties for unaligned accesses.
It'd still be an issue on architectures like ARM and PPC/POWER, where unaligned access can trap.
Before you go and try that: you might never see an actual trap; the kernel fixes them up unless you disable that with a prctl(). They're quite literally 1,000x slower than aligned accesses, however.
(PPC is interestingly quirky in that the high-end chips normally support unaligned loads and stores, whereas the low-end chips - what sits in your toaster or microwave - do not. A function of restricted die area.)
Note that modern x86_64 doesn't really have any penalties for unaligned accesses. See e.g. https://stackoverflow.com/a/45116730/282536
It's not clear to me what "alignment" is meant to mean: is it meant to be access requirements? or is it meant to be the size of the accessed data? or is it register size?
I don't think your conclusion follows from the citation you gave. Here's a breakdown of the benchmark that it cites: https://stackoverflow.com/a/45129784/432
Alignment is talking about the virtual memory address where the data is stored. The places that alignment can be specified in Zig is in variable declarations and functions - which provides a guarantee that the data will be stored at an address with at least the requested alignment - and in pointers - which is a type system feature to enable code to specify minimum alignment requirements of pointers.
I don't think your conclusion follows from the citation you gave. Here's a breakdown of the benchmark that it cites: https://stackoverflow.com/a/45129784/432
I did read through that. As far as I understand it there is only a penalty when you go across cache lines.
Most helpful comment
I'm guessing you'll see this behavior for any type that is between 2 "native" types. For instance:
On x64 that should mean u56 is the upper bound.
Anyway, I agree with andrewrk that
@sizeOfshould return the number of bytes the type will occupy in memory sans any packing. If you want to know how many bytes it is, you can usestd.meta.bitCount(T) / std.meta.bitCount(u8)for primitives. For structs we can just add a meta function to total up all their members' bitCount/8.std.meta.packedSizeOfmaybe.