OpenBSD takes the prize for the hackiest implementation of std.fs.selfExePath, a function which is readily available on Linux, Darwin/macOS, FreeBSD, DragonFlyBSD, NetBSD, and Microsoft Windows.
https://github.com/ziglang/zig/blob/71ac5b151524288562bb78d9b0924bb3b0ba5e1c/lib/std/fs.zig#L2235-L2271
Let's put some friendly pressure on the OpenBSD project to improve this use case.
if I take my OpenBSD core-developer hat, I would say it will be complex 馃槂 but I am opened to discuss it
let's me try to explain the problem from kernel point of vue. kernel should only provider information which is accurate, else it could lead to obscure errors or eventually security issues. having a interface (syscall or sysctl entry) to retrieving the pathname of the current running executable which is accurate all the time is really complex:
Even Linux provides only a partial solution (to my knowledge). For example, zig code source has a comment in zig code saying readlink(2) will return garbage if the file is deleted (and there is no code for such case, even just panic).
And others OS implementations have other behaviour in such "ill" cases, like returning the pathname used at execve(2)-time, even if it points on a different file now. it is a easy footgun.
This lead to a second question: for what usage such path is need ? Because getting a pathname is per-se asking for trouble: the pathname could be out-of-date as soon as retrieved even if the kernel takes care of all the possible problems (see TOCTOU).
For citing an example, Rust env:current-exe() has been discussed a bit regarding this kind of problem, and several actions was done:
env::current_exe() which were dangerous : Don't use env::current_exe with libbacktraceRegarding zig's standard library, selfExePath() is used for several things. Here all entrypoints resulting possible call to selfExePath() (which could return wrong/racy result):
selfExePath()openSelfExe() - on !linux and !windows platformsselfExeDirPath()DebugInfo.getModuleForAddress() (via lookupModuleDl())printSourceAtAddress()dumpStackTraceFromBase(), writeStackTrace(), ...Which makes me to think that every zig binary could potentially call such racy function to do the complex task of parsing a binary, whereas not all OS provides strong guarantee on the path quality. but I am unsure of the cases where stacktrace is printed (instead of an error return path).
Regarding zig compiler (the binary), it is using exclusively selfExePath():
zig in cmdBuild()in both cases, the "traditional" way (when an application is source compiled) is to use a path provided at compile-time. but I agree it doesn't work for binary distribution where the installation directory isn't know at compile-time.
if the path returned is wrong, it would mean building a executable with "wrong" std, or code-execution to "wrong" binary. but I agree that such problem is more theorical than pratical.
Now, if I am returning to the original question to have such interface in OpenBSD kernel. I hope to have explained correctly why the kernel should not provide possibly inaccurate pathname. Eventually, an interface which return the descriptor (like for openSelfExe()) could be looked at, but the last time it was discussed there were problem regarding how to provide such descriptor without letting restricted programs to gain too many informations (because if any program could easily read the current executable, it could gain information on possible gadgets and their relatives positions for example).
I don't understand any concerns with letting an executable read its own data.
The naive, worst-case solution could always be to embed a copy of the entire (remaining) binary executable within some read-only data segment. That burdens the linker and doubles the file size, but there is no (realistic) way of an OS ever inhibiting this.
Most helpful comment
if I take my OpenBSD core-developer hat, I would say it will be complex 馃槂 but I am opened to discuss it
let's me try to explain the problem from kernel point of vue. kernel should only provider information which is accurate, else it could lead to obscure errors or eventually security issues. having a interface (syscall or sysctl entry) to retrieving the pathname of the current running executable which is accurate all the time is really complex:
so OpenBSD prefers provides no interface instead of an interface which could return errornous result.
Even Linux provides only a partial solution (to my knowledge). For example, zig code source has a comment in zig code saying
readlink(2)will return garbage if the file is deleted (and there is no code for such case, even just panic).And others OS implementations have other behaviour in such "ill" cases, like returning the pathname used at execve(2)-time, even if it points on a different file now. it is a easy footgun.
This lead to a second question: for what usage such path is need ? Because getting a pathname is per-se asking for trouble: the pathname could be out-of-date as soon as retrieved even if the kernel takes care of all the possible problems (see TOCTOU).
For citing an example, Rust
env:current-exe()has been discussed a bit regarding this kind of problem, and several actions was done:env::current_exe()which were dangerous : Don't use env::current_exe with libbacktraceRegarding zig's standard library,
selfExePath()is used for several things. Here all entrypoints resulting possible call toselfExePath()(which could return wrong/racy result):selfExePath()openSelfExe()- on !linux and !windows platformsselfExeDirPath()DebugInfo.getModuleForAddress()(vialookupModuleDl())printSourceAtAddress()dumpStackTraceFromBase(),writeStackTrace(), ...Which makes me to think that every zig binary could potentially call such racy function to do the complex task of parsing a binary, whereas not all OS provides strong guarantee on the path quality. but I am unsure of the cases where stacktrace is printed (instead of an error return path).
Regarding zig compiler (the binary), it is using exclusively
selfExePath():zigin cmdBuild()in both cases, the "traditional" way (when an application is source compiled) is to use a path provided at compile-time. but I agree it doesn't work for binary distribution where the installation directory isn't know at compile-time.
if the path returned is wrong, it would mean building a executable with "wrong" std, or code-execution to "wrong" binary. but I agree that such problem is more theorical than pratical.
Now, if I am returning to the original question to have such interface in OpenBSD kernel. I hope to have explained correctly why the kernel should not provide possibly inaccurate pathname. Eventually, an interface which return the descriptor (like for
openSelfExe()) could be looked at, but the last time it was discussed there were problem regarding how to provide such descriptor without letting restricted programs to gain too many informations (because if any program could easily read the current executable, it could gain information on possible gadgets and their relatives positions for example).