According to #40374 , a part of private information is leaked by the path string living in the compiled binaries. To protect privacy and help debugging, I think we can let rustc to mangle the path. Here is my solution:
Basically, we can hide the 'insignificant' part of the path (which usually contains some private and/or unrelated info), leave 'significant' part untouched. Then what is the 'significant' part of a path? Here is an example: Assume that we have a project foobar
which is in user's home directory (here I use a windows path, on *nix things work similarly):
C:\Users\username\Documents\foobar
In this case, the useful part is the crate name and the part after the crate name, i.e.
\foobar\lib.rs
Assuming we have a mod called 'somemod', then after mangling, the new path looks like:
[crate]\foobar\somemod\mod.rs
which not only saves the relatitionship information between sources files, but also protects the privacy of the user (since no more user name or aboslute path exists) !
The next question is how to process all paths under this rule. From what I know, all compiled code of a crate comes from 4 sources:
We could specify different root names for these sources to indicate their origin:
[crates.io]
. Example:C:\Users\username\.cargo\registry\src\github.com-1ecc6299db9ec823\winapi-0.2.8\
will be mangled to [crate.io]\winapi-0.2.8\
[username@git server name]
. Example: https://github.com/rust-lang/rust
will be manged to [[email protected]]/rust
[local]
. Example: D:\workspaces\foobarng\
will be mangled to [local]\foobarng\
crate
. Example: C:\Users\username\Documents\foobar\
will be mangled to [crate]\foobar\
And for helping debuggers, all paths in debugging information won't be modified. Thus, users still know where the debugging code is, and won't worry about leaking privacy (just need stripping out debugging information before packaging on *nix, or not distributing .pdb files on windows).
Related to #38322 & #39130 (which are for debuginfo). Likely makes sense to use the same mapping mechanism for filenames in panics too.
(edit: linking to #40492 as it is the PR for #40374)
We may need to record path mappings in a file if we also want to process paths in debug information.
Related to #38322 & #39130 (which are for debuginfo). Likely makes sense to use the same mapping mechanism for filenames in panics too.
--remap-path-prefix
will also remap panic messages.
Seems like we now have a working and stable solution. Closed.
So what's the current solution to not include the path in the exe (when building with cargo)?
Try something like RUSTFLAGS=--remap-path-prefix=<your-src-dir>=src cargo build
.
Can we reopen this to track having rustc do this by default? You shouldn't have to know both that rustc does this and this obscure mechanism for changing it to protect privacy. It should just do it.
I think this would need an RFC to come up with a solid solution that doesn't break debugging (which relies on these paths being contained in debuginfo). cc @rust-lang/core
@michaelwoerister But it should be stripped automatically from --release builds!
An RFC would need to cover both debug and release builds.
Try something like
RUSTFLAGS=--remap-path-prefix=<your-src-dir>=src cargo build
.
@michaelwoerister this didnt work for me and a username and full path was still present. and yes i also used the --release
flag. i also second @Boscop in that this should be automatic for release builds. why does the rfc need to include debug when we are talking about release specifically? @jimmycuadra is also right normal users shouldnt have to know this exists or be expected to manually specify the opts all the time since its so obscure.
if it helps you including user ids like this violates gdpr https://gdpr.eu/eu-gdpr-personal-data/ so this should be addressed by the rust team. in 2020 people care about privacy and this can be a put off like https://github.com/rust-lang/mdBook/issues/847 where people actively worked away from the project due to the disrespect of user privacy
cc @sneak & @aral who might also have some words about this
To be honest, I expected that release binaries do not contain information like this. Would this be an option to cut the string in these cases?
To be honest, I expected that release binaries do not contain information like this. Would this be an option to cut the string in these cases?
@dns2utf8
not sure if part of the message got dropped. what is the option you are suggesting? the provided RUSTFLAGS dont actually work. it seems the rust team thinks this is acceptable for release binaries for some reason. perhaps more attention on the issue may help given the push for privacy in 2020
unrelated but this also causes an issue with reproducible builds since the strings and usernames will differ from person to person
It seems to me that two different people building two identical programs with identical build args on different (but same-architecture) systems should receive the same output binary regardless of the name in $USER
or the string in $HOME
, for all build types (but especially release).
I'm not going to weigh in on the privacy issue (I think that if you're privacy conscious, $USER
should be anonymous
or user
or something already), just the principle of least astonishment: what I would expect, not knowing the tooling, is that the same thing built on different systems would result in the same output. Deviation from this would surprise me, given what I know about the generally excellent caliber of Rust stewardship (and the well-known gargantuan task of stripping out this unnecessarily nondeterministic stuff from other distros/packages in the pursuit of deterministic builds). Tool designers should probably not be throwing more rocks into their path.
This probably means stripping any mention of local environment, build time, and file paths before the root/prefix of the build.
Most helpful comment
It seems to me that two different people building two identical programs with identical build args on different (but same-architecture) systems should receive the same output binary regardless of the name in
$USER
or the string in$HOME
, for all build types (but especially release).I'm not going to weigh in on the privacy issue (I think that if you're privacy conscious,
$USER
should beanonymous
oruser
or something already), just the principle of least astonishment: what I would expect, not knowing the tooling, is that the same thing built on different systems would result in the same output. Deviation from this would surprise me, given what I know about the generally excellent caliber of Rust stewardship (and the well-known gargantuan task of stripping out this unnecessarily nondeterministic stuff from other distros/packages in the pursuit of deterministic builds). Tool designers should probably not be throwing more rocks into their path.This probably means stripping any mention of local environment, build time, and file paths before the root/prefix of the build.