udocker is a basic user tool to execute simple docker containers in user space without requiring root privileges.
It is the only means to deploy containerized jupyter+X on our super computers.
podman, docker and singularity can not be used (subuid/guid issues…) on the server nodes.
Current Julia versions (including nightly build) fail to run within a udocker-container. The container OSses testet are ubuntu (18.04) and centos:latest.
Older Julia versions like 1.0.5 and older perfectly work.
Tests with podman or docker are successful for all versions of Julia.
When I execute julia inside a udocker container I get the following error message (the same under ubuntu and centos):
./julia-1.3.1/bin/julia
ERROR: IOError: stat: permission denied (EACCES) for file "/root/julia-1.3.1/bin/../etc/julia/startup.jl"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] isfile at ./stat.jl:311 [inlined]
[3] load_julia_startup() at ./client.jl:314
[4] exec_options(::Base.JLOptions) at ./client.jl:258
[5] _start() at ./client.jl:460
curl https://raw.githubusercontent.com/indigo-dc/udocker/devel/udocker.py > udocker
chmod u+rx ./udocker
./udocker install
export PROOT_NO_SECCOMP=1
udocker pull ubuntu
udocker create --name=ubuntu ubuntu
udocker run --user=root --env="HOME=/root" --workdir="/root" ubuntu
Within the ubuntu container run:
apt update && apt install wget
wget https://julialang-s3.julialang.org/bin/linux/x64/1.3/julia-1.3.1-linux-x86_64.tar.gz
tar xvzf julia-1.3.1-linux-x86_64.tar.gz
./julia-1.3.1/bin/julia
What happens if you start Julia with the flag --startup-file=no
?
For example: (inside the container)
./julia-1.3.1/bin/julia --startup-file=no
./julia-1.3.1/bin/julia --startup-file=no
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.3.1 (2019-12-30)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
ERROR: IOError: stat: permission denied (EACCES) for file "/root/.julia/logs"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] isdir at ./stat.jl:311 [inlined]
[3] #mkpath#8(::UInt16, ::typeof(mkpath), ::String) at ./file.jl:217
[4] mkpath at ./file.jl:215 [inlined]
[5] setup_interface(::REPL.LineEditREPL, ::Bool, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/REPL/src/REPL.jl:859
[6] #setup_interface#45(::Bool, ::Any, ::typeof(REPL.setup_interface), ::REPL.LineEditREPL) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/REPL/src/REPL.jl:769
[7] setup_interface at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/REPL/src/REPL.jl:769 [inlined]
[8] (::Pkg.var"#1#2")(::REPL.LineEditREPL) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Pkg/src/Pkg.jl:432
[9] __atreplinit(::REPL.LineEditREPL) at ./client.jl:338
[10] #invokelatest#1 at ./essentials.jl:709 [inlined]
[11] invokelatest at ./essentials.jl:708 [inlined]
[12] _atreplinit at ./client.jl:345 [inlined]
[13] (::Base.var"#770#772"{Bool,Bool,Bool,Bool})(::Module) at ./client.jl:381
[14] #invokelatest#1 at ./essentials.jl:709 [inlined]
[15] invokelatest at ./essentials.jl:708 [inlined]
[16] run_main_repl(::Bool, ::Bool, ::Bool, ::Bool, ::Bool) at ./client.jl:366
[17] exec_options(::Base.JLOptions) at ./client.jl:304
[18] _start() at ./client.jl:460
[ Info: Disabling history file for this session
Alright let’s try this: (inside the container)
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
It's a bit odd for stat
to throw. Usually that only happens when the permissions are really odd. Can you show the permissions, as seen inside the container for the /root
directory, the /root/.julia
directory and the /root/julia-1.3.1/bin/../etc/julia
directory?
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
works without errors, but nothing can be installed:
julia> using Pkg
ERROR: IOError: stat: permission denied (EACCES) for file "/root/.julia/environments/v1.3"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] isdir at ./stat.jl:311 [inlined]
[3] load_path_expand(::String) at ./initdefs.jl:241
[4] load_path() at ./initdefs.jl:288
[5] identify_package(::String) at ./loading.jl:219
[6] identify_package(::Base.PkgId, ::String) at ./loading.jl:206
[7] identify_package at ./loading.jl:200 [inlined]
[8] require(::Module, ::Symbol) at ./loading.jl:882
No /root/.julia
directory is created.
ls -la
total 180352
drwxr-xr-x 5 root root 4096 Feb 29 16:51 .
drwxr-xr-x 21 root root 4096 Feb 29 16:43 ..
-rw------- 1 root root 1460 Feb 29 16:52 .bash_history
-rw-r--r-- 1 root root 3106 Apr 9 2018 .bashrc
drwxr-xr-x 3 root root 4096 Feb 29 16:50 .local
-rw-r--r-- 1 root root 148 Aug 17 2015 .profile
-rw-r--r-- 1 root root 248 Feb 29 16:50 .wget-hsts
drwxr-xr-x 7 root root 4096 Sep 9 21:08 julia-1.0.5
-rw-r--r-- 1 root root 88706549 Sep 11 21:22 julia-1.0.5-linux-x86_64.tar.gz
drwxr-xr-x 8 root root 4096 Dec 30 22:12 julia-1.3.1
-rw-r--r-- 1 root root 95929584 Dec 31 00:20 julia-1.3.1-linux-x86_64.tar.gz
ls -la /root/julia-1.3.1/bin/../etc/julia
total 12
drwxr-xr-x 2 root root 4096 Dec 30 22:12 .
drwxr-xr-x 3 root root 4096 Dec 30 22:12 ..
-rw-r--r-- 1 root root 162 Dec 30 22:12 startup.jl
With ./julia-1.0.5/bin/julia
everything works as expected.
First, try this: (inside the container)
export JULIA_DEPOT_PATH="$HOME/.julia:"
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
If that does not work, then try this instead: (inside the container)
export JULIA_DEPOT_PATH="$HOME/.julia"
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
I have a few additional questions:
/
directory?whoami
, groups
, and id
inside the container?./julia-1.3.1/bin/julia --startup-file=no --history-file=no
, and then run the following commands in the Julia REPL and post the results:julia> @show homedir()
julia> @show Base.DEPOT_PATH
julia> versioninfo(stdout; verbose = true)
~/.udocker
directory (on the host machine) and then try again to reproduce?export JULIA_DEPOT_PATH="$HOME/.julia:"
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
and
export JULIA_DEPOT_PATH="$HOME/.julia"
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
both lead to ERROR: IOError: stat: permission denied (EACCES) for file "/root/.julia/environments/v1.3"
when trying to using Pkg
.
Permissions of /
ls -la /
total 72
drwxr-xr-x 21 root root 4096 Mar 1 08:15 .
drwxr-xr-x 21 root root 4096 Mar 1 08:15 ..
drwxr-xr-x 2 root root 4096 Feb 19 01:17 bin
drwxr-xr-x 2 root root 4096 Apr 24 2018 boot
drwxr-xr-x 19 root root 4400 Mar 1 07:52 dev
drwxr-xr-x 31 root root 4096 Mar 1 08:16 etc
drwxr-xr-x 2 root root 4096 Apr 24 2018 home
drwxr-xr-x 9 root root 4096 Mar 1 08:15 lib
drwxr-xr-x 2 root root 4096 Feb 19 01:15 lib64
drwxr-xr-x 2 root root 4096 Feb 19 01:14 media
drwxr-xr-x 2 root root 4096 Feb 19 01:14 mnt
drwxr-xr-x 2 root root 4096 Feb 19 01:14 opt
dr-xr-xr-x 298 root root 0 Mar 1 07:50 proc
drwx------ 3 root root 4096 Mar 1 08:16 root
drwxr-xr-x 5 root root 4096 Feb 21 22:20 run
drwxr-xr-x 2 root root 4096 Feb 21 22:20 sbin
drwxr-xr-x 2 root root 4096 Feb 19 01:14 srv
dr-xr-xr-x 13 root root 0 Mar 1 07:50 sys
drwxr-xr-x 2 root root 4096 Mar 1 08:16 tmp
drwxr-xr-x 10 root root 4096 Feb 19 01:14 usr
drwxr-xr-x 11 root root 4096 Feb 19 01:17 var
root@desktop:~# whoami
root
root@desktop:~# groups
root adm cdrom sudo audio dip video plugdev G111 G114 G121 G127 G133 G134 G1000 G1002
root@desktop:~# id
uid=0(root) gid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),29(audio),30(dip),44(video),46(plugdev),111(G111),114(G114),121(G121),127(G127),133(G133),134(G134),1000(G1000),1002(G1002)
and finally:
julia> @show homedir()
homedir() = "/root"
"/root"
julia> @show Base.DEPOT_PATH
Base.DEPOT_PATH = ["/root/.julia", "/root/julia-1.3.1/local/share/julia", "/root/julia-1.3.1/share/julia"]
3-element Array{String,1}:
"/root/.julia"
"/root/julia-1.3.1/local/share/julia"
"/root/julia-1.3.1/share/julia"
julia> versioninfo(stdout; verbose = true)
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
uname: Linux 5.4.0-16-generic #19-Ubuntu SMP Wed Feb 26 18:35:11 UTC 2020 x86_64 x86_64
CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz:
speed user nice sys idle irq
#1 3800 MHz 5963 s 62 s 1828 s 220891 s 0 s
#2 3800 MHz 6187 s 56 s 2048 s 215920 s 0 s
#3 3801 MHz 6060 s 214 s 1964 s 219314 s 0 s
#4 3800 MHz 6131 s 154 s 2024 s 219160 s 0 s
#5 3808 MHz 6243 s 84 s 1872 s 220415 s 0 s
#6 3808 MHz 5916 s 3 s 1852 s 221188 s 0 s
#7 3800 MHz 5134 s 32 s 1829 s 221457 s 0 s
#8 3800 MHz 5726 s 19 s 1886 s 220930 s 0 s
Memory: 15.52651596069336 GB (10002.5703125 MB free)
Uptime: 2303.0 sec
Load Avg: 0.3193359375 0.29736328125 0.34814453125
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
JULIA_DEPOT_PATH = /root/.julia:
JULIA_DEPOT_PATH = /root/.julia:
HOME = /root
TERM = xterm-256color
PATH = /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Can you run println(mktempdir())
in the Julia REPL and post the results?
./julia-1.3.1/bin/julia --startup-file=no --history-file=no
julia> println(mktempdir())
/tmp/jl_5ejX2C
In the Julia REPL, try the following:
```julia
julia> rm("/root/.julia"; force = true, recursive = true)
julia> mkpath("/root/.julia")
julia> open("/root/.julia/foo.txt", "w") do io
println(io, "hello world")
end
julia> println(read("/root/.julia/foo.txt", String))
julia> println(realpath("/root/.julia/foo.txt"))
Also:
- What happens if you delete your ~/.udocker directory (on the host machine) and then try again to reproduce?
Oh, and also:
julia> Base.stat("/")
julia> Base.stat("/root")
julia> Base.stat("/tmp")
Also, is there any chance that your host system is running SELinux?
julia> rm("/root/.julia"; force = true, recursive = true)
There is no .julia dir, so I can't delete it
julia> mkpath("/root/.julia")
ERROR: IOError: stat: permission denied (EACCES) for file "/root/.julia"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] isdir at ./stat.jl:311 [inlined]
[3] #mkpath#8(::UInt16, ::typeof(mkpath), ::String) at ./file.jl:217
[4] mkpath(::String) at ./file.jl:215
[5] top-level scope at REPL[2]:1
no ~/.julia
created
When I manually create mkdir ~/.julia
, file creation under julia works then!
julia> open("/root/.julia/foo.txt", "w") do io
println(io, "hello world")
end
julia> println(read("/root/.julia/foo.txt", String))
hello world
julia> println(realpath("/root/.julia/foo.txt"))
/root/.julia/foo.txt
julia> Base.stat("/")
StatStruct(mode=0o040755, size=4096)
julia> Base.stat("/root")
StatStruct(mode=0o040700, size=4096)
julia> Base.stat("/tmp")
StatStruct(mode=0o041777, size=2957312)
I always start with a rm -rf ~/udocker
;)
SELinux…I don't know. Tested so far on vanilla Ubuntu 19.10 and 20.04 hosts.
Update:
On Red Hat Enterprise Linux Server release 7.7 host everything works as expected with ubuntu container.
There is no .julia dir, so I can't delete it
You can run that command in Julia whether or not the file/directory exists. In Julia, if you run the rm
function with force = true
, it won’t throw an error if the file/directory doesn’t exist.
SELinux…I don't know.
You can run the following commands in bash
to see if SELinux is enabled:
getenforce
sestatus
This seems like a bug in one of the following:
Base.Filesystem.mkpath
in JuliaBase.Filesystem.isdir
in JuliaBase.Filesystem.stat
in Juliajl_stat
in Juliastat
in libuvWhat happens if you go into a brand-new container, enter a Julia REPL, and run this:
julia> Base.stat("/root/.julia")
Also, on one of the hosts that does have this error, can you download and extract Julia on the host (not inside udocker, not inside a container), open a Julia REPL, and try to reproduce this error?
That will help us figure out of the problem is with your host machine or with udocker.
Inside a brand new container:
julia> Base.stat("/root/.julia")
ERROR: IOError: stat: permission denied (EACCES) for file "/root/.julia"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] top-level scope at REPL[1]:1
On the host (all machines in reach), I do not get any errors. Everything works as expected.
SELinux is disabled on my Ubuntu machines and on the cluster RHEL 7.7.
Have there been substantial changes between julia 1.0.5 and the following versions with respect to directory creation/stat/security/rights management?
What about this: (inside a brand new container)
julia> run(`stat /root/.julia`)
I expect this to throw an error. The question is: will it throw an error because of stat: No such file or directory
, or will it throw an error because of stat: Permission denied
?
Have there been substantial changes between julia 1.0.5 and the following versions with respect to directory creation/stat/security/rights management?
I'm not sure.
Is Julia 1.0.5 the most recent version that works for you? Do any of the Julia 1.1.x versions work?
In particular, if you can confirm that Julia 1.0.5 works for you, but Julia 1.1.0 gives this error, then we can try to look at the diff between those two versions.
Here's another thing to try. Inside a fresh container:
julia> Base.stat("/root/.foo")
julia> Base.stat("/root/foo")
julia> run(`stat /root/.julia`)
stat: cannot stat '/root/.julia': No such file or directory
ERROR: failed process: Process(`stat /root/.julia`, ProcessExited(1)) [1]
Stacktrace:
[1] pipeline_error at ./process.jl:525 [inlined]
[2] #run#565(::Bool, ::typeof(run), ::Cmd) at ./process.jl:440
[3] run(::Cmd) at ./process.jl:438
[4] top-level scope at REPL[5]:1
julia> Base.stat("/root/.foo")
ERROR: IOError: stat: permission denied (EACCES) for file "/root/.foo"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] top-level scope at REPL[6]:1
julia> Base.stat("/root/foo")
ERROR: IOError: stat: permission denied (EACCES) for file "/root/foo"
Stacktrace:
[1] stat(::String) at ./stat.jl:69
[2] top-level scope at REPL[7]:1
I tried different versions:
Can you try julia-1.3.0 also?
@Keno does the fact that
run(`stat /root/.julia`)
gives "no such file or directory" (the correct answer), but
Base.stat("/root/.julia")
gives "permission denied (EACCES)" suggest that this is a Julia bug or a libuv bug (rather than a problem with the system)?
So, if you are able to build Julia from source inside the container, the best option will be to git bisect
between 1.2.0 and 1.3.0 to figure out which commit introduces this bug.
The script for the git bisect can probably be as simple as:
make
rm -rf /root/.julia
./julia -e 'Base.stat("/root/.julia")
Pass that script to git bisect run
with the tags v1.2.0 and v1.3.0 as the known good and bad commits, respectively. Then the bisect should automatically find which commit introduced this bug.
Another thing to try would be stracing Julia doing the stat call vs the
native utility.
On Sun, Mar 1, 2020, 13:45 Dilum Aluthge notifications@github.com wrote:
The script for the git bisect can probably be as simple as:
make
rm -rf /root/.julia
./julia -e 'Base.stat("/root/.julia")—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/julia/issues/34918?email_source=notifications&email_token=AAJ3LF7POZR6DTJMLULCVYTRFKUL7A5CNFSM4K5UHN32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENNHHHQ#issuecomment-593130398,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAJ3LF5FHMAJFRAWWF7HRXLRFKUL7ANCNFSM4K5UHN3Q
.
Another thing to try would be stracing Julia doing the stat call vs the native utility.
Yeah that might be the first thing to do. @dr-br inside a fresh container can you run each of the following:
strace stat /root/.julia
strace ./julia-1.3.1/bin/julia --startup-file=no --history-file=no -e 'Base.stat("/root/.julia")'
Yeah that might be the first thing to do. @dr-br inside a fresh container can you run each of the following:
In order for this to work, you'd need to strace the children also (or just strace the utility directly).
In order for this to work, you'd need to strace the children also (or just strace the utility directly).
Would this be better then?
strace stat /root/.julia
strace ./julia-1.3.1/bin/julia --startup-file=no --history-file=no -e 'Base.stat("/root/.julia")'
All the stats on ~/.julia
will return 1, as in a new container, there is no such directory.
All the stats on
~/.julia
will return 1, as in a new container, there is no such directory.
I think @Keno wants to see the strace
output, to figure out why the system utility stat
works correctly (telling us that there is no such directory), but the Julia Base.stat
function errors and claims that there is a permissions problem. strace
will tell us about the system calls that are made in each situation.
OK.
command1: strace stat /root/.julia
command2: strace ./julia-1.3.1/bin/julia --startup-file=no --history-file=no -e 'Base.stat("/root/.julia")'
command1.txt
command2.txt
Is the difference between the following commands valuable?
strace ./julia-1.3.1/bin/julia -e 'exit()'
strace ./julia-1.0.5/bin/julia -e 'exit()'
Is the difference between the following commands valuable?
It can't hurt to run it and save the output, just in case Keno wants to see it later!
By the way @dr-br, thanks for your patience with all of this!
@Keno look at line 80 of command1.txt
(system stat
utility):
lstat("/root/.julia", 0x7ffeb2f4d7a0) = -1 ENOENT (No such file or directory)
Now look at line 845 of command2.txt
(Julia 1.3.1 Base.stat
):
statx(AT_FDCWD, "/root/.julia", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7fff4a035940) = -1 EACCES (Permission denied)
It seems like the system stat
command is calling the lstat
syscall, while Julia's libuv is calling the statx
syscall?
@dr-br You said that this bug does not exist on Julia 1.2.0, right? Can you run this as well:
strace ./julia-1.2.0/bin/julia --startup-file=no --history-file=no -e 'Base.stat("/root/.julia")'
@Keno I'm wondering if maybe https://github.com/libuv/libuv/issues/2152 is the culprit?
Here the output from 1.2.0
strace_1.2.0.txt
command1.txt
(system stat
utility), line 80:
lstat("/root/.julia", 0x7ffeb2f4d7a0) = -1 ENOENT (No such file or directory)
command2.txt
(Julia 1.3.1 Base.stat
), line 845:
statx(AT_FDCWD, "/root/.julia", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7fff4a035940) = -1 EACCES (Permission denied)
strace_1.2.0.txt
(Julia 1.2.0 Base.stat
), line 864:
stat("/root/.julia", 0x7ffc2c99fb00) = -1 ENOENT (No such file or directory)
It certainly does seem like statx
is the culprit here.
If you look at the original bug report, Base.stat
is getting called by Base.isdir
. I don't think we need all of the fancy information of statx
just to figure out whether or not the input is a directory, right? The info we get from stat
or lstat
(which presumably are equivalent if the input is not a link) should be more than sufficient.
It would be good to figure out where the eaccess comes from though and at least update the man page or file a bug with somebody if that's not the expected behavior.
To add the reference, as I suspected there is no documented difference between stat and statx with respect to eaccess errors: http://man7.org/linux/man-pages/man2/statx.2.html. @dr-br could you show the output of mount
inside the container, so we know what kind of car we're dealing with?
mount inside container:
root@myMachine:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=5988288k,nr_inodes=1497072,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=1206384k,mode=755)
/dev/mapper/vgubuntu-root on / type ext4 (rw,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=44,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=20854)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
/var/lib/snapd/snaps/telegram-desktop_1234.snap on /snap/telegram-desktop/1234 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/core18_1668.snap on /snap/core18/1668 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/core_8689.snap on /snap/core/8689 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/gnome-logs_81.snap on /snap/gnome-logs/81 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/gnome-3-28-1804_116.snap on /snap/gnome-3-28-1804/116 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/gtk-common-themes_1440.snap on /snap/gtk-common-themes/1440 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/gnome-calculator_544.snap on /snap/gnome-calculator/544 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/blender_36.snap on /snap/blender/36 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/skype_112.snap on /snap/skype/112 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/telegram-desktop_1244.snap on /snap/telegram-desktop/1244 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/chromium_1028.snap on /snap/chromium/1028 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/code-insiders_375.snap on /snap/code-insiders/375 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/gnome-characters_399.snap on /snap/gnome-characters/399 type squashfs (ro,nodev,relatime)
/var/lib/snapd/snaps/signal-desktop_299.snap on /snap/signal-desktop/299 type squashfs (ro,nodev,relatime)
/dev/sda2 on /boot type ext4 (rw,relatime)
/dev/sda1 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
me@otherMachine: on /mnt/lsdf type fuse.sshfs (rw,relatime,user_id=0,group_id=0,allow_other)
/var/lib/snapd/snaps/code-insiders_376.snap on /snap/code-insiders/376 type squashfs (ro,nodev,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=1206380k,mode=700,uid=1000,gid=1000)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
That's inside the container? Looks more like a host system. I wonder if it's picking up the outside world instead.
To recall this:
Update:
On Red Hat Enterprise Linux Server release 7.7 host everything works as expected with ubuntu container.
The problems seem to be the combination Ubuntu host + Udocker
So I'm not sure, if further digging into this problem is worth it. However, it is strange that julia 1.2.0 and older work.
@Keno: Yes, udocker is not really what you expect, if you know docker or podman.
However, it is strange that julia 1.2.0 and older work.
In Julia 1.2.0 and older, the version of libuv bundled with Julia did not use statx
. They used stat
and/or lstat
instead.
Starting with Julia 1.3.0, the version of libuv bundled with Julia uses statx
.
@dr-br So CentOS host and Ubuntu container is fine. But Ubuntu host and Ubuntu container has the problem.
I wonder: could you try Ubuntu host and CentOS container, and CentOS host and CentOS container?
I am curious: does this bug happen whenever the host Linux distro is the same as the container Linux distro? Or does it only happen for the very specific Ubuntu-Ubuntu combination?
So I just built 1.5.0-DEV on our RHEL 7.7 Xeon Gold 6230 cluster inside an Ubuntu 18.04 udocker container. Works as expected, no errors.
Also the binary packages (1.3.1, ...) work.
@DilumAluthge: I already tried Ubuntu host and CentOS container, it gave the same errors.
Might be a kernel bug in the Ubuntu host kernel
@Keno: I would not dare to blame the Ubuntu host kernel. I mean, udocker does so many magic tricks. I would rather think of udocker not being well enough tested on "consumer-OSses".
Meh, we find kernel bugs about once or twice a month around these parts ;), but yes at this point this might be something to take up with the udocker developers.
It would be nice to have a MWE. Maybe a simple C program that makes both of the syscalls (stat
versus statx
).
@dr-br I notice that you have this in your workflow:
export PROOT_NO_SECCOMP=1
I wonder if that is relevant.
I also wonder if this issue is relevant: https://bugs.launchpad.net/ubuntu/+source/docker.io/+bug/1755250
@dr-br I notice that you have this in your workflow:
export PROOT_NO_SECCOMP=1
I wonder if that is relevant.
Very likely. On the RHEL machines this is not necessary, only on the Ubuntu hosts.
So it seems that Ubuntu hosts are doing funny business with the statx
syscall, as seen in this issue: https://bugs.launchpad.net/ubuntu/+source/docker.io/+bug/1755250
This issue may also be relevant: https://github.com/proot-me/proot/issues/106
Other potentially related discussions:
And even more potentially related discussions. Apparently the statx
syscall inside containers (e.g. Docker containers) is a real pain.
What is the latest version of Ubuntu that you have access to?
It seems to be statx inside container.
I compiled 2 example programs, one uses stat, the other statx.
On the host, they run fine, inside the container, only stat succeeds. Even on the RHLE system, statx fails
./statx-example .
statx(.) = -1
.: Function not implemented
The Ubuntu versions I ran all the above tests are 19.10 and 20.04. They behave the same.
I now wish that libuv had never started using statx
. It seems like statx
refuses to work correctly inside containers.
@Keno Is there any chance that in the Julia-specific fork of libuv, we could stop using statx
?
Alternatively, we could ask upstream libuv to stop using statx
, i.e. revert https://github.com/libuv/libuv/pull/2184
This is not unique to udocker. For example, this issue: https://github.com/docker/for-linux/issues/208
statx syscalls are only allowed in privileged containers
Funny thing: I don't even get the statx example compiled on the RHEL host, as there is currently a 3.10 kernel running ;)
I think that https://github.com/JuliaLang/libuv/pull/7 will fix this.
I think that JuliaLang/libuv#7 will fix this.
After a discussion with my colleague: Is it possible, that libuv already has a fallback, if statx ist not available? How else could julia run on a RHEL 7 host with 3.10 kernel?
On an Ubuntu host, julia/libuv inside the container detects, that statx is available, but udocker does not support this?
Is it possible, that libuv already has a fallback, if statx ist not available?
If your kernel is old and does not have statx
, then it will return ENOSYS
. If libuv detects that statx
returned the ENOSYS
errno, it will fall back to stat
. You can see this code here:
As you can see in the code, the fallback only applies when the errno returned by statx
is the ENOSYS
errno.
If seccomp
blocks your call to statx
, then it will return some errno. The specific value of that errno is user-defined. If the errno is the ENOSYS
errono, then libuv will fall back to stat
, as I described above. However, if the errno is not the ENOSYS
errno, then libuv will return the errno, i.e. there is no fallback to stat
in that case.
The errno returned by seccomp is user defined. It not being enosys or something sensible is a bug in udocker or one of its dependencies.
The errno returned by seccomp is user defined. It not being enosys or something sensible is a bug in udocker or one of its dependencies.
I’ve corrected my answer.
It seems to be statx inside container.
I compiled 2 example programs, one uses stat, the other statx.
On the host, they run fine, inside the container, only stat succeeds. Even on the RHLE system, statx fails./statx-example . statx(.) = -1 .: Function not implemented
The Ubuntu versions I ran all the above tests are 19.10 and 20.04. They behave the same.
@dr-br Can you post an issue on the udocker repo (https://github.com/indigo-dc/udocker) and include the code for your example programs? And cc me in the issue? Hopefully that will help us get things moving on the udocker end.
Since this turned out not to be a julia issue, I'm gonna go ahead and close this. Discussion can continue here of course.