Dhall-haskell: PermissionDenied exception related to MoveFileEx on Windows

Created on 8 Jul 2020  Â·  8Comments  Â·  Source: dhall-lang/dhall-haskell

(Opening an issue here rather than keeping it on Slack.)

Windows builds of a project using Dhall (in Haskell) have been failing seemingly at random on Github's CI. Sometimes the tests succeed normally, sometimes they fail (always in this way). I've been unable to reproduce the problem on an actual Windows computer, where the whole test suite passes fine.

uncaught exception: IOException of type PermissionDenied
C:\Users\runneradmin\AppData\Local\dhall-haskell\ato5C23.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\ato5C23.write" Just "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\122026a29e0113646fb623fba2a6657b31b99127b689d510ef6761df7dd49da8a5bb": permission denied (Access is denied.)

The hex values change, e.g. this was another error thrown on a different build:

uncaught exception: IOException of type PermissionDenied
C:\Users\runneradmin\AppData\Local\dhall-haskell\atoCD11.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\atoCD11.write" Just "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\12205b43b1207f0c5f69e80a94bf78d52a2b2189b2658a70652cb805001ced08b5ae": permission denied (Access is denied.)

The test in question that is failing is simply importing a Dhall expression from a file (using Dhall.inputFile) and comparing it to an expected Haskell value (i.e. no file manipulation is occurring beyond whatever Dhall.inputFile is doing). It is always the very first import on a test run that fails, with all others seeming to succeed fine.

Turning off hspec's parallel spec evaluation just for this first spec seems to fix the error, even though parallel spec evaluation can be on for any number of other specs that also include Dhall.inputFile. So this may be a non-issue, but it still seems peculiar, since it seems to be Windows-specific.

bug caching

Most helpful comment

Okay, after fidgeting around, I've been able to reproduce it pretty simply on my own windows box, rather than in CI.

Requirements:

  • Delete %localappdata%\dhall & %localappdata%\dhall-haskell between runs. This only occurs on the first run when the cache is populated (which is why it was showing up in CI and not on my own computer).

Minimal Main.hs (using hspec for easy parallelism):

{-# LANGUAGE OverloadedStrings #-}

module Main (main) where

import qualified Dhall as D
import Test.Hspec

main :: IO ()
main = hspec $ parallel $ do
  it "import 1" $ D.input D.auto "let B = https://prelude.dhall-lang.org/Bool/package.dhall in B.not True" `shouldReturn` False
  it "import 2" $ D.input D.auto "let B = https://prelude.dhall-lang.org/Bool/package.dhall in B.not False" `shouldReturn` True

To reproduce: stack run --compiler=ghc-8.8.2 (8.8.3 is borked on Windows due to a bug that will be fixed in 8.8.4.)

import 1
import 2 FAILED [1]

Failures:

  src\Main.hs:12:3:
  1) import 2
       uncaught exception: IOException of type PermissionDenied
       C:\Users\Username\AppData\Local\dhall-haskell\ato3EE7.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\Username\\AppData\\Local\\dhall-haskell\\ato3EE7.write" Just "\\\\?\\C:\\Users\\Username\\AppData\\Local\\dhall-haskell\\1220262d2dcb718ae7f37b6ce6142fb0aa73b714802582809d20ad49d8e4627f35ff": permission denied (Access is denied.)

  To rerun use: --match "/import 2/"

Randomized with seed 74949068

Finished in 0.6375 seconds
2 examples, 1 failure

stack.yaml:

resolver: lts-16.0

packages:
- .

package.yaml (Important: Note ghc-options; have to have threading/parallelism turned on, or the error doesn't occur.):

name: atomic-write-err
version: 0.1.0.0

dependencies:
  - base >= 4.7 && < 5
  - dhall >= 1.32
  - hspec >= 2.7.1

executables:
  atomic-write-err:
    main: Main.hs
    source-dirs: src
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N

All 8 comments

Thanks for the report @SiriusStarr! :)

For reference, it's this code that writes files to the "semi-semantic" dhall-haskell cache:

https://github.com/dhall-lang/dhall-haskell/blob/e22ecde05f3cb858a0d5695dcd3ba3466b62cec2/dhall/src/Dhall/Import.hs#L716-L721

The use of atomic-write here was introduced in https://github.com/dhall-lang/dhall-haskell/pull/1544.

This is the atomicWriteFile function that we use:

http://hackage.haskell.org/package/atomic-write-0.2.0.7/docs/System-AtomicWrite-Writer-ByteString-Binary.html#v:atomicWriteFile

And this is the renameFile function used there:

http://hackage.haskell.org/package/directory-1.3.6.1/docs/System-Directory.html#v:renameFile

I admittedly don't have a huge amount of trust in the atomic-write code – I think it's possible that dhall is one of very few projects using atomic-write on Windows. atomic-write doesn't appear to have CI for Windows, for instance.


@jneira As our Windows expert, would you have a recommendation how to tackle this? :)

My suspicion is that the atomic-write logic might not be concurrency-safe on Windows. Based on the error message I'm guessing that two parallel dhall interpreters were trying to write out the same cache file at the same time. If that hypothesis is true then this should be possible to narrow down into a minimal reproducing example on Windows without using dhall

@SiriusStarr: Also, I suspect the reason this only affects the first test is because that's the one that populates the cache

I just noticed this bit from the atomic-write package description:

Atomically write to a file on POSIX-compliant systems while preserving permissions.

That sounds as if the package was never meant to guarantee atomicity on Windows. I must have missed that when I picked it to address #1540.

The underlying primitive that atomic-write needs to be atomic is System.Directory.renameFile.

On a POSIX system it's a wrapper around System.POSIX.rename and on a Windows system it's a wrapper around System.Win32.moveFileEx. So one possible explanation is that System.POSIX.rename is atomic while System.Win32.moveFileEx is not atomic.

Either way, it seems like an issue that would need to be fixed upstream, in either the directory package or the Win32 package

Either way, it seems like an issue that would need to be fixed upstream, in either the directory package or the Win32 package

I guess we should try to make a bug report then. We could also try creating a workaround in dhall, possibly by using a file lock.

In any case, it would be good to have a proper reproducer for the issue. @SiriusStarr Could you possibly help us with that?

Okay, after fidgeting around, I've been able to reproduce it pretty simply on my own windows box, rather than in CI.

Requirements:

  • Delete %localappdata%\dhall & %localappdata%\dhall-haskell between runs. This only occurs on the first run when the cache is populated (which is why it was showing up in CI and not on my own computer).

Minimal Main.hs (using hspec for easy parallelism):

{-# LANGUAGE OverloadedStrings #-}

module Main (main) where

import qualified Dhall as D
import Test.Hspec

main :: IO ()
main = hspec $ parallel $ do
  it "import 1" $ D.input D.auto "let B = https://prelude.dhall-lang.org/Bool/package.dhall in B.not True" `shouldReturn` False
  it "import 2" $ D.input D.auto "let B = https://prelude.dhall-lang.org/Bool/package.dhall in B.not False" `shouldReturn` True

To reproduce: stack run --compiler=ghc-8.8.2 (8.8.3 is borked on Windows due to a bug that will be fixed in 8.8.4.)

import 1
import 2 FAILED [1]

Failures:

  src\Main.hs:12:3:
  1) import 2
       uncaught exception: IOException of type PermissionDenied
       C:\Users\Username\AppData\Local\dhall-haskell\ato3EE7.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\Username\\AppData\\Local\\dhall-haskell\\ato3EE7.write" Just "\\\\?\\C:\\Users\\Username\\AppData\\Local\\dhall-haskell\\1220262d2dcb718ae7f37b6ce6142fb0aa73b714802582809d20ad49d8e4627f35ff": permission denied (Access is denied.)

  To rerun use: --match "/import 2/"

Randomized with seed 74949068

Finished in 0.6375 seconds
2 examples, 1 failure

stack.yaml:

resolver: lts-16.0

packages:
- .

package.yaml (Important: Note ghc-options; have to have threading/parallelism turned on, or the error doesn't occur.):

name: atomic-write-err
version: 0.1.0.0

dependencies:
  - base >= 4.7 && < 5
  - dhall >= 1.32
  - hspec >= 2.7.1

executables:
  atomic-write-err:
    main: Main.hs
    source-dirs: src
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N
Was this page helpful?
0 / 5 - 0 ratings

Related issues

SiriusStarr picture SiriusStarr  Â·  5Comments

chris-martin picture chris-martin  Â·  5Comments

DrSensor picture DrSensor  Â·  6Comments

Profpatsch picture Profpatsch  Â·  4Comments

SiriusStarr picture SiriusStarr  Â·  5Comments