Provide an option to pipe GHC's output to the terminal to better keep track of progress, instead of hanging on Cabal-1.22.4.0: build for dependencies that take some time to compile.
We already have logic in place to send output to stdout under specific circumstances (namely: on the terminal, only one package left to build, and it's a "wanted" package). Adding a flag to simply override that logic with "always output to stdout" is trivial, though the result will likely not be what you're looking for unless you also use --jobs 1.
@chrisdone Thoughts?
although the result will likely not be what you're looking for unless you also use --jobs 1
As long as it gives some indication of GHC making progress, it's helpful.
Perhaps, in the multi-jobs context, output lines could be prefixed with the
name of the package generating them?
On Fri, Jun 26, 2015, 03:21 Michael Snoyman [email protected]
wrote:
We already have logic in place to send output to stdout under specific
circumstances (namely: on the terminal, only one package left to build, and
it's a "wanted" package). Adding a flag to simply override that logic with
"always output to stdout" is trivial, though the result will likely not be
what you're looking for unless you also use --jobs 1.@chrisdone https://github.com/chrisdone Thoughts?
—
Reply to this email directly or view it on GitHub
https://github.com/commercialhaskell/stack/issues/426#issuecomment-115631874
.
Yeah, outputting interleaved GHC output would be a bit messy. We could indeed prefix lines, that was discussed once. It might not be too bad.
@spinda This part of the code base is pretty isolated. You interested in giving a crack at a PR on it? I can give you pointers of where to start.
@snoyberg I'm traveling from Friday to Monday, but if the delay isn't an issue then I'd be happy to have a go at it.
Shouldn't be a problem at all. When you're ready to look into this, check out Stack.Build.Execute and look for withLogFile and console.
:+1:
This may well be an overengineered solution, but it seems like it ought to be possible to have uninterleaved GHC output if some of it is delayed. Here's how this would work:
1) The first package being built would get control of stdout, so you get to see its live progress.
2) When that package finishes building, and does so successfully, the output from the other builds that finished can be outputted in bulk.
3) Then, it'd pick the next oldest build to get control over stdout (and it'll dump what it has so far for it). This makes it likely that you'll get the progress output from the package build that's holding things up. Continue with step 2.
Stack messages and logging would get interleaved with the ghc output, but that seems rather reasonable to me, if line buffered.
@mgsloan I'm -1 on that solution, it seems like it's contrary to the stated goal of this issue: get regular output to stdout as long as a build is occurring. Delaying output from other processes would impair that. Additionally, implementing the solution you've described would be very difficult to do without introducing the possibility of significant memory usage or hurting performance in the common case.
Agreed, the complication isn't worth it.
@spinda Are you still planning on looking into this?
(Continuing conversation here from the duplicate #642.)
Regarding the complexity of the implementation -- my hope was that this policy could be completely encapsulated in a function roughly of the type Stream (Stream ByteString) -> Stream ByteString, with the simplest trivial implementation being immediately merging all substream's lines on the output.
Once the policy is factored out like that, then any of these _fancy_ policies could live in their own modules, expose that one function, and not touch anything else. (And also not be the default, probably.) Maybe with some help factoring the plumbing (to route through that single arbiter function), then subsequently myself or @spinda, or @mgsloan or whoever could easily implement and try out different policies.
A richer type than the one above will probably be necessary to have metadata for the substreams (e.g. human readable names) plus some way to implement the current policy of hiding output but still printing status messages....
I don't know conduit yet, but I'll have a look at the async/concurrent stuff out there for it and then make a specific proposal based on Conduit.
Even modules / functions that are only used in one spot add complexity - they'd also need to be wired up to the CLI.
my hope was that this policy could be completely encapsulated in a function roughly of the type Stream (Stream ByteString) -> Stream ByteString
In the other thread, @snoyberg mentioned concerns about huge logs building up in memory. For consumers of this stream that wait a while to send their data, this could easily be the case (and hard to catch when only tested on small examples). It's not clear how to resolve this while keeping the type of the function simple.
That said, I do think this output stuff could use some attention. Here are the things I'd like to see resolved:
1) If you're just building one thing, you get the output in the console, but no log file. Here's the code. To me, it'd be more consistent to have both.
2) A long running build or test can lead the user to think that it's stuck.
3) It's easier to ignore warnings when you don't see all the output. Not sure what to do about this.
One somewhat hacky approach to these last two would be to attempt to parse out the following parts of output:
Considering how hacky this is, I wouldn't want it to make it stack's canonical behavior, at least not at first. So, yeah, maybe it'd be good to have this be something where it's easy to swap in a different output behavior.
my hope was that this policy could be completely encapsulated in a function roughly of the type
Stream (Stream ByteString) -> Stream ByteString
Well it would be nice if the sticky line could report fine-grained module progress as well as coarse-grained progress. But that does require a hacky parsing of GHC output, right? (I suppose the alternative is to do like Jenkins and remember what typical amounts of time/console-output are for particular subtasks, to estimate progress based on that.)
_Regarding memory usage:_
It should be possible to have constant-memory stream transducers of the type Stream (Stream ByteString) -> Stream ByteString which are nice in practice. So, for example, even a strategy that typically passes through build-output when a subtask is complete, could have a max buffer size after which it passes through chunks of output even if not complete.
I think some experimentation could lead to a real-time behavior that feels comfortable, not leaving those "it looks stuck" windows of time, nor producing ugly interleavings.
Actually, on second thought, even without having a "denominator", just having a counter in the sticky line that tracks the total number of lines of output from all subtasks.... that would go a huge way towards showing that the thing isn't stuck.
Whatever you can do to improve the output for multiple projects would be appreciated. This is the most frustrating part of stack for me at the moment. I get a lot of spurious Travis CI build failures. Locally I'd much rather get all the output interleaved (with some prefix would be nice) than have no output until it hits an error. Thanks.
As mentioned in #896, I would like to see ghc warnings for my local packages when doing a multi-package stack build.
One work around is to pass the --pedantic flag to stack build, but there are times when I don't want the whole build to fail on a warning, I just want to see them.
Another work around is to pass a package name to stack build in order to get it to do a single-package build. When working on multiple packages simultaneously, this is kind of annoying.
What is the status on this? Is someone working on a PR or should I just try do that myself (seems pretty straightforward to implement)?
I do not believe anyone else is working on this right now.
The other day I was trying to see template haskell output by passing --ghc-options='-ddump-splices' to stack, but stack didn't print anything to stdout.
Maybe a --ghc-output` flag would solve this problem as well?
Dump output should be going into a dump file inside .stack-work
On Tue, Oct 27, 2015 at 6:15 AM, Dennis Gosnell [email protected]
wrote:
The other day I was trying to see template haskell output by passing
--ghc-options='-ddump-splices' to stack, but stack didn't print anything
to stdout.Maybe a --ghc-output` flag would solve this problem as well?
—
Reply to this email directly or view it on GitHub
https://github.com/commercialhaskell/stack/issues/426#issuecomment-151370713
.
Dump output should be going into a dump file inside .stack-work
I did find the file it was going to, but I had assumed it would work like cabal where the code generated by template haskell is output to stdout.
@cdepillabout: We pass -ddump-hi -ddump-to-file to GHC because we use the .dump-hi files to find unlisted modules and TH dependencies. But there doesn't seem to be a way to tell GHC to _only_ write the the dump-hi files but not other dumps (like -ddump-splices). Is having the splices written to files instead of to stdout causing any real problems for anyone? I usually prefer them to go to files anyway. I've added a FAQ entry, so at least it's documented now.
@borsboom Thanks for the updated documentation! Having the -ddump-splices output go to a file instead of stdout isn't causing me any _real_ problems, but I had expected it to work the same as it does with cabal. I think the updated FAQ solves most of the problem!
Should probably use a different argument name than --ghc-output, since this would also include test and benchmark progress output. Maybe --all-tasks-output?
Is having the splices written to files instead of to stdout causing any real problems for anyone?
@borsboom - I think I am getting stung by this.
EDIT: Oh wait, this was unrelated. Sorry for confusion
I've opened PR #2594 to add a --dump-logs flag and dump-logs config option.
I've filed #3225 for a flag that outputs to stdout, instead of a file that one first has to find.
IIUC --dump-logs prints logs only after a build is complete. Would it be possible to output a . for every compiled module. Use-cases:
[1] https://discuss.circleci.com/t/too-long-with-no-output-while-building-docker-image/12275
@saurabhnanda Should be possibly by parsing GHC output. How about opening a new issue about it? This one is closed and already has a ton of comments.
Relevant PR: https://github.com/commercialhaskell/stack/pull/4097
Most helpful comment
This may well be an overengineered solution, but it seems like it ought to be possible to have uninterleaved GHC output if some of it is delayed. Here's how this would work:
1) The first package being built would get control of stdout, so you get to see its live progress.
2) When that package finishes building, and does so successfully, the output from the other builds that finished can be outputted in bulk.
3) Then, it'd pick the next oldest build to get control over stdout (and it'll dump what it has so far for it). This makes it likely that you'll get the progress output from the package build that's holding things up. Continue with step 2.
Stack messages and logging would get interleaved with the ghc output, but that seems rather reasonable to me, if line buffered.