Hello,
Testing platform:
Crystal 0.20.5+65 [dff9b2e] (2017-02-10)
Scenario:
Compiled Crystal with 0.20.5 and then proceeded to clear the cache (~/.cache/crystal
).
Subsequent builds start reusing bc+obj
files but it takes several attempts until 100% of the files are reused.
The following is the console output and the commands used after the first build:
$ rm -rf ~/.cache/crystal
$ touch src/compiler/crystal.cr
$ make stats=1
Using /usr/bin/llvm-config-3.8 [version=3.8.1]
CRYSTAL_CONFIG_PATH=`pwd`/src ./bin/crystal build --stats -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Using compiled compiler at .build/crystal
Parse: 00:00:00.0001633 ( 0.19MB)
Semantic (top level): 00:00:00.9600884 ( 83.18MB)
Semantic (new): 00:00:00.0049334 ( 83.18MB)
Semantic (type declarations): 00:00:00.1009797 ( 91.18MB)
Semantic (abstract def check): 00:00:00.0029139 ( 91.18MB)
Semantic (cvars initializers): 00:00:00.0222071 ( 91.18MB)
Semantic (ivars initializers): 00:00:00.0852713 ( 99.18MB)
Semantic (main): 00:00:14.0652180 ( 867.25MB)
Semantic (cleanup): 00:00:00.0015672 ( 867.25MB)
Semantic (recursive struct check): 00:00:00.0019160 ( 867.25MB)
Codegen (crystal): 00:00:08.8524483 ( 876.12MB)
Codegen (bc+obj): 00:00:10.0868879 ( 876.12MB)
Codegen (linking): 00:00:04.0767550 ( 876.12MB)
Macro runs:
- /home/luis/code/crystal-lang/crystal/src/ecr/process.cr: 00:00:03.4635985
Codegen (bc+obj):
- no previous .o files were reused
As expected on first run after clean cache, no bc file was reused.
Second attempt:
$ touch src/compiler/crystal.cr
$ make stats=1
Using /usr/bin/llvm-config-3.8 [version=3.8.1]
CRYSTAL_CONFIG_PATH=`pwd`/src ./bin/crystal build --stats -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Using compiled compiler at .build/crystal
Parse: 00:00:00.0001430 ( 0.19MB)
Semantic (top level): 00:00:00.9827208 ( 82.74MB)
Semantic (new): 00:00:00.0044674 ( 82.74MB)
Semantic (type declarations): 00:00:00.1067661 ( 90.74MB)
Semantic (abstract def check): 00:00:00.0030296 ( 90.74MB)
Semantic (cvars initializers): 00:00:00.0219456 ( 90.74MB)
Semantic (ivars initializers): 00:00:00.1206257 ( 90.74MB)
Semantic (main): 00:00:10.9776052 ( 826.74MB)
Semantic (cleanup): 00:00:00.0015307 ( 826.74MB)
Semantic (recursive struct check): 00:00:00.0018696 ( 826.74MB)
Codegen (crystal): 00:00:10.6871095 ( 874.93MB)
Codegen (bc+obj): 00:00:08.8601001 ( 874.93MB)
Codegen (linking): 00:00:03.7990048 ( 874.93MB)
Macro runs:
- /home/luis/code/crystal-lang/crystal/src/ecr/process.cr: reused previous compilation (00:00:00.0080372)
Codegen (bc+obj):
- 247/1186 .o files were reused
Only 247 of 1186 were reused.
Third attempt:
$ touch src/compiler/crystal.cr
$ make stats=1
Using /usr/bin/llvm-config-3.8 [version=3.8.1]
CRYSTAL_CONFIG_PATH=`pwd`/src ./bin/crystal build --stats -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Using compiled compiler at .build/crystal
Parse: 00:00:00.0002702 ( 0.19MB)
Semantic (top level): 00:00:00.9722807 ( 82.74MB)
Semantic (new): 00:00:00.0044383 ( 82.74MB)
Semantic (type declarations): 00:00:00.1029496 ( 90.74MB)
Semantic (abstract def check): 00:00:00.0028177 ( 90.74MB)
Semantic (cvars initializers): 00:00:00.0214620 ( 90.74MB)
Semantic (ivars initializers): 00:00:00.1101186 ( 90.74MB)
Semantic (main): 00:00:10.2959104 ( 826.74MB)
Semantic (cleanup): 00:00:00.0015995 ( 826.74MB)
Semantic (recursive struct check): 00:00:00.0018579 ( 826.74MB)
Codegen (crystal): 00:00:09.0105445 ( 874.93MB)
Codegen (bc+obj): 00:00:01.5306878 ( 874.93MB)
Codegen (linking): 00:00:03.8219490 ( 874.93MB)
Macro runs:
- /home/luis/code/crystal-lang/crystal/src/ecr/process.cr: reused previous compilation (00:00:00.0082830)
Codegen (bc+obj):
- 1057/1186 .o files were reused
Number goes up but still isn't 100%
Fourth attempt:
$ touch src/compiler/crystal.cr
$ make stats=1
Using /usr/bin/llvm-config-3.8 [version=3.8.1]
CRYSTAL_CONFIG_PATH=`pwd`/src ./bin/crystal build --stats -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Using compiled compiler at .build/crystal
Parse: 00:00:00.0001631 ( 0.19MB)
Semantic (top level): 00:00:01.0759624 ( 82.74MB)
Semantic (new): 00:00:00.0049699 ( 82.74MB)
Semantic (type declarations): 00:00:00.1093231 ( 90.74MB)
Semantic (abstract def check): 00:00:00.0032196 ( 90.74MB)
Semantic (cvars initializers): 00:00:00.0232578 ( 90.74MB)
Semantic (ivars initializers): 00:00:00.2513325 ( 90.80MB)
Semantic (main): 00:00:10.9166493 ( 826.80MB)
Semantic (cleanup): 00:00:00.0015945 ( 826.80MB)
Semantic (recursive struct check): 00:00:00.0020521 ( 826.80MB)
Codegen (crystal): 00:00:10.0011559 ( 875.68MB)
Codegen (bc+obj): 00:00:00.8647753 ( 875.68MB)
Codegen (linking): 00:00:03.9348918 ( 875.68MB)
Macro runs:
- /home/luis/code/crystal-lang/crystal/src/ecr/process.cr: reused previous compilation (00:00:00.0080509)
Codegen (bc+obj):
- all previous .o files were reused
This time (and any subsequent one) reuses 100% of the generated bc files.
I've tried waiting several minutes between builds considering it might be a timing issue, but obtained the same result. Also considered might be the date or a macro call (like getting the git commit and version tag), but cannot confirm.
I was able to reproduce this consistently multiple times:
rm
): no reused bc+objThanks for making the compile process fast, which help me spot this issue :smile:
Let me know what else should I look at that might help.
Thank you :heart: :heart: :heart:
YES!! I noticed this too. It seems to happen with .o files that have the Thread type inside them. I have no idea what's going on... would be interesting to find out :-)
I can reproduce 247, 1061, 100%
here on arch linux (llvm 3.9). How strange!
You can also reproduce this with an empty file. I'm getting:
- no previous .o files were reused
- 118/121 .o files were reused
- all previous .o files were reused
So using an empty file should be easier for finding out what's wrong.
@asterite maybe we can get an extended list of which bc files were reused and then diff
between to determine which files might the be ones changing between builds.
The comment:
I've tried waiting several minutes between builds considering it might be a timing issue...
brings to my mind a problem seen in some older versions of make
. I have no idea what crystal is doing here, so let me phrase it in terms of make
. Given:
```cr
bthing : athing
compile -o bthing
clib : bthing
buildlib -o clib bthing
``
You then touch
athing, and type
make clib`.
The problem comes up when the compile
and buildlib
steps can execute in less than a second. You'd end up with bthing
and clib
with the exact same timestamp (which was saved in seconds at the time). And this meant that the next time make
was run, it thought that clib
needed to be rebuilt even though it did not. Might something like that be happening with crystal's cached files?
@drosehn don't think make
is the issue, as @asterite mentioned this also happens with an empty file:
$ touch empty.cr
$ rm -rf ~/.cache/crystal
$ bin/crystal build --stats empty.cr
Using compiled compiler at .build/crystal
...
Codegen (bc+obj):
- no previous .o files were reused
$ bin/crystal build --stats empty.cr
Using compiled compiler at .build/crystal
...
Codegen (bc+obj):
- 106/118 .o files were reused
$ bin/crystal build --stats empty.cr
Using compiled compiler at .build/crystal
...
Codegen (bc+obj):
- all previous .o files were reused
Also Crystal's Makefile
takes in consideration all the timestamps for all the source files against the generated executable, including libcrystal.a
, so that shouldn't be the issue.
I just added some code to show which units are not reused. For an empty file I'm getting:
These modules were not reused:
- Thread (T-hread.bc)
- Set(Thread) (S-et40T-hread41.bc)
- Hash(Thread, Nil) (H-ash40T-hread4432N-il41.bc)
The strange thing is that if I emit LLVM IR for these (with --ll
) then there's no difference. Somehow a different .bc
is generated for the same .ll
...
I also went from two different .bc
files for the same module, turned them into .ll
(with llvm-dis
) and there's no difference. I think this promotes this to an LLVM bug. Or maybe there's a reason why they generate different .bc files for the same LLVM IR. The strange thing is that it later stabilizes, so I'm not entirely sure what I say here is correct.
I got a slightly number of different for an empty file:
(pulled and compiled from master after your change)
Codegen (bc+obj):
- 106/118 .o files were reused
These modules were not reused:
- Thread (T-hread.bc)
- Set(Thread) (S-et40T-hread41.bc)
- Hash(Thread, Nil) (H-ash40T-hread4432N-il41.bc)
- Pointer(Hash::Entry(Thread, Nil) | Nil) (P-ointer40H-ash58-5c8fd5776cf345ec94b91bf7a5f7e50f.bc)
- Array(Int32) (A-rray40I-nt3241.bc)
- String::Builder (S-tring5858B-uilder.bc)
- String (S-tring.bc)
- GC (G-C-.bc)
- Slice(T) (S-lice40T-41.bc)
- IO::Encoder (I-O-5858E-ncoder.bc)
- Iconv (I-conv.bc)
- Math (M-ath.bc)
What is also interesting is that after removal of the cache, the second build displays some other random modules that weren't reused:
Codegen (bc+obj):
- 106/118 .o files were reused
These modules were not reused:
- Thread (T-hread.bc)
- Set(Thread) (S-et40T-hread41.bc)
- Hash(Thread, Nil) (H-ash40T-hread4432N-il41.bc)
- Pointer(Hash::Entry(Thread, Nil) | Nil) (P-ointer40H-ash58-5c8fd5776cf345ec94b91bf7a5f7e50f.bc)
- Array(Int32) (A-rray40I-nt3241.bc)
- String::Builder (S-tring5858B-uilder.bc)
- String (S-tring.bc)
- GC (G-C-.bc)
- Slice(T) (S-lice40T-41.bc)
- IO::Encoder (I-O-5858E-ncoder.bc)
- Iconv (I-conv.bc)
- Math (M-ath.bc)
- Debug::DWARF::LineNumbers::Sequence (D-ebug5858D-W-A-R-F-5858L-ineN-umbers5858S-equence.bc)
- Pointer(Hash::Entry(Signal, Proc(Signal, Nil)) | Nil) (P-ointer40H-ash58-ac8d12c9042874d67c1e049ddde35350.bc)
And this other time:
Codegen (bc+obj):
- 106/118 .o files were reused
These modules were not reused:
- Thread (T-hread.bc)
- Set(Thread) (S-et40T-hread41.bc)
- Hash(Thread, Nil) (H-ash40T-hread4432N-il41.bc)
- Pointer(Hash::Entry(Thread, Nil) | Nil) (P-ointer40H-ash58-5c8fd5776cf345ec94b91bf7a5f7e50f.bc)
- Array(Int32) (A-rray40I-nt3241.bc)
- String::Builder (S-tring5858B-uilder.bc)
- String (S-tring.bc)
- GC (G-C-.bc)
- Slice(T) (S-lice40T-41.bc)
- IO::Encoder (I-O-5858E-ncoder.bc)
- Iconv (I-conv.bc)
- Math (M-ath.bc)
- StaticArray(UInt8, 1024) (S-taticA-rray40U-I-nt84432102441.bc)
- Char (C-har.bc)
- Nil (N-il.bc)
- Debug::DWARF::LineNumbers (D-ebug5858D-W-A-R-F-5858L-ineN-umbers.bc)
- Array(Debug::DWARF::LineNumbers::Row) (A-rray40D-ebug585-1d195a7f864d2d3790ebeadd84b7a4df.bc)
- Proc(Fiber, (IO::FileDescriptor | Nil)) (P-roc40F-iber4432-6eb246a0a45118d3c5507cc830b14a70.bc)
- Hash(Int32, Channel::Buffered(Process::Status | Nil)) (H-ash40I-nt324432-12de0ef574784dde6ffe5b3ae93d5bba.bc)
- Pointer(Hash::Entry(Int32, Channel::Buffered(Process::Status | Nil)) | Nil) (P-ointer40H-ash58-1ce4187043ca1e746f24acb8d0ff7a7e.bc)
- Array(Proc(Int32, Nil)) (A-rray40P-roc40I-nt324432N-il4141.bc)
- Pointer(Proc(Int32, Nil)) (P-ointer40P-roc40I-nt324432N-il4141.bc)
The count appears to be consistent, but the listed modules not :100: :confused:
With --no-debug
this problem disappears.
I still have no idea what's going on. If I start with an empty cache and compile an empty file, then in the cache directory there will be a T-hread.bc
file. I do:
$ md5 T-hread.bc
MD5 (T-hread.bc) = ea315a5d222f2f73379e9378cb7fb12f
Now I compile again and repeat the check:
$ md5 T-hread.bc
MD5 (T-hread.bc) = 182b831f82b936d681b1e6ebdc4fccba
From there, subsequent compilations give the same md5 value. If I delete the cache and start over I get the same md5 values (first the first one, then always the second one). I can't understand what we are doing different in the compiler for LLVM to generate two different .bc
files, and why LLVM manages to always generate a same .bc
file starting from the second time... big mistery!
I continued investigating this. I changed Compiler#codegen_many_units
to this:
private def codegen_many_units(program, units, target_triple)
unit1 = units.find(&.name.==("F-iber")).not_nil!
unit2 = units.find(&.name.==("T-hread")).not_nil!
unit1.llvm_mod.target = target_triple
unit2.llvm_mod.target = target_triple
# First unit
unit1.llvm_mod.write_bitcode_to_file(unit1.bc_name)
if ENV["FOO"]?
target_machine.emit_obj_to_file unit1.llvm_mod, unit1.object_name
end
# Second unit
unit2.llvm_mod.write_bitcode_to_file(unit2.bc_name)
p `md5 #{unit2.bc_name}`
[] of String
end
So what I'm doing here is generating the .bc
file for the LLVM module that corresponds to Fiber
, and then doing the same for Thread
. In between, I conditionally emit the .o
for Fiber
(by checking ENV["FOO"]?
. What I can observe is that the second .bc
file (the one for Thread
) changes depending on whether an .o file was emited for Fiber
or not. So it seems emit_obj_to_file
(a method exposed by the LLVM API) has some other side effect besides writing the object file.
May no debug be a hint? Maybe some locations are slightly different, or maybe the generated DWARF changes a bit?
@ysbaddaden It's probably related to the debug info. However:
.bc
files for the same module to .ll
, I get the same .ll
.o
file for an entirely different LLVM moduleShould I be looking somewhere else? You can repeat the experiment by (using master) cleaning the cache, compiling an empty file, then doing llvm-bcanalyzer -dump ~/.cache/crystal/path/T-hread.bc > file1.txt
, the compiling the file again and repeating the above command but storing it in a different file. I post the diff here. It seems the second time 3 more debug words (whatever that means) are generated. I still have no idea what's going on :-)
(I wanted to use the details tag, but apparently it stopped processing markdown...)
1c1
< <BITCODE_WRAPPER_HEADER Magic=0x0b17c0de Version=0x00000000 Offset=0x00000014 Size=0x00000f1c CPUType=0x01000007/>
---
> <BITCODE_WRAPPER_HEADER Magic=0x0b17c0de Version=0x00000000 Offset=0x00000014 Size=0x00000f28 CPUType=0x01000007/>
6c6
< <MODULE_BLOCK NumWords=957 BlockCodeSize=3>
---
> <MODULE_BLOCK NumWords=960 BlockCodeSize=3>
124c124
< <VSTOFFSET abbrevid=6 op0=839/>
---
> <VSTOFFSET abbrevid=6 op0=842/>
130c130
< <METADATA_KIND_BLOCK NumWords=92 BlockCodeSize=3>
---
> <METADATA_KIND_BLOCK NumWords=95 BlockCodeSize=3>
150a151
> <KIND op0=20 op1=115 op2=114 op3=99 op4=108 op5=111 op6=99/>
521c522
< <FNENTRY abbrevid=9 op0=9 op1=623 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=64 op10=82 op11=101 op12=102 op13=101 op14=114 op15=101 op16=110 op17=99 op18=101 op19=35 op20=61 op21=61 op22=60 op23=84 op24=104 op25=114 op26=101 op27=97 op28=100 op29=62 op30=58 op31=66 op32=111 op33=111 op34=108/> record string = '*Thread@Reference#==<Thread>:Bool'
---
> <FNENTRY abbrevid=9 op0=9 op1=626 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=64 op10=82 op11=101 op12=102 op13=101 op14=114 op15=101 op16=110 op17=99 op18=101 op19=35 op20=61 op21=61 op22=60 op23=84 op24=104 op25=114 op26=101 op27=97 op28=100 op29=62 op30=58 op31=66 op32=111 op33=111 op34=108/> record string = '*Thread@Reference#==<Thread>:Bool'
523c524
< <FNENTRY abbrevid=9 op0=1 op1=435 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=58 op10=58 op11=99 op12=117 op13=114 op14=114 op15=101 op16=110 op17=116 op18=58 op19=84 op20=104 op21=114 op22=101 op23=97 op24=100/> record string = '*Thread::current:Thread'
---
> <FNENTRY abbrevid=9 op0=1 op1=438 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=58 op10=58 op11=99 op12=117 op13=114 op14=114 op15=101 op16=110 op17=116 op18=58 op19=84 op20=104 op21=114 op22=101 op23=97 op24=100/> record string = '*Thread::current:Thread'
526,528c527,529
< <FNENTRY abbrevid=9 op0=18 op1=800 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=35 op10=99 op11=117 op12=114 op13=114 op14=101 op15=110 op16=116 op17=95 op18=102 op19=105 op20=98 op21=101 op22=114 op23=61 op24=60 op25=70 op26=105 op27=98 op28=101 op29=114 op30=62 op31=58 op32=70 op33=105 op34=98 op35=101 op36=114/> record string = '*Thread#current_fiber=<Fiber>:Fiber'
< <FNENTRY abbrevid=9 op0=10 op1=654 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=64 op10=82 op11=101 op12=102 op13=101 op14=114 op15=101 op16=110 op17=99 op18=101 op19=35 op20=115 op21=97 op22=109 op23=101 op24=63 op25=60 op26=84 op27=104 op28=114 op29=101 op30=97 op31=100 op32=62 op33=58 op34=66 op35=111 op36=111 op37=108/> record string = '*Thread@Reference#same?<Thread>:Bool'
< <FNENTRY abbrevid=9 op0=2 op1=464 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=58 op10=58 op11=110 op12=101 op13=119 op14=58 op15=84 op16=104 op17=114 op18=101 op19=97 op20=100/> record string = '*Thread::new:Thread'
---
> <FNENTRY abbrevid=9 op0=18 op1=803 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=35 op10=99 op11=117 op12=114 op13=114 op14=101 op15=110 op16=116 op17=95 op18=102 op19=105 op20=98 op21=101 op22=114 op23=61 op24=60 op25=70 op26=105 op27=98 op28=101 op29=114 op30=62 op31=58 op32=70 op33=105 op34=98 op35=101 op36=114/> record string = '*Thread#current_fiber=<Fiber>:Fiber'
> <FNENTRY abbrevid=9 op0=10 op1=657 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=64 op10=82 op11=101 op12=102 op13=101 op14=114 op15=101 op16=110 op17=99 op18=101 op19=35 op20=115 op21=97 op22=109 op23=101 op24=63 op25=60 op26=84 op27=104 op28=114 op29=101 op30=97 op31=100 op32=62 op33=58 op34=66 op35=111 op36=111 op37=108/> record string = '*Thread@Reference#same?<Thread>:Bool'
> <FNENTRY abbrevid=9 op0=2 op1=467 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=58 op10=58 op11=110 op12=101 op13=119 op14=58 op15=84 op16=104 op17=114 op18=101 op19=97 op20=100/> record string = '*Thread::new:Thread'
530c531
< <FNENTRY abbrevid=9 op0=5 op1=533 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=35 op10=105 op11=110 op12=105 op13=116 op14=105 op15=97 op16=108 op17=105 op18=122 op19=101 op20=58 op21=80 op22=111 op23=105 op24=110 op25=116 op26=101 op27=114 op28=40 op29=86 op30=111 op31=105 op32=100 op33=41/> record string = '*Thread#initialize:Pointer(Void)'
---
> <FNENTRY abbrevid=9 op0=5 op1=536 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=35 op10=105 op11=110 op12=105 op13=116 op14=105 op15=97 op16=108 op17=105 op18=122 op19=101 op20=58 op21=80 op22=111 op23=105 op24=110 op25=116 op26=101 op27=114 op28=40 op29=86 op30=111 op31=105 op32=100 op33=41/> record string = '*Thread#initialize:Pointer(Void)'
534,535c535,536
< <FNENTRY abbrevid=9 op0=13 op1=687 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=35 op10=102 op11=105 op12=110 op13=97 op14=108 op15=105 op16=122 op17=101 op18=58 op19=40 op20=73 op21=110 op22=116 op23=51 op24=50 op25=32 op26=124 op27=32 op28=78 op29=105 op30=108 op31=41/> record string = '*Thread#finalize:(Int32 | Nil)'
< <FNENTRY abbrevid=9 op0=8 op1=596 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=64 op10=82 op11=101 op12=102 op13=101 op14=114 op15=101 op16=110 op17=99 op18=101 op19=35 op20=104 op21=97 op22=115 op23=104 op24=58 op25=85 op26=73 op27=110 op28=116 op29=54 op30=52/> record string = '*Thread@Reference#hash:UInt64'
---
> <FNENTRY abbrevid=9 op0=13 op1=690 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=35 op10=102 op11=105 op12=110 op13=97 op14=108 op15=105 op16=122 op17=101 op18=58 op19=40 op20=73 op21=110 op22=116 op23=51 op24=50 op25=32 op26=124 op27=32 op28=78 op29=105 op30=108 op31=41/> record string = '*Thread#finalize:(Int32 | Nil)'
> <FNENTRY abbrevid=9 op0=8 op1=599 op2=42 op3=84 op4=104 op5=114 op6=101 op7=97 op8=100 op9=64 op10=82 op11=101 op12=102 op13=101 op14=114 op15=101 op16=110 op17=99 op18=101 op19=35 op20=104 op21=97 op22=115 op23=104 op24=58 op25=85 op26=73 op27=110 op28=116 op29=54 op30=52/> record string = '*Thread@Reference#hash:UInt64'
543c544
< Total size: 30944b/3868.00B/967W
---
> Total size: 31040b/3880.00B/970W
551c552
< Percent of file: 2.1717%
---
> Percent of file: 2.1649%
564c565
< Percent of file: 10.3219%
---
> Percent of file: 10.2899%
583c584
< Percent of file: 0.3781%
---
> Percent of file: 0.3769%
596c597
< Percent of file: 3.3771%
---
> Percent of file: 3.3666%
609c610
< Percent of file: 3.5968%
---
> Percent of file: 3.5857%
627c628
< Percent of file: 20.0814%
---
> Percent of file: 20.0193%
655c656
< Percent of file: 0.6916%
---
> Percent of file: 0.6894%
669c670
< Percent of file: 19.7227%
---
> Percent of file: 19.6617%
685c686
< Percent of file: 15.6056%
---
> Percent of file: 15.5573%
708c709
< Percent of file: 2.1717%
---
> Percent of file: 2.1649%
722c723
< Percent of file: 10.7194%
---
> Percent of file: 10.6862%
744c745
< Percent of file: 1.3088%
---
> Percent of file: 1.3048%
756,757c757,758
< Total Size: 2997b/374.62B/93W
< Percent of file: 9.6852%
---
> Total Size: 3093b/386.62B/96W
> Percent of file: 9.9646%
760c761
< Num Records: 20
---
> Num Records: 21
765c766
< 20 2928 146.4 KIND
---
> 21 3021 143.9 KIND
OK, at this point I'm 99% sure it's a bug in LLVM. Here's some code to reproduce the issue very easily:
require "llvm"
lib LibLLVM
fun create_memory_buffer_with_contents_of_file = LLVMCreateMemoryBufferWithContentsOfFile(path : UInt8*, out_mem_buf : MemoryBufferRef*, out_message : UInt8**) : Int32
fun parse_ir_in_context = LLVMParseIRInContext(context : ContextRef, mem_buf : MemoryBufferRef, out_m : ModuleRef*, out_message : UInt8**) : Int32
end
module LLVM
class MemoryBuffer
def self.from_file(filename : String)
ret = LibLLVM.create_memory_buffer_with_contents_of_file(filename, out mem_buf, out msg)
if ret != 0 && msg
raise LLVM.string_and_dispose(msg)
end
new(mem_buf)
end
end
class Context
def initialize
@unwrap = LibLLVM.create_context
end
def parse_ir(buf : MemoryBuffer)
ret = LibLLVM.parse_ir_in_context(self, buf, out mod, out msg)
if ret != 0 && msg
raise LLVM.string_and_dispose(msg)
end
Module.new(mod)
end
end
class Module
def initialize(@unwrap : LibLLVM::ModuleRef)
@name = ""
@owned = false
end
end
end
LLVM.init_x86
target_triple = LLVM.default_target_triple
target = LLVM::Target.from_triple(target_triple)
target_machine = target.create_target_machine(target_triple).not_nil!
ctx = LLVM::Context.global
fiber_buf = LLVM::MemoryBuffer.from_file("./F-iber.ll")
fiber_mod = ctx.parse_ir(fiber_buf)
thread_buf = LLVM::MemoryBuffer.from_file("./T-hread.ll")
thread_mod = ctx.parse_ir(thread_buf)
target_machine.emit_obj_to_file fiber_mod, "./Fiber.o"
thread_mod.write_bitcode_to_file "./Thread.bc"
What I'm doing here:
crystal build foo.cr --ll
with an empty "foo.cr" file, then finding these ".ll" files inside ~/.crystal/cache
The ".bc" changes depending on whether I emit a ".o" file for "F-iber.ll" or not.
What I found out is that if I use two different LLVM::Context instances, the problem goes away:
require "llvm"
lib LibLLVM
fun create_memory_buffer_with_contents_of_file = LLVMCreateMemoryBufferWithContentsOfFile(path : UInt8*, out_mem_buf : MemoryBufferRef*, out_message : UInt8**) : Int32
fun parse_ir_in_context = LLVMParseIRInContext(context : ContextRef, mem_buf : MemoryBufferRef, out_m : ModuleRef*, out_message : UInt8**) : Int32
end
module LLVM
class MemoryBuffer
def self.from_file(filename : String)
ret = LibLLVM.create_memory_buffer_with_contents_of_file(filename, out mem_buf, out msg)
if ret != 0 && msg
raise LLVM.string_and_dispose(msg)
end
new(mem_buf)
end
end
class Context
def initialize
@unwrap = LibLLVM.create_context
end
def parse_ir(buf : MemoryBuffer)
ret = LibLLVM.parse_ir_in_context(self, buf, out mod, out msg)
if ret != 0 && msg
raise LLVM.string_and_dispose(msg)
end
Module.new(mod)
end
end
class Module
def initialize(@unwrap : LibLLVM::ModuleRef)
@name = ""
@owned = false
end
end
end
LLVM.init_x86
target_triple = LLVM.default_target_triple
target = LLVM::Target.from_triple(target_triple)
target_machine = target.create_target_machine(target_triple).not_nil!
ctx1 = LLVM::Context.new
ctx2 = LLVM::Context.new
fiber_buf = LLVM::MemoryBuffer.from_file("./F-iber.ll")
fiber_mod = ctx1.parse_ir(fiber_buf)
thread_buf = LLVM::MemoryBuffer.from_file("./T-hread.ll")
thread_mod = ctx2.parse_ir(thread_buf)
target_machine.emit_obj_to_file fiber_mod, "./Fiber.o"
thread_mod.write_bitcode_to_file "./Thread.bc"
I still consider the above to be a bug in LLVM: there's no reason why a ".bc" file should change for one module depending on whether I emitted a ".o" file for another module in the same context. I'm pretty sure that LLVM function somehow changes LLVM::Context.
In any case, to support parallel compilation without fork
, like what we do right now (fork
will most likely disappear from the language) we'll need to use different LLVM contexts, so I guess it's time to start working on that :-)
If I remember correctly, there is a global context in the LLVM C API, though the C++ API no longer has one; it changed in LLVM 3.7 or 3.8.
Some progress on this: I managed to refactor the compiler's codegen to use different LLVM::Context instances for each module. This opens the door for generating bc/obj files in multiple threads, and also partially fixes the problem described in this issue. Now with an empty file I get that all .bc
files are reused at the second compilation. But with bigger files, like the http server example or the compiler there are still some files that generate in a different way. I'm pretty sure at this point that this is an issue in LLVM itself. One way to "fix" this would be to first generate all ".bc" files and then all ".o" files, but I'm not sure that will slow down things a bit... we can keep this issue open, but it's really minor as eventually all ".bc" files end up being reused.
Looking into the changed binary cache files provides further clues, since various debug info is legible.
I'm guessing the _H-ash40T-hread4432N-il41.bc_ changes since debug information is referenced via hash tables - and initialising these hash tables (initial build) is treated differently than running subsequent builds - thus _H-ash40T-hread4432N-il41.bc_ only changes once.
the files _S-et40T-hread41.bc_ and _T-hread.bc_ appear to change as a side-effect of _H-ash40T-hread4432N-il41.bc_ changing.
Some useful info is here: http://llvm.org/docs/SourceLevelDebugging.html
This is of course largely speculation, but it looks like a debug trade-off against cache reuse. Of course no-debug or release negates the trade-off.
https://gist.github.com/mjago/1174481762671c8293ab8a0fc94b78e0
I believe that this issue should be closed, because fixed along #4336 issue.
On master branch (compiler built on master), second compilation (after cache cleanup):
➜ crystal git:(Float64.hash-inspired-by-_Py_HashDouble) ✗ bin/crystal build --stats empty.cr
Using compiled compiler at `.build/crystal'
...
Codegen (bc+obj):
- all previous .o files were reused
@akzhan looks like is working, closing it.
Thank you for the follow up.
Actually reopening this based on last comment from @akzhan from #4336 😢
This is just a compiler detail. Not a bug. I don't think we can improve this because LLVM doesn't guarantee we can reuse modules.
Most helpful comment
Some progress on this: I managed to refactor the compiler's codegen to use different LLVM::Context instances for each module. This opens the door for generating bc/obj files in multiple threads, and also partially fixes the problem described in this issue. Now with an empty file I get that all
.bc
files are reused at the second compilation. But with bigger files, like the http server example or the compiler there are still some files that generate in a different way. I'm pretty sure at this point that this is an issue in LLVM itself. One way to "fix" this would be to first generate all ".bc" files and then all ".o" files, but I'm not sure that will slow down things a bit... we can keep this issue open, but it's really minor as eventually all ".bc" files end up being reused.