Julia: log_record_id runtime increases quadratically with duplicate logs

Created on 2 Aug 2018  Β·  4Comments  Β·  Source: JuliaLang/julia

I've copied the code in question from logging.jl here:

_log_record_ids = Set{Symbol}()
# Generate a unique, stable, short, human readable identifier for a logging
# statement.  The idea here is to have a key against which log records can be
# filtered and otherwise manipulated. The key should uniquely identify the
# source location in the originating module, but should be stable across
# versions of the originating module, provided the log generating statement
# itself doesn't change.
function log_record_id(_module, level, message_ex)
    modname = _module === nothing ?  "" : join(fullname(_module), "_")
    # Use (1<<31) to fit well within an (arbitriraly chosen) eight hex digits,
    # as we increment h to resolve any collisions.
    h = hash(string(modname, level, message_ex)) % (1<<31)
    while true
        id = Symbol(modname, '_', string(h, base = 16, pad = 8))
        # _log_record_ids is a registry of log record ids for use during
        # compilation, to ensure uniqueness of ids.  Note that this state will
        # only persist during module compilation so it will be empty when a
        # precompiled module is loaded.
        if !(id in _log_record_ids)
            push!(_log_record_ids, id)
            return id
        end
        h += 1
    end
end

Note that it will loop the number of times the log has previously been logged. This makes benchmarking logging very difficult, as this function takes up almost the entire runtime.

julia> function xparse4(str)
           @debug "xparse"
           Base.parse(Int, str)
       end

julia> @benchmark @suppress xparse4("0")
BenchmarkTools.Trial:
  memory estimate:  1.20 MiB
  allocs estimate:  22514
  --------------
  minimum time:     2.856 ms (0.00% GC)
  median time:      3.752 ms (0.00% GC)
  mean time:        3.979 ms (4.19% GC)
  maximum time:     51.694 ms (92.27% GC)
  --------------
  samples:          1255
  evals/sample:     1
julia> results = @benchmark Base.CoreLogging.log_record_id(@__MODULE__, :Debug, :("xparse"))
BenchmarkTools.Trial:
  memory estimate:  1.04 MiB
  allocs estimate:  24700
  --------------
  minimum time:     2.852 ms (0.00% GC)
  median time:      4.141 ms (0.00% GC)
  mean time:        4.366 ms (2.64% GC)
  maximum time:     46.329 ms (90.08% GC)
  --------------
  samples:          1143
  evals/sample:     1



md5-44b8c16115ee11aae990195b0ad8d1b5



julia> lineplot(StatsBase.trim(results.times; count=10))
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   8000000 β”‚β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⑆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⑇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⑇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⒀⑇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           β”‚β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β£Έβ €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⑇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⑼⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           β”‚β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β’€β‘žβ β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           β”‚β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β£ β žβ €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           β”‚β €β €β €β €β €β €β €β €β €β €β €β €β €β €β£ β ΄β šβ β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           β”‚β €β €β €β €β €β €β €β €β €β£€β‘€β ΄β šβ ‰β β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           │⠀⠀⠀⠀⣀⑀⠴⠒⠋⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
           │⒀⑀⠖⠋⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
   3000000 β”‚β ‹β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           0                                     2000

Do we even need this function? Could we just have logs not have ids unless they're explicitly provided? Where are these ids being used?

logging performance

Most helpful comment

Root cause is the same as #28786 (bug introduced in #28017)

Analysis: https://github.com/JuliaLang/julia/pull/29355#issuecomment-431543734

It's concerning that nobody seems to know what id is for, but reading back on https://docs.julialang.org/en/v1/stdlib/Logging/index.html#Log-event-structure-1 I guess I can see why; the docs don't explain in depth.

All 4 comments

@StefanKarpinski I don't think you're going to get any response from Chris, he hasn't been active for half a year now

Fair enough. I’m still giving the responsibility to him :)

Root cause is the same as #28786 (bug introduced in #28017)

Analysis: https://github.com/JuliaLang/julia/pull/29355#issuecomment-431543734

It's concerning that nobody seems to know what id is for, but reading back on https://docs.julialang.org/en/v1/stdlib/Logging/index.html#Log-event-structure-1 I guess I can see why; the docs don't explain in depth.

Log id's are useful. They are used in OhMyREPL's logger to allow repeated log messages e.g. in s loop to be overwritten.

With that said because of #28786 OhMyREPL's logger also reassigned all ids to be concatenation of file name and line number.

But that is fixed now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

i-apellaniz picture i-apellaniz  Β·  3Comments

ararslan picture ararslan  Β·  3Comments

yurivish picture yurivish  Β·  3Comments

omus picture omus  Β·  3Comments

omus picture omus  Β·  3Comments