The following metadata should just work:
:name (source). This option appears optionally in :registered_name in proc_lib crash reports and it may be present in behaviours as an atom or PIDThe following metadata requires new formatter rules:
:initial_call (source) - available only in proc_lib crash reports.The following metadata is always ignored unless explicitly fetched by the logger backend:
We can also use the first stacktrace entry to fill in the module and function, although we can get false positives. Alternatively we can get the module from initial_call.
happy to take this if nobody is actively working on it. would be able to get started on it this week 馃檪
@mitchellhenke I am already working on it :) but reviews will be really appreciated when I send a PR. :)
would be glad to!
Actually, one question. error_logger can receive events like the following, which includes a good amount of information like process dictionary, reductions, etc.:
{:error_report, #PID<0.50.0>,
{#PID<0.219.0>, :crash_report,
[
[
initial_call: {MyApp.ErrorLoggerTest,
:"-test logs errors from crashed processes/1-fun-0-", []},
pid: #PID<0.219.0>,
registered_name: [],
error_info: {:exit,
{%RuntimeError{message: "MyError"},
[
{MyApp.ErrorLoggerTest,
:"-test logs errors from crashed processes/1-fun-0-", 0,
[
file: 'test/lib/my_app/error_logger_test.exs',
line: 20
]},
{Task.Supervised, :do_apply, 2,
[file: 'lib/task/supervised.ex', line: 88]},
{:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 247]}
]},
[
{Task.Supervised, :exit, 4,
[file: 'lib/task/supervised.ex', line: 121]},
{:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 247]}
]},
ancestors: [#PID<0.217.0>],
message_queue_len: 0,
messages: [],
links: [],
dictionary: [
my_process_dictionary_key: "value"
],
trap_exit: false,
status: :running,
heap_size: 610,
stack_size: 27,
reductions: 345
],
[]
]}}
Would it be reasonable to include that information as well?
comment updated
We want to include some of it (pid, registered_name, error_info, etc) but not all of it. It should be easy to add new information at any time though. Is there anything in particular you would like to add?
dictionary is particularly useful I think. It would include logger_metadata which could even be pulled out and passed to the eventual Logger call as well?
@mitchellhenke oh, genius. We should extract the logger_metadata from the dictionary. I wouldn't include other dictionary keys though. Is that ok?
@josevalim It's a grey area to me. My argument in favor would be that it's information included in a log event from Erlang, so it could be passed along to all of the Logger backends to do with what they like.
I also understand not wanting to pass around the whole thing and having to deal with issues that could arise from that.
If other process dictionary keys weren't included, my use case would have to come up with a process-based dictionary somewhere or make use ofLogger.metadata to store/access the same information.
The problem is that anything can store something in the dictionary, such as the rand module. I think we can be conservative here. In the worst scenario, you can provide your own translation module and extract metadata before Logger puts its hands on it. :)
I hadn't thought of a custom translator, but that would work as well. My preference would be to allow Logger backends to receive all of that information, but I understand being conservative in handling it.
This discussion has been really helpful, and I appreciate you taking the time to explain things and answer my questions. Looking forward to the PR! 馃檪
Apologies but I had to focus on other stuff and I was not able to progress on this.
@mitchellhenke I do have one question though. I see that you also have a plug: https://github.com/getsentry/sentry-elixir/blob/747964b81dab4a3f79a154d3f918ba3c9fd95c2a/lib/sentry/plug.ex#L1
How do you avoid reporting the errors twice? One from plug and another from the process crash?
@josevalim No worries about having other stuff to focus on, completely understand!
This is something that's currently not ideal, but Sentry's current error_logger handler doesn't handle the message sent on a Plug request process crash.
I've considered expanding what events the handler generates reports from to minimize other configuration (like the Plug) that is necessary, but stacktrace information is generally taken from the arguments to the formatted message at the moment. My understanding is that if the format of the messages were to change it may cause issues with fetching that information?
For example, I see that when a Plug request crashes, error_logger passes along the message Ranch listener ~p had connection process started with ~p:start_link/4 at ~p exit with reason: ~999999p~n'. It's a list of 4 pieces of data, the final one which contains the structured stacktrace. That last one is a two-element tuple in which I could match on {{error, stack}, _} and pull out the stacktrace.
I wasn't sure how often the format may change, so I've avoided depending on it in favor of solutions like the Plug.
If your time is limited because of other things, I'd be glad to take this on and take it in whichever direction you think is best in solving the original issue 馃檪
@mitchellhenke the PR #7649 adds support for returning metadata in the translator. We should merge it tomorrow. If you would like to go ahead and return the relevant metadata after it is merged a PR will be very welcome!
Sorry for the slow replies but some of the decision around this area was pending on OTP 21 but we should be good to go now.
@mitchellhenke also please note that Elixir v1.7 and Erlang/OTP 21 will automatically read the Logger metadata on log messages. So this particular issue has already been solved. We can now focus on extracting the metadata from the error reports themselves.
@josevalim awesome, thank you!
I do have a couple questions if you don't mind. What do you mean by Logger metadata will automatically be read? Is there somewhere to read what is all changing for logging in OTP 21?
Thanks again 馃檪
@mitchellhenke you can read the docs in master: https://github.com/erlang/otp/blob/master/lib/kernel/doc/src/logger.xml
But in a nutshell, the old gen_event error_logger will be deprecated in favor of a new logger mechanism that works on the caller process. This gives us the opportunity to hook into the logger and make it behave as if it was calling our own logger, including in regards to the whole back pressure mechanism.
@josevalim thanks for the link! I am hoping to start taking a look into it this week.
@josevalim took longer than I hoped to mess around with it, but I put together a small repo here with a sample translator that matches on a specific message format received from error_logger. It then grabs the error and stacktrace and returns it in the metadata.
One potential issue is being able to return just metadata and skip attempting to translate the message as the above example ends up returning an empty message. Would it be possible to support something like returning {:ok, :skip, new_metadata}?
The big issue is there's no guarantees about the shape of messages error_logger will generate, and whether they may conflict and not have the expected data in the formatted arguments. Is having translations like the sample above something that would potentially be included in the default translator?
@mitchellhenke we could allow {:none, metadata} but in this case you could format the message as well by calling :io_lib.format(msg, args). So I would prefer to go the :io_lib.format/2 way.
Ah ok, I was hesitant to do any message formatting as I wasn't sure if the translator should be strictly concerned with metadata.
Do you think that the structure in the example is workable for including metadata in Logger events from error_logger, and is it something that you think would be included in core?
@mitchellhenke I believe Elixir should provide some metadata by default, at least the ones defined in the issue topic.
Hi @mitchellhenke! This week we are closing in on the issues tracker to do a release candidate for v1.7.0. Did you move forward on this front? If you haven't finished yet but already has something working, please let us know.
@josevalim yeah! I think I've got most of the structure in place, but ran out of time this week to add it across all of the translator tests. I'll be able to finish it up tomorrow if that's alright?
Current progress can be seen here (happy to receive feedback): https://github.com/elixir-lang/elixir/compare/master...mitchellhenke:logger-metadata?expand=1
@mitchellhenke looks excellent to me! :D
I think we could change the translator tests in their entirety to use the new backend but we should do it later. :)
I have played a bit with initial_call and unfortunately it is very inconsistent. For example, for supervisors, it returns {supervisor, Module, 1}. My suggestion is that, instead of including initial_call, we should include initial_module or proc_module, that only includes the module name and does not need extra rules.
I think the weird supervisor name is used by OTP to make it obvious that a process is a supervisor in observer and similar tooling. As far as I know, it's not used for any functionality at the moment.
But after your change for supervisors to return proper metadata that they are supervisors from format_status, I think it should be reasonable to make OTP and the tooling to use that and have normal initial_call for supervisors.
@michalmuskala good point. I think for now we can go ahead and change the supervisor initial_call to a proper MFA and one day we may no longer need to do it.
@mitchellhenke since we are closing on a release candidate for v1.7, i went ahead and implemented the remaining initial_call and registered_name metadata. Thanks for the great test logger and the crash report implementation. :heart: :green_heart: :blue_heart: :yellow_heart: :purple_heart:
Closing in favor of #7844.
@josevalim sounds good! Happy to have been able to contribute a little bit 馃檪
I'll definitely play around with the new translators/metadata when the release candidate comes out.