Enhancements: Structured logging

Created on 6 Mar 2020  ·  73Comments  ·  Source: kubernetes/enhancements

Enhancement Description

Define standard structure for Kubernetes log messages, add methods to klog to enforce this structure, add ability to configure Kubernetes components to produce logs in JSON format and initiate migration to structured logging.

  • Kubernetes Enhancement Proposal: Structured Logging KEP
  • Primary contact (assignee): @serathius
  • Responsible SIGs: sig-instrumentation
  • Enhancement target (which target equals to which milestone):

    • Alpha release target 1.19

Alpha

  • [x] Implement json log format (@yuzhiquan, https://github.com/kubernetes/kubernetes/issues/91490)

  • [x] Update flag help to list both format options (@serathius, https://github.com/kubernetes/kubernetes/pull/92177)
  • [x] Add --logging-format flag to kubernetes components

    List of components

    • [x] kube-apiserver (https://github.com/kubernetes/kubernetes/pull/91501, @tahsinrahman)
    • [x] kube-controller-manager (https://github.com/kubernetes/kubernetes/pull/91521, @SataQiu)
    • [x] kubelet (https://github.com/kubernetes/kubernetes/pull/91532, @afrouzMashaykhi)
    • [x] kube-scheduler (https://github.com/kubernetes/kubernetes/pull/91522, @SataQiu)
  • [x] Write documentation for new logging format (https://github.com/kubernetes/website/pull/21202, @serathius)
  • Follow-up from main tasks:

    • [x] https://github.com/kubernetes/klog/issues/153 (@tahsinrahman)
    • [x] https://github.com/kubernetes/klog/issues/156 (@yuzhiquan)
    • [x] https://github.com/kubernetes/klog/issues/158 (@tahsinrahman)
    • [x] Update klog in k/k to collect bug fixes above (@serathius, https://github.com/kubernetes/kubernetes/pull/91792)
    • [x] https://github.com/kubernetes/klog/issues/165 (@physcat)
    • [x] Update klog to collect bug fixes above (@serathius, https://github.com/kubernetes/kubernetes/pull/92554)
    • [x] Remove locking from logFormatRegistry as it's not needed (@rahulchheda, https://github.com/kubernetes/kubernetes/pull/92347)
    • [x] Validate if incompatible klog flags were used (@rahulchheda, https://github.com/kubernetes/kubernetes/pull/92394)
    • [x] Fix JSON logger verbosity (@yuzhiquan, https://github.com/kubernetes/kubernetes/pull/92788)

    Moved to Beta

    • [ ] Migrate Trace Logs to Structured Logging (trace.go:116) (@hase1128, https://github.com/kubernetes/utils/pull/171)
    • [ ] Move klog flags to Logs FlagSet (https://github.com/kubernetes/kubernetes/pull/92707)
    • [ ] Refactor Options to use LoggingConfiguration (https://github.com/kubernetes/kubernetes/pull/92732)

    Please let me know if someone is interested in working on one of the tasks.

    siinstrumentation stagbeta trackeno trackeyes

    Most helpful comment

    Congrats and thanks to everyone who contributed to Structured Logging. Hope we will have more opportunities to work together.

    All 73 comments

    /sig instrumentation

    /help

    @serathius:
    This request has been marked as needing help from a contributor.

    Please ensure the request meets the requirements listed here.

    If this request no longer meets these requirements, the label can be removed
    by commenting with the /remove-help command.

    In response to this:

    /help

    Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

    interested in

    /cc

    /milestone v1.19

    @serathius: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

    In response to this:

    /milestone v1.19

    Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

    /milestone v1.19

    /cc

    KEP was marked as "implementable", so we can start working on it.
    I have created first two subtasks within specific repositories. I will add more as work progresses and new tasks are unblocked.

    If your interested in contributing please volunteer directly on subtasks or ask for specific task here.

    /cc

    After discussing offline with @dims we concluded that upgrading klog will be non-trivial as we need to upgrade all references at once. Upgrade cannot be done partially as it would break klog configuration.

    We considered two options:

    • [Preferred] Properly update of all dependencies
    • Aliasing in go modules

    As we have a lot of time before 1.19 cut we would like to do it properly. I will create subtask with upgrade plan.

    /cc

    @serathius Enhancements shadow for v1.19 here. Do you see this as moving to alpha in 1.19?

    Yes, I think we are on track. We are finishing klog upgrade, after that next milestones will be really fast to implement.

    Thanks, I'll add it to the tracking sheet.

    /stage alpha

    Hi @serathius, I was looking over the KEP and noticed that there is no test plan defined. Please add a bit about the test plan in order to target to 1.19. _Enhancements freeze is next week, May 19._

    Also, thanks for adding the PRR questions from the previous pilot. We have now updated those and integrated that with the updated KEP format. If you can, please take this opportunity to update the KEP to the new format and answer any additional questions. Thanks!

    Thanks @serathius for answering the updated questions. When you have a chance please also re-format to the new format. You do not need to do that before the freeze tomorrow though.

    Hi @serathius - I am Savitha,1.19 Docs lead. Does this enhancement work planned for 1.19 require any new docs (or modifications to existing docs)? If not, can you please update the 1.19 Enhancement Tracker Sheet, or let me know, I can do it for you :)
    If docs are required, just a friendly reminder that we are looking for a PR against k/website (branch dev-1.19) due by Friday, June 12, it can just be a placeholder PR at this time. Let me know if you have any questions!

    We should document structured logging format I will create a placeholder PR

    With --logging-format flag implemented and migration instructions draft, there are lot's of tasks that were unblocked.
    Some easy tasks:

    • Adding logging flag to k8s component (Kubelet little harder, discussed below)
    • Migrating logging call

    Harder tasks:

    • Writing documentation
    • Implementing json format (https://github.com/kubernetes/kubernetes/issues/91490)

    I notice that existing flags about log mostly in Global flags, and implement in klog or component/logs, such like:

    Global flags:
    
          --alsologtostderr                                                                                                                                                               
                    log to standard error as well as files
      -h, --help                                                                                                                                                                          
                    help for kube-scheduler
          --log-backtrace-at traceLocation                                                                                                                                                
                    when logging hits line file:N, emit a stack trace (default :0)
          --log-dir string                                                                                                                                                                
                    If non-empty, write log files in this directory
          --log-file string                                                                                                                                                               
                    If non-empty, use this log file
          --log-file-max-size uint                                                                                                                                                        
                    Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
          --log-flush-frequency duration                                                                                                                                                  
                    Maximum number of seconds between log flushes (default 5s)
          --logtostderr                                                                                                                                                                   
                    log to standard error instead of files (default true)
    
    

    kube-scheduler or apiserver does not impelment them in component code.
    So i don't know what's the propose of this subtask adding logging flag to k8s component?

    Your right that currently there are a lot of global flags about logging. Those flags belong to klog which was introduced early in k8s development (before forking klog we used glog). With growth of k8s repo global flags started to be problematic (in k/k repo there are over 20 binaries which not always want to share command flags). Now it's an anti pattern to use global flags. I expect that in future when replacing klog, global flags will be removed.

    With your change kube-scheduler or kube-apiserver do not implement this flag it requires another step which is the subtask adding logging flag to k8s component. https://github.com/kubernetes/kubernetes/pull/89683 implements special set of functions AddFlags, Validate and Apply that are meant to be invoked within component setup itself allowing picking which flag are needed and easier use outside of kubernetes/kubernetes repo.

    Here is a PR that shows how to register additional flag in components https://github.com/kubernetes/kubernetes/pull/85266/files
    In this effort I decided to split task for flag "implementation" and flag "installation" to speed collection of approvals as different components are managed by different SIGs.

    Hope this helps to explain reason and how to add flags to components. Please let me know if you have other questions.

    OK, that's explained, thanks!

    Created subtask for json https://github.com/kubernetes/kubernetes/issues/91490
    For other tasks feel free to grab them. Please remember to request here them to make sure we don't duplicate work.

    I try to do pkg/kubelet task in Migrate selected log messages to klog.InfoS

    /cc

    Hey @hase1128, Adding flag to Kubelet is little harder as it requires code to support dynamic configuration. This PR shows additional places that need to be changed https://github.com/kubernetes/kubernetes/pull/85282/files

    @serathius
    Thanks the information, but I was going to take a task which migrate to infoS(not the task which adds flag).
    I try to see also task of flag. if it is difficult for me, I inform you.

    @hase1128 Oh, sorry. Your right, I read your commend incorrectly.

    @serathius : can I take up migrate to InfoS for Apiserver?

    Sure

    Small announcement.
    One of the approvers requested that json format is implemented before we introduce --logging-format flag as it's not very useful without it. We are prioritizing json implementation to make sure we are code complete before freeze.

    In mean time feel free to work on migrating log calls based on draft of instructions.
    https://github.com/kubernetes/community/pull/4793/files?short_path=af969dd#diff-af969ddd1593b66ad7330155ce792feb

    @serathius can i work on migrate log calls for pkg/controller?

    Ofc

    @serathius - I can take /plugin/pkg/auth/authorizer (node_authorizer.go:197) if it hasn't already been claimed OOB. I am most involved in sig-auth so seems like a reasonable fit.

    assigned

    I take pkg/scheduler for Migrate

    Can I try to migrate staging/src/k8s.io/client-go (event.go:278)?
    I assume that this migration is below:

    • StartLogging(klog.Infof) -> StartLogging(klog.InfoS)
    • change the variable in staging/src/k8s.io/client-go (event.go:278).

    @KobayashiD27 assigned you to staging/src/k8s.io/client-go
    Please follow instructions here https://github.com/kubernetes/community/pull/4793/files?short_path=af969dd#diff-af969ddd1593b66ad7330155ce792feb

    I take vendor/k8s.io/utils for Migrate.

    Congrats to @hase1128 for getting first migration PR merged.

    Awesome!

    Hi @serathius , any other tasks left to be done on this issue?

    @rahulchheda, currently all coding tasks are taken. There is one tasks for writing documentation, but I expect it would be pretty hard and not very interesting for new contributors. Still please let me know if your interested.

    Logging format was implemented, now we can move forward with PRs adding flags to components!
    Great work @yuzhiquan on JSON!!

    Hi @serathius !

    To follow-up on the email sent to k-dev on Monday, I wanted to let you know that Code Freeze has been extended to Thursday, July 9th. You can see the revised schedule here: https://github.com/kubernetes/sig-release/tree/master/releases/release-1.19
    We expect all PRs to be merged by that time. Please let me know if you have any questions. 😄

    Best,

    Kirsten

    Thanks for info @kikisdeliveryservice

    Added two new tasks for grabs as followups from https://github.com/kubernetes/kubernetes/pull/92177

    @rahulchheda please let me know if your still interested

    Yes @serathius, I would like to take them up

    Hi @serathius :wave:, I noticed that there are some pending action items in this issue description. Also, I see https://github.com/kubernetes/kubernetes/pull/92788 is still in progress. Do you think all of the implementation PRs would be merged by Thursday, July 9th?

    Thank you. :slightly_smiling_face:


    Code Freeze begins on Thursday, July 9th EOD PST

    @palnabarun Yes, for Alpha (planned for 1.19) we are left with only manual test and 1 small bugfix. I will work with @yuzhiquan

    @serathius -- Thanks for the update. :+1:

    Please let us know if the release team can help you in any way getting the pending work completed before the freeze.

    Hey @palnabarun looks like fix https://github.com/kubernetes/kubernetes/pull/92788 has been stuck in merge queue for last 24h (retested >3 times with 0 test failures). I'm worried it will not get merged.

    @serathius -- Let me have a look and take it up with the appropriate teams. Thanks for pointing it out. :+1:

    Hi @serathius kubernetes/kubernetes#92788 finally merged I will update the enhancements sheet accordingly. Congrats!

    Congrats and thanks to everyone who contributed to Structured Logging. Hope we will have more opportunities to work together.

    w00t! @serathius

    /milestone clear

    (removing this enhancement issue from the v1.19 milestone as the milestone is complete)

    Hi @serathius

    Enhancements Lead here. Any plans for this to go beta in 1.20?

    Thanks!
    Kirsten

    Beta work on this is a maybe for 1.20 right now -- TBD based on people's availability

    Hey @serathius @ehashman -- 1.20 Enhancements Shadow here 👋 friendly reminder that Enhancements Freeze deadline is October 6th.

    Do we have an update on going beta in 1.20?

    I've also noticed that some of the graduation criteria for Alpha is still being worked on, unless I am mis-interpreting the context of the PRs 😅

    For example, the Alpha graduation criteria was:

    Introduce structured logging and JSON format:
    
    Most important logs are migrated to structured methods.
    Flag for selecting logging format is implemented
    JSON format is implemented
    

    The Beta graduation criteria is:

    Adding guarding against regression and update user guideline:
    
    Static analysis protects important log lines from reverting to string formating
    Kubernetes Logging convention document is updated
    

    But, I've noticed that despite clearing the 1.19 milestone, we are still working on:

    Can we please verify and update the graduation criteria for the 1.20 release and future releases?

    Appreciate the work everyone!

    Regards,
    Jeremy

    /remove-milestone v1.20

    Don't currently have the capacity to push this to beta in the 1.20 release, will revisit once log sanitization work is complete.

    /milestone clear

    @serathius are we going to add a WarningS or similar function in the beta? @endocrimes pointed out that while we currently have Info and Error implementations, Warning is currently missing.

    Happy to open a PR if we want to include it - mostly spotted this while trying to cleanup some logs in kubelet.

    @endocrimes go for it! let's debate on the PR :)

    Structured Logging has different semantic then original klog methods. Structured logging interface is based on https://github.com/go-logr/logr which doesn't use Warning. From documentation

    Verbosity-levels on info logs. This gives developers a chance to indicate arbitrary grades of importance for info logs, without assigning names with semantic meaning such as "warning", "trace", and "debug". Superficially this may feel very similar, but the primary difference is the lack of semantics. Because verbosity is a numerical value, it's safe to assume that an app running with higher verbosity means more (and less important) logs will be generated.
    

    https://github.com/kubernetes/klog/pull/184#issuecomment-716976865 has some good info that currently only lives in the migration plan but I think would be valuable to put in the KEP if this is the direction we're moving.

    Was this page helpful?
    0 / 5 - 0 ratings

    Related issues

    euank picture euank  ·  13Comments

    wlan0 picture wlan0  ·  9Comments

    majgis picture majgis  ·  5Comments

    msau42 picture msau42  ·  13Comments

    xing-yang picture xing-yang  ·  13Comments