Enhancements: Extending Hugepage Feature

Created on 5 Feb 2020  路  19Comments  路  Source: kubernetes/enhancements

Enhancement Description

  • One-line enhancement description:

    • Extend hugepages feature to overcome limitations, This enhancement consists of 1) container isolation of hugepages 2) support multiple sizes of hugepages
  • Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/20190129-hugepages.md

  • Primary contact (assignee): @bg-chun

  • Responsible SIGs: sig-node

  • Enhancement target (which target equals to which milestone):

    • Alpha release target (1.18) // This enhancement is the extending of GA stage feature.
    • Beta release target (x.y)
    • Stable release target (x.y)

PR Tacker

Hugepages KEP have been updated for serveral enhancements.

  • Support container isolation of hugepages / KEP Update1(merged)
  • Support multi size hugepages at host level / KEP Update2(merged)
  • Support multi size hugepages at container level / KEP Update2(merged)
  • Support hugepage reservation for system-level service / part of original KEP

PRs of container isolation of hugepages

Kubernetes Side

PR | Description | Status | Target | Owner
-- | -- | -- | -- | --
https://github.com/kubernetes/kubernetes/pull/83614 | Update CRI to support hugepages | merged | 1.18 | @bg-chun
https://github.com/kubernetes/kubernetes/pull/84154 | Support for setting hugepages limit during container creation | merged | 1.18 | @ohsewon
https://github.com/kubernetes/kubernetes/pull/87118 | e2e_node test for container isolation of hugepage | need review | 1.19 | @ohsewon

CRI Runtime Side

PR | Description | Status | Target | Owner
-- | -- | -- | -- | --
https://github.com/kubernetes/kubernetes/pull/84701 | Update Dockershim | WIP | 1.19 | @admanV
https://github.com/moby/moby/pull/40160 | Add hugepages field to resource(moby) | Approved | 1.18 | @bg-chun
https://github.com/cri-o/cri-o/pull/2940 | Update Container runtimes(cri-o) | Merged | 1.18 | @bg-chun
https://github.com/containerd/cri/pull/1332 | Update Container runtimes(containerd) | Merged | 1.18 | @bg-chun

PRs of support multiple sizes of hugepages

PR | Description | Status | Target | Owner
-- | -- | -- | -- | --
https://github.com/kubernetes/kubernetes/pull/82820 | Support for pre-allocated hugepages with 2+ sizes(for host) | Merged | 1.18 | @odinuge
https://github.com/kubernetes/kubernetes/pull/84051 | Support for multiple sizes huge pages(for pod) | Merged | 1.18 | @bart0sh

Hugepages feautre related PRs(out of scope of KEP updates)

PR | Description | Status | Target | Owner
-- | -- | -- | -- | --

80831 | Add support for removing unsupported huge page sizes | got lgtm | | @odinuge

83541 | Support for reserving hugepages for system and kubelet | need 2rd review | | @odinuge

80605 | Add huge page usage stats to kubectl describe node | need CLI review | | @odinuge

81774 | Bugfix: Kubelet doesn鈥檛 update /sys/fs/cgroup/hugetlb/kubepods/hugetlb.2MB.limit_in_bytes upon Node Status Update | got lgtm | | @rojkov

kinfeature lifecyclstale sinode trackeno

Most helpful comment

@VineethReddy02 Yes, this enhancement requires documentation update. Here is a PR for it: https://github.com/kubernetes/website/pull/19008

All 19 comments

/milestone v1.18

@bg-chun: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone v1.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/sig node
/kind feature

/assign @bg-chun

add related conversation with @justaugustus, @jeremyrickard at the slack
https://kubernetes.slack.com/archives/C2C40FMNF/p1580827241393800

justaugustus:kubecat:  11:40 PM
So the biggest thing I see is that this is missing a way for the Release Team to track it  
The tracking issue that you have open should really be in the enhancements repo  
The KEP is missing the Release Team checklist, which might have been implemented after the enhancement went GA  
but given that this is a revisit of the KEP, I'd suggest opening another enhancements issue and following the template.  
From there, you'll need to add the Release Team Checklist: https://github.com/kubernetes/enhancements/blob/master/keps/YYYYMMDD-kep-template.md#release-signoff-checklist

https://kubernetes.slack.com/archives/C2C40FMNF/p1580827722403800

jerickar  11:48 PM
Was just about to write that :) @bg.chun the enhancement freeze for 1.18 was last week.  
 You鈥檇 need to get the Issue created and the KEP updated ASAP and we would need to grant you an exception to get into the release.  
You should for sure do what @justaugustus has called out above and file the exception request : https://github.com/kubernetes/sig-release/blob/master/releases/EXCEPTIONS.md  
The new issue and the KEP will need to happen regardless of release though, so even if we can鈥檛 grant an exception for 1.18 you will need those for 1.19

/cc @justaugustus, @jeremyrickard @derekwaynecarr, @bart0sh ,@odinuge, @kad
sig-release: @justaugustus, @jeremyrickard
sig-node: @derekwaynecarr, @bart0sh ,@odinuge, @kad , @bg-chun

As guidance of sig-release, I created an issue for release and added a checklist on the issue.

I have some questions for rel-checklist.
We extend hugepages feature, which is implemented status and GA stage.

So, it is hard to meet the checklist just right now.
Here, I organized the list of questions.

1) KEP approvers have set the KEP status to implementable
The KEP is already implemented/GA.
What status, KEP should have for this case.
Should we change status then start alphav2/betav2/GAv2?

2) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
I opened a PR to update the KEP to have a test plan. Is it sufficient?
https://github.com/kubernetes/enhancements/pull/1540
=> Done

3) "Implementation History" section is up-to-date for milestone
I opened a PR to update KEP for impl history. Is it sufficient?
https://github.com/kubernetes/enhancements/pull/1540
=> Done

4) User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
https://github.com/kubernetes/website/pull/19008
=> Done

/milestone v1.18

Hello, @bg-chun, I'm 1.18 docs lead
Does this enhancement work planned for 1.18 require any new docs (or modifications to existing docs)? If not, can you please update the 1.18 Enhancement Tracker Sheet (or let me know and I'll do so)
If so, just a friendly reminder we're looking for a PR against k/website (branch dev-1.18) due by Friday, Feb 28th, it can just be a placeholder PR at this time. Let me know if you have any questions!

@VineethReddy02 Yes, this enhancement requires documentation update. Here is a PR for it: https://github.com/kubernetes/website/pull/19008

Hey @bg-chun @bart0sh,

Thanks so much for all the effort in getting this through! Just a friendly reminder that code freeze for 1.18 is March 05, 2020. Is there anything we should track, aside from your very helpful PR tracker up at the top of the issue?

Is there anything we should track
=> I think so, we have one un-merged PR(https://github.com/kubernetes/kubernetes/pull/84051)
And @liggitt requested a change for validation logic.

@bg-chun thanks for getting that PR merged. You mentioned @liggitt suggested a change for the validation logic, do you have a PR for that?

@jeremyrickard validation logic change included in that PR, no other changes have been requested.

The only 2 PRs that still need to be reviewed and merged are:

I've asked sig-node maintainers to review and merge them on 2 last sig-node meetings.

/milestone clear

(removing this enhancement issue from the v1.18 milestone as the milestone is complete)

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dekkagaijin picture dekkagaijin  路  9Comments

andrewsykim picture andrewsykim  路  12Comments

justaugustus picture justaugustus  路  3Comments

msau42 picture msau42  路  13Comments

justaugustus picture justaugustus  路  7Comments