Kops: Spot request for spot instance group expires after 1 week

Created on 12 Feb 2018  路  17Comments  路  Source: kubernetes/kops

Kops version: 1.7.0
Kube version: 1.6.2
Cloud provider: AWS

I successfully created a spot instance group ~1 month ago. Today I noticed that the spot instance requests were in state request-canceled-and-instance-running.

After quick investigation it seems like kops sets date Valid Until to today + 7 days so spot instance request is cancelled. Is there any way I can change this during creation of the instance group? I did not find anything about it in docs here.

image

lifecyclrotten

Most helpful comment

Thanks for reply. I see, I think you are right about spot fleets. AWS does not seem to expose valid until in any SDK (by curiosity I checked Java and JS).

Does anyone know if there is another way to achieve spot instances using kops even though it requires manual tweak (i.e. changing something in AWS that is not reflected in state store)? From what I know there is no way to modify an active spot instance request which otherwise would be the easiest way to simply prolong Valid Until.

UPDATE: My quick fix was to add a scheduled action on the ASG to every day scale down and then up with OldestInstance as termination policy in order to rotate the nodes and hence get new spot instance request. This works for us considering it is a test environment and just wanted a quick way of having it more cost efficient.

All 17 comments

Found in AWS CLI docs:

--valid-until (timestamp)
The end date of the request. If this is a one-time request, the request remains active until all instances launch, the request is canceled, or this date is reached. If the request is persistent, it remains active until it is canceled or this date is reached. The default end date is 7 days from the current date.

So probably, kops does not set it which makes it 7 days as default.

WIth the AWS API call that we are using it does not appear that the option to set the valid-until date is exposed. I am guessing that we would need to implement the use of spot fleets, but I am only taking an educated guess.

We are using CreateLaunchConfigurationInput which has the struct member SpotPrice but does not have a request until date.

Thanks for reply. I see, I think you are right about spot fleets. AWS does not seem to expose valid until in any SDK (by curiosity I checked Java and JS).

Does anyone know if there is another way to achieve spot instances using kops even though it requires manual tweak (i.e. changing something in AWS that is not reflected in state store)? From what I know there is no way to modify an active spot instance request which otherwise would be the easiest way to simply prolong Valid Until.

UPDATE: My quick fix was to add a scheduled action on the ASG to every day scale down and then up with OldestInstance as termination policy in order to rotate the nodes and hence get new spot instance request. This works for us considering it is a test environment and just wanted a quick way of having it more cost efficient.

We've also unknowingly been suffering from this problem. I was surprised to find this out today.

For now we're going to implement some tooling to bring down each node (or cancel their spot instance request) every week in order to actually take advantage of spot pricing.

Keen to see work on the implementation of spot fleet.

I don't think this is an issue exactly if you read the documentation from amazon. I just tested this myself as well we had kops create a ig with spot instances they have now expired after 7 days but our instances are still running as spot instances.

The answer from the amazon employee explains it well and provides links to amazon documentation that explains it as well.

https://forums.aws.amazon.com/thread.jspa?messageID=827030

To add a little more, it since the instances stay running everything is fine until the price goes above your max bid at that point it will terminate the spot instance. However kops sets on the launch configuration that is attached to the autoscaling group that it should launch spot instances so when the asg drops below its node count it will try to recreate the spot requests and hence a new spot instance.

I don't think this is an issue exactly if you read the documentation from amazon. I just tested this myself as well we had kops create a ig with spot instances they have now expired after 7 days but our instances are still running as spot instances.

It is an issue if you want long lived spot instances i.e. greater than a week. People who are using this feature are trying to use spot instances to keep costs low as possible if the bid is acceptable.

The issue here isn't that this happens, but that it would be desirable to choose a longer period so that after one week your costs don't suddenly spike 馃搱 ... which is what happened to us.

Per https://forums.aws.amazon.com/thread.jspa?messageID=827030 this is the state the spot request gets put into when it expires the "request-canceled-and-instance-running" so you are still billed the spot price. It will however not auto request a new spot instance in that state because the request is canceled but in this case that is ok because the autoscaling group will when the number of nodes drop create a new spot request then that will start in the open state and try to be fulfilled for 7 days.

Hello,

The status 'request-canceled-and-instance-running' is shown when you cancel the Spot Request while the Spot Instances are still running. The request is cancelled, but the instances remain running [1]. If your Spot request is active and has an associated running Spot Instance, canceling the request does not terminate the instance; you must terminate the running Spot Instance manually [2] (should you require). 

Regarding the pricing, your instance will run at the current spot price as long as the Spot-Price is below your Maximum Bid. If you set it at the On-Demand price, or if you didn't set a price for the Maximum Bid, it will run until you terminate it.

With spot instances, autoscaling Instance Protection won't prevent a spot instance from being terminated - any spot interruption will cause the instance to be terminated [3].

Links:

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-bid-status.html#spot-instance-bid-status-understand
[2] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-requests.html#using-spot-instances-cancel"
[3] https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-instance-termination.html#instance-protection

Best Regards,

Jay F.

Another thing to note, you will notice on spot instances that have had their spot request put into the state 'request-canceled-and-instance-running' will still maintain the lifecycle of spot, you can find that in the details panel of the ec2 console.

screen shot 2018-05-05 at 3 10 30 am

Regarding the pricing, your instance will run at the current spot price as long as the Spot-Price is below your Maximum Bid. If you set it at the On-Demand price, or if you didn't set a price for the Maximum Bid, it will run until you terminate it.

@zachaller Interesting! I understand what you were saying more clearly now. Thanks!

So is it safe to say this is a non-issue?

@duro yes i would say its a non-issue, from my understanding of the docs and testing of the behavior you should not get charged extra or have any undesirable effects from the way kops uses spot instances

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

It's a non-issue and can be closed.

/close

@itskingori: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yetanotherchris picture yetanotherchris  路  3Comments

RXminuS picture RXminuS  路  5Comments

rot26 picture rot26  路  5Comments

lnformer picture lnformer  路  3Comments

argusua picture argusua  路  5Comments