Distributions: rpm nodesource doesn't wokr

Created on 22 Aug 2016  路  9Comments  路  Source: nodesource/distributions

https://rpm.nodesource.com/

==> default: Error: wget --quiet --tries=5 --connect-timeout=10 -O '/.puphpet-stuff/nodesource_pkg' https://rpm.nodesource.com/pub_0.12/el/6/x86_64/nodejs-0.12.9-1nodesource.el6.x86_64.rpm returned 4 instead of one of [0]
==> default: Error: /Stage[main]/Puphpet_nodejs/Exec[add nodejs rpm]/returns: change from notrun to 0 failed: wget --quiet --tries=5 --connect-timeout=10 -O '/.puphpet-stuff/nodesource_pkg' https://rpm.nodesource.com/pub_0.12/el/6/x86_64/nodejs-0.12.9-1nodesource.el6.x86_64.rpm returned 4 instead of one of [0]

Most helpful comment

Thank you for bringing this to our attention. We've migrated the repository away from the failing host and service has been restored. Apologies for the outage - I'll update this issue with a full postmortem report in the next few hours.

All 9 comments

rpm.nodesource.com is down for us as well.

me too
this no longer works
curl --silent --location https://rpm.nodesource.com/setup_4.x | bash -

it works for now!

Thank you for bringing this to our attention. We've migrated the repository away from the failing host and service has been restored. Apologies for the outage - I'll update this issue with a full postmortem report in the next few hours.

rpm.nodesource.com unavailable

Report Status

Service restored

Executive Summary

The AWS instance hosting https://rpm.nodesource.com/ became unavailable and prevented anyone from installing using this repository.

Outage Description

At approximately Monday, August 22, 2016 at 03:00:00 AM PDT we received reports that https://rpm.nodesource.com/ was unavailable. This host had become unresponsive and wouldn't respond to either a soft or hard reboot via the AWS Console. We snapshotted the instance and provisioned a new instance using a volume created from the previous snapshot. DNS was updated to point to the new instance and service was restored.

Affected users

Users in the community who used rpm based disributions and tried to install or update to the 4.5.0 or 6.4.0 releases.

Start Date/ Time

Monday, August 22, 2016 at 01:00:00 AM PDT (approximately)

End Date/ Time

Monday, August 22, 2016 at 06:00:00 AM PDT (approximately)

Duration

3 hours

Timeline

The new server for rpm.nodesource.com went online at approximately 5am PDT on 2016-08-22. It took approximately 45 minutes to snapshot, migrate, configure, and propagate updates to the new host.

Contributing Conditions Analysis

AWS marked this EC2 instance for retirement in early September, which suggests a potential hardware issue.

NodeSource became aware of this at approximately 3am PST and it was handled by @rvagg in Australia. Restoring service took longer than acceptable due to the manual configuration steps that were performed to address https://github.com/nodesource/distributions/issues/344.

Recommendations

We have a plan in place for an end-to-end pipeline for automated testing of the Node.js packages. Once this is in place, it is far less likely that this type of issue would interrupt service. Even though we are currently resource-constrained, we plan on raising the priority of this effort to ensure that this issue does not occur again.

Specific recommendations from this issue:

  • Be more aggressive in actively decomissioning instances marked for EC2 retirement
  • Provision HA configurations to prevent similar issues going forward
  • Tighten monitoring and alerting feedback loops to minimize MTTR

@mweagle have you guys thought about setting up an autoscaling group with health checks on AWS? Even if you set the min and max size of the group to 1 instance it would prevent long downtimes like you experienced today.

Thanks for sorting this.

@fewstera - Completely agree; that's exactly the type of configuration we're currently moving towards.

To follow up, we have completed our migration to CloudFront and you should be seeing increased availability and security.

Ref: https://github.com/nodesource/distributions/issues/353#issuecomment-245766143

Closing.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ErisDS picture ErisDS  路  5Comments

SilkAndSlug picture SilkAndSlug  路  5Comments

mortenpi picture mortenpi  路  5Comments

xNarkon picture xNarkon  路  3Comments

jtyr picture jtyr  路  4Comments