Freecodecamp: New users progress can be wiped unintentionally

Created on 5 Dec 2017  路  20Comments  路  Source: freeCodeCamp/freeCodeCamp

Issue Description


Many campers are reporting to the forum that all their progress has been wiped: https://forum.freecodecamp.org/search?q=lost%20progress

Steps to replicate:

  1. create a new account with email sign-up
  2. do not verify your email address (may not be the issue)
  3. You鈥檒l need to sign in again as a returning camper to see challenges
  4. complete a number of challenges - two is fine
  5. sign out
  6. clear local storage and cookies
  7. sign back in

After that the progress should be lost. It doesn't happen again, as far as I can tell, but can be consistently replicated on a fresh account.

Confirmed by bengitter: https://forum.freecodecamp.org/t/i-think-i-can-replicate-the-lost-all-progress-bug/162323/2?u=jacksonbates

Browser Information

  • Browser Name, Version: Firefox 57.0.1 (but also reported on Chrome)
  • Operating System: Ubuntu 16.04
  • Mobile, Desktop, or Tablet: Desktop
accounts

Most helpful comment

Fixed!

All 20 comments

x-post from forum discussion:

Update:

I have tried a few more times and not been able to replicate it, but I noticed a weird thing.

I caught one attempt on video that worked and I noticed that the default user avatar was broken when my progress got lost. When I tried later and the progress wasn't lost, the user avatar was also not broken. I wonder if this means there is an issue with the s3 bucket dropping out occasionally? Is user progress possibly also saved to the same s3 AWS location?

Here is the screenshot of the broken avatar, top right corner:

image

I would upload the whole video, but my upload speeds are atrocious and it would take all night!

HI @JacksonBates thanks for the report.

I wonder if this means there is an issue with the s3 bucket dropping out occasionally?

Yup, possibly just a error with the network.

Is user progress possibly also saved to the same s3 AWS location?

Nope, they are stored on the User object in the DB, which is a dynamic instance of MongoDB.

From the threads in the forum this looks like its sporadic? Its not reproducible locally though.

@JacksonBates This gives me a starting point to look then.

From your steps, I see two possible entry points to investigate. The signout and signin. But I have my doubts about this as it would mean most campers would be losing their progress, not a few. So it must be some combination that we are overlooking.

Was anyone able to reproduce locally?

@raisedadead I just reproduced this in production on my first try.

@BerkeleyTrue This could be the main reason (hopefully only reason) campers are losing their progress.

Hi @BerkeleyTrue

I have managed to reproduce this locally, using the NODE_ENV=production.
Spoke too soon. Can't any more.

Looking deeper.

I'd like to confirm this bug.
I used hidden as the e-mail address.
The browser was Firefox Developer Edition: 58.0(64-bit)
The challenges I used as sample challenges: Reverse a String and Factorialize a Number.
Moreover, the default user avatar was also broken as @JacksonBates mentioned.
Also, when I signed back in and lost my progress, the avatar was restored to it's default image.

Update: Email hidden by mod.

So, on local with a pristine DB and NODE_ENV=production
I get this:

200 GET 114.166 ms - /signin
404 GET 8.439 ms - /api/flyers/findOne?filter%5Border%5D=id+DESC
200 GET 119.523 ms - /
404 GET 130.124 ms - /api/flyers/findOne?filter%5Border%5D=id+DESC
200 GET 76.825 ms - /signin
404 GET 9.374 ms - /api/flyers/findOne?filter%5Border%5D=id+DESC
200 GET 99.159 ms - /email-signin
200 GET 141.373 ms - /email-signin
404 GET 70.939 ms - /api/flyers/findOne?filter%5Border%5D=id+DESC
404 GET 12.118 ms - /api/flyers/findOne?filter%5Border%5D=id+DESC
  fcc:user:remote setting cookies +37s
  fcc:user:remote user logged in +1ms
302 POST 20.328 ms - /api/users/login
  fcc:user:remote setting cookies +22ms
  fcc:user:remote user logged in +0ms
302 POST 15.733 ms - /api/users/login
302 GET 23.117 ms - /
  fcc:challenges looking for /headline with the h2 element/i +27ms
200 GET 205.760 ms - /challenges/headline-with-the-h2-element
302 GET 288.362 ms - /challenges/current-challenge

I followed steps from @QuincyLarson:

Here are the steps I took:

1) Created an account
2) Signed into that account (without doing the email verification step)
3) Completed a challenge
4) Signed out
5) Opened up localhost:3000 in an incognito tab and signed into the same account

The progress should be wiped. And if you go back to your non-incognito tab and log in again, you'll notice the progress is wiped there, too.

This seems to be an issue with the account verification process, and that makes sense since our reports of account progress resets suddenly started appearing shortly after we deployed the account verification feature.

Unsure of the cause here, but this looks like duplicate requests to same end points.

@raisedadead OK - that's a good start. Please keep investigating this, and let me know if I can do anything to help. This is our absolute top priority right now - even higher than beta (though I think we can work on the two in tandem).

@raisedadead Are you now able to reproduce?

@BerkeleyTrue Yes, but not consistently enough, meaning if I purge my DB completely and start afresh, I can get this with the steps that I have mentioned from Quincy above.

But then on with every other account this goes inconsistent.

Also, I had to set NODE_ENV=production.

What, my initial investigations reveal is that their duplicate requests for every action (do you remember the chrome caching AJAX calls issue way back? Its kind of a similar behaviour).

I'll update more on my progress, as soon as I have them.

Also, to note that the relevant fields for challenges and the solutions challengeMap are missing from the affected user.

I think there is a rogue UPSERT or similar query somewhere that we are missing.

Could this be a old migration script?

This is the only one running, but it uses and update which should prevent it from destroying data.
https://github.com/freeCodeCamp/freeCodeCamp/blob/staging/server/middlewares/migrate-completed-challenges.js#L87

I do notice that the debug function is using the old namespace

I am not sure, but this affects only new accounts (which are unverified), potentially. As a emergency, I highly recommend, we add a notice on the welcome page after signup step:

"We are currently investigating an issue with accounts, please verify your email, before starting"

or similar.

[deleted]

@raisedadead I don't think we have enough data to point to email verification. Historically we haven't used verification

Yes, that totally makes sense, that the verification step is never been an issue.

I am fairly confident that the create and the verify calls (yeah, loopback's implementation 馃槥) in the same execution path on the User object are the trouble here.

I think this is the point where the User object is not updated correctly (and potentially malformed).

We should try and go back to sending the email via a direct call to send$ api available on the email object instead. I chose to do that in passwordless for instance in staging.

And create/update user object in its regular path.
But this is just a theory right now.

I'll confirm and update after I debug.

P.S: I'll get back this to after I catch some sleep 馃槾 馃槅 !

I believe I've tracked down the cause.

It is here: https://github.com/freeCodeCamp/freeCodeCamp/blob/backup/staging/server/boot/home.js#L17

To replicate:

Create new account: you should have a broken image. This is because we do not add a user.picture on account creation

Navigate directly '/': Since you do not have an image you will have a default one added and then the user.save method will be called.

At this point in the code execution your user object will not have the challengeMap field at all (reasons explained below). The user.save method will send the user object as it currently exists in the codebase to the database, which will in turn use it to replace the current version in the database.

You should see in the db that your user document has no challengeMap field. If you had any progress up till now it would have been wiped out.

Now that the how has been found, the why:

There where two changes that lead to this that where created to fix slowdowns we where experiencing about two years ago. The fix at the time was to reduce the amount of data going back and forth between the server and the db.

The first part of this fix was to remove the challengeMap from the initial user deserialization as this is only used if the user is navigating to a /challenges subroute and would reduce the amount of request data substantially.

The second part required a refactor of the codebase from using user.save (an instance save) to User.update. This mean we where only sending the required data to update the user object. The addDefaultImage slipped through my initial refactor. Due to the first fix, now anytime a user without a proper image link in their user data that would navigate directly to the home page would hit this middleware and loose all their progress.

Because of the manor of this bug, only those who solely used email to sign in where effected. Those who used OAuth signins would have their image brought in from the third party.

I'm working on a fix now.

Fixed!

Was this page helpful?
0 / 5 - 0 ratings