Freecodecamp: Gitter chat rooms archiving

Created on 8 Dec 2017  路  7Comments  路  Source: freeCodeCamp/freeCodeCamp

Moved from https://github.com/freeCodeCamp/freeCodeCamp/issues/8418 owing to its freshness

As mentioned by @evaristoc


New rooms scheduled for archiving:

  • FreeCodeCamp/NewYorkCity (id: 5593982115522ed4b3e3263f)
  • FreeCodeCamp/CoreTeam

Currently exploring @abhisekp approach to archiving before initiating the process of downloading the data:

https://github.com/freeCodeCamp/freeCodeCamp/issues/8418#issuecomment-238083019
https://github.com/freeCodeCamp/freeCodeCamp/issues/8418#issuecomment-269025399
https://github.com/freeCodeCamp/freeCodeCamp/issues/8418#issuecomment-258058263


@QuincyLarson :

I was trying to use the great package made by @abhisekp: npmjs.com/package/gitter-archive-cli but unfortunately it didn't work on my computer. It is giving a 404 error that I am finding hard to debug.

I will likely work this on Python. My current code seems to be outdated though. Apparently I am also affected by the rate limits with an 459 error - didn't happen before, I managed to download
messages over the limit in March 2017 with a simpler code.

If it works, I will make my Python code available. Hoping that will help to find a standard code to approach chatroom-archiving in the future.


Another, simpler option by @ladybugtju :
https://github.com/ladybugtju/ffcGitterData/blob/master/fccGitterData.js


This is a previous message by @abhisekp to be kept here as reference: https://github.com/freeCodeCamp/freeCodeCamp/issues/8418#issuecomment-252860430

Most helpful comment

ROOMS

Update:

FreeCodeCamp/FreeCodeCamp

  • [x] a download that included data until 9 Dec 2017 was made available in Kaggle

FreeCodeCamp/python:

  • [ ] Download
  • [ ] Compression Modification (into the format suggested, *.tsv)
  • [ ] Pull Request

We are having other ones currently in the list. I will keep updates for those rooms that were downloaded in this thread.

All 7 comments

/cc @evaristoc @QuincyLarson @abhisekp @ladybugtju

Thanks, @raisedadead!

@raisedadead @evaristoc Awesome - sounds good. It's a shame @abhisekp's tool doesn't seem to work at the moment, because I could imagine a lot of communities being able to use that. But if it's faster to just download the data using Python, that works too :) Thanks for digging into this!

@QuincyLarson

No worries. My code for dealing with chatrooms is mainly in Python, which I know much better than node.js. I might work on a solution in node.js eventually to have both, or ask someone to improve the existing node.js solution by @ladybugtju .

I will list all the existing code as part of the Open Data project for other people to follow up.

Sorry for being too late though - I made a wrong assumption about the duration of the download process which was worsened by a line in my code. The fact is that I entered into an loop that increased the download time EXPONENTIALLY (0o0) at each iteration when downloading data from the Casual room recently. THREE DAYS DOWNLOADING DATA until today!! First time in my life, never happened to me before when downloading similar amount of data from the same room. But glad that happened : Makes me a better coder :) .

I made the corresponding amendments for the error handling section and although still not perfect it is much better now.

FreeCodeCamp/NewYorkCity (id: 5593982115522ed4b3e3263f):

  • [x] Download (I have a copy)
  • [ ] ~Compression~ Modification (into the format suggested, *.tsv)
  • [ ] Pull Request

FreeCodeCamp/CoreTeam:

  • [ ] Download
  • [ ] ~Compression~ Modification (into the format suggested, *.tsv)
  • [ ] Pull Request

@evaristoc Thrilled to see this backup underway! Let me know if I can do anything to help :)

ROOMS

Update:

FreeCodeCamp/FreeCodeCamp

  • [x] a download that included data until 9 Dec 2017 was made available in Kaggle

FreeCodeCamp/python:

  • [ ] Download
  • [ ] Compression Modification (into the format suggested, *.tsv)
  • [ ] Pull Request

We are having other ones currently in the list. I will keep updates for those rooms that were downloaded in this thread.

I'm closing this issue as stale since it hasn't been active lately. If you think this is still relevant to the newly updated platform, please explain why, then reopen it.

Was this page helpful?
0 / 5 - 0 ratings