Respec: Centralize github fetching

Created on 24 Apr 2018  路  13Comments  路  Source: w3c/respec

Fetching from github can require credentials and can be rate limited. It would be good to centralize fetching of github data through the github module.

refactoring

All 13 comments

I want to take this up. A bit more details on the centralising part would be nice :)

@devanshbatra04 take as look at core/github and core/issues-notes. They both make use of the GitHub API... the idea would be to find a way to harmonize both of those so fetching, caching, and rate limiting is handled better (i.e., a centralized "GitHub-api" module).

See:
https://github.com/w3c/respec/blob/develop/src/core/github.js#L20

And:
https://github.com/w3c/respec/blob/develop/src/core/issues-notes.js#L188

This requires quite a bit of refactoring... so will be quite challenging. I'd suggest getting a bit more familiar with the code base before attempting this.

This requires quite a bit of refactoring... so will be quite challenging. I'd suggest getting a bit more familiar with the code base before attempting this.

Sure, I will continue looking at some other issues for now. Thanks!

@marcoscaceres I would like to work on this issue. Although I understand this is quite challenging, I want to apply for this project in Google Summer of Code 2019, so I want to start learning and contributing more to the codebase.

Hi, I already did some work on this issue before my injury. I have recently been discharged and will send a PR next week when I get back to normal work. Really sorry for the delay for this and other issues I picked up.

@devanshbatra04 OK. But please do tell if you leave this issue for some reason.

@Swapnilr1 you can work together with @devanshbatra04 on it - see the feedback I gave in the pull request. If you guys want to split the task up, that would be great. There is a ton to do. But work out the design of the module first... what's it going to export?, how is it going to fetch stuff?, and so on...

@marcoscaceres Sure. @devanshbatra04 I hope it is OK with you if we can split this?

Yes, sure. Here are all the corrections (as of now) in the PR that should be made, other than the module itself

  • [ ] Shift the renamed getHeaders function to core/github-api, and also export it. We also need to make these changes.
  • [ ] Add token only auth to getHeaders, and then refactor https://github.com/w3c/respec/pull/2099/files#diff-fd8e14e6c335ae838f06d979287518d8R27
  • [ ] Use the new github api to handle rate limiting instead of hard-coding it (https://developer.github.com/v3/rate_limit/)
  • [ ] Shift this back to core/contrib https://github.com/w3c/respec/pull/2099/files#diff-fd8e14e6c335ae838f06d979287518d8R105
  • [ ] Some silly function names to be renamed, other tiny issues.

As for the module itself, I though of the following solution:

  • Have it export promises, different for issues, contributors and comments. We use the rate limit as returned by the above mentioned API. Although it will be interesting to think if we need to make these rate limits on every promise call. A solution would be to instead return a promise that resolves this array of promises itself and makes an API call before resolving them. (I really need to explain things better)
  • Basically functions like -
export async function FetchIssues(IssueNumbers) {
   const remainingRequests = 0;
   fetch("https://api.github.com/users/octocat").then(resp => {
       const remainingRequests = resp.header
   }) /* Call to github rate limits api, probably another function */ 
   const IssuesList = [];
   if (remainingRequests > IssueNumbers.length) {
       IssueList = new Promise.all(Array.from(IssueNumbers.map(number => originalIssuesFetch(number)));
       return IssueList;
   }
   // else some complex rate handling code
}

There would be three such functions, for issues, contributors and comments. I might be wrong somewhere in the code, hope it conveys the rationale though.

  • Also, if you see in my PR, the core/github-config essentially just rewrites the conf to include the repo URLS to fetch. It only has a run(conf) function which should (in my opinion) be merged into core/github-api and we end up with only core/github-api. It looks like it will be the only run(conf), once we shift the current one back to core/contrib

Thoughts? @marcoscaceres @Swapnilr1

@devanshbatra04 I am looking into it, sorry for the delay.

@devanshbatra04 Sorry for this trivial doubt: Since it is your PR and it is not merged, how do I make changes to your implementation? Would it be fine if I forked your fork and submitted changes to it? Or should we split the refactoring to different PRs (I think it might make a mess of merge conflicts)?

I have given you collaborator access to my fork. Feel free to make changes on the branch. I will get back to this after my exams end this week. Also it would be great if you can ideate a bit about the github-api module by then.

@devanshbatra04,

Yes, I think we can merge core/github-config into core/github-api.

And about the module itself, yes I agree with your design, core/github-api should contain 3 functions to fetch issues, comments and contributors. And in each function call, we can see if remaining requests is greater than number of issues/contributors/comments requested, resolving the array of promises if that is the case. And in case there are insufficient requests remaining, then resolve as many as requests we have available, and reject others with a warning/error message.

@marcoscaceres Any comments on the module implementation?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

saschanaz picture saschanaz  路  6Comments

jnurthen picture jnurthen  路  6Comments

saschanaz picture saschanaz  路  3Comments

sidvishnoi picture sidvishnoi  路  4Comments

saschanaz picture saschanaz  路  3Comments