Gensim: Set up Azure pipelines for gensim

Created on 1 May 2020  Â·  34Comments  Â·  Source: RaRe-Technologies/gensim

@piskvorky basic config is ready https://github.com/menshikh-iv/gensim/pull/2, waiting when you activate azure for gensim (please make sure than pipeline runs in separate PR, after that I'll continue)

_Originally posted by @menshikh-iv in https://github.com/RaRe-Technologies/gensim/pull/2814#issuecomment-621950805_

housekeeping

All 34 comments

I tried setting up Azure Pipelines and got as far as here:

Screenshot (2)

Do I understand correctly the app wants read+write access to all my repos, including private ones? If so, that's a show stopper – at least for OAuth from my own piskvorky account.

I thought Azure Pipelines could be enabled for selected repos only.

I think I figured it out. Instead of logging in to Azure Pipelines website and then creating a Pipeline from there (= OAuth above = no go), I went to the Github marketplace and installed Azure Pipeline app from there. Instructions.

That allowed me to select which Github repositories I want. I selected all our open source repos.

@menshikh-iv Azure Pipelines is now active, I added both you and Misha as admins. I have no idea what you mean by (please make sure than pipeline runs in separate PR, after that I'll continue).

@piskvorky azure pipelines aren't really active, see https://github.com/RaRe-Technologies/gensim/pull/2823 (no azure job spawned)

by please make sure than pipeline runs in separate PR, after that I'll continue I mean that you test it first (run any random build from gensim repository), so, please, fix that and make sure than my PR trigger azure build, after I'll continue.

If I understand correctly you want me to:

  1. create a new github branch and add that azure yml file to it
  2. create a new azure pipeline at https://dev.azure.com/rare-technologies/gensim-ci by selecting that github branch
  3. ?do something about your PR?

I should get to it tonight. You and Misha are admins, so feel free to add the pipeline yourself if you like.

I want that you make my PR trigger azure pipeline (no need to create new one, just edit my PR)
I have no access to https://dev.azure.com/rare-technologies/gensim-ci, only you have an access.

UPD: something wired with microsoft auth system & permissions, let me dig in ...

That's weird. This is how it's set up:

azureadmin

I can open your link only If I don't log in (and have no settings controls)

image
Looks like you should add us to organization @piskvorky (I have empty organization list)

I already added you to the org – otherwise I couldn't have added you to the project:

Screen Shot 2020-05-04 at 10 40 44

Did you receive an email with the invite? Maybe in spam?

@piskvorky no emails (even in spam), still no access :(

Hm. I'll try to see if there's any help/support in Azure DevOps.

NLP <3

image

LOL you should know better than to expect much from "NLP chatbots".

Here: https://developercommunity.visualstudio.com/content/problem/1016416/users-added-but-have-no-access.html

@piskvorky I'm digging into the documentation and have an idea: can you try to fully delete us from azure and add though GitHub usernames (instead of emails)?

Note:

OK. The Azure support team responded, I'll try that first.

Note:

That's what I did before. I re-did it again now – did you receive an email from [email protected] now?

@mpenkov how about you? Did you receive anything?

No, I changed subscription setting in profile to "get all emails", can you try to resend?

@menshikh-iv Invite resent:

Screen Shot 2020-05-04 at 14 09 59

@piskvorky got them, I have an access

Good. Because adding Github usernames (instead of emails) is not possible. I tried and adding a new user requires their email.

@piskvorky still can't run build: looks like lack of permissions to RaRe organization, can you create default python pipeline thought https://dev.azure.com/rare-technologies/gensim-ci/_build yourself?

Unfortunately not; it's asking for those crazy OAuth permissions on piskvorky again.

What Azure permissions do you need? You should have the same as me (admin) = full access.

What permissions do you need?

No idea, I'm getting the same OAuth (without options) as you

I made you an admin for https://github.com/RaRe-Technologies/gensim, as well as admin for https://dev.azure.com/rare-technologies/gensim-ci. I don't see any higher permissions than that.

image

No idea what's wrong here =/

image
moving forward (thanks to https://developercommunity.visualstudio.com/content/problem/642369/pipelines-creation-falls-back-to-github-oauth-auth.html thread), but that's not the end

Finally, it running

image

So, travis & azure have around 1m difference in the building, no more "slowdown" from appveyor :)

image
image

Possible next step - make a similar replacement in gensim-wheels

I think a side-effect of this switchover is causing me major git headaches on both Ubuntu & MacOS. I suspect especially this git config change:

https://github.com/RaRe-Technologies/gensim/commit/93385d331935fd8eeedb3f52b4dd8883dc86695c#diff-fc723d30b02a4cca7a534518111c1a66

I'm getting lots of spurious indications that files are modified in ways that git can't reverse in the usual ways, blocking usual operations – and possibly even mangling of the contents of binary files. They're easy to reproduce even in a fresh, vanilla clone. I've posted more details on SO in the hopes of getting some git expert advice or a canonical fix:

https://stackoverflow.com/questions/61720528/simple-git-clone-mangles-many-files-with-phantom-un-resettable-modified-state

Even if this can be worked-around with some local git config change by every dev, it seems something to fix in the repo if at all possible.

Separately, can these new Windows-only Azure builds be listed after the faster/more-central CircleCI/Travis builds? (They don't seem to show as much info, while in progress, as the others.)

The 1st comment response at SO suggests the .gitattributes set up to help this Azure process is overbroad, and may inadvertently modify non-text files: https://stackoverflow.com/questions/61720528/simple-git-clone-mangles-many-files-with-phantom-un-resettable-modified-state?noredirect=1#comment109174230_61720528

On the good chance that deleting this file is a step towards a solution, I've started #2836 to do that. Cloning that branch (git clone -b del-gitattributes-text-eol-lf [email protected]:gojomo/gensim.git) yields a usable working-directory without the problem.

… yields a usable working-directory without the problem

Are the files already mangled, or is it safe to continue from such branch?

… yields a usable working-directory without the problem

Are the files already mangled, or is it safe to continue from such branch?

In the clone from the branch-without-.gitattributes, I believe they're unchanged. The phantom-change indication seems to be a warning of imminent automatic rewriting on next commit.

Indeed, take a look at PR #2837 - it's what happens if one of the 'modified' files is committed. It's definitely been changed in the repo at that point. However, while preparing the commit (at the add step, git gave its error:

warning: CRLF will be replaced by LF in docs/notebooks/index.
The file will have its original line endings in your working directory.

And indeed, the local file hasn't changed - it's sha1sum is the same as before, and the same as a 'raw' download from the changed-years-ago PNG from github:

$ sha1sum jupyter_home.png 
4832400ee0650c5140ce1261c57106d9fdd1df05  jupyter_home.png

Given this observation, if any files have been corrupted, I'd expect it to be in something that was committed from after the .gitattributes was put in place. So far in develop that's only CHANGELOG.md, which could likely survive any temporary CR/LF strangeness.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

franciscojavierarceo picture franciscojavierarceo  Â·  3Comments

dancinghui picture dancinghui  Â·  4Comments

volj1 picture volj1  Â·  4Comments

mikkokotila picture mikkokotila  Â·  3Comments

mmunozm picture mmunozm  Â·  3Comments