Crystal: #5215 [RFS] (Request For Shrinkage) aka Operation Ice Shorts

Created on 1 Jul 2019  路  45Comments  路  Source: crystal-lang/crystal

I tried handling this privately but didn't receive a response in 5 days. Maybe I sent it to the wrong place so I'll retry here.

Recently I registered a a github organization crystal-stdlib perhaps too early. My intent was to register the name, then post a RFC after registering in case there are registration bots scanning for keywords like they do for DNS or someone got a funny idea to squat. The RFC would contain a more detailed plan possibly with a pull request to get started on #5215 as the shard documentation addition seems to be going forward independently.

To me 4 thumbs up was encouraging on a post over a year old. After registering I noticed the 1 dissenting thumbs down and started to look at who gave what indication. In hindsight I should have researched who is who before going forward as RX14 was the only project member and he gave the thumbs down.

So now I have this organization with default settings. I can either:

  1. Add members.
  2. Transfer ownership to whomever is responsible for crystal organizations.
  3. Delete it. If deleted can a crystal organization member register it again?
  4. Wait until a better shrinkage plan is put in place.

In short I registered the name but don鈥檛 want you to think I鈥檓 poaching it.
My bad.

If I鈥檝e sent this to the wrong person please let me know. And delete any nudes.

Big
Bigger
Biggerrrrrrr

Most helpful comment

We can easily create teams in an organization. I don't see the need or benefit in having yet another organization. Stdlib should live near the compiler; they're distributed along the compiler and endorsed by the core team, unlike crystal community shards that are independent (and great) shards.

All 45 comments

The plan for an organization split was quite simple and was intended to help the organization grow with fewer impediments:

  • crystal-lang

    • Projects include the compiler. Maybe more.

    • Members include the core team.

  • crystal-stdlib

    • Projects include stdlib either as a single package or individual packages. TBD.

    • Members include the core team plus an expanded list of trusted community members.

  • crystal-community

    • Projects include a well curated list of crystal shards or other crystal related projects mostly handled by community members.

    • Members include any core members that want to be there and an expanded list of contributors that may or may not overlap with stdlib or core.

This suggested split was based on reviewing the crystal-lang vs crystal-community members. It seems there is little overlap. Crystal-stdlib is intended to bridge the gap between core and community allowing both to contribute at their own pace while keeping crystal-lang to core and the existing crystal-community as is or expanding.

Also, Gwen Stephani and that Lanister drunk changed 1 word and stole my song.

We can easily create teams in an organization. I don't see the need or benefit in having yet another organization. Stdlib should live near the compiler; they're distributed along the compiler and endorsed by the core team, unlike crystal community shards that are independent (and great) shards.

Splitting up a relatively small team is a bad idea.

My proposal was partially based on lack of understanding of github teams but also the interface shown when viewing https://github.com/[name] with a large number of repos. If stdlib splits in to multiple packages a separate organization seemed like a way to organize the repos without clutter.

Either way, how can we get more people working on stdlib?

Either way, how can we get more people working on stdlib?

Crystal's popularity needs to grow. We developers need to create something amazing and just continue to spread the word. This will eventually lead to more contributors

I'm attempting to contribute to stdlib. You can view my PR's here. My PR's are based on (attempted) ports of ruby security software to crystal but they won't function as the necessary API's only partially exist or don't.

@girng How can I make software in crystal if the API's lack basic functions like getting or setting the uid of the process?

Working on #5215 seems like a way to get the ball rolling for more developers to work on stdlib so more crystal application can be developed.

Can't you just make a PR to implement the API to set/get the uid of the process? Have you tried that, what did the lead devs say?

I did a month ago: Add POSIX (get|set)(uid|gid) functionality. #7822

Which is waiting on a design doc #7829 about users and groups last commented by me on May 29 about why a single switch away from root API doesn't handle application needs.

@chris-huxtable first attempted this in Jan 2018 as a large patch including user/group and id switching. See #5615 and #5627 for further information. I'm not sure exactly why it didn't go through as there are hundreds of comments.

Somewhere I read that it was too large and should start over so I implemented only the id switching portion without users and groups.

Which circles back to my PR #7822 last month.

The API is only

Process.uid # Always necessary
Process.gid # Always necessary
Process.euid # Necessary for setuid or switching uid applications.
Process.egid # Necessary for setuid or switching uid applications.
Process.setresuid # Necessary for dropping privs, or resuming them.
Process.setresuid # Necessary for dropping privs, or resuming them.
Process.euid = Int # Simplified wrapper for only dropping privs.

No one has suggested improvements, design deficiencies or coding style changes so I'm not sure what it's waiting on. Regardless of what #7829 comes up with everything except Process.euid = is still necessary for secure application development.

Right now I'm writing applications with my own fork of crystal that contains all my PR's just to get basic functionality available in ruby or other programming languages.

Creating a symlink is available in crystal.
Seeing what a link points to? Not possible.

So I made #7858 "Adds File.readlink to match File.symlink" 27 days ago.
It's 1 added method. No reviewers are listed and there are no comments. I don't know what it's waiting on.

I'm starting to get the idea the @asterite works on the compiler and that's mostly what he wants to do based on a comment he made in #5627 and possibly confirmed in #5625 where he suggests splitting stdlib from the compiler. I don't know where the other core developers interests lie. It's possible none of them have interest in POSIXy things even though symlinks exist on windows and the API is windows compatible. (But not implemented on windows as I don't have it)

So I started writing and attempting to create PR's here to help split stdlib development from core as suggested by @asterite so that maybe these issues could be resolved a bit quicker by giving more people access to stdlib without giving them access to core.

@didactic-drunk I work on these things:

  1. compiler bugs
  2. std stuff that I understand or I have fun working on

Part of the core team, which is at Manas, is working on parallelism right now. I think that's fine and super important but the downside is that it's a long work and there won't be results until it's done.

Other part of the core team probably does 2, like me.

The main issue with the System users/groups API is that none of us seem to know how it should look like, so we can't just merge a PR that looks fine. I particularly don't know anything about unix users and groups.

Maybe there should be a way to decide what the core team should work on next, with the possibility to change the direction every month or every couple of months. But right now the focus seems to be parallelism, based on user feedback.

There are more options than work on one thing or switch directions.

  1. More community involvement.
  2. Let experts review within their fields.
  3. Let experts contribute within their fields.
  4. Split stdlib and let others work on it.
  5. Split each lib in to it's own sub shard and let them have their own reviewers/committers.
  6. Take a more hands off approach rather than controlling everything.
  7. More community involvement.

Based on the commit stats it looks like Manas is handling most of the work considering the #1 and #3 spots are taken by @asterite.

| name | loc | commits | files | distribution (%) |
|-------------------------------|---------|---------|-------|--------------------|
| Ary Borenszweig | 154,282 | 6,711 | 854 | 50.7 / 62.3 / 57.8 |
| Johannes M眉ller | 23,570 | 198 | 350 | 7.8 / 1.8 / 23.7 |
| asterite | 16,995 | 12 | 391 | 5.6 / 0.1 / 26.5 |
| Julien Portalier | 15,948 | 258 | 444 | 5.2 / 2.4 / 30.0 |
| Brian J. Cardiff | 13,711 | 469 | 329 | 4.5 / 4.4 / 22.3 |
| Sijawusz Pur Rahnama | 13,637 | 128 | 367 | 4.5 / 1.2 / 24.8 |
| TSUYUSATO Kitsune | 7,554 | 301 | 265 | 2.5 / 2.8 / 17.9 |
| RX14 | 7,064 | 75 | 245 | 2.3 / 0.7 / 16.6 |
| O_o | 5,103 | 8 | 68 | 1.7 / 0.1 / 4.6 |
| Jonne Ha脽 | 4,421 | 414 | 178 | 1.5 / 3.8 / 12.0 |

If you spell your name different you can probably have #1, #2 and #3 within a year.

Joking aside do you want additional contributors especially for fields where you don't have expertise?

It normal for a large software project to not have a single person that's an expert in every field. The question then is how to handle the parts you aren't familiar with. Letting others take on those roles would help crystal grow and have more application developed bringing more people, jobs and money.

The work completed by @chris-huxtable was solid. (I reviewed his shard, not the PR) There was only 1 missing piece for users/groups which is rectified by adding two new methods. He would probably make a great reviewer for POSIXy code if he hasn't left frustrated by lack of movement on his PR in over a year.

Are you willing to transition to more development outside Manas so the project can grow? If so some of us are trying to help but can't seem to figure out how.

So, to get the ball rolling I've started writing code to split stdlib.

  • Nothing is permanent.
  • The names and file filters are changeable.
  • Seeking feedback. Not for use.
  • If you don't like the direction it's going in say so and I'll switch.

TODO:
[x] Splits a list of files in to a shard preserving history. Example repo structure.
[x] Generates shard.yml with the name and version.
[x] Generates meta project shard.yml
[ ] CI integration.
[ ] Multi version CI integration.
[ ] Combined documentation generation.
[ ] Shards integration.

Questions remain:

  • How do you want to handle versions?

    • What list of files would you like to the preserve commit history for with each shard?

    • Are there any other new files for each shard you would like to add. Such as customized README.md's?

    • What should the initial version be? 0.29?

    • With each crystal release does each every package get a version bump regardless of changes or should the meta package reference the current version of each package.

    • What about security related fixes. Does the individual package and it's meta package increase in version or does crystal make a new point release? Probably the 2nd for OS package managers correct?

    • Do you want to start with a single test shard while working out the bugs and infrastructure or do everything at once?

Colorize was chosen semi arbitrarily as a test case. Feel free to choose any other set of potential shards.

The current structure is a potential hybrid between hosting on crystal-stdlib or crystal-lang. Do you want stdlib- prefixes on the stdlib repos so they stand out when hosted on crystal-lang? This form of naming is somewhat copied from rubysl. If it's under crystal-stdlib prefixes are redundant and the organization repo list only lists stdlib components in a single place. Clarity of which packages are part of stdlib without the ~20 other packages under crystal-lang is one reason crystal-stdlib was created that I failed to mention earlier. Use it or don't. Right now it's a testing place or example.

I don't see how splitting stdlib solves anything whatsoever. Do you want it to be more permissive?
I, for example, want contributions to it receive the same scrutiny as the compiler (if not more so).
On the other end I see "wild west" with all the garbage from Ruby being added blindly.

Your complaint is that there are no approvals on your pull requests, so instead you work on some split that nobody ever agreed on?

One thing I've noticed over the years in all the repos I've participated in, some PRs are specific use-cases where the lead devs will not merge them. And instead, an addon is recommended. In this case, a shard could be created?

@oprypin Go read asterite's post on reasons for splitting stdlib. Debate that with him.

I'm providing example code of how it could be done and not relying on one person to do everything.

@girng I'm not only talking about my PR's. Those happen to be simple examples that I'm familiar with. I won't debate a problem I'm loosely familiar with but If you want a more complicated example that I am see @chris-huxtable's attempts to get users and groups (basic standard functions) approved in crystal in 2018.

My personal PR's add missing pieces to the standard library. Some of the API's are already half implemented and accepted with me providing the missing pieces.

Others correct security problems and provide developers with functions they actually want. #7768 dealt with accidental expansion of filenames breaking his application. He asked for a way to not expand ~'s which is what most people want for non-configuration files. I don't think anyone addressing that PR realized the repercussions of the expand problem when handling attacker controlled filenames. My PR #7903 is an attempt to close a security hole and give developers what they actually want. It also happens to match ruby/go and many other languages as they've dealt with the same problem.

Why is it stalled? I'm not sure but I can guess from @asterite they either don't have the expertise or the time to fix security related problems. If they don't have that then they probably don't have the time to work on stdlib. Hence my attempt to provide a way to get more people involved on improving crystal.

I'm not doing anything else until I get some feedback from the core team. This is the point where they can say what changes they'd like to see for example @ysbaddaden can give direction on CI or @asterite can give naming standards, say do include history or it's not necessary just copy the files.

If this doesn't work for them then the PR gets ignored, closed or whatever. At least I tried to give them more bandwidth to work on the compiler and have stdlib polished.

While speaking to my non-technical boning partner about this thread I happened upon the actual scope of the problem. (Or at least a way to explain it in a concise manner)

  • ~70% of the code is written by @asterite
  • @asterite doesn't approve code he's not familiar with.
  • Are there sufficient resources to approve code @asterite isn't familiar with?
  • Is that a healthy project structure?

Top 4 commits.

| name | loc | commits | files | distribution (%) |
|-------------------------------|---------|---------|-------|--------------------|
| Ary Borenszweig | 154,282 | 6,711 | 854 | 50.7 / 62.3 / 57.8 |
| Johannes M眉ller | 23,570 | 198 | 350 | 7.8 / 1.8 / 23.7 |
| asterite | 16,995 | 12 | 391 | 5.6 / 0.1 / 26.5 |
| Julien Portalier | 15,948 | 258 | 444 | 5.2 / 2.4 / 30.0 |

@didactic-drunk , OK, went and argued with @asterite. I should also point out that the problems you're trying to solve are a bit different and have even less to do with splitting the repo, which, in my view, cannot possibly make things easier; it's only one of the crudest ways to manage permissions.

I admit there are multiple goals, one of which is increased access for stdlib contribution. Another is splitting stdlib from core similar to what rust does or what rubysl attempted to do for better long term support.

The people you choose as code owners must have write permissions for the repository.

Will Manas give others write access to crystal-lang/crystal to help review and maintain stdlib?

Your complaint is that there are no approvals on your pull requests, so instead you work on some split that nobody ever agreed on?

No.

  • I tried writing PR's #7903 #7858 #7845 #7822. Most are simple 1-2 method additions. Some have multiple people wanting the same features. They're waiting on ?
  • I tried writing [RFC]'s before writing PR's #7846 #7904 #7857. Some have existing issues open by other people with identical requests. They're also waiting on ?
  • I tried looking at the Projects. Multithreading and windows support are outside of my abilities. They're also what Manas is working on.
  • I tried asking on the forum "How can I help". I didn't get a response. The prior responses didn't list any way to contribute to core or stdlib within my means which is what I find lacking to create shards.
  • I tried taking over an abandoned shard. You can compare where it came from by reading the project history at the bottom. So far It has 2 stars and no users.
  • I tried commenting on other RFC's, taking everyone's concerns in to account and coming up with a plan like in #5215. @ysbaddaden's concerns were about integration confidence. So I started writing this PR to get to the point where I can tackle the integration confidence.

It's starting to look like there's not a meaningful way to contribute to stdlib.

I mean, it's the same thing, just phrased more nicely.

I think the community could look into these problems directly; I don't see splitting repositories as the best solution to this, and let's not look at it as the only solution.

Suppose we went with your suggestion of using code owners. Maybe my immediate problems would go away, because in theory I could work on other things. I'd work on more pressing matters like stdlib if I could find a way to get contributions through.

Eventually when I was finished with stdlib I'd come back to this PR and start working on a split if it would be accepted solely for long term support and incremental upgrades. It's a real problem I've been through too many times. There's an opportunity with crystal to get it right before v1 which means the infrastructure is in place for any apps developed on crystal v1 to sit rotting for a few years before incrementally upgrading to vX.

I'm closing: the discussion goes in too many directions, and this isn't a forum but an issue tracker.

There are pros to extracting std libraries into their own repositories:

  • grouping issues and pull requests per library (possibly cleaning the compiler/corelib bug tracker);
  • simplified compiler/corelib test suites;
  • smaller, faster and extended test suites for each library (e.g. testing OpenSSL against different OpenSSL and LibreSSL versions);
  • ability to override an stdlib, without waiting/upgrading to a new Crystal release;
  • keeping a clear separation between core/std libraries.

But there are also cons:

  • lost confidence when merging a change in the compiler/corelib that breaks stdlib but when unnoticed until some CRON job is running, leading to fixup merges (or worse: reverting) _or_ it requires to complexify the test suite to also clone and test every std library;
  • contributing to a std library is more complex: master should run against crystal master, which requires to install/build crystal master and the libraries we want to contribute to;
  • std libraries must be compatible with both crystal master and the last crystal release, possibly leading to keeping and maintaining 2 active branches at the same time (on crystal breaking changes), or it's decided that std libraries are only compatible with crystal master (oops).

These cons, I believe, lead to contributions becoming more complex.

Last but not least: the core team will likely be the one that will review and merge stdlib pull requests, which won't make features be merged quicker or easier, quite the opposite, since it's now scattered and not centralized anymore.

There is value in extracting std libraries from the main repository, and I'd love it. Yet, this will be tackled by the core team when we feel that we're ready to achieve it.

Please don't attempt to tackle these large issues and refactors without good communication with the core team, it ends up more work and pain for everyone.

@didactic-drunk I'm sorry but right now I don't think there's anything we can do about it. Crystal is mainly run by volunteers who do things which are a result of their free time, what they like and what they know, and apparently the intersection of that with your PRs is empty. Then there's Manas, but they are working on parallelism, not on these issues.

Which one is more urgent or should be tackled first? I don't know. I think it should be a mix (spend some time on parallelism, spend some time tackling issues, spend some time reviewing PRs), but right now that's not the case.

How to change this? I don't know.

By the way, this also happens to my PRs. I opened https://github.com/crystal-lang/crystal/pull/7944 a couple of days ago, still no review.

My personal opinion is that it was a mistake to make things more strict. Previously I used to commit without anyone reviewing the code and the project progressed much faster. Yes, there were mistakes and some things needed clean ups, but with an extensive test suite this is less of an issue. See Ruby: any core committer commits, no problem, no reviews needed. Every day you see commits. People trust each other.

Right now Crystal doesn't have enough collaborators to merit this strict way of developing. I'd really like to push commits whenever I want. I'll probably still send them in the form of PRs, but I'll merge them once CI is green.

After all, we are not 1.0 yet.

Your answers seem conflicting.

  • You say you don't have enough contributors to do the split or review PR's yet there are people willing to take on those roles.
  • You claim crystal is run by volunteers who don't have time or expertise in various areas but won't accept more volunteers with time and expertise in areas you lack.

After all, we are not 1.0 yet.

More volunteers help you reach 1.0. You seem to imply the opposite.

How to change this? I don't know.

The answers are right in front of you.

Sorry, I don't have time to continue discussing :-(

Now there are stdlib forks. Exactly the situation @asterite claimed he wanted to avoid. This isn't mine.
https://github.com/73616c74/stdlib.cr

I think that's great. If the community likes it and starts using it and they find it useful, we can backport it to the standard library.

In fact, I'd like the above to happen more.

Are you suggesting the primary way of contributing to stdlib should be forking as crystal core doesn't have time to address PR's outside of their interests or expertise?

I'm very confused about the process. This seems to work differently than any other open source project.

No, forks are the primary way of disagreeing with the way an open source project is run. If one gets traction then that's a sign to the core team to change and reflect.

It's not a primary contribution workflow.

@RX14 asterite claimed he wanted to see more forks like that. You seem to be disagreeing.
I see you are a member. Is that an official project related statement or your personal opinion? If this comes off as snarky it's not. I don't know who is who or what member means in the context of crystal.

I have my own personal fork of crystal labeled "mine" which was private until 5 mins ago. I only intended to use it as a temporary integration branch for my PR's until they are accepted so I can get my current application ported from ruby.

Now I'm wondering what to do based on @asterite's response. Should I:

  1. Publish my branch and merge the upstream changes periodically with messy merge commits.
  2. Make a shard with monkey patches to stdlib. This would be harder to merge as 1-2 internal functions may be touched per file but not the whole file. See my OptionParser diff for an example.
  3. Do what I originally intended. This is all temporary, it just takes longer than expected to get PR's approved.
  4. Other.

If you suggest making a shard without monkey patching that's not possible for some of my changes. They modify stdlib classes that other stdlib classes take as arguments. One PR changes Path. I can't make a Path2 without monkey patching File.everything that takes a Path as the type is restricted. Not to mention it's a security vulnerability so any shards would need hand auditing to make sure they don't use the old Path.

I meant: make a shard, either with money patches or not. If it becomes popular and people think it's good we can then merge it back to the standard library so people don't have to use a shard, specially for cross platform stuff like process, users, etc.

Note for one of your PRs: never expose libc types in public Crystal APIs.

The idea that the core team needs to have a consistent stance is not one I agree with, unless stated otherwise then when I say something it's my opinion. I don't run my github posts past the rest of the core team before I post them.

If we've debated about something internally and reached a conclusion, we'll say so.

Note for one of your PRs: never expose libc types in public Crystal APIs.

So I should .to_u32 the return values?

@RX14 Part of my question was about the organization. Does member imply core team or do different members have different levels authority in decision making and when there is a difference of opinions between members which should I listen to?

@didactic-drunk Everyone in the crystal-lang github organization is on the core team. There's no levels, just people who can merge and people who can't. When there's a difference of opinions between anyone in this community you should listen to the points being made, not who's making them.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

will picture will  路  3Comments

costajob picture costajob  路  3Comments

asterite picture asterite  路  3Comments

relonger picture relonger  路  3Comments

ArthurZ picture ArthurZ  路  3Comments