This issues is very controversial, but it was never really discussed in the past.
I've been thinking more and more about moving out pieces of the standard library to shards. Examples of these are the YAML
and XML
modules, though we could include more modules (CSV
, Zip
, Big*
, maybe even HTTP
).
These shards will most probably live under the crystal-lang
organization, so they will be kind of "blessed".
The pros of doing this are:
YAML
would have been a shard, we could already have the new functionality without having to wait for a next compiler release.libxml2
that comes with Debian 7 doesn't work in Crystal. Specs weren't (and aren't) passing there. But nobody complains, so maybe nobody uses Debian 7 anymore, or maybe nobody uses Crystal+XML in Debian 7. If XML were a shard, we wouldn't have to worry about all of this (well, yes, but not in each release, we could solve this faster and better), and we would have 0.24.0 some weeks ago.Of course there are cons too. These are:
require "xml"
we now have to write a shard.yml
, add the xml github repo, run shards
, etc.I have replies for these 3 points.
crystal-lang
organization, they are easy to find (if we put them somewhere in the docs saying these are "blessed"/recommended shards). The only inconvenient part is writing the dependencies in shards.yml
, but I wouldn't use this as the only reason not to go with this.crystal-lang
organization hosts "blessed", well-maintained, up-to-date shards, then hopefully people will look there before someone else (stars in a GitHub repo might do the trick as well for internet searches).HTTP::Client
in the standard library, people go and create others, like cossack and crest. Of course HTTP::Client
is missing fundamental stuff like redirections but there's not enough man-power to tackle this. Moving it to an external shard would make collaboration much easier as more people could join, and it could evolve faster.Basically, I'd like to move almost everything that the compiler doesn't use to a shard.
If we agree on this, we could discuss doing so for individual modules.
I believe Crystal should come with a standard library that mostly doesn't depend on external C libraries (it's a bit impossible, but the less, the better), and one that provides a foundation for other libraries to build on top of it. Examples of that are IO
, which is evented by default; the runtime, with its concurrency model, spawn
, Channel
, etc.; Array
, Hash
and Regex
, which are a core part of the language, mainly because they are tied to literals; and probably other types for similar reasons.
I still don't have a strong opinion about this issue. Just wanted to add something we (Ary & me) have discussed last week - we could help the adoption of this blessed shards by making crystal init
generate a shards.yml
that _already includes_ those dependencies listed.
So they are _almost_ part of the stdlib (ie, most projects will probably "just include them") but they aren't coupled to the compiler's release process.
Oh, yes, that's a great idea. In fact it could ask you what you want/need:
- Do you need YAML? (yes/no)
- Do you need XML? (yes/no)
I think the right boundary between usefulness is somewhere between JSON/HTTP (which should be in the stdlib) and YAML/XML (which should not). JSON is simple, and it's fairly common. HTTP (at least version 1) is simple, and very common. YAML is a lot more complex of a standard, and is much less common than JSON. XML is very common (although not as much as json) yet is extremely complex to the point that we really need a prebuilt library to parse it.
Here's a list of top-level modules I (personally) don't think should be in the stdlib:
Adler32
- never heard of it, seems to be there to satisfy the compiler onlyBig*
- not so sure on this one but they're really quite slow and not really high quality and not well usedBitArray
- i'm 50/50 since it's so simple (but not well used). This is a very borderline case to meComplex
- should we include matrices in the stdlib? is complex truly special among mathematical constructs to want to be in the stdlib?CRC32
- see Adler32
. Perhaps this one is common enough to live in the stdlib under Digest
Debug
- do we want a fully featured dwarf parser in the stdlib or just enough to get us debug info on stacktraces? I'd say this should really be private.DL
- well this one should just be removed because it's fairly useless (what's it's usecase?)INI
- seems much less commonly usedLevenshtein
- only here for the compiler (should be a shard which the compiler depends on)LLVM
- only here for the compiler (should be a shard which the compiler depends on)Markdown
- dittoOAuth
- complex and specifc, not sure why it's hereOpenSSL
- the only crypto primitives we should provide imo is a few Digest
s, HMAC
and TLS
.OptionParser
- somewhat opinionated and only here because of the compilerReadline
- named after a library, instead of a concept (like OpenSSL
but even less popular/useful)Termios
- no docs, idek what it does lolXML
YAML
Perhaps the compression stuff (Zip
, etc.) should be rethought a bit too. I think we should provide a few decomrpessors, but move them into their own module, and perhaps think about removing the file formats, and only keeping the raw compression formats.
I am not a crystal user, only a ruby user, so please feel free to ignore me completely. I am sure that whatever way you pick will be the one that will be right.
I am a bit confused as to why json is considered superior to yaml but since it also does not affect me,
I'll not comment on this further.
But one thing about distribution packaging, e. g. debian:
For example we lost a lot of time in the past release (0.24.0) because the version of libxml2
that comes with Debian 7 doesn't work in Crystal. Specs weren't (and aren't) passing there.
But nobody complains, so maybe nobody uses Debian 7 anymore, or maybe nobody
uses Crystal+XML in Debian 7.
I would not worry about this. Distributions such as debian will already package things up
anyway, and it should be THEIR responsibility to do so. If you depend on them and then
wait on this or that distribution, then in my opinion this leads to frustration only.
Ruby has the xmas release and it does not matter what any distributions do in regards to
xmas - xmas gifts are coming that day.
If you were to depend on a distribution, then that distribution has a crippling effect on
upstream (in this case, you, as programming language developers), and I feel that
this is a very ... awkward constraint to be had.
I would love to see a stdlib shard, which would in turn depends on shards of these 'blessed' "pieces of the standard library", and likewise for other high-level groups of libraries (ui?, web?, gl?, etc). Smaller chunks [shards]; easier for community involvement and more optimizable code. [Kinda makes me think of MRuby's granularly configurable dependencies.]
@drhuffman12 this just feels right to me. Having a shard (or group of shards) like this allows for the standard library to be even more conservative about what is considered a de-facto component of the language. If this would (as @asterite suggests) make version updates more nimble, then I'm all for it!
I really do agree for shrinking the stdlib. Keeping the core smaller will lead to a better maintenance. Rust has a small stdlib and noone seems to complain, so I don't think it'll be a problem for Crystal too..would be nice though to keep at least HTTP/1.0 as part of the core lib.
Basically, I'd like to move almost everything that the compiler doesn't use to a shard.
Is it a problem if the compiler itself ends up depending on (blessed) shards?
I disagree on the principle of moving everything the compiler doesn't use to a shard. I believe there are many things that the compiler does depend on that should be in a shard, and many things the compiler does not depend on which should be in core.
For example I think that a common batteries-included HTTP abstraction is essential for HTTP libraries and frameworks to play together. And the correct way to ensure this is to put HTTP in the stdlib. The compiler doesn't (apart from play) use this. But I still think it should be in the stdlib.
@RX14 I agree. I think LLVM could perfectly be a shard on which the compiler depends. It makes very little sense to have it in the standard library unless you are going to develop a compiler or interpreter, or contribute with the compiler itself.
I think in the Markdown discussion we concluded that it would make distributing Crystal a bit harder, but I'm not so sure about that.
I also kind of agree regarding a common HTTP server base. I don't know about HTTP::Client
, maybe that applies too because OAuth
can easily integrate with it thanks to before_request
. I guess eventually HTTP::Client
will have all of the current missing features.
For now, we can discuss the YAML and XML modules.
Regarding XML, I think it's always required by spec
to provide the JUnit formatter output, which is based on XML. That also is a bit strange: if you don't have libxml2
in your system, you can't run specs, even if you don't want to use XML. That formatter should be a shard which enhances the spec
library. Then we could extract XML to a shard.
Regarding YAML, it seems the specs for crystal tool init
use it. But this could easily be tested in a different way without requiring YAML.
We could then open separate issues to discuss extracting other stuff, like Big*
, to separate shards. Maybe if this idea is liked I could close this issue and open two separate issues, one for YAML and other for XML so we can weights the cons and pros of each.
I think that if common libraries like YAML are going to get moved out of the standard library, there should be some first-party way of maintaining local copies of shards, much like how Bundler installs gems locally.
A common use case for me is working on a plane or a train, where I do not have access to the internet, or it is spotty/slow at best. If I wanted to start a project while in transit, I'd be sad to find out that a common library I want/need to use isn't available because I can't download it in the moment, and won't be able to for another 5-12 hours.
I can't say I'm necessarily a big fan of Bundler because it ends up keeping various versions of everything around and you end up being forced to bundle exec
everything, but being able to at least "cache" a more fleshed out (potentially customized?) "stdlib" would be really nice.
As a bit of a side note: seeing how shards.yml
is (obviously) a YAML file, doesn't that mean that it needs YAML to run? I'm not entirely sure of how it all works, but it seems a little odd that the system for installing dependencies would need to install dependencies to run. But I guess that's essentially what bootstrapping is....
@faultyserver shards
is compiled, it doesn't need the yaml shard, or any sourcecode to run.
It does depend on libyaml though.
I also think that if we include a http server we should have a http client. I don't think the current HTTP module is well designed at all though. HTTP::Server
is usable but HTTP::Client
is a mess and I don't like how client and server share Request
but not Response
classes. HTTP::Client
needs middleware to allow shards to extend it or it'll replaced by something which actually does what people want (redirects, proxy, etc.)
I just think that shardening shall be later. Please don't touch it, it's simple before 1.0.
Multiprocessing is feature I want :)
Can the parser/compiler be used to generate a dependency tree? If so, maybe such a tree could be used to help note the 'from', to help clarify what might be easiest/harder to move to shards, and to help clarify what might need to be adjusted/de-coupled/etc in order to be moved to a shard?
@akzhan , sounds good to me! Saving the sharding for 2.0 [or 1.x] would be fine with me. [Maybe sharding would help Crystal get to 1.0 sooner and I would love the pro's; but, I'd rather err on the side of less scope creap.]
@akzhan It's actually simpler if we move stuff out of the standard library to shards. Right now we have to tackle YAML, XML, etc., plus parallelism. If YAML, XML, etc., were shards, maintained by others (with the core team keeping an eye on them, but not necessarily having to worry too much because it's not in the standard library) it will give the core team more time to fiddle with parallelism and other more important stuff.
I actually started implementing tar some time ago, but never pushed it because it wasn't feature-complete. Now I'm thinking that I could have created a shard and others could have probably helped me fill the gaps. It doesn't matter if it's not feature-complete or if it's buggy because it's not in the standard library, and can evolve easier and separate from the language/compiler.
Another way to say it: a module that's in the standard library has to be well thought, documented and complete, and will be frozen once we reach 1.0. That means the core team has to spend more time on this.
Just look at how fast Elixir evolved. Their standard library is pretty small, but it has the core stuff in there. In fact, the tool for generating docs is not even included in Elixir, it's a separate tool. That lets the core developers focus on the language and a small core subset. Same goes with Rust, which already reached 1.0. Go is a different story because they had a lot more time and resources than those projects.
I agree, an smaller stdlib will be easier to maintain and focus on parallelism or even windows support.
I also think documentation generator should be moved to a separated shard.
Maybe we can move some crystal tool
too, like formatter, implementation or expand benefiting the creation of tools like Scry and facilitating crafting new tools like rename or refactoring, without waiting for a new version of crystal.
I agree with @faultyserver, impossibility to create anything useful without internet access seems sad. So it would be great to have offline way to install shards, so e.g. all crystal-lang
shards will be installed by default when installing crystal itself. On the other hand, right now it is possible to just create app once with all needed shards and then copy them from lib
folder, so it is not critical inconvenience, just minor issue.
remove many libs from stdlib decrease possibility to use crystal for scripting. when you need to write 10 lines script, and you don't want to create project and install shards.
impossibility to create anything useful without internet access seems sad.
yarn
for node (and I think npm
does the same as well) creates a cache folder where it downloads every installed package there. When add a package it first checks the cache instead of downloading directly. This way you can install both faster and without Internet.
The omnibus release could prefill this "cache" with some of the blessed shards so that no internet is needed after crystal itself is installed.
remove many libs from stdlib decrease possibility to use crystal for scripting
We could have a shard
macro to automatically install the shard into some standard location and include that location into the import path. Something like this:
shard "github/mamantoha/crest", ">0.9" # optional version restriction
require "crest"
pp Crest.get("http://example.com/resource")
This can even be used for bigger project (actually, I think this is better than having a yml file, but that's another story).
Maybe we can move some crystal tool too
The compiler can import some tool shards and expose then on the command line. This way they are both included in the compiler (transparent for the user), and with easier maintenance for the project.
@lbguilherme Yeah, caching shards in some common place like node would be very useful, see https://github.com/crystal-lang/shards/issues/180
remove many libs from stdlib decrease possibility to use crystal for scripting
@kostya Maybe we should implement shards install foo
allowing us to require common installed shards without using a macro see https://github.com/crystal-lang/shards/issues/144
I'm not sure I like the idea of removing doc and the formatter from the compiler. At least these tools should always be installed. They're also tied into the compiler internals, which means that refractors to the compiler will affect these tools. Which makes putting them in a seperate repo rather hard with a changing master. Also more tools means more binaries we have to distribute.
Regarding offline usage of Crystal, I believe the problem will always be there. For example if I want to use MessagePack, well, it's not in the standard library, and I won't get it on a plane.
One can always download the shards in a local directory and then reference them with path
.
I do agree about Crystal becoming less suitable for scripts. But I think scripts mostly use Array, Hash, File, IO, etc. Well, maybe JSON and CSV. We could keep those in the standard library because they have no dependencies, plus they are relatively simple.
But if scripts become a bit harder to write, I wouldn't mind. In any case I don't see the point in writing a small file, saving it in /usr/bin
and compiling and executing it every time you want to run it. It's probably better to create a project, compile it with --release
and put it in /usr/bin
. I hope Crystal is more suitable for medium/large projects than for small scripts :-)
(we never said Crystal is a scripting language)
@faustinoaq Small side note, but doing a refactoring/renaming tool in Crystal is plain impossible, so I wouldn't spend time on that.
I hope Crystal is more suitable for medium/large projects than for small scripts :-)
Well said! 👍 :)
doing a refactoring/renaming tool in Crystal is plain impossible
@asterite Maybe because crystal has the same issue that C++ about refactoring tools ?
@asterite why are refactoring/renaming tools impossible? If a human can do it, surely the compiler can.
@RX14 @faustinoaq Let's continue that discussion in this gist
Personally, I was really positively surprised by the amount of useful features included in Crystal's stdlib. That's really nice, especially for beginners, to have so much readily available.
But sure, it doesn't have to be that way. And there is certainly some benefit from outsourcing complex libraries into shards.
However, the pro argument that shards can evolve faster is only relevant until a library is more or less feature complete. For example data formats like JSON or YAML are not to be expected to change much in the future. Once the Crystal implementation is mature enough, there shouldn't be much to evolve anymore.
I like MRuby's extremely modular/configurable approach with 'blessed' mgems available w/in the mruby repo, but only used/compiled into your app based on what you specify/config for your app; even the compiler is an mgem, available to your app. I'm not sure how much of that could map to a more sharded Crystal.
We can definitely trim some libraries off the monolith repository. I believe most libraries can be extracted and be of enough value to become shards (e.g. LLVM
, XML
, Markdown
, HTTP
, OpenSSL
, ...). Some libraries need debate (e.g. Big
); some libraries don't need much (e.g. OAuth
).
I believe we should reason in terms of: what is usually useful to build libraries or applications in Crystal? HTTP
, JSON
and SSL
are useful whenever you need Internet connectivity. Some serializers (YAML
, XML
), digests (SHA1
) and PRNG are nice tools to have. Markdown
is nice but somewhat specific. LLVM
is very specific. Etc.
That being said, most could still be distributed as part of the stdlib (e.g. YAML
, JSON
, OpenSSL
), and be easily overriden in shard.yml
to get the latest version without waiting for the next compiler release, while still being available by default. The con is that bundled libraries have to follow the compiler's stability engagement, or keep maintaining a previous, compatible, version, which isn't necessarily a bad idea.
Crystal is perfectly fine to be used as "scripting language" as part of Crystal projects. I do so in my projects more and more, due to a simple reason: I know crystal
is installed. And thus I know the script will work on anything that Crystal runs on. I'd like to have this feature in the future without adding further hacks.
I'd also like the stdlib to stay "batteries included", but not by hosting everything in the compiler sources, but by having default-available shards.
But what if the stdlib isn't the issue (Well, except for OAuth
), but the compiler is? Maybe most issues would be solved already if the compiler would be split into its own shard?
The Compiler team could then build a El Neato compiler, and the stdlib group a consistent stdlib.
I think the compiler could be moved from src/crystal
to compiler
, then it can have it's own shard.yml
. The problem is that if we don't vendor in the dependencies to the repo to make them modifiable, we'd have to have loads of work between the shards the compiler depends on to handle breaking changes.
That is of course, assuming we don't compile the compiler with the old stdlib. I'm not sure what I think about that.
Just wanted to add my $0.02.
I explored Crystal for a large project a little while ago. Part of this resulted in a few commits to the crystal project. I wasn't able to move forward with Crystal, the two deal breakers at the time being various bugs in generics and lack of multi-threading.
But there was a third issue that was gnawing at me, and that was the kitchen-sink approach the stdlib takes. I feel that it's huge baggage.
The one example that stood out (though not the only one) was: Hash(K, V)
. This was actually a bottleneck for our projects, so I wrote an alternative. I thought, "geeee, maybe I can contribute this to the project, let's see what behaviour I'd have to implement." Hopefully you know why I quickly abandoned that goal.
At one point, Go's map returned sorted keys. This was simply a consequence of the implementation, not something provided for in the spec. You know what they did? They changed the implementation for the sole purpose of removing that behaviour. Elixir is equally thoughtful...being functional, it leverages functional composition. I think Crystal could do some of that. Instead of my_hash.keys()
it should be Hash.keys(my_hash)
which works against some type of interface (ideally implicit, like Go).
I realize the type of change I'm suggesting is no longer realistic, but I still wanted to share my thoughts. I enjoyed the time I spent working with Crystal. I knew the technical issues I encoutered would eventually be fixed, but the state of the standard library is an issue (imo).
I disagree on making the collections we already have less featureful. I also don't see the point in using functional programming patterns in crystal, which is clearly an OO language. Hash.keys
makes little sense over using Hash#keys
.
@karlseguin I don't get it, why stdlib having Hash(K, V)
is such an issue for you. If Hash
's implementation is too generalistic for your puprose that it proves to be a bottleneck, it should be easily replaceable by a custom data structure.
If you had any trouble with that, could you explain what was the issue?
I don't think Hash.keys
is in any way superior to Hash#keys
or vice versa. They're both just different ways of expressing the same concept. Crystal is an OOP language with many ideas borrowed from Ruby & Co. so for Crystal, Hash#keys
is the more reasonable choice.
@straight-shoota
I'm trying to say is that the stdlib over-specifies, which makes it difficult to make changes in the future. Again, in Go 1.0 (and since then), they've actively worked at breaking ordering because that isn't a behaviour they want to maintain. Iteration order is just example.
As for the Hash.keys
vs Hash#keys
it was just an example. But, I will be pedantic and point out that this has nothing to do with OOP. When Alan Key coined the term, he was largely talking about messaging passing and protected state. One version is not any more or less OOP than the other. I didn't mean to attack Hash.keys
, it's the stuff like shift
and compact
that constrain the implementation. That said, the advantage of Hash.keys
is that it could behave on a more generalized interface. Which means that you could have a Set, SortedHash and Hash all implement some interface, (say, Enumerable(K, V)
) without having to specifically implement .keys and .first and on and on.
@karlseguin I don't get it. We happen to like Hash retaining its insertion order. If that's one of your reasons for not using Crystal, I can't help you.
Also, regarding "sharing implementations of common functionality": We already do that with modules. So what's your point?
A Set
doesn't have a key. So what good would it make if I could call Hash.keys(some_set)
? Hash, Set, Array can be iterated though, hence they all have a #each
implementation, and benefit from the ton of functionality in Enumerable
.
In the 90s, everything had to be OOP (For no good reason). A more recent war was "Death to relational databases!" and everything just had to use document-oriented databases. And now, everything has to be FP
style. Just why.
Another crystal built-in feature we can ship on a blessed shard is crystal play
, WDYT?
I just remembered this while writing a comment here https://github.com/amberframework/amber/issues/352#issuecomment-342994862
@faustinoaq Playground is currently a part of the compiler, not stdlib. This issue is about shrinking stdlib, not splitting out compiler features.
Indeed, please lets keep the compiler interface to another issue.
I don't know very well English. So this text translated/corrected with Google Translate. (Sorry for this).
I like the idea of shrinking the library into shards and stdlib.
In the main library there should not be such things as XML/CSV/HTTP/HTML/JSON/ZIP, etc.
The best place for them is here https://github.com/crystal-community
It is better to have a reliable, well-tested, documented small stdlib.
Such things as XML/CSV/HTTP/etc come and go, (HTTP -> HTTP2 & WebSocket), (XML -> JSON || YAML || MsgPack || Protocol Buffers), etc.
Not everyone needs XML, and not everyone needs Json or Protobuf.
Not everyone needs HTTP, but somebody needs HTTP2 (which is still not fully implemented, but which is very actively used) or UDP/TCP.
It is better to follow the principle of Unix Way (Write programs that do one thing and do it well.).
In stdlib you can not shove all the implementations of existing protocols and bindings to libraries.
Support for multithreading, Windows, etc. more important than having a lot of batteries in stdlib.
It is much easier to organize the distribution of shards than to accompany many components of stdlib.
In addition, I do not believe that crystal is only a web-based language.
Therefore, the presence of such things as HTTP is less important than working with files/directories/sockets/etc.
Some technologies can become legacy (for example, HTTP 1.X), but because of the presence of HTTP 1.X in stdlib, this code will have to be maintain for backward compatibility.
In general, I consider separation a more wise act, than trying to fit everything into stdlib.
Мне нравится идея с разделением библиотеки на shard-ы и stdlib.
В основной библиотеке не должно быть таких вещей как XML/CSV/HTTP/HTML/JSON/ZIP и т.д.
Лучшее место для них - здесь https://github.com/crystal-community
Лучше иметь надежный, хорошо протестированный, документированный stdlib.
Такие вещи как XML/CSV/HTTP/etc приходят и уходят, (HTTP -> HTTP2 & WebSocket), (XML -> JSON || YAML || MsgPack || Protocol Buffers) и т.д.
Не всем нужен XML, как и не всем нужен Json или Protobuf.
Не всем нужен HTTP, но кому-то нужен HTTP2 (который до сих пор не дописан, но который очень активно используется) или UDP/TCP.
Лучше следовать принципу Unix Way (Write programs that do one thing and do it well.).
В stdlib нельзя засунуть все реализации существующих протоколов и биндинги к библиотекам.
Поддержка многопоточности, Windows и т.д. более важна, чем наличие кучи батареек в stdlib.
Организовать дистрибуцию shard-ов гораздо проще, чем сопровождать множество компонентов stdlib.
К тому же я не считаю, что crystal только веб-ориентированный язык.
Поэтому наличие в нем таких вещей как HTTP менее приоритетно, чем работа с файлами/директориями/сокетами/etc.
Некоторые технологии могут стать легаси (например HTTP 1.X), но из-за наличия HTTP 1.X в stdlib этот код придется тянуть ради обратной совместимости.
В общем я считаю разделение более мудрым поступком, чем попытка вместить всё в stdlib.
Thanks for Crystal.
My 2 cents on this topic.
Batteries included is one of the primary reason for the success of a language. The moment functionality is stripped and moved to separate maintained shards, there are major issues:
Maintenance:
Its not uncommon for shards, even official ones, to get less attention to the point that people simple do not use them any more, because such a lack support.
If for instance the core language is lightweight, it does allow for faster releases. The con side is that you may start to see Crystal release 0.55 but Xml has feature / requirements that break with 0.55. Maybe nobody complains very fast and then people who upgrade to 0.55, need to deal with the broken shards. That becomes frustration and more pressure again on the developers as the same whining will happen in the core issue tracker and the shard issue tracker.
My case in point is take a look at the D language. Non-stop releases but also non stop breaking dubs (shards). As a end result it even affects Code editors like VSC, where the coding plugins use dub projects and break. Do not ask me how much time got wasted because of breaking features.
I already see the exact comment in this topic that boils down to: Let somebody else deal with that extended functionality so we the core developers can deal with the more important stuff. And then the above mentioned happens where packages do not get maintained and more discussions follow. Please look at at some of the the other languages to see the same issue.
Sure, it helped to get faster releases out because the core developer can play more with the new stuff they want to implement but it also became a hot mess as the packages got ignored and maintainers left. And as a result nobody wanted to fix the hot steaming mess and ... the core language keeps evolving and breaking more and more because "hey, its not our issue to fix packages, when we introduce changes" attitude.
Documentation:
There is nothing worse then having documentation scattered over multiple projects, lets alone dozens of shards.
Why is PHP as a language so popular? Its not alone because you can run PHP on any webhost but its has the kitchen and sink included. And that same kitchen and sink is also present in there standard documentation. While some modules are "external" and can be looked upon as "shards" that need separated installation (GD, Zip, ... lib for instance), the fact that its all in the standard documentation is a blessing.
Differentiation"
Another issue that happens over time, is that people can not see the tree's inside the forest. As more and more shards by 3th party developers appear, they have the habit of drowning out official shards.
LTS:
As a developer, i always prefer official std library as i consider it LTS ( Long Term Support ). You have confidence that the developers will maintain the functionality to stay up to date with the core language. The same can not be said when shards get more and more split away.
After seeing several other languages where maintenance gets so split among maintainers, that bug fixes or other patches stay "stuck" for ages, as the maintainer is too busy on other projects.
The only language where i have seen a different attitude is Rust, where the core developers actively check every "crate" ( by building them again the latest release candidate ), to see if a new release will break 3th party crate maintainers and then then notify them.
Visibility decreases:
A perfect example can be seen by @MrSorcus pointing to https://github.com/crystal-community ... Try finding something in there, if there are dozens, hundreds of shards in the future.
Its not just about splitting a few pieces out now, its what about the future as the language grows with more possibilities.
Complexity:
It also means that people are forced to use "shards". Where as currently, one can do almost anything that does not require a shard. Its very flexible and user friendly. The moment people needs to start creating sub projects ( aka sharded project ) to try out even basic functionality, ...
Let alone all the extra files scattered over projects, where as they use one main std library but have the same shard maybe in project a, b, d, g. But maybe with different versions that then also need updates.
It forces the requirement of git for even small test projects. See the above mentioned extra requirements for shards.
One of the main features that i personally like about Crystal is that its user friendly as language to get going. Shards have always been a extra layer you want to be only concerned with after you get more into the language.
Conclusion:
I am personally against any half baked sharding out the standard library. Lets not call them shards but "official extensions".
It can only be done if:
The whole sharding away from currently official Std library functionality leaves a bad impression. I understand the needs for separating but simply sharding them away is asking for trouble in my book.
There is nothing wrong with releases that take more time, because one wants to be sure that everything works. PHP has only major releases every year. Go has every 6 months. Developer prefer long term stability where when they update, everything simply works without breaking official plugins. People who want to live on the edge of advanced features, can always manually compile or run beta versions.
Based on the feedback here, I'd like to revise down the list of things I want removed:
Adler32
to Digest
or remove it.CRC32
to Digest
.Debug
private.DL
Levenshtein
- move it into the compilerLLVM
- LLVM
is going to keep on making breaking changes after our 1.0
whether we're ready or not.Markdown
- It's very opinionated and quite complex, I think this should go.OAuth
- complex and specifc, not sure why it's hereOpenSSL
should be refactored. The only crypto primitives we should provide is Digest
, HMAC
and TLS
. Named after a library not a concept.Readline
- named after a library, instead of a concept (like OpenSSL
but even less popular/useful)Termios
- no docs, idek what it does lolXML
i'm 50/50 about but leaning on it should stay. It's been fairly stable for a while now.Readline is a must for pretty much any time when you're reading input, and OAuth is useful for writing web apps (Crystal already has an HTTP server.)
Idk what Termios does, but I'm guessing it's a wrapper over Posix terminal manipulation primitives.
@kirbyfan64 I've never used Readline
so it's hardly a must. The number of applications who really need advanced line editing in the terminal is pretty small. OAuth
would be fine as a shard, it's not used in every webapp. If nobody knows what Termios
does it should clearly be removed.
Good words from @Wulfklaue :+1:
It is a tough question to find the right amount of things to include and those better living outside the standard library. And we shouldn't avoid including too much because many things are just really useful to have in stdlib.
The list compiled by @RX14 looks very reasonable to me.
I'd argue for XML to stay. It's a very common data format, similar to JSON and YAML - just not so much of a hype anymore. It is a proven standard and while it has been replaced by the former in some applicationes, it's still very popular and won't loose importance anytime soon. LibXML is also a stable library to bind to.
OAuth is useful, no argue there, but it's just one of many protocols for HTTP-based authentication (which is a feature not as widely used overall for Crystal applications). It's great to have a basic HTTP server and client in the standard library, but specific protocols like OAuth can easily live in a separate shard and integrate with the stdlib HTTP component.
In my endeavors into RPC in Go I found its net/rpc stdlib package, which I thought was really useful and fit my use-case very well. Then I read the docs and almost missed the text at the bottom that said "The net/rpc package is frozen and is not accepting new features.", and I wanted to know more and found this issue which I believe is a prime example of what could happen with modules in the stdlib that probably would be healthier as a shard.
In case you do move some parts of the stdlib into shards, will they be left to the community to develop or will they still be supported by the core developers? In case of the latter, they could still be included in the official documentation but separate from the standard library. That way, newcomers won't be discouraged from a lacking standard library when they see commonly used shards that are "officially supported" directly on the official docs page.
I think that many developers will feel reassured when something is officially supported. I think that is one of the main reasons why the net/http stdlib package in Go is prefered over popular and more efficient community-maintained ones like fasthttp, because net/http is officially supported and will never become "abandonware".
Edit:
To illustrate, I'm thinking something like this or a separate tab:
It's a minor detail better discussed in another issue, but I wanted to illustrate my point that they shouldn't just disappear and be hard to find for newcomers.
@RX14 you can also include FileUtils
, that is basically a shim that mimics standard UN*X commands.
My opinion is to keep well used library, considered as standard and that can have a one best generic implementation. Others that can have multpile opinionated implementations or fill a specific use case should be moved out.
If we follow this logic: YAML
, JSON
, XML
and CSV
can be kept as official extensions (I agree with @Wulfklaue ) because they are quite used, and they can be implemented in a one best way (but maybe I'm wrong). Same for HTTP
libs, but should do the bare minimum, so other higher level shards can use them as a base.
Why not having a shards command like shards install official-extension
that install and add the official extension to the shards.yml
? We can also imagine some magics about installing it, based only on a require "official-extension"
or the compiler suggesting that you should do a shards install official-extension
because it's missing.
Some time later with using crystal http/xml i see that's both has not fully completed.
If they are in stdlib, you might think that the whole stdlib is in this state.
And if json or yaml can be fully implemented in accordance with the RFC (maybe), then others are not.
There is no need to overshadow(discredit) stdlib by the presence of incomplete implementations.
This is tangentially related but I think it'd make sense for the functionality in Colorize
to be part of a bigger ANSI package. Maybe this is the wrong place to bring this up since I'm advocating for making the standard lib bigger, but if Crystal is going to have methods that basically just wrap strings in ANSI color codes, it makes sense to have the ability to use the positioning and many other codes too.
Here's good resources on these:
http://invisible-island.net/xterm/ctlseqs/ctlseqs.html
https://en.wikipedia.org/wiki/C0_and_C1_control_codes
https://vt100.net/
Supporting all of these is absurd, because very few applications use most of those VT100 codes, but the major ones like moving the cursor, saving and resetting the screen, etc., I think could all live together with the color codes in one ANSI library. I know some people probably don't like colorize and feel it too should be separated from the std lib, but I think it's great to be able to write nice terminal applications right out of the box in Crystal without having to worry about curses. (also why I think Readline should definitely stay. Having readline is very helpful when you just want to knock out a quick text prompt ui for a terminal application.)
@Sevensidedmarble you're thinking of something like Nim's Terminal library right? Which is part of the stdlib. I actually do like that idea, but it would be nice if the colorize functionality remains the same at least (ie. Being able to actually colorize strings, rather than setting a color for all text written to the terminal).
My two cents about the whole issue: I like the idea of splitting things up, mainly on the grounds of maintainability. As has been discussed though, maintenance could also suffer if the shards aren't taken care of. I feel like that's better than having an API be completely locked down after v1 though.
At the very least shards like YAML, JSON, and XML belong in their own shards. I'd say you could also expand that to OpenSSL, HTTP, and Big.
I know it's a separate topic, but having shards be globally available (as with Ruby gems) could fix the problem of having to specify a shard.yml any time you want to include one of those shards. We'd just need a shard install
command that globally installs a shard, and then if the project doesn't have a shard.yml the compiler could just get that shard from the global cache. If multiple versions of that shard are installed it could just get the most recent one.
I'm sure this has downsides as well but it would solve some issues.
but it would be nice if the colorize functionality remains the same at least (ie. Being able to actually colorize strings, rather than setting a color for all text written to the terminal).
The thing is @watzon what the Colorize lib does is put the terminal code to set a given color at the beginning of your string, and at the end it puts the code to reset the color to what it was before. So you're telling the terminal, print green from now on, now print my string "hello world", now reset the color state. So those operations are already sort of masking the concept that the terminal has state from you. If you never sent the final code it would print green forever from that point on. The terminal has no capability to say 'print the next string in green' or something, it only has this one sense of state. So we're already dealing with these codes, we might as well be able to position the cursor too.
And yeah I worry very much about these libraries being relegated to shards and forgotten. I myself find the most joy in writing terminal applications so I'd be very sad to see the terminal libraries decaying if they leave the std lib.
Should probably remove ARGF
too. Does anyone here use it?
I think JSON::Serializable
and YAML::Serializable
would better leave in an organization/project outside the stdlib (like serde). Rust and Go have only basic encoder/decoders too, this complexity can be delegated to external shards.
Serializable
implementations for formats not implemented in the stdlib like TOML, XML, CON.Serializable
features, may even be part of this organizationSerializable
moved out, there will be still one way to serialize with the stdlib: JSON/YAML.mapping
Should probably remove
ARGF
too. Does anyone here use it?
Yes, pretty much always.
crystal eval
works without requiring explicit shards (batteries included).crystal init app
adds a dependency for crystal-community/stdlib with an advisory version of same version of stdlib used with the compiler.That's the basic idea. To most developers all the commands stay the same and 2 lines are added to new shard.yaml files by default. Documentation is identical. Library availability online or off is identical. crystal
and shard
commands are identical. Project management is improved. Upgradability is improved. Long term support is improved. Documentation generation is improved (see below).
The remainder of changes are for the shards command, documentation generator, or other plumbing that doesn't concern the average user or usability of crystal.
Minor changes are made to the documentation generator to include stdlib dependencies in the official documentation and mark them as such.
Changes to the documentation generator to include shards and annotate which shard the API is from (if it doesn't already) would have other benefits. Now documentation of each project can be generated with all available API's for the specific versions used in your projects. Going to work on a ~4yr old project would be much better with it's own documentation rather than relying on google. If you tried to search for crystal 0.x.y, it wouldn't give you documentation for libfoo 0.y.z all in one place.
Within this proposal no project needs to require stdlib. It can be removed by deleting the dependency in shard.yml and adding dependencies to the individual libraries used.
In addition more focused stdlibs could be published by anyone such as stdlib-http which includes dependencies for at least [http-client, openssl, json] that are validated and tested together against various compiler versions.
It almost doesn't matter. Everything not required by the compiler (@asterite's preference) can be put in it's own library and still be available for use and documented as it is right now.
Automated testing:
shard
command tell you that it's nonfunctional. Almost all of your concerns are addressed.App "foo" is using stdlib v1 but wants to use xml v2. If stdlib is locked to v1 then it's not possible to upgrade shards a piece at a time which is often necessary for old applications.
Perhaps a better name would be "default locked version". Locked to a specific version unless specified elsewhere.
Needs to track shard "foo" tested "ok" or "fail" against various compilers. This seems like a general purpose change not limited to stdlib.
shard upgrade
on a new compiler uses the new compiler defaults.shard upgrade
with a version specified uses the version string as normal.The code required to implement (most) of this proposal is minimal. A handful of defaults and additional search paths. Additional glue is necessary in shards to check compiler versions but that's probably necessary over the long term anyway for all libraries not just stdlib.
Should I have put this here or in a new PR?
@RX14 Was there something I didn't address? Or some part that can be improved?
Something that would be actionable in an iterative way and beneficial whether the std-lib is split or not is to separate the prelude from core.
Letting core be a minimalistic prelude alternative will:
Currently, low-level stuff like Pointer
ends up using use modules, macros, and classes that are convenient for a full batteries experience but are not required in a minimalistic environment. Nice to_s, conversions to other types, non-essential operations, etc.
Another example is Exception
, to extract the backtrace depends on lots of things. For minimal behaviour, we could have no backtraces, but still, have the begin/rescue semantic.
Due to refactors is easy to end up with circular dependencies across types in the prelude. Having a core will help to put some strict boundaries there. Right now the prelude has some first modules included in a specific order, but leaving only those does not compile.
Core should have the minimal runtime, most primitives (but not necessarily all), and some classes/operations but not things like map/filter/etc.
I'm not sure the std should/can be splitted in the near future, but even so, I think that what I am proposing is still a step needed.
I think this would introduce massive breaking-changes for current Crystal projects. Which will cause Crystal devs a lot of pain that is not really necessary. Especially considering the state Crystal is in now.
Shards exist for this very reason. Instead of all the shards being added to the core, they are.. shards. This is already sufficient. Crystal already has great modularity. and the current codebase is in no way bloated. I'm sorry, I just don't see any justification in doing this. In fact, I beg asterite, bcarddiff, and the other lead devs to never even consider doing something like this.
The lead devs decide on what is added to the core language. They've already merged what we have, as they thought it would be good for the core at the time. There is no reason to backtrack.
Also, if someone thinks just because they personally don't find something useful, you can't use that as an argument to get it removed; because you don't know if another Crystal dev is already utilizing it.
@didactic-drunk an actionable plan is the easy part: create repositories, move files there, use submodules or a shard.yml with loose patch level constraints (~> x.y.z
) to make it easy to install, ..
There is more to the problem: confidence loss in merging changes and/or a complexified crystal test suite, interdependences, ...
It would be great to see some libs like LLVM or OpenSSL extracted and each tested against different versions, to be able to install patches quicker for them, but it's not _that_ easy to put up.
I think this would introduce massive breaking-changes for current Crystal projects.
None. No breaking changes. That's part of "Requirements 1." implemented as "minor changes 1.".
@bcardiff What you propose sounds entirely different from splitting stdlib and core. I think you're talking about minimizing dependencies within core. I'm not up on compiler internals but stdlib is right up my alley.
@girng Rust does something similar. The compiler and stdlib can run at different versions. The proposal I listed allows the same mixing and matching with different version for long term support in a way that's seamless. You won't know about it unless you try to change the stdlib version manually.
Here's an example of how this benefits long term support:
With core + stdlib as a single package you can:
Or with stdlib split from core:
stdlib-http ~>= 3.0
It's way less work and way less breakage.
@ysbaddaden If crystal provided nightly ci builds to test against would this address the confidence issue? Could tests for stdlib-a run both with and without the current stdlib to check for integration issues? Without stdlib would only add it's listed dependencies to the require path. With stdlib (the default) all stdlib is available in the require path.
Would a travis image with all supported crystal versions preinstalled help? Tests could run against all available or a listed set.
Should there be tooling to aggregate the test results for all packages in a single place or will the existing ci infrastructure work?
If YAML would have been a shard, we could already have the new functionality without having to wait for a next compiler release.
True.
Because they are less critical, we could accept more contributors.
This can be done without splitting the repo and the corresponding downsides. Can also be done more granularly.
These shards don't need to freeze their API in a backwards-compatible way after 1.0, unlike the standard library
And is that a good thing? If there is no HTTP client that maintains its API, what are people supposed to use?
It simplifies the maintenance work for the crystal core developers
If you mean that ignoring problems is a simplification, sure. In fact, with how PRs don't get merged, it's already pretty simple :)
It simplifies the release process.
Maybe the particular reason you provided is true, but I'm pretty sure that overall it will complicate things. Even starting with things like backwards-incompatible changes, though thankfully those are handled decently now.
But surely 1 repo is easier than N repos?
These shards could provide more features than what they provide now. We tend to be very conservative with the standard library because every public method is something that needs to be maintained and kept for backwards compatibility.
True, but moving code to a different git repository shouldn't completely correlate with dropping all conservatism and is also not necessary for it.
Do you have a complete solution for long term maintenance and incremental upgrades? How do you upgrade only YAML
in a 1 year old application? Or upgrade YAML
as far as it will go including it's dependencies? If stdlib is part of crystal you can't except by copying files yourself and tracking dependencies manually. If it's split cross version testing will tell shards what works and what doesn't saving everyone that has to upgrade an application lots of time.
There's also the question of whether Manas will give more people write access to crystal-lang/crystal. I suspect they don't want to which may require a split to get more contributors.
These shards don't need to freeze their API in a backwards-compatible way after 1.0, unlike the standard library
And is that a good thing? If there is no HTTP client that maintains its API, what are people supposed to use?
With split stdlib if you really need old HTTP functionality you can have it by setting the version. There's additional flexibility for forward of backward versions. With that flexibility API's can evolve without pissing off as many people.
Most of the complexity you worry about can be automated for crystal core developers. It doesn't concern end user developers. (Is that a term?) They still install crystal the same way they did before mostly via package managers.
It simplifies the release process.
Maybe the particular reason you provided is true, but I'm pretty sure that overall it will complicate things. Even starting with things like backwards-incompatible changes, though thankfully those are handled decently now.
But surely 1 repo is easier than N repos?
I assume the current release process is:
The shell script will get more complex to handle the split but the release process will probably stay the same.
This is an issue about removing modules from the standard library, not splitting it up into multiple git repositories.
Please create another issue about that as they are different problems.
But please be aware that it'll never happen, as it creates a bunch of migratory work for the core team for almost no benefit over the code maintainers model oprypin mentioned. I don't even think crystal is large enough for that model yet.
The problem is a social organisational problem, with crystal, it's community, and mostly the core team. Treating it as a technical problem won't give the correct solution.
Most helpful comment
I would love to see a stdlib shard, which would in turn depends on shards of these 'blessed' "pieces of the standard library", and likewise for other high-level groups of libraries (ui?, web?, gl?, etc). Smaller chunks [shards]; easier for community involvement and more optimizable code. [Kinda makes me think of MRuby's granularly configurable dependencies.]