Crystal: [RFC] Remove recursive aliases

Created on 20 Oct 2017  ยท  60Comments  ยท  Source: crystal-lang/crystal

Recursive aliases were introduced when we were implementing JSON parsing. We needed a type that could be Nil, Bool, Int64, Float64, String or an array of those same things, recursively, or a hash with values of those same things, recursively. So we decided that having a type that could reference itself in its definition was the way to go.

That works, but it has two disadvantages:

  1. We can't define methods on this new type, because we can't define methods on aliases
  2. The language already has a way to express recursive types!

For example, the current definition of JSON::Type is:

module JSON
  alias Type = Nil | Bool | Int64 | Float64 | String | Array(Type) | Hash(String, Type)
end

We can express that same thing using a struct:

module JSON
  struct Type
    @value : Nil | Bool | Int64 | Float64 | String | Array(Type) | Hash(String, Type)

    def initialize(@value)
    end
  end
end

Or simpler using the record macro:

module JSON
  record Type, value : Nil | Bool | Int64 | Float64 | String | Array(Type) | Hash(String, Type)
end

In terms of memory representation and performance it's exactly the same: a recursive alias is a union, and as such it's represented as a struct.

Now, when working with a recursive alias such as JSON::Type we can simply cast a value to an integer using json.as(Int64). With the struct approach we have to do json.value.as(Int64). But since we don't like those casts, we have JSON::Any in the standard library which wraps the JSON::Type alias. Sounds familiar? We could have simply never used a recursive alias in the first place! Just make JSON.parse return a JSON::Type that is already a struct whose value is the union of possible types (recursively). Want to get the raw value? Call value. Similar to JSON::Any.

That said, I believe we can safely remove recursive aliases from the language. They are not an essential feature, and in fact using a struct is better because we can add methods to them. And removing them will also simplify the compiler's code.

breaking-change refactor draft compiler

Most helpful comment

Without union types, I wouldn't be using Crystal at this point. It also is the most popular (most liked) feature when the question "What do you like the most in Crystal?" comes up every other week in the chat channel.

Functional programming is nice. But not that nice. Ruby showed that copying the useful parts from FP while maintaining a OOP language is not only feasible, but actually good. Crystal simply improves upon this, and for good reason.

All 60 comments

And in fact, this also solves another problem that's a source of confusion and complaint with JSON::Any. Right now we have: JSON::Any#to_a which assumes the raw value is an Array and returns it as such. But this Array is of JSON::Type, not of JSON::Any, so we can't invoke to_a on them.

Using the struct approach, the elements are themselves JSON::Type which will provide these methods. So much simpler and powerful!

Nice thinking. This looks great

At first I immediately didn't like the proposal because I use recursive type aliases a lot in a personal toy project (https://github.com/lbguilherme/rethinkdb-lite) and it will break badly. But looking carefully, this looks great. I wasn't aware that an struct could have a self-reference like this, nice!

One issue I see a lot is taking an Array(String) and turning this into, say, JSON::Type. This is what you need to do now:

x = ["a", "b"]
x.map(&.as(JSON::Type)).as(JSON::Type)

And for the future:

x = ["a", "b"]
JSON::Type.new(x.map {|e| JSON::Type.new(e) })

But JSON::Type could have a constructor taking any Array and doing the map in the constructor, so this would be reduced to just JSON::Type.new(x), which I like a lot.

+1 for this change. The refactoring will result in better code after it.

@Papierkorb @oprypin Care to share why you ๐Ÿ‘Ž ? I'm honestly interested in what you have to say about this (I only used recursive aliases in JSON and YAML and in these cases I think they are not very useful, but maybe there are other use cases I'm missing).

In the meantime, I will try to refactor JSON (and probably YAML) to use this approach and see how it goes, if the API ends up being nicer.

I wasn't aware that an struct could have a self-reference like this, nice!

This is kind of guesswork, but I think the key here is in the union; IIRC without it, recursive structs wouldn't be supported, since they would take infinite memory... but it works here because the union adds a pointer indirection.

kirbyfan64: It works here because the struct isn't directly self-recursive, it contains itself only by way of Array or Hash intermediaries, both which of do add indirection.

Comments on the proposal text

in fact using a struct is better because we can add methods to them

I don't see this as advantage, actually it's a disadvantage:

I can add methods to them. I don't have to. The issue is that I as reader of source can't know this, and have to subsequently look it up. Reading an alias is fast, it's only a single line with a restrictive syntax. Reading a struct is by very definition more complex. Thus, not being able to add a method if not needed is an advantage.

And removing them will also simplify the compiler's code.

The language serves the usual programmer, not the compiler developer. A compiler is not the common program. I think the presented reasoning is harmful in that it might set a dangerous precedence of throwing stuff out that doesn't make sense for only the compiler, while everyone benefits from it.

I'm also missing discussion of the bigger picture in this RFC.

The purpose of recursive union types

I think we should not discuss throwing out features at random without actually discussing what we actually were trying to achieve in the first place. It is correct that I wasn't there right from the start, so I have to do educated guesses.

What we want is a way to store data of an union type inside a container for that very union type itself. It's basically what JSON::Type is.

Issues with the proposed syntax

The biggest issue is that your proposal changes the whole usage semantics of recursive aliases. The issue isn't even the struct (I regard that as implementation detail). But that I now have to use #value to get the data out of it. This in turn clutters the users code with .value for no gain on their end.

Possible solutions

The current solution is using an recursive alias. You propose making the user manually write a struct instead.

Without knowing the compiler source, I see a few other solutions countering the two main issues: Compiler complexity first, then the verbose struct-as-alisa syntax second:

  1. Allow for recursive structures using the current syntax. The compiler can detect if it's recursive, and if so, rewrite it to a struct internally (Also with automatic #value unpacking). Then the compiler could rely on the existing, reliable struct code emission. Hopefully, this would reduce overall complexity, focusing this language feature in a single place instead of all-over-the-place.
  2. If my assumption in 1. is wrong, at the very least offer a recursive-alias-struct macro, and market it as such.
  3. Or of course, also an option, fix the existing code.

A short use-case study

So what's the use of recursive aliases? At least for me, the following is a common pattern:

def visit_recursively(value : JSON::Type)
  case value
  when String then visit_string(value)
  when Array(JSON::Type) then visit_array(value)
  # And so on ...
end

Observations:

  1. I assume that for many users, value is a popular (as fitting) name for the variable. Having to write case value.value is more confusing than simple case value.
  2. With the struct, you'd actually have to write: case v = value.value. How ugly is this?
  3. A struct isn't a union type of course. So sometimes, case f would be fine, other times case g = f.value is required. Confusing?

Another possible solution

I see one possibility of using the struct solution, while getting rid of the #value spam. However, this would require changes to a different part of the language:

  1. Make Object#as(T) overridable.
  2. Make Object#===(T : Class) overridable.
  3. No change to Object#is_a?.
  4. Alternatively, add Object#like_a?(T), checking if #as(T) exists and if #===(T) would return true. Does appropriate type restrictions in the following body, just like #is_a?.

The algorithm when encountering an (implicit) value type check would now be as follows:

  1. Encounter a case condition checking for a type
  2. Check if Object#=== returns true, and that Object#as(T) exists
  3. Implicitly call #as(T) on the value and retype its type restriction in the inner block as is done today
  4. Else, do all other already existing checks

If #like_a? is accepted, the algorithm would be:

  1. Encounter a case condition checking for a type
  2. Check if #like_a?(T) is true
  3. Do the re-typing

This would allow the user (the stdlib) to write recursive alias-structs without spamming other code with it. This also elevates user types to first class citizens for the compilers type deduction.

Conclusion

This is a hard topic. "Just throw it out" is in my opinion too short-sighted and harmful. In the end, you have to pick your poison: To diverge from what made Crystal great by keeping boilerplate to the absolute minimum, or to make the language more powerful.

@Papierkorb I understand your reasoning.

When I say "it will simplify the compiler" I also meant "it will simplify the language". Removing a feature that is essentially a duplicate of another feature (the way I see it) is better: less things to learn, less ways to do a same thing.

As far as I know, no such feature exists in other programming languages. I'm pretty sure that's because it's not needed.

I personally don't see writing json.value as something that's so terrible. Plus you can have methods to extract the value from it (in fact JSON::Any has this right now) so you can do:

if string = json.as_s?
  # string
elsif array = json.as_a?
  # array
end

and of course you can also do this, as you say:

case value = json.value
when String
  # ...
when Array
 # ...
end

Comparing having to write value = json.value vs. adding a feature that's complex to implement and understand (maybe not for you or for me, but it is tricky), that's bug prone (for example you can define a recursive without a base type and that crashes the compiler, and I don't know yet how to detect that, plus having to repeat the logic of detecting an infinitely expanded recursive alias as a struct) and that simplifies the language (less concepts to learn), I prefer having to write that. I mean, if we remove the feature you will have to write maybe 10 more chars in an entire program. I don't think that's a reason to keep a feature.

In any case, as I said, I'll try to refactor the std to not use recursive aliases at all and see how it looks and feels like. Maybe then I'll have a better opinion on this.

As far as I know, no such feature exists in other programming languages. I'm pretty sure that's because it's not needed.

No, it's because it's not possible. If you removed union types, on the other hand, then it would not be needed. And oh boy how much simpler the compiler would become!

As far as I know, no such feature exists in other programming languages. I'm pretty sure that's because it's not needed.

I don't know if any other static-typed language has this feature, or a feature like this. I just want to point out that this may be a fallacy. I think the type system of Crystal is kinda unique, and with that, unique solutions to problems can (and should!) evolve.

@oprypin I actually believe that removing union types from the languages would be something good. I know it's one of the things that make Crystal unique, but I now think algebraic data types are better, and probably just nullable types (not union of X and Nil). But it's probably too late to change all of that now.

Without union types, I wouldn't be using Crystal at this point. It also is the most popular (most liked) feature when the question "What do you like the most in Crystal?" comes up every other week in the chat channel.

Functional programming is nice. But not that nice. Ruby showed that copying the useful parts from FP while maintaining a OOP language is not only feasible, but actually good. Crystal simply improves upon this, and for good reason.

As far as I know, no such feature exists in other programming languages. I'm pretty sure that's because it's not needed.

Well, one alternative explanation is that it is hitting the boundaries on where the research frontier is.

http://www.cl.cam.ac.uk/%7Esd601/papers/mlsub-preprint.pdf made a fairly large splash when it came earlier this year and is an example of a paper that is investigating the general area of recursive typing and hindley-milner. It has an example implementation at https://github.com/sweirich/hs-inferno - see the tests for examples.

I think without recursive alias y-combinator looks quite difficult to achieve:

alias T = Int32
alias Func = T -> T
alias FuncFunc = Func -> Func
alias RecursiveFunction = RecursiveFunction -> Func

fact_improver = ->(partial : Func) {
  ->(n : T) { n.zero? ? 1 : n * partial.call(n - 1) }
}

y = ->(f : FuncFunc) {
  g = ->(r : RecursiveFunction) { f.call(->(x : T) { r.call(r).call(x) }) }
  g.call(g)
}

fact = y.call(fact_improver)
fact = fact_improver.call(fact)
pp fact.call(5) # => 120

https://carc.in/#/r/2xpy

https://stackoverflow.com/questions/45237446/recursive-proc-in-crystal

I almost got it but failed ๐Ÿ˜…

record T, value : Int32 do
  def *(other)
    self.value * other.value
  end
end
record Func, value : T -> T
record FuncFunc, value : Func -> Func
record RecursiveFunction, value : RecursiveFunction -> Func

fact_improver = ->(partial : Func) {
  ->(n : T) { n.value.zero? ? 1 : T.new(n.value) * partial.value.call(T.new(n.value - 1)) }
}

y = ->(f : FuncFunc) {
  g = ->(r : RecursiveFunction) { f.value.call(->(x : T) { r.value.call(r.value).value.call(x.value) }) }
  g.value.call(g)
}

fact = y.value.call(FuncFunc.new(fact_improver))
fact = fact_improver.value.call(Func.new(fact))
pp fact.value.call(T.new(5)) # => 120

https://carc.in/#/r/2xq3

WDYT @veelenga ?

the Y combinator can be used to formally define recursive functions in a programming language that doesn't support recursion.

So why would you write such code in crystal?

So why would you write such code in crystal?

@monouser7dig Not use case at all, just wanted to make a y-combinator ๐Ÿ˜… and @veelenga gave me a brilliant solution using recursive alias ๐Ÿ˜‰

@faustinoaq ok I thought this was supposed to be an argument for keeping rekursive aliases and that surprised me

I have no strong opinion on the recursiveness, I'd just like to have usable typeoptions instead of alias

Thinking about this, as long as performance is the same (it is in this case), I mostly don't care what happens behind the scenes. All I care about is the syntax.

I'm okay with this change:

#before
alias Type = Nil | Bool | Int64 | Float64 | String | Array(Type) | Hash(String, Type)

#after
record Type, value : Nil | Bool | Int64 | Float64 | String | Array(Type) | Hash(String, Type)

But I do not like this change:

#before
def foo(thing : Type)
  case thing
  when Float
    #do stuff
  when Bool
    #do stuff
  #etc..
  end
end
foo(123)

#after
def foo(thing : Type)
  case (t = thing.value)
  when Float
    #do stuff
  when Bool
    #do stuff
  #etc..
  end
end
foo(Type.new(123))

Is there no way to keep the same syntax while changing things behind the scenes?

Should we go forward with this? The standard library doesn't use recursive aliases anymore, and it seems others are finding wrappers like JSON::Any more useful than recursive aliases (cc @straight-shoota).

I can send a PR to remove this feature from the language.

After that, I can probably implement generic aliases (it's much easier once recursive aliases aren't possible).

If removing recursive aliases gets us generic aliases, i'm all for it.

Anything for better generics :+1: Let's go for it @asterite :100:

I hope recursive alias would come back eventually. I think they are easier to think about them than the indirection of a wrapper struct.

But I understand there is no clean solution right now with recursive aliases and nested values (the now old JSON::Type vs JSON::Any in constructor issue). Since I don't see an easy way to solve that for the time being, I accept dropping recursive aliases.

Yes, actually, maybe we could keep recursive aliases but remove the JSON:Any wrapper at all. You'll have to cast the value to use it, but maybe that's more intuitive than a wrapper.

It will seem like we are going forward and back with this, but given that the concept doesn't exist in other languages, and that Crystal is an experiment, and we are not 1.0 yet, it might be acceptable.

Most strongly-typed languages I know only have a pull parser and a mapping equivalent. I think JSON::Any is fine, but it should be perhaps demoted in the docs to a footnote. And use JSON::Any.from_json instead of JSON.parse. The first example in the JSON docs is for JSON.parse, which I think leads many people astray. I think JSON::Any is an advanced topic, and people's first interaction should be with JSON::Serializable.

I have a love-hate relationship with recursive aliases. They are actually quite elegant for parsing not-well-known JSON (YAML, etc.) structures without introducing intermediate types. So, originally we had:

module JSON
  alias Type = Bool | Int64 | Float64 | String | Array(Type) | Hash(String, Type)

  def parse(string_or_io) : Type
    # ...
  end
end

That way you could parse something and immediately get a union of all possible values. Of course you can't do much with that: if you want to work with it you probably will have to cast it. For example:

value = JSON.parse("...")
array = value.as(Array) # assert it's an array
array.each do |elem|
  # ...
end

Actually, there are a few things you can do generically with the value: compare it using == and ===:

value = JSON.parse("...")
case value
when "string" # works because of String#===(String)
end

With JSON::Any this isn't as nice because we have to reopen add String#===(JSON::Any) for that to work (remember that case exp; when value is translated to value === exp).

Now, let's say we remove JSON::Any and go back to using recursive aliases... well, actually, we can't get rid of JSON::Any, because if you want to do something like:

class MyObject
  property number : Int32
  property any : JSON::Any
end

with an any that could have any of the JSON types, we need to define any with a type that has a class method from_json: and recursive aliases can't have methods, because they are just a union type (kind of). Or, well, there's simply no syntax to reopen an alias and define methods on them (it would be a bit weird, because we could only define class methods).

So... JSON::Any is needed (at least for JSON, but I'd expect the same need for other libraries that need recursive types) , and now we have two ways to represent almost the same thing: the (recursive type) union of all JSON types, and the JSON::Any wrapper.

Thinking a bit more about the == and === case above, it might be a bit "dangerous" to compare a JSON::Any to a string. What if it's not a String at all? The comparison will silently fail, and if we don't handle it, well, it might lead to a bug. If we remove == and === from JSON::Any, the user will probably have to get the value and cast it to a String first, using, say, as_s (or, well, it could get the raw value and have the same problem, so maybe it's still "dangerous").

Another nice thing about recursive aliases is that we can do:

value = JSON.parse("...")
value.as(Array) << 1 # works

With JSON::Any we have to do:

value = JSON.parse("...")
value.as_a << JSON::Any(1) # Ugh

It's a bit tedious, but maybe this isn't needed very often. Or maybe it's good, because it reminds us we are dealing with a JSON::Any and not any kind of array?

So, I'm not sure which way is winning yet... Is there anything nice about JSON::Any? Yes! We can define [] on it, and internally check if it's an Array or Hash, so we can do this:

value = JSON.parse("...")

# We are hitting an API and we know the shape of the response, so...
value["data"]["users"][0].as_h

With just a recursive alias we'd have to do:

value = JSON.parse("...")

# We are hitting an API and we know the shape of the response, so...
value.as(Hash)["data"].as(Hash)["users"].as(Array)[0].as(Hash)

Super verbose! And I think this case is much more frequently needed then modifying the parsed JSON, because we usually parse JSON from an API or as read-only access.

...

Actually, thinking it a bit more, JSON::Any might not be needed: we can have JSON::Any.new(pull_parser) to parse and return a value whose type is the recursive alias type, not JSON::Any. But you'd have to do it like:

class MyObject
  @[JSON::Field(converter: JSON::Any)]
  property x : JSON::Type
end

Maybe that's acceptable. Then we won't have the wrapper any more, just the recursive alias... and we'd try to fix the bug surrounding it. But of course you'd have to use casting, a lot, to use it.

So... I have a local stash with changes that remove recursive aliases from the language, but I'm not sure it's totally good to remove them. We should evaluate the pros and cons, and if we decide they stay, we should remove the JSON::Any wrapper too.

I think you perfectly explained the love-hate relationship. I share it.

I prefer to use PullParser manually or automated (Serialization).

But being capable to do value["data"]["users"][0] is invaluable when overwhelmed with potential schemas (see selenium webdriver) that make creating structs or classes a very tedious task (and could slow down compilation, introducing so many types).

But then, .as_s or .as_h are meh, at best. Having [] also calls for .each, which calls for including Enumerable, adding == and === for supported types... and ugh.

I have a project at hand with aliases. I try to avoid recursive aliases, it feels kinda limiting, but it's not that bad. I think I could live without recursive aliases, thought they're nice.

About your last snippet, we could just have JSON.new(pull) : JSON::Type and totally get rid of Any... but then we lose Any[]... I guess I could live without it, too.

class method from_json: and recursive aliases can't have methods, because they are just a union type (kind of). Or, well, there's simply no syntax to reopen an alias and define methods on them (it would be a bit weird, because we could only define class methods).

But we already have Union.from_json: https://github.com/crystal-lang/crystal/blob/9ba6ba93e3cd1d53a5df0131c8aa4d272a56a6c3/src/json/from_json.cr#L192

Wouldn't JSON::Type.from_json work "out of the box" if we fixed class methods on recursive aliases (but slow because of raise, maybe we could speed it up by trying Hash or Array first depending on token type)?

But that's missing the point: the point of JSON::Any isn't to implement from_json, it's to avoid the .as ugliness by overloading []. You can't do that without a wrapper. But then again, Go makes people just deal with the cast pain and perhaps thats a good thing. Arbitrary JSON without knowing the structure is a pain in typed languages and maybe we shouldn't hide it.


As a footnote, == and === and <=> etc. are commutative operators, yet the language doesn't treat them as such. It forces you to manually implement the commutativity by reopening classes. Perhaps thats a problem to think of a better solution to. Perhaps the compiler could automatically commute the arguments to get the most precise overload. Then again, complexity, magic, etc etc etc...

Yeah, implementing commutative operators would be nice. But at least === is not commutative.

<=> is about as far from commutative as you get. It literally is the case that x <=> y is -(y <=> x)

@yxhuvud yeah sorry, my bad.

Maybe we could just have macros do this, unfortunately it's not possible for a macro to reopen a class unless it's invoked at the top-level.

Regarding https://github.com/crystal-lang/crystal/issues/5155#issuecomment-403330684

What if it's not a String at all? The comparison will silently fail, and if we don't handle it, well, it might lead to a bug

This is the same issue why == and === can be applied in all objects and why some containers methods do not restrict arguments to the type parameter: In order to get a less pedantic and easier to be used API.

I already say I like recursive aliases, I found them intuitive to express some structures. I only acknowledge to remove them in the light of generic aliases hoping that recursive aliases will come back later.

Regarding the JSON::Any, maybe another way to go is to use recursive alias as the structure but let a macro do the navigation. It would be a bit more verbose that value["foo"][0]["bar"], but a JSON.get(value, "foo", 0, "bar") and a JSON.get? macros could allow us to keep using the values defined as recursive alias directly. At the end of the day, JSON::Any is for traversing easily the nested values.

We could possibly implement (Array | Hash).dig from ruby to avoid the casts. Then remove JSON::Any and be done with it.

Actually, I'm thinking that that's probably by far the most sane idea.

But there's no dig for int, bool or string.

Of course there isn't, that wouldn't make sense and isn't needed.

How will #.dig help me recursively parse an entire json that I have no knowledge of the schema beforehand?

.dig is just turning value.as(Hash)["data"].as(Hash)["users"].as(Array)[0].as(Hash) into value.dig("data", "users", 0).as(Hash). The only reason we introduced JSON::Any was to turn the casting mess into value["data"]["users"][0].as_h, so if we've solved that problem and solved recursive aliases, then there's no need for JSON::Any.

How would you call value.dig if it's not defined on every type in the union?

@straight-shoota exactly!

@RX14 the json any recursive alias also has bool, int and string in it, and there would be no dig for those types.

I guess you could start from the hash (as it's the most common structure returned anyway) and use #dig from there (value.as(Hash).dig("data", "users", 0))?

Probably.

I'm not sure what we are trying to solve with dig...

@asterite the same way it's implemented right now in the already-existing PR: responds_to?

And what are we trying to solve? removing the casts with recursive JSON::Type. I just said it in my last comment.

@asterite I agree, I don't see how the following:

value.as(Hash).dig("data", "users).as(Array)

would trump:

value["data"]["users"].as_a

@RX14 the problem isn't sub elements, but the very first one: you must cast it to Hash/Array to be able to call dig on it, since not every type in the JSON::Type union type implements dig.

@ysbaddaden because the first one is possible with JSON::Type and the latter syntax needs a wrapper struct JSON::Any.

Forgive me for thinking that we were discussing how to remove JSON::Any after everyone else was dicussing how to remove JSON::Any.

The thread is about removing recursive aliases, and to weight alternatives to keep it or not in the light of the principal use case: deserialization. I believe an alternative shouldn't just work, but be as nice as, or better than Any. So far, proposals seem to fail that.

Sadly, because I'm not fond of Any myself, yet I struggle to see a better alternative :disappointed:

The problem with dig as an alternative to JSON::Any (which I didn't catch on to you explaining, sorry about that) is that JSON.parse returns JSON::Type instead of Hash(String, JSON::Type) | Array(JSON::Type). Changing JSON.parse to fail when parsing a string, bool, or nil at the top level wouldn't be a huge deal. Several existing libraries do this already, and not having an array of object at the top level is rare. We could provide another method which returns JSON::Type.

For me, dig is the only concept I can think of which comes close to replacing Any#[]. So I'm trying to think up how it could work to come up with a solution and syntax which can be pitted against Any, if we all agree that we don't like Any.

@RX14 The RFCs actually explicitly allow the other types at the top, nowadays. See https://stackoverflow.com/questions/17220924/does-the-top-level-value-in-json-have-to-be-an-array-or-object

This is something that has changed over time - the first RFC on it only allowed Objects and Arrays.

I actually like the concept of JSON::Any. It wraps a value with a dynamic data type and has methods for accessing it.
While you can retrieve nested values using a series of [] calls, it would actually make sense to define a #dig method in order to avoid raising when the data is incompatible.

value["data"]["users"] # raises if value or `value["data"]` don't wrap a Hash
value.dig("data", "users") # returns `nil` if value or `value["data"]` don't wrap a Hash
# without `dig` this is ugly:
value.as_h? && (data = value["data"]?) && data.as_h? && data["users"]?

After recursive aliases where removed with JSON::Any, I also refactored Crinja::Value with the same concept. The diff (mostly in https://github.com/straight-shoota/crinja/commit/57b2fcd43dbd406fb2e790141d8fd539bc4a085c) gets rid of ugly .as(Crinja::Type) casts and in my opinion improved the code dealing with such values in many places. This wrapper type also defines a lot of helper methods for abstracting specifics of the underlying data values. Many of them are particularly useful from within the Crinja runtime, but also for external use. JSON::Any already has similar methods #[], #size and it wouldn't hurt to expand on that. #dig, #each, #first, #last could make it easier to work with JSON::Any.

I've done some experiments about finding an alternative to JSON::Any.
A draft implementation is in this gist - tell me what you think :smile:

Edit: the main advantage is to be able to get/modify/delete values of keys dynamically - the path is set with an Array
You can for example do

JSON.parse File.read("./some.json")
puts "Enter the path, with keys separated with dots, to retrieve its value"
if path = gets
  puts json[path.split('.')].to_type
end

I'm starting to think that all this wrappers around YAML and JSON's PullParser will better live outside the stdlib in their shards.
The issues are:

  • opinionated: can be lots of different implementations - may I use Any, Serializable, PullParser or others?
  • safety: can't safely pin a version and use it with the X Crystal version
  • convenience: possible refactoring with each Crystal release (because of implementation's changes)

@j8r Your implementation uses a completely different approach of parsing JSON structures on demand. In contrast, JSON.parse currently parses an entire JSON document and it can be accessed in memory. Both have it's pros and cons, but for a general implementation, it doesn't make much sense to re-parse the JSON document everytime you try to access a property.

@straight-shoota But in the general case either with this implementation ot any other, it's better to store a property if we want to re-access to it, because more efficient.
This implementation is about this - parse only what you need, and do what you have to do :smile:

I'm with @straight-shoota on this, in our code base the change from JSON::Type to JSON::Any improved the code, the readability and the way we loop around it and parse.

Let's not take a step back, and instead improve upon JSON::Any

What do you think of this gist?
This merges duplicate code of JSON::Any and YAML::Any by using a macro, and add the ability to search keys:

js = JSON.parse %({"a": {"b": "c"}, "e": [{"o": "p"}, 12, 2]})

p js[["a", "r"]]?    #=> nil
puts js[["e", 0, "o"]] = JSON::Any.new "test"
puts js        #=> {"a" => {"b" => "c"}, "e" => [{"o" => "test"}, 12_i64, 2_i64]}
js.delete ["e",0, "o"]
puts js          #=> {"a" => {"b" => "c"}, "e" => [{}, 12_i64, 2_i64]}

If the community like this, I will continue for []= and delete, and then creating a PR.
I really need it. If it can't be in the stdlib it will be in a shard.
Work also for yaml, and #[Enumerable], #[Enumerable]= and #delete(Enumerable) are implemented.

Of course this lead to less efficient code, but we have no ther choice when dealing with dynamic documents - jq is a perfect example.

Recursive aliases continue to introduce bugs into Crystal code (latest example: #7567). They're buggy and should better not be used at all. We've gotten rid of using them in the stdlib already.

I'd suggest to remove them from the language. It's either that or fixing them.
If we want to try to implement JSON & co deserialization based on recursive aliases (and I don't see we're getting there), it would require fixing recursive aliases first anyway. So we don't really lose anything if we remove them now and maybe decide adding a complete implementation at some point in the future. But the immediate benefits would be a simplification of the language and compiler and keeping people from running into bugs caused by recursive aliases.

I'm all for this and I could do it in under one day, after all it's just removing code. But I'd like @bcardiff and @RX14 to agree about this too.

I agree strongly with @Papierkorb's comment here and @shelvacu's comment here in favor of keeping recursive aliases and fixing their implementation. If that's not feasible, I'm with @bcardiff here in that, if they _are_ removed, I hope we can bring them back at some point because of how much easier they make thinking about my program's types.

The struct indirection certainly has a lot of value, which has been shown in this thread, but I don't feel like the recursive alias is just a different way to do the same thing, but more that they are both great for different use cases. The struct is awkward to use in some places where a recursive type does well โ€” this has also been shown in this thread. One thing I've noticed in building apps with Crystal is that if something feels wonky, there's probably a better abstraction you could be using.

@jgaskins Yes, the only reason to remove recursive aliases is because they're broken and there is no perspective to fix them anytime soon. This is not meant as a decision against having recursive aliases in the language.

The @asterite method using Struct doesn't work in generics. I can only get it to work in macros. This is as good as I could do this evening.

# Implementation of a pseudo-generic that contains itself without use of
# recursively-defined aliases, which are problematical and/or broken
# in the compiler.
macro recursive_hash(name, keytype, valuetype)
  struct Type_%a
    property value : {{valuetype.id}}
    def initialize(@value)
    end
  end

  class {{name.id}}
    @h = Hash({{keytype.id}}, Type_%a).new

    def []=(key, value)
      @h[key] = Type_%a.new(value)
    end
  end
end

recursive_hash(MyHash, Symbol, String|MyHash|Array(MyHash))

h = MyHash.new
h[:itself] = h
h[:array] = [MyHash.new]
p h.inspect

@BrucePerens What's your use case?

The initial implementation of a shard for generics that can contain themselves is at https://github.com/BrucePerens/recursive_generic .
It's not ready for your use yet.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

asterite picture asterite  ยท  3Comments

Papierkorb picture Papierkorb  ยท  3Comments

lbguilherme picture lbguilherme  ยท  3Comments

jhass picture jhass  ยท  3Comments

ArthurZ picture ArthurZ  ยท  3Comments