Chapel: Using module in a function doesn't find symbols

Created on 25 Jun 2020  路  25Comments  路  Source: chapel-lang/chapel

The test test/library/packages/HDF5/parallelReadMultitype.chpl fails if the use BlockDist at the top of the HDF5 module is commented out. That test calls HDF5.readAllHDF5Files, which calls HDF5.readAllNamedHDF5Files. HDF5.readAllNamedHDF5Files uses BlockDist and then declares a Block distributed array.

    use BlockDist;

    const Space = filenames.domain;
    const BlockSpace = Space dmapped Block(Space, locs,
                                           dataParTasksPerLocale=1);
    var files: [BlockSpace] ArrayWrapper(eltType, rank);

The test gives the error:

parallelReadMultitype.chpl:10: In function 'main':
parallelReadMultitype.chpl:19: error: unresolved call 'unmanaged BlockDom(1,int(64),false,unmanaged DefaultDist).dsiBuildArray(type ArrayWrapper(int(64),2), 0)'
$CHPL_HOME/modules/internal/DefaultRectangular.chpl:683: note: this candidate did not match: DefaultRectangularDom.dsiBuildArray(type eltType, param initElts: bool)
... other non-matching functions...

BlockDist.dsiBuildArray is not included in the list.

This issue is visible by compiling the test on OSx.

% chpl --version
chpl version 1.23.0 pre-release (7d6dd40782)
Copyright 2020 Hewlett Packard Enterprise Development LP
Copyright 2004-2019 Cray Inc.
(See LICENSE file for more details)
Compiler Bug

All 25 comments

Okay, this is maybe a bug with initCopy's insertion, or modification of the value returned from a function or something along those lines. It looks like the initCopy on the result of readAllHDF5Files into the files value is resolving as instantiated from the user code rather than readAllHDF5Files or further down its call chain, and it calls chpl_coerceMove, which calls buildArray (which is what looks for dsiBuildArray). So the reason the use of BlockDist at the top of the HDF5 module fixes things is because we are reaching it through the use of the HDF5 module on line 11 of the program.

I suspect this issue has been around for a while. I'd be curious on @mppf and @vasslitvinov's thoughts as to how to resolve this

I think this is an approximation of what is going on in user code:

module One {
  class Parent {
    type t;

    proc one(x) {
      two(x); // This call fails through the initCopy
    }
  }

  pragma "init copy fn"
  proc chpl__initCopy(const ref rhs: Parent) {
    rhs.one(2); // This is a simplification of what is going on in chpl_coerceMove from ChapelArray
    pragma "no copy"
    var lhs = new Parent(rhs.t);
    return lhs;
  }

}
module Two {
  public use One;

  proc Parent.two(x) {
    writeln("In Parent.two with value ", x);
  }
}

module Three {
  proc blah() {
    use Two;
    return new unmanaged Parent(int);
  }
}
module User {
  use One, Three;

  proc main() {
    var x = blah();
    writeln(x);
    delete x;
  }
}

The call chain seems less well motivated in this toy example than it obviously is in ChapelArray. I'm not sure what that implies, though

Ah, frak, I meant to have Two contain a subclass of Parent rather than a method on Parent itself. Standby

I think this is a more accurate representation of the situation, based on reexamining the original motivating code.

module One {
  class OuterClass {
    var field;

    proc one(x) {
      field.two(x); // Expects to know enough about field's type to find it's `two` method
    }
  }

  pragma "init copy fn"
  proc chpl__initCopy(const ref rhs: OuterClass) {
    rhs.one(2);
    pragma "no copy"
    var lhs = new OuterClass(rhs.field);
    return lhs;
  }

}
module Two {
  public use One;

  class Stored {

    proc two(x) {
      writeln("In Stored.two with value ", x);
    }
  }
}

module Three {
  proc blah() {
    use Two; // This use is fine when creating the field, but we have no way back to it
    return new unmanaged OuterClass(new owned Stored());
  }
}
module User {
  use One, Three;

  proc main() {
    var x = blah(); // Call to initCopy is inserted on the result of this call
    // But because we aren't in the scope with the use of Two any more, we can't find the
    // method via the point of instantiation either.
    writeln(x);
    delete x;
  }
}

Key differences:

  • Parent is instead a class that stores another class
  • two (the method that is not resolved) is a method on the stored class

This more accurately reflects the situation in ChapelArray, where dsiBuildArray was a method on the dom field rather than on the array itself as I originally assumed.

(Restating in other words so to help myself understand). So here the heart of the matter is var x = blah(); where blah returns a type that isn't visible through normal means. This statement then runs initCopy, which is found, because of the special handling for method-like functions from PR #15003. But, then, when the initCopy is instantiated, its point of instantiation is in User.main which does not have visibility into Two which is where the class Stored and its two method is defined.

I would suppose that there is an adjustment that needs to occur to the point of instantiation for method-like functions (exactly those handled by the method resolution strategy in PR #15003). In particular, we should also adjust the point of instantiation to be the point of instantiation for a method-like function to be the point of instantiation for the type on which it applies. (Here I am assuming that, method-like functions have a straightforward relationship with that type - say, it is the type of the 1st argument).

If we did that, the point of instantiation for initCopy would be the point of instantiation for OuterClass(owned Stored) which is the body of blah. As a result, initCopy would have visibility to see the use Two and two through its point of instantiation.

Lazy questions: How specific is this issue to initCopy()? What's currently standing in the way of our retiring initCopy()?

How specific is this issue to initCopy()?

I think it'd be possible to make a version that doesn't use initCopy at all.

What's currently standing in the way of our retiring initCopy()?

Support for runtime types in initializers.

How specific is this issue to initCopy()?

I think it'd be possible to make a version that doesn't use initCopy at all.

I agree. After looking at the problem for a while, I think it has the most to do with relying on a method of a field when we don't know the field's type at the point where the method is called. With that in mind, maybe what Michael is proposing doesn't go far enough? What if we got the value for the field from another function that contained a use? E.g.:

...
proc blah() {
  var fieldVal = foo();
  return new unmanaged OuterClass(fieldVal);
}
proc foo() {
  use Two;
  return new owned Stored();
}
...

Would it be reasonable to say that interfaces would help in this situation?

What if we got the value for the field from another function that contained a use

I'm having trouble seeing it without a complete example.

Would it be reasonable to say that interfaces would help in this situation?

They sure would, but I think the point-of-instantiation adjustment I proposed in https://github.com/chapel-lang/chapel/issues/15946#issuecomment-656908837 would still be appropriate.

I'm having trouble seeing it without a complete example.

Sorry, I went two steps ahead instead of just one.

Here's code that fails in the same way without an initCopy - the method one on the outer type is called directly:

module One {
  class OuterClass {
    var field;

    proc one(x) {
      field.two(x); // This is the call that fails to resolve
    }
  }

}
module Two {
  public use One;

  class Stored {

    proc two(x) {
      writeln("In Stored.two with value ", x);
    }
  }
}

module Three {
  proc blah() {
    use Two; // If this use is at module scope, this code compiles due to the P.O.I.
    return new unmanaged OuterClass(new owned Stored());
  }
}
module User {
  use One, Three;

  proc main() {
    var x = blah();
    x.one(2);  // Calls the `one` method directly
    delete x;
  }
}

Here's code that also would fail, and I think wouldn't be resolved by having the point of instantiation be where the returned OuterClass instance was created (because we got the type from a different scope):

module One {
  class OuterClass {
    var field;

    proc one(x) {
      field.two(x); // This is the call that fails to compile
    }
  }

}
module Two {
  public use One;

  class Stored {

    proc two(x) {
      writeln("In Stored.two with value ", x);
    }
  }
}

module Three {
  proc blah() {
    use One;
    var fieldVal = foo(); // This is the type instance that is problematic
    return new unmanaged OuterClass(fieldVal); // By putting it in this instance, we extend its scope
  }

  proc foo() {
    use Two; // But with the use only within here, we can't find it again.
    return new owned Stored();
  }
}
module User {
  use One, Three;

  proc main() {
    var x = blah();
    x.one(2); // Calls the `one` method directly
    delete x;
  }
}

Here's code that also would fail, and I think wouldn't be resolved by having the point of instantiation be where the returned OuterClass instance was created (because we got the type from a different scope):

I think I agree- but where I am confused has to do with PR #15003. I think I might be imagining that PR does something it does not do. If a method on a type is defined in the same module as wherever the type is defined, then I think that method should be available to be a called at any point where the type is. Even if there is no path to that module via use or import.

I think that such a rule would allow both of the examples in your previous comment to compile.

I have to confess that I still am not sufficiently caught up on this issue to understand it deeply, but this statement:

If a method on a type is defined in the same module as wherever the type is defined, then I think that method should be available to be a called at any point where the type is. Even if there is no path to that module via use or import.

resonates with me and feels like the type-oriented equivalent of the "I want to call methods on an object that was returned to me whose type I can't 'name' through use or import statements" theme that Lydia's improved a few times in recent months.

I think I might be imagining that PR does something it does not do.

Yeah, #15003 didn't modify the scope of any use or import statements, it just chose to follow private ones that shouldn't be traversed when looking for non-methods.

I suspect #15803 is more along the lines of what could be done, but we did explicitly choose not to follow the strategy that would have solved this when there. From that PR, here's the reason:

Traversing the module's used modules list [would allow] inner scopes with such references
to impact the symbols available throughout the entire module.

Basically, if we blindly expand the visibility of a use or import used within a function, it could cause things to be available that the user was explicitly trying to avoid by making the scope of the use or import more limited. Here's an example:

module A {
   proc foo() {
     use One;      // One defines a type, RecFromOne
     return new RecFromOne();
  }
  proc bar(): int {
    use Two;      // Two provides a function we want to use, funcFromTwo, which returns an int
    // However, it also defines a secondary method on RecFromOne, methodFromTwo.
    return funcFromTwo();
  }
}

If the user calls foo to obtain an instance of RecFromOne, should that instance be able to call methodFromTwo?

use A;
var x = foo();
x.methodFromTwo();  // Should this work?

Should that depend on if bar was called in that same scope?

use A;
var x = foo();
var y: int = bar();
x.methodFromTwo(); // Should this now work because of the previous line?

I would argue no - bar limited the scope of its use of Two, and nothing about its use of Two involves the secondary method. It would be confusing for a function that is never called in the user's scope, or is called in an unrelated circumstance to impact the behavior of the user code.

We could adjust whether it impacts the code based on if the particular type is returned from the function, but that will require adjusting based on something that is known at function resolution which brings up some order of operations questions and seems like kinda a mess to me.

@lydia-duncan - I am not sure I am following everything correctly but isn't this issue about secondary methods defined in non-visible modules solved by the part of the proposal I've emphasized below:

If a method on a type is defined in the same module as wherever the type is defined, then I think that method should be available to be a called at any point where the type is. Even if there is no path to that module via use or import.

?

I don't think that's a limitation that will stand and so I believe it's important to consider the consequences of it not holding on our particular implementation choice. Someone is definitely going to come along and request that this works:

module A {
  proc foo() {
    use DefinesRec;                // Defines rec
    use DefinesRecMethods;  // Defines rec.secondaryMethod
    return new rec();
  }
}
module User {
  use A;
  var x = foo();
  x.secondaryMethod();  // Defined by DefinesRecMethods
}

And distinguishing between those that case and the one from my most recent comment is going to be hairy. It's even more obvious someone will ask if they leave off the use of DefinesRec, since DefinesRecMethods should probably have its own public use of the module. From a user perspective, why would something be available from a further module but not the closer one?

Someone is definitely going to come along and request that this works

Yes, but it is my opinion that it is OK if that case doesn't work, provided that there is a way of writing it that does work.

From a user perspective, why would something be available from a further module but not the closer one?

I think it'd be OK. Point-of-instantiation is pretty weird as it is; we'd just have another part to the rule that helps some cases (but maybe not all cases).

I don't think that's a limitation that will stand and so I believe it's important to consider the consequences of it not holding on our particular implementation choice.

Why not allow all methods?

it could cause things to be available that the user was explicitly trying to avoid by making the scope of the use or import more limited

Maybe this is a pattern that we should decide not to support? Maybe private methods are a better way to handle it?

Maybe private methods are a better way to handle it?

Private methods are a different sort of solution. Think of it this way - you have three people involved: Person 1 (defined the type and maybe the secondary method), Person 2 (defined the function that is returning the type), and Person 3 (defined the code that called the function).

Making the methods private is a Person 1 decision. It's not something Person 2 has control over. Person 2 has control over the visibility of the use and import statements used to obtain the type that is being returned. Because we allow the compiler to follow private uses when looking for methods, this is their only strategy of controlling what is visible as part of their implementation. If we go into these methods, Person 2 has no way of limiting what is seen. While it's perfectly reasonable for Person 3 to want access to this method, they can get it for themselves if they need to.

If we go into these methods, Person 2 has no way of limiting what is seen.

I'm not really convinced that Person 2 should be able to control what methods Person 3 can call.

I'm not really convinced that Person 2 should be able to control what methods Person 3 can call.

The thing is, they're not preventing methods from being capable of being referenced, they're simply not being the reason they are referenceable. Person 3 should actively use the scope to access the method, it makes the code more understandable. It's not Person 2's responsibility.

As mentioned above, to me, what Michael's proposing sounds very equivalent to the capability we already have that says "If I can refer to an object, I should be able to call its methods on it even if the object's type name isn't visible to my code" (for those methods defined in the same module as the object's type, at least). Where here we're talking about calling type methods on types rather than normal methods on objects. Is that correct?

That is, the code I'm imagining we're discussing the legality of is something like:

module M {
  record t {
    proc type foo() {
      ...
    }
  }  // or class t or type t or ...

  proc type t.bar() {
    ...
  }

  proc getType() type {
    return t;
  }
}

module N {
  use M only getType;

  type u = getType();
  u.foo();
  u.bar();
}

where I'm calling foo() and bar() from N even though I can't refer to t directly in that module. Is that right? If so, I think this should be allowed for the same reasons as the similar issues on object values whose types aren't visible.

Where here we're talking about calling type methods on types rather than normal methods on objects. Is that correct?

Nah. I think what's more being discussed is the impact that the general position of "if I have a type, I should be able to call a method on it" should have on the implementation, in regards to situations where we obtained the type in a very indirect manner. Our current strategy of how we get back to where the type is defined relies on use and import statements because those are the ways we have traditionally brought in another scope, but when the scope of the use or import is limited (because it is inside a function rather than at module scope), should that limitation continue to impact our ability to find the type's methods? Should that limitation vary depending on if the method was defined in the same scope as the type or if it was a secondary method in another module? Would the answer to that most recent question be confusing for users?

I also am internally wrestling with the meta-worry that maybe we shouldn't allow types to be returned to scopes where they aren't visible. It was my original reaction to the idea, and it always occurs to me when we encounter an additional situation that wasn't accounted for by that rule. Each subsequent edge case that increases the hoops we have to jump through to enable that support makes me worry we've made the wrong decision here.

I think I may have misunderstood Michael's original proposal. I believe I was interpreting it to mean this:

module One {
  class OuterClass {
    var field;

    proc one(x) {
      field.two(x); // Inner call to resolve (the problematic one)
    }
  }

}
module Two {
  public use One;

  class Stored {

    proc two(x) {
      writeln("In Stored.two with value ", x);
    }
  }
}

module Three {
  proc blah() {
    use Two;
    return new unmanaged OuterClass(new owned Stored()); // Look here because this is where the instance was created 
  }
}
module User {
  use One, Three;

  proc main() {
    var x = blah();
    x.one(2); // Outer call to resolve
    delete x;
  }
}

which seems like it would be complex to track, since we could have gotten the value to use for the field from a variety of places.

But upon rereading, I think what he was actually suggesting was looking at the type's scope when resolving methods on it (since we should theoretically know that we are making a method call), and that seems more doable, as long as we don't only look there (since secondary methods could be visible). I'm kinda surprised we don't do this already - we do jump to the type's scope when we're already in a method during scopeResolve, but we don't seem to jump that way in function resolution and certainly don't react to resolving a method call by doing so.

I think the only open question that would leave in my mind is:
Does this new strategy seem like a replacement for following private uses and imports? Or an addition to it?

Argument in favor of replacement:

  • Looking at the type's scope may be sufficient for the cases that following private uses and imports handled
  • We'd arguably be more honest about what "private" means in that case
  • Less special rules to keep track of when determining if a secondary method can be found

Argument in favor of "in addition":

  • Jumping to the type's scope only helps with primary methods and fields. If someone is relying on private uses and imports being followed for method resolution (which could be, since that change was in the last release), removing that traversal could break their code.
  • Theoretically less likely we'll have another case of a user expecting to find a symbol from a scope that isn't visible in a few months

I wouldn't necessarily hold up implementing the proposal from Michael as a result of this question, as it could happen in a separate PR, but it's something I'd like an answer to.

The replacement/addition question seems to really be about what we support for secondary methods defined in other modules (i.e. not in the module defining the type). From a language design viewpoint, I'd think that a secondary method defined in another module that is private used should not be visible. (In other words that would lead us to "replacement"). However I worry that someone might run into a pattern where defining the secondary method in another module is needed for some reason.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

buddha314 picture buddha314  路  3Comments

Aniket21mathur picture Aniket21mathur  路  3Comments

ben-albrecht picture ben-albrecht  路  3Comments

BryantLam picture BryantLam  路  3Comments

buddha314 picture buddha314  路  3Comments