Chapel: bug: cannot use union and filter on Series in same program

Created on 1 May 2018 · 12Comments · Source: chapel-lang/chapel

Summary of Problem

If a program tries to use both a union and a filtering operation on a Series (i.e. A + B, A[A < 3]), the Chapel compiler incorrectly raises a type error.

Steps to Reproduce

A branch with the reproducer can be found here: https://github.com/chapel-lang/chapel/compare/master...psahabu:series-dispatch-bug

Source Code:

use Dataframes;

var I = new TypedIndex(["A", "B", "C", "D", "E"]);

var oneDigit = new TypedSeries([1, 2, 3, 4, 5], I);
var twoDigit = new TypedSeries([10, 20, 30, 40, 50], I);

writeln("oneDigit:");
writeln(oneDigit);
writeln("\ntwoDigit:");
writeln(twoDigit);

writeln("\noneDigit + twoDigit:");
writeln(oneDigit + twoDigit);

writeln("\noneDigit < 3");
writeln(oneDigit < 3);

Console output:

Dataframes.chpl:446: In function 'f':
Dataframes.chpl:447: error: type mismatch in assignment from int(64) to bool
Dataframes.chpl:447: note: unresolved call had id 1879013

Compile command:
chpl MixedOperateSeries.chpl -M <Dataframes.chpl location>

Execution command:
./MixedOperateSeries

Configuration Information

Output of chpl --version:
chpl version 1.18.0 pre-release (2c2de20adf)
Output of $CHPL_HOME/util/printchplenv --anonymize:

CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: clang
CHPL_TARGET_ARCH: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: intrinsics
CHPL_GMP: gmp
CHPL_HWLOC: hwloc
CHPL_REGEXP: re2
CHPL_AUX_FILESYS: none

Back-end compiler and version, e.g. gcc --version or clang --version:

Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Bug

Source

psahabu

All 12 comments

@benharsh: Sadly the dynamic dispatch did not work as well as we hoped :(

Any ideas on what could be going on here?

psahabu on 1 May 2018

What line in the user code is it failing on?

benharsh on 1 May 2018

I believe it's the oneDigit + twoDigit line.

psahabu on 1 May 2018

Can you provide some kind of callstack showing the various methods called? Does something attempt to add the result of a filter operation?

Regardless I think that the addition step needs to handle invalid operands, because the types are not known at runtime.

benharsh on 1 May 2018

--print-callstack-on-error may be your friend here.

bradcray on 1 May 2018

The failing line is the return here:

  class SeriesAdd : SeriesUnifier {
    proc f(lhs: eltType, rhs: eltType): eltType {
      return lhs + rhs;
    }
  }

The only thing that makes sense to me is that this is being
instantiated for eltType == bool. The lhs + rhs coerces both
args to int and adds them, but can't coerce back.
(writeln(true+true) prints 2.)

Changing the : eltType to : int results in an error that shows
that the SeriesAdd's SeriesUnifier's eltType is bool:

test/library/standard/DataFrames/
DataFrames.chpl:539: error: conflicting return type specified for 'f(_mt: _MT, this: SeriesUnifier(bool), lhs: bool, rhs: bool) [1894978]: bool'
DataFrames.chpl:554: error:   overridden by 'f(_mt: _MT, this: SeriesAdd(bool), lhs: bool, rhs: bool) [1912122]: int(64)'

But why is it instantiating a SeriesAdd(bool)? As suggested by
the summary, removing the oneDigit < 3 or the oneDigit + twoDigit allows it to compile with the : int return type for f().

It's like the oneDigit < 3 somehow gives the compiler the notion
of a SeriesUnifier(bool), but it needs the oneDigit + twoDigit
to look closely enough at it to become upset about the types?

It compiles and runs with both expressions with this change. I don't
know if this is even a workaround:

  class SeriesAdd : SeriesUnifier {
    proc f(lhs: eltType, rhs: eltType): eltType where isNumeric(eltType) {
      return lhs + rhs;
    }
  }

That wouldn't allow adding two TypedSeries where the type is a
record with + defined.

cassella on 4 May 2018

👍1

Changing the test case's oneDigit + twoDigit to
oneDigit.add(twoDigit) allows it to compile, and produce what
looks like the correct result. The definition of+ looks to me like the same thing :

  proc +(lhs: Series, rhs: Series) {
    return lhs.add(rhs);
  }

Changing only the < 3 to .lt_scalar(3) does not affect the error.

cassella on 4 May 2018

👍1

@cassella Thanks for doing some investigation! I suspect that SeriesAdd(bool) is being instantiated when a TypedSeries(bool) exists in the program. The filtering done by oneDigit < 3 returns a TypedSeries(bool), which might be sending the compiler down the dark operator corner.

Gating all the arithmetic operations with isNumeric() seems reasonable enough, though I also want isString() or something similar for +. Thanks again!

psahabu on 5 May 2018

I don't know if it's related, but if I give the untyped Index
class's stub map() method the same type-query constraints as
TypedIndex's, I get a suspicious error:

    proc map(s: TypedSeries(?T), mapper: SeriesMapper(T,?R)): Series {

fortytwo@magrathea:~/src/chapel (master)$ chpl -M test/library/standard/DataFrames issue-9424.chpl 
test/library/standard/DataFrames/DataFrames.chpl
:457: In function 'map':
:459: error: unresolved call 'Index.map(TypedSeries(bool), SeriesMapper(int(64),bool))'
:459: note:   instantiated from map
:499: note:   instantiated from lt_scalar
:37: note: candidates are: Index.map(s: TypedSeries(?T), mapper: SeriesMapper(T, ?R)) [191607]
:141: note:                 TypedIndex.map(s: TypedSeries(?T), mapper: SeriesMapper(T, ?R)) [192524]

which suggests that it's trying to apply a
SeriesMapper(int(64),bool) over a TypedSeries(bool), which
looks backwards. And that with the less constrained Index.map, it
is able to resolve it.

Maybe this just points at the same thing as the PR you just opened,
that the creation of the TypedSeries(bool) in the presence of a
+ operation on TypedSeries(int)s is causing it to resolve
things on TypedSeries(bool).

cassella on 5 May 2018

Given the code as-is in master, does it make sense that it's
attempting to instantiate SeriesAdd(bool), or should
DataFrames.chpl compile without causing that instantiation without
the where clauses?

I have a stripped-down version of DataFrames.chpl that's just over
100 lines and hits the problem. Is that interesting, or is it the
case that the arithmetic operations are just going to need where
clauses, and that's that?

cassella on 5 May 2018

Should the test have compiled with DataFrames.chpl as-was?

I can't tell if it should have worked since there was no attempt to
invoke + on a TypedSeries(bool), or if the language definition
requires it not to compile.

The fixing commit describes the where clauses as a workaround.

I've been able to reduce the test case to 60 lines, primarily by
simplifying the "Series" to be a singleton with no more Indices,
removing all the operations but +, and flattening the Unifier
hierarchy down to one SeriesUnifier which hardcodes addition.

I think that's short enough I'll just include it here. I tried to
leave function and class names the same to make it easier to track
back to the original code, even when the names are now inaccurate.

// Renamed to disambiguate now that it's not explicitly invoked as a method.
proc guni(lhs: TypedSeries(?lhsType), rhs: TypedSeries(?rhsType),
      unifier: SeriesUnifier(lhsType)): TypedSeries(lhsType)
     where lhsType == rhsType {

  defer { delete unifier; }
  return new TypedSeries(unifier.f(lhs.data, rhs.data));
}

class Series {
  proc uni(lhs: TypedSeries, unifier: SeriesUnifier) {
    halt("generic Series cannot be unioned");
    return this;
  }

  proc add(rhs) {
    halt("generic Series cannot be added");
    return this;
  }
}

class TypedSeries : Series {
  type eltType;
  var data: eltType;

  proc init(data: ?T) {
    super.init();
    eltType = T;

    this.data = data;
  }

  proc uni(lhs: TypedSeries(eltType), unifier: SeriesUnifier(eltType)): TypedSeries(eltType) {
    return guni(lhs, this, unifier):TypedSeries(eltType);
  }

  proc add(rhs): Series {
    return rhs.uni(this, new SeriesUnifier(eltType));
  }
}

class SeriesUnifier {
  type eltType;

  proc f(lhs: eltType, rhs: eltType): eltType {
    return lhs + rhs;
  }
}

proc +(lhs: Series, rhs: Series) {
  return lhs.add(rhs);
}


var oneDigit = new TypedSeries(1);
writeln("oneDigit: ", oneDigit);
writeln("oneDigit + oneDigit: ", oneDigit + oneDigit);

var bools = new TypedSeries(true);
writeln("bools: ", bools);

fortytwo@magrathea:~/src/chpl/dataframes (master)$ chpl DataFrames.chpl 
DataFrames.chpl:45: In function 'f':
DataFrames.chpl:46: error: type mismatch in assignment from int(64) to bool
DataFrames.chpl:46: note: unresolved call had id 1362421

Without the oneDigit stanza, or without the bools stanza, it still produces the expected output.

cassella on 18 May 2018

My understanding is that the language definition says this shouldn't compile, because bool cannot be added to another bool. It's described as a workaround because there may be a more elegant solution, but this one works for now.

psahabu on 18 May 2018

Was this page helpful?

0 / 5 - 0 ratings