Chapel: parallel block scan results are incorrect

Created on 28 Jun 2019  路  3Comments  路  Source: chapel-lang/chapel

Summary of Problem

Bug. High priority.

Two issues:

  1. Parallel block-distributed scan results are incorrect for both min scan and max scan. Serial scan results are okay. Issue only replicates with enableParScan (or sufficiently new master where that config var is default-on).

  2. Results are also wrong (but not the same wrongness) for data type real. It doesn't look like the parallel scan did anything for real.

Steps to Reproduce

Source Code:

use BlockDist;
var D = {0..#10};
var DD: domain(1) dmapped Block(boundingBox=D) = D;
var A1: [D] int = [5, 4, 3, 2, 1, 0, 1, 2, 3, 4];
var A2: [DD] int = A1;

{
  var expected = [5, 4, 3, 2, 1, 0, 0, 0, 0, 0];
  writeln("Output (A1):  ", min scan A1);
  writeln("Output (A2):  ", min scan A2);
  writeln("Expect:       ", expected);
  writeln();
}
{
  var expected = [5, 5, 5, 5, 5, 5, 5, 5, 5, 5];
  writeln("Output (A1):  ", max scan A1);
  writeln("Output (A2):  ", max scan A2);
  writeln("Expect:       ", expected);
  writeln();
}

Compile command:
chpl -senableParScan --fast

Configuration Information

chpl version 1.20.0 pre-release circa June 20.

Libraries / Modules gating issue Bug

Most helpful comment

I think I have a lead on the problem:

https://github.com/chapel-lang/chapel/blob/2471cdc24e4de738c6352556139927bf4296663e/modules/dists/BlockDist.chpl#L1605-L1607

I noticed that the results for + scan seemed to be correct, and wondered if we were mistakenly using '+' or '+=' manually somewhere when we should be using the provided operator's methods to accumulate things. If I change that line from s += myadjust to op.accumulateOntoState(s, myadjust) then I think I'm seeing correct results. I need to do more testing, but wanted to let you know I'm investigating this issue.

All 3 comments

Here's the output I'm seeing for the example above:

Output (A1):  5 4 3 2 1 0 0 0 0 0
Output (A2):  -2 -2 -2 -9223372036854775806 -9223372036854775806 -9223372036854775808 -9223372036854775808 -9223372036854775808 3 3
Expect:       5 4 3 2 1 0 0 0 0 0

Output (A1):  5 5 5 5 5 5 5 5 5 5
Output (A2):  5 5 5 2 2 0 1 2 3 4
Expect:       5 5 5 5 5 5 5 5 5 5

and here's the real counterpart:

use BlockDist;
var D = {0..#10};
var DD: domain(1) dmapped Block(boundingBox=D) = D;
var A1: [D] real = [5.0, 4.0, 3.0, 2.0, 1.0, 0.0, 1.0, 2.0, 3.0, 4.0];
var A2: [DD] real = A1;

{
  var expected = [5, 4, 3, 2, 1, 0, 0, 0, 0, 0];
  writeln("Output (A1):  ", min scan A1);
  writeln("Output (A2):  ", min scan A2);
  writeln("Expect:       ", expected);
  writeln();
}
{
  var expected = [5, 5, 5, 5, 5, 5, 5, 5, 5, 5];
  writeln("Output (A1):  ", max scan A1);
  writeln("Output (A2):  ", max scan A2);
  writeln("Expect:       ", expected);
  writeln();
}
Output (A1):  5.0 4.0 3.0 2.0 1.0 0.0 0.0 0.0 0.0 0.0
Output (A2):  5.0 4.0 3.0 2.0 1.0 0.0 0.0 0.0 3.0 3.0
Expect:       5 4 3 2 1 0 0 0 0 0

Output (A1):  5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0
Output (A2):  5.0 5.0 5.0 2.0 2.0 0.0 1.0 2.0 3.0 4.0
Expect:       5 5 5 5 5 5 5 5 5 5

I think I have a lead on the problem:

https://github.com/chapel-lang/chapel/blob/2471cdc24e4de738c6352556139927bf4296663e/modules/dists/BlockDist.chpl#L1605-L1607

I noticed that the results for + scan seemed to be correct, and wondered if we were mistakenly using '+' or '+=' manually somewhere when we should be using the provided operator's methods to accumulate things. If I change that line from s += myadjust to op.accumulateOntoState(s, myadjust) then I think I'm seeing correct results. I need to do more testing, but wanted to let you know I'm investigating this issue.

Sheesh, I'm really embarrassed about this one... Thanks very much for handling it in my absence @ben-albrecht and @benharsh!

Was this page helpful?
0 / 5 - 0 ratings