Chapel: Build a library of iterator tools

Created on 21 Mar 2019 · 41Comments · Source: chapel-lang/chapel

Libraries / Modules Feature Request

Source

ben-albrecht

Most helpful comment

@akshansh2000 You may want to make this into its own Pull Request. (also side note: If you add write the code block as ```chpl you get syntax highlighting)

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

You also will want a parallel iterator. I'll do this one for you so you get the gist of how to do so in Chapel...

iter repeat (arg, times = 0, param tag : iterKind) where tag == iterKind.standalone {
    if times == 0 then
        forall count in 1.. do yield arg;
    else
        forall count in 1..#times do yield arg;
}

You'll also definitely want a leader-follower version so you can do something like this...

forall (ix, a) in zip(repeat(fixedIdx), arr) {
   a = ix;
}

In the above example, you zip over an infinite stream repeat(fixedIdx) and some finite container.

LouisJenkinsCS on 23 Mar 2019

👍2

All 41 comments

This is up for grabs as a GSoC 2019 project. Some suggestions to GSoC applicants:

Determining the subset of iterators to implement and which ones are parellizable/distributable is left as an exercise for the applicants, and can be included as part of the application project plan / timeline.
Implementing one of the iterator functions, with tests, documentation, and
performance testing in advance would be helpful in several ways:
- It will give you an idea of how long this process takes and should give
  
  you a more accurate estimate for your timeline.
- It will also show that are ramped up on interacting with Chapel's
  
  testing system (start_test) and chpldoc.
I would expect that handling some parallel and/or distributed iterators would be part of the GSoC timeline.
- This may limit how many iterators get implemented in the summer, and that's fine. It's OK to leave some (or many) iterators out of the timeline or push them into stretch goals section.

ben-albrecht on 21 Mar 2019

Though not explicitly listed, this may be considered part of https://github.com/chapel-lang/chapel/issues/6329

ben-albrecht on 21 Mar 2019

I want to take this up as a GSoC 2019 project. I was just wondering if beginning with implementing the repeat and product itertools of python would be a good way to start. As far as I know, these tools are not available in Chapel as of now. Please let me know if I missed something, and whether to go ahead with this.

For example, the basic implementation of product for integer ranges could be given as:

iter product(x: range, y: range) {
    for i in x do for j in y do yield (i, j);
}

writeln(product(1..5, 3..5));

// (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 3) (3, 4) (3, 5) (4, 3) (4, 4) (4, 5) (5, 3) (5, 4) (5, 5)

So what I need to do is write the tests and documentation, work on determining whether it can be made parallelizable or not, and make this process fast by reducing communication overheads, right?

Please let me know if I'm missing something.
Thank You!

akshansh2000 on 22 Mar 2019

Hi @akshansh2000, repeat and product seem like good starts. I expect repeat to be relatively simple, but product will be a little more challenging.

Regarding your product implementation, there are a few design goals to consider:

We'd want to support variadic arguments
We'd want our arguments to be generic so that we don't have to write an overload for every iterable type
We'd want to support a repeat argument similar to itertools.product
Note that we won't be able to support heterogenous types for our arguments, due to being unable to iterate over or index into a tuple of heterogenous types in chapel.
- See https://github.com/chapel-lang/chapel/issues/9607 for why this is, if curious.

Beyond that, documentation, evaluating performance, and determining if the implementation should/can be parallelized would be your next steps (in any order you prefer).

ben-albrecht on 22 Mar 2019

👍1

Thank you, @ben-albrecht. I'm working on it. I'll keep you updated.

akshansh2000 on 23 Mar 2019

The basic implementation for the repeat tool is given:

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

// For integer/real/complex etc..

writeln(repeat(10, 15));

// For arrays/strings

writeln(repeat([1, 4, 6, 7], 4));
writeln(repeat("Testing", 5));

// For dictionaries

var dictDomain: domain(string);
var dict: [dictDomain] int;

dict["one"] = 1;
dict["two"] = 2;
dict["three"] = 3;

writeln(repeat((dictDomain, dict), 3));

// For infinite repetition

write(repeat("infinite"));

Output:

10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
1 4 6 7 1 4 6 7 1 4 6 7 1 4 6 7
Testing Testing Testing Testing Testing
({two, three, one}, 2 3 1) ({two, three, one}, 2 3 1) ({two, three, one}, 2 3 1)
infinite infinite infinite infinite .....

Please let me know what all I need to change in this. This sure could be parallelized (only in the case of a bounded range), and I'll work on that next. The documentation and performace evaluation is further in the queue.

akshansh2000 on 23 Mar 2019

@akshansh2000 You may want to make this into its own Pull Request. (also side note: If you add write the code block as ```chpl you get syntax highlighting)

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

You also will want a parallel iterator. I'll do this one for you so you get the gist of how to do so in Chapel...

iter repeat (arg, times = 0, param tag : iterKind) where tag == iterKind.standalone {
    if times == 0 then
        forall count in 1.. do yield arg;
    else
        forall count in 1..#times do yield arg;
}

You'll also definitely want a leader-follower version so you can do something like this...

forall (ix, a) in zip(repeat(fixedIdx), arr) {
   a = ix;
}

In the above example, you zip over an infinite stream repeat(fixedIdx) and some finite container.

LouisJenkinsCS on 23 Mar 2019

👍2

Thanks for the clarification on the parallel iterator, @LouisJenkinsCS! I'll work on the leader-follower part and share my progress.

(Thanks for the syntax highlighting part as well hahaha)

Also, as for the PR, I'm supposed to make a Mason package and make a PR to the mason-registry repository, right?

akshansh2000 on 23 Mar 2019

Hm... I'm not sure, honestly. Previously students just submit a PR to the repository and it gets stuck somewhere (likely labeled as some kind of prototype/test), but I suppose it makes sense if @ben-albrecht would want you to implement it as a Mason package.

LouisJenkinsCS on 23 Mar 2019

Alright, I'll wait for @ben-albrecht sir to reply, and work on my code until then.

Thank you!

akshansh2000 on 23 Mar 2019

Mason package vs package module is still an open question at this point. Though, I'm leaning towards a package module right now so that we can utilize the nightly performance testing. Once mason packages have this functionality, we could migrate this to a mason package, like we plan to do for other existing package modules (https://github.com/chapel-lang/chapel/issues/10713).

ben-albrecht on 24 Mar 2019

👍1

The repeat iterator is almost complete, I believe.
Please take a look at my implementation and let me know if I need to change something.

Also, I'm a bit confused here. How do I implement infinite repetition for a parallel loop?
In the case of a forall loop, it needs to know to the total number of tasks to divide the repeat task into different tasks, while in the case of a coforall loop, the program terminates at runtime due to excessive memory usage. Please help me out with this, I'm a bit lost.

Here's the code; I've commented the non working code below.

// Serial Iterator

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

// Standalone Parallel Iterator

iter repeat (param tag : iterKind, arg, times = 0) where tag == iterKind.standalone {
    if times == 0 then
        coforall count in 1.. do yield arg; // parallel infinite loop creates problem
    else
        coforall count in 1..#times do yield arg;
}

// Procedure to compute chunks which are to be iterated through

proc computeChunk(r : range, myChunk, numChunks) where r.stridable == false {
    const numElems = r.length;
    const elemsPerChunk = numElems / numChunks;
    const mylow = r.low + (elemsPerChunk * myChunk);

    if (myChunk != numChunks - 1) then
        return mylow..#elemsPerChunk;
    else
        return mylow..r.high;
}

const numTasks = here.maxTaskPar; // max number of tasks supported by locale

// Leader

iter repeat (param tag : iterKind, arg, times = 0) where tag == iterKind.leader {
    if times == 0 then
        coforall task_id in 0.. do yield(0..1,); // parallel infinite loop creates problem
    else
        coforall task_id in 0..#numTasks {
            const working_iters = computeChunk(0..#times, task_id, numTasks);
            yield(working_iters,);
        }
}

// Follower

iter repeat (param tag : iterKind, arg, times = 0, followThis)
        where tag == iterKind.follower && followThis.size == 1 {
    const working_iters = followThis(1);

    for idx in working_iters do yield arg;
}

for element in repeat(1, 10) do write(element, ' '); // works - 1 1 1 1 1 1 1 1 1 1
writeln();

forall element in repeat('abc', 4) do write(element, ' '); // works - abc abc abc abc
writeln();

forall element in repeat(123) do write(element, ' '); // doesn't work, parallel infinite loop
writeln();

var arr: [1..5] int;

forall (idx, element) in zip(repeat(3, 5), arr) do element = idx; // works
writeln(arr); // - 3 3 3 3 3

forall (idx, element) in zip(repeat(2), arr) do element = idx; // doesn't work, parallel infinite loop
writeln(arr);

Thank you!

akshansh2000 on 25 Mar 2019

Also, I'm a bit confused here. How do I implement infinite repetition for a parallel loop?

This is not supported today in Chapel. As a result, we will not be able to pursue parallel implementations of any infinite iterators for this project.

ben-albrecht on 25 Mar 2019

👍1

Alright, so should I put the rest of the code into a module and create a PR?

akshansh2000 on 25 Mar 2019

Oh yeah, I forgot that you can't break from a forall loop. I'm wondering if its wise to return a 'killswitch' which the leader and follower will check to see if it should continue.

LouisJenkinsCS on 25 Mar 2019

Note that if I make the array the 'leader', it works fine. It has to do with the leader not knowing when to stop, which is definitely a problem. TIO

LouisJenkinsCS on 25 Mar 2019

akshansh2000 on 25 Mar 2019

LouisJenkinsCS on 25 Mar 2019

👍1

ben-albrecht on 25 Mar 2019

👍1

akshansh2000 on 12 Apr 2019

ben-albrecht on 13 Apr 2019

👍1

akshansh2000 on 20 Apr 2019

LouisJenkinsCS on 21 Apr 2019

akshansh2000 on 21 Apr 2019

LouisJenkinsCS on 21 Apr 2019

akshansh2000 on 21 Apr 2019

LouisJenkinsCS on 21 Apr 2019

akshansh2000 on 21 Apr 2019

ben-albrecht on 22 Apr 2019

👍1

akshansh2000 on 27 Apr 2019

bradcray on 29 Apr 2019

👍1

akshansh2000 on 29 Apr 2019

bradcray on 30 Apr 2019

👍1

bradcray on 2 May 2019

akshansh2000 on 2 May 2019

bradcray on 2 May 2019

👍1

akshansh2000 on 2 May 2019

bradcray on 3 May 2019

👍1

akshansh2000 on 16 May 2019

ben-albrecht on 21 Jan 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Should enums support `.indices` and indexing?

bradcray · 3Comments

How do you add a package to Mason that requires changes to the Chapel run-time/compiler?

LouisJenkinsCS · 3Comments

Should "new initializers" support field initialization prior to super.init()?

bradcray · 3Comments

CG: when to resolve an interface declaration?

vasslitvinov · 3Comments

Reading previously serialized domains fails

astatide · 3Comments