Problem-specifications: pangram and isogram are extremely similar

Created on 19 Apr 2018  Â·  16Comments  Â·  Source: exercism/problem-specifications

Not clear to me that both pangram and isogram should be here. They are more or less the same exercise as far as I can tell.

Most helpful comment

Conclusion:

  • [ ] Isogram and Pangram can be similar, or different; mostly dependent on the language
  • [ ] Tracks where exercises are very similar can choose to deprecate one of them in the track config.json

Can we close this issue with this conclusion?

All 16 comments

They are completely different.

Pangram is about checking if letters are in a sentence at least once while isogram checks if letters are contained at most once.

Maybe it's the other way round, I'm not sure about the exact definition right now.

I can demonstrate by example that there is at least one difference in both directions:

  • a is an isogram but not a pangram.
  • aabcdefghijklmnopqrstuvwxyz is a pangram but not an isogram

The way of solving the two problems is quite different too, https://github.com/petertseng/exercism-problem-specifications/blob/verify/exercises/pangram/verify.rb#L11-L26 vs https://github.com/petertseng/exercism-problem-specifications/blob/verify/exercises/isogram/verify.rb#L7

If I should change my interpretation and/or thinking to instead understand that they are, in fact, extremely similar, let me know how.

Those are both the same sequence of steps:

  1. Form a map from letters -> count
  2. Iterate over the values of that map

Pangram is true if for all letters, the count >= 1. Isogram is true if for
all letters, the count <= 1.

Flipping the sign of the inequality is technically enough to call the
exercise completely different. At the same time, it's reasonable to say
that the two are extremely similar, and there may be benefit in eliminating
one.

Pangram's rules require case-folding. Isogram's rules involve filtering
spaces and hyphens from the input. In both cases, I believe it fair to call
those rules chaff, not the point of the exercise, and not worthy of
forming exercises of their own.

I don't feel strongly one way or the other about whether we should
eliminate one of these exercises, but I think it would be unfair to dismiss
the issue out of hand.

On Thu, Apr 19, 2018 at 6:58 PM, Norbert Melzer notifications@github.com
wrote:

They are completely different.

Pangram is about checking if letters are in a sentence at least once
while isogram checks if letters are contained at most once.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/exercism/problem-specifications/issues/1228#issuecomment-382807972,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHdeTn4urmvLwqYiXxQTZGA3NZv8V854ks5tqMIugaJpZM4TcJNd
.

I solved the one by basically copy-pasting my solution to the other and then doing slight edits. Honestly, I think that should work pretty much whatever solution strategy you are using. It's a matter of taste, definitely, but they feel pretty redundant to me in terms of what you learn by solving them.

the same sequence of steps

Now I see that this is a true statement! Your next paragraph explains the reason why I didn't see this.

Flipping the sign of the inequality is technically enough to call the exercise completely different

Yes, this caused me to miss the similarity; the sign flip means I chose not to form a map of letter -> count.

And that revealed an interesting difference in early termination. Say we're reading left to right:

  • isogram: can terminate early in the negative but never in the affirmative.

    • aa plus 1 million bytes only requires reading the first 2 to arrive at negative answer

    • a..z requires reading the entire thing to arrive at positive answer. Though I suppose the impact of this is bounded, since a positive answer can never require reading more than 26 letters. I guess since the rules of the exercise allow spaces, I guess a..z followed by a million spaces will require reading the entire string.

  • pangram: can terminate early in the affirmative but probably not the negative

    • a..z plus 1 million bytes only requires reading the first 26 to arrive at affirmative answer

    • a..y repeated 1 million times requires reading the entire thing to make sure there was no z to be able to arrive at negative answer

    • I guess you can terminate immediately in the negative if you see that the input size < the alphabet size

Pangram's rules require case-folding. Isogram's rules involve filtering spaces and hyphens from the input. In both cases, I believe it fair to call those rules chaff, not the point of the exercise

agree

feel pretty redundant to me in terms of what you learn by solving them

I feel the same way.

I usually choose to solve neither exercise because I do not feel I will learn meaningfully from solving either.

My are different in both cases, as I made different optimisations, especially for early abort.

I guess you can terminate immediately in the negative if you see that the input size < the alphabet size

Technically true, but depending on the input encoding, determining the length of the input in caharacters might already iterate it once.

The input might be encoded as a list of characters (Erlang, Haskell), UTF-8 encoded (Elixir), UTF16 (Java, .Net), in all of those len is O(n).

So in fact, depending on the language, its a place to learn if such a shortcut is actually a good thing or not…

But yeah, after thinking more about it.

Its just the early exit what makes the difference.

Although it was easy for me to implement those exercises in many languages without much thinking, rather mechanically, I learnt at least what pangrams and isograms are ;)

While implementation in many languages can be almost identical, in some other languages they will be handled wildly differently.

I Bash for example, the best way to check a pangram is to send the whole alphabet through a command that deletes any letters from the pangram candidate. If there are any characters left over then it is not a pangram. This is actually a really good concept for Bash students to learn.

For isogram in bash there are two main methods:

  • one which maps the letters as keys in an associative array.
  • one which fills an array which the letters, but checks if the letter is already in the array first.

The methodology for checking an isogram is also good for students to learn, but it is wildly inefficient when applied to pangrams.

Nerdsniped. Here's my Pangram code in Bash:

#!/usr/bin/bash
# Copyright © 2018 Bart Massey
# Exercism Pangram solution in Bash

echo "abcdefghijklmnopqrstuvwxyz$1" |
tr -d '\n' |
tr 'A-Z' 'a-z' |    
sed -e 's/[^a-z]//g' -e 's/\(.\)/\1\n/g' |
sort |
uniq -c |
while read COUNT CHAR
do
    if [ $COUNT -lt 2 ]
    then
        exit 1
    fi
done
case $? in
    0) echo true;;
    1) echo false;;
esac
exit 0

You can turn it into Isogram by changing the -lt in the if test to -gt.

This code is not "wildly inefficient" in either case, being O(n lg n) with reasonable constants. If you want to go faster, you can roll your own uniq -c that counts an unsorted input, but honestly who would ever care?

Perhaps 'wildly inefficient' was too strong. I struggle with hyperbole.

There is a bash solution that uses one pipe and no loops. It is very well optimized relative to your example.

I'm on mobile now, but I can uplaod an example later.

Those are both the same sequence of steps:

  1. Form a map from letters -> count
  2. Iterate over the values of that map

It really bears repeating that the best way to address these is highly language dependent. In some languages, what appears like a very slight difference in the problem domain results in fairly dramatic differences in effective implementation. That's what makes the two problems useful.

The beauty of exercism, however, is that no language track is bound to implement any one problem. So if the problems are, in fact, too similar to be useful in some languages, those tracks are free to omit one or both of them.

Here is an example of a very nice way to do pangram in bash without restricting oneself to built-ins:

#!/usr/bin/env bash

alphabet="abcdefghijklmnopqrstuvwxyz"

pangram_candidate="${1,,}"

zero_check="$(echo "$alphabet" | tr -d "$pangram_candidate")"

[[ -z $zero_check ]] && echo 'true' || echo 'false'

And in pure bash with only built-ins:

alphabet="abcdefghijklmnopqrstuvwxyz"

pangram_candidate="${1,,}"

zero_check="${alphabet//[$pangram_candidate]/}"

[[ -z zero_check ]] && echo 'true' || echo 'false'

Like I said at first. This exercise for other languages likely has really similar implementations to isogram, but in Bash this is optimal for pangram, but is not as easily convertible for isogram.

I think @hilary makes a great point that language tracks don't have to implement an exercise if they don't want to. Personally I just don't want to see either one deprecated, because for some tracks they provide different challenges with a similar subject.

We don't have the isogram exercise implemented on the bash track currently, but it is on my todo list.
So I had to make and test a script:

#!/usr/bin/env bash

LOWERCASE_INPUT="${1,,}" # get lowercase

isogram_candidate="${LOWERCASE_INPUT//[![:lower:]]}" # get rid of all but lowercase alpha

# can't do a mass replacement because we only want to remove one of each character
# this forces the loop and the single '/' in the replacement syntax
for letter in {a..z}; do 
  isogram_candidate="${isogram_candidate/$letter/}" # replace only first occurrence of character
done

[[ -z $isogram_candidate ]] && echo 'true' || echo 'false' # this logic is almost identical

I tried to make these as similar as possible, and there is no doubt that they are. For Bash, at least, I think there is a lot of related but different material to learn from each of these.

  • For example: the difference between ${myvar//pattern/substitute} and ${myvar/pattern/substitute} and how to use [pattern] in that syntax to ignore order within the pattern.
  • Also, learning to tell when there is no way around a loop, and why non-recursive replacement requires a loop in cases like this.

Thank you for letting me present this evidence

  • I only want for this information to be kept in mind while the community considers deprecating one of these two exercises.

Conclusion:

  • [ ] Isogram and Pangram can be similar, or different; mostly dependent on the language
  • [ ] Tracks where exercises are very similar can choose to deprecate one of them in the track config.json

Can we close this issue with this conclusion?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kytrinyx picture kytrinyx  Â·  4Comments

kytrinyx picture kytrinyx  Â·  3Comments

wolf99 picture wolf99  Â·  4Comments

budmc29 picture budmc29  Â·  3Comments

wolf99 picture wolf99  Â·  5Comments