Scryer-prolog: Partial strings: not representable characters

Created on 21 Feb 2020  路  19Comments  路  Source: mthom/scryer-prolog

?- char_code(C,0), partial_string([C],Xs0,Xs).
caught: error(type_error(character,'\x0\'),partial_string/3)

I believe this should succeed, even if Xs0 will not be a partial string.

And the error should be rather a representation error.

bug

All 19 comments

Not sure, I think it should rather succeed with Xs0 = .('\0\',Xs).

In the meantime it's:

?- char_code(C,0), partial_string([C],Xs0,[]).
   C = '\x0\', Xs0 = [].

Expected: Xs0 = "\0\" which is the same as Xs0 = ['\0\']

Further:

?- char_code(C,0), partial_string([C,b,c],Xs0,[]).
   C = '\x0\', Xs0 = "bc".

Expected: Xs0 = "\0\bc"

@mthom : [bug] since it must be true

for all Xs being a list of chars: partial_string(Xs, Xs0, []), Xs == Xs0.

Counterexample: Xs = "a\0\b".

But also

?- Xs = "a\0\b".
   Xs = "ab".

is incorrect.

@mthom , what is a terminator for PartialString?

I thought it was '0'. The 0-byte.

But in "a\x0\b" there is the '\0', wouldn't it cause issue?

That's what I thought! I'm not sure how to reconcile the two.

Not sure if it is possible, C-string have that issue. The terminator is the issue. Right now:

pstr.append_chars("\x00ab") == None

Why use a terminator?

I think the general idea is that \0 still causes partial strings to split, as it originally did. "a\x0\b" will write "a\x0" being to the first string/segment, and "\x0b\x0" to the second string/the first string's tail. See #95 for how the terminators are intended to work.

The catch is that no partial string (at least, none represented by HeapCellValue::PartialString) can be entirely empty, ie. each must contain at least one character. We have [] for empty strings.

Why use a terminator?

Eventually strings will be stored directly in the heap. The WAM needs to know when they terminate by scanning them since the length of the string won't be stored anywhere. That's not currently the case. Strings are currently stored to dedicated buffers pointed to from within the heap, and are deallocated via Rust's RAII.

This comment seems to state that it isn't possible.

If the length can't be stored then the terminator is required but it doesn't seem possible to distinguish a terminator and '0'.

When allocate_pstr is called with "\x00ab" is the allocation done for "\x0\ab"?

It's written to a second string, the tail of "x0".

write_pstr is returning None for "x00x00". Will do some tests later.

OK, I have it done, according to the above interpretation. That is, a '\x0\' can occur as the first character of a partial string segment, where it will be interpreted as just another character. Anywhere else, it will be interpreted as a null terminator. This is to say that no partial string segment may be empty. This query still succeeds as expected however:
```
?- partial_string("", Xs, Xs0).
Xs = Xs0.
````

I will commit the change and we can hopefully close the issue.

The test cases work perfectly now, thank you a lot!

However, I now incorrectly get:

?- Ls = "\x2124\".
   Ls = "\2124\".

Expected answer: Ls = "\x2124\". Note the x.

'x0\' instead of '0\' is acceptable, right?

I suppose you mean '\x0\' and '\0\', i.e., with the trailing \?

Yes, absolutely!

As I wanted to try the equivalence with GNU Prolog, I got:

| ?- X = '\0\'.
uncaught exception: error(syntax_error('user_input:9 (char:34) invalid character code in \\constant\\ sequence'),read_term/3)

So, this seems to be a shortcoming in GNU Prolog...

Definitely '\x0\' is acceptable to denote the character with code 0.

So, this seems to be a shortcoming in GNU Prolog...

Not sure what you mean by shortcoming, but there is no requirement in 13211-1 that '\0\' is a character of the Processor character set (PCS, 6.5). Including '0\' means that the implementation defined PCS contains an extended character. In GNU it is not part of the PCS, thus either syntax errors or representation errors occur:

| ?- char_code(C,0).
uncaught exception: error(representation_error(character_code),char_code/2)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

UWN picture UWN  路  3Comments

notoria picture notoria  路  3Comments

Qqwy picture Qqwy  路  3Comments

dcnorris picture dcnorris  路  3Comments

triska picture triska  路  4Comments