I'm using the npm cassandra client to make a select query using the stream method but the end event is never being triggered. This only happens in large collections where pagination is actually necessary. You won't see this behaviour for small sets.
I'm running scylla locally in docker with
docker run --name scylla -d scylladb/scylla
Scylla version (or git commit hash): 3.0.5-0.20190414.403f66eca
CREATE KEYSPACE my_keyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 1 };
USE my_keyspace;
CREATE TABLE events
(
eventId UUID,
userId INT,
payload TEXT,
ackeddevices SET<TEXT>,
PRIMARY KEY(eventId)
);
CREATE INDEX ON events(userId);
const stream = scylladb.stream('SELECT payload, ackeddevices FROM events WHERE userId=?', [userId], { prepare: true });
stream
.on('readable', function () {
let data;
while (data = this.read()) {
// Doing something with data here
}
})
.on('end', () => console.log('Ended'))
.on('error', (err) => console.log(`Some error: ${err}`));
I've confirmed:
docker run cassandra:latest)stream.readable property is never set to false with Scylla, even seconds after the query is finishedend event is not being triggered with the Scylla imageI can provide a dml.cql with a lot of inserts if it helps.
@denesb please comment
@thales-gaddini please do provide the dml.cql
It will also help if you can provide a tcp dump for the interaction with cassandra and the interaction with scylla capturing port 9042
(the info for no next page should be visible on the last response - and it will help to validate that on the wire its marked correctly flag: Has_more_pages https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec#L552)
Hi @slivne , the dml is attached as a zip (just a really big file with a lot of inserts).
I've also implemented the paging with the eachRow method as describe here and noticed that the nextPage and pageState from the result are never null.
This is an example of the ResultSet returned in the eachRow callback with Cassandra when there are still pages to be retrieved:
{
result: ResultSet {
info: {
queriedHost: '172.17.0.2:9042',
triedHosts: [Object],
speculativeExecutions: 0,
achievedConsistency: 10,
traceId: undefined,
warnings: undefined,
customPayload: undefined,
isSchemaInAgreement: true
},
rows: undefined,
rowLength: 20000,
columns: [ [Object], [Object] ],
pageState: '10ee2f4e7f73e043bea633a78769f5dfe300f07fffb1df00',
nextPage: [Function: nextPage]
}
}
This is an example of the ResultSet returned in the eachRow callback result with Cassandra when there are no more pages to be retrieved:
{
result: ResultSet {
info: {
queriedHost: '172.17.0.2:9042',
triedHosts: [Object],
speculativeExecutions: 0,
achievedConsistency: 10,
traceId: undefined,
warnings: undefined,
customPayload: undefined,
isSchemaInAgreement: true
},
rows: [],
rowLength: 20000,
columns: [ [Object], [Object] ],
pageState: null,
nextPage: undefined
}
}
The ResultSet with Scylla never reaches the pageState: null and nextPage: undefined status, making infinite requests. It makes sense to me that the stream uses the same ResultSet under the hood and that's why it never ends. Two consecutive ResultSets when there where no more rows to be retrieved from Scylla:
{
result: ResultSet {
info: {
queriedHost: '172.17.0.2:9042',
triedHosts: [Object],
speculativeExecutions: 0,
achievedConsistency: 10,
traceId: undefined,
warnings: undefined,
customPayload: undefined,
isSchemaInAgreement: true
},
rows: [],
rowLength: 20000,
columns: [ [Object], [Object] ],
pageState: '00000000a40000001000000001000000040000000000271001280000000200000008000000fffc0480a1e1a69410000000c50f3598f6304f7baa73526537dc724cdfb1ff7fee47a20ac473782b0f0c912bde39858e010000003900000001190000001400000001000000080000003aaa8898163098470101190000001400000001000000080000003aaa8898163098470101010000006d41db007affc3b5d4340de9fa2b2a940100',
nextPage: [Function: nextPage]
}
}
{
result: ResultSet {
info: {
queriedHost: '172.17.0.2:9042',
triedHosts: [Object],
speculativeExecutions: 0,
achievedConsistency: 10,
traceId: undefined,
warnings: undefined,
customPayload: undefined,
isSchemaInAgreement: true
},
rows: [],
rowLength: 20000,
columns: [ [Object], [Object] ],
pageState: '00000000a40000001000000001000000040000000000271001280000000200000008000000fffc0480a1e1a69410000000c50f3598f6304f7baa73526537dc724cdfb1ff7fee47a20ac473782b0f0c912bde39858e010000003900000001190000001400000001000000080000003aaa8898163098470101190000001400000001000000080000003aaa8898163098470101010000006d41db007affc3b5d4340de9fa2b2a940100',
nextPage: [Function: nextPage]
}
}
md5-da83befbfbb41cb36266ab50f3929bb4
sudo tcpdump port 9042 -w scylla.pcap
sudo tcpdump port 10000 -w scylla.pcap
sudo tcpdump host 172.17.0.2 and port 9042 -w scylla.pcap
sudo tcpdump host 172.17.0.2 and port 10000 -w scylla.pcap
But none of them had any output. I've also tried to just tcpdump everything (without specifying a host or port), but nothing related to Scylla was logged. Should I tcpdump inside the Docker container?
tcpdumping from docker worked. Here is a zip with the pcap files from both Scylla and Cassandra.
As you said, the Has_more_pages flag is true in Scylla even though there are no more records (with a lot of the same results), as can be seen below (one of the many results )


Has_more_pages is false in the final result in the Cassandra dump.

@thales-gaddini thanks (and extra points for doing the initial analysis - nice going)
There is one difference between cassandra and scylla when it comes to paging.
I am not sure the paging state is changing correctly between pages - yet will dig further after trying to reproduce this using the sample you shared.
So trying your schema + dml.cql on a clean 3.0.5 docker container with python I can't reproduce
from cassandra.cluster import Cluster
cluster = Cluster(["172.17.0.3"])
session = cluster.connect()
p = session.prepare('SELECT payload, ackeddevices from my_keyspace.events where userid=?')
rows = session.execute(p,[10000])
for user_row in rows:
print ".",
next the dump :)
Yet a closer look it does reproduce the issue

the last page has the flag Has_next_page set to True
Is the paging state changing between different iterations?
The driver should have continued fetching. I don't understand @slivne's capture.
The paging state does not change
frame 17623 result paging state
frame_17623.paging_state.txt
frame 17624 next page request (passes the paging info)
frame_17624.e_paging_state.txt
frame 17625 result paging state - the same as 17623
frame_17625.paging_state.txt
@avikivity I don;t understand the driver as well not fetching more pages - the program is the one I provided above
so the query returns a total of 20000 rows split into 5000 per page
I changed the pagesize from being 5000 to being 4999
So scylla can return 4999 and the last page not a full page (4 results)
in this case the last page Has_more_pages is marked as FALSE

So it seems the issue is with the last page and the next query that returns no results returns back the paging information while it shouldn't
another oddity using the 5000 row page size , when I change the code and not use prepared statements it does not reproduce as well
on docker 3.0.7
this has the problem e.g. the last page is an empty page and has Has_more_pages : True
from cassandra.cluster import Cluster
cluster = Cluster(["172.17.0.4"])
session = cluster.connect()
p = session.prepare('SELECT payload, ackeddevices from my_keyspace.events where userid=?')
rows = session.execute(p,[10000])
for user_row in rows:
print ".",
this does not e.g. the last page is an empty page with Has_more_pages set to False
from cassandra.cluster import Cluster
cluster = Cluster(["172.17.0.4"])
session = cluster.connect()
rows = session.execute("SELECT payload, ackeddevices from my_keyspace.events where userid=10000")
for user_row in rows:
print ".",
Hey guys, thanks for the fix. When does it go from next to master?
@thales-gaddini before being merged into master, it needs to pass the gating tests. If everything goes smoothly, then it may be a couple of hours, if not, then until I fix the bugs and resend :)
Backported to 3.0 and 3.1, removing from backport candidates.
Most helpful comment
@thales-gaddini before being merged into master, it needs to pass the gating tests. If everything goes smoothly, then it may be a couple of hours, if not, then until I fix the bugs and resend :)