Clickhouse: JOIN on BETWEEN or a range?

Created on 20 Nov 2018  路  4Comments  路  Source: ClickHouse/ClickHouse

Have a lookup table with IP address ranges as start to end IP, which can also be start and end number using IPv4StrToNum. Need a way to lookup rows in that table from a larger table.

Something that would accomplish this example in SQL:

SELECT d.*, r.id
FROM data AS d
LEFT JOIN ranges AS r ON IPv4StrToNum(d.address) BETWEEN r.start_num AND r.end_num;

Breaking out my lookup table into one row per element in each range for a JOIN ... USING explodes the lookup table significantly. Is there any form in ClickHouse to join on a range?

Most helpful comment

How many rows in ranges ?

Check ip_trie dict
https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_layout/#ip_trie

Your query will work much faster
SELECT d.*, dictGetT('dict_name', 'id', tuple(address)) as id
FROM data AS d

All 4 comments

How many rows in ranges ?

Check ip_trie dict
https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_layout/#ip_trie

Your query will work much faster
SELECT d.*, dictGetT('dict_name', 'id', tuple(address)) as id
FROM data AS d

Is dictGetT still a valid function? dictGetString seems to work in this case.

The ip_trie docs mention dictGetT, but its not listed in https://clickhouse.yandex/docs/en/query_language/functions/ext_dict_functions/

DB::Exception: Unknown function dictGetT

Is dictGetT still a valid function?

No, T means Type dictGet<T>
Replace with a type you need. String/Int64/UInt....

For matching the most specific subnet there is also ip_trie dictionaries (handy for geoip or similar tasks).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hatarist picture hatarist  路  3Comments

fizerkhan picture fizerkhan  路  3Comments

atk91 picture atk91  路  3Comments

innerr picture innerr  路  3Comments

opavader picture opavader  路  3Comments