Have a lookup table with IP address ranges as start to end IP, which can also be start and end number using IPv4StrToNum. Need a way to lookup rows in that table from a larger table.
Something that would accomplish this example in SQL:
SELECT d.*, r.id
FROM data AS d
LEFT JOIN ranges AS r ON IPv4StrToNum(d.address) BETWEEN r.start_num AND r.end_num;
Breaking out my lookup table into one row per element in each range for a JOIN ... USING explodes the lookup table significantly. Is there any form in ClickHouse to join on a range?
How many rows in ranges ?
Check ip_trie dict
https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_layout/#ip_trie
Your query will work much faster
SELECT d.*, dictGetT('dict_name', 'id', tuple(address)) as id
FROM data AS d
Is dictGetT still a valid function? dictGetString seems to work in this case.
The ip_trie docs mention dictGetT, but its not listed in https://clickhouse.yandex/docs/en/query_language/functions/ext_dict_functions/
DB::Exception: Unknown function dictGetT
Is dictGetT still a valid function?
No, T means Type dictGet<T>
Replace with a type you need. String/Int64/UInt....
For matching the most specific subnet there is also ip_trie dictionaries (handy for geoip or similar tasks).
Most helpful comment
How many rows in
ranges?Check ip_trie dict
https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_layout/#ip_trie
Your query will work much faster
SELECT d.*, dictGetT('dict_name', 'id', tuple(address)) as id
FROM data AS d