Support for ASN ?
https://dev.maxmind.com/geoip/geoip2/geolite2-asn-csv-database/
Get isp and Network of ip.
Hey @mhf-ir, is there an advantage to using the CSV database rather than the binary city database that the geoip transform (https://vector.dev/docs/reference/transforms/geoip) currently targets?
@Jeffail sorry my bad, for sharing link. my Point is DB of ASN.
For get data of ISP and network of ip:
https://geolite.maxmind.com/download/geoip/database/GeoLite2-ASN.tar.gz
Not using csv instead of binary db file.

Understood, we can possibly add support for the ASN DB as an additional extra. So with a config like:
[transforms.my_transform_id]
type = "geoip"
inputs = ["my-source-id"]
database = "/path/to/GeoLite2-City.mmdb"
asn_database = "/path/to/GeoLite2-ASN.csv"
source = "ip_address"
target = "geoip"
And with the asn_database field present the DB would be parsed and data enriched with the extra information. If this functionality is being added to geoip then it would be necessary to also make the field database optional, where the absence of either database or asn_database is accepted, but both would cause an error (the transform would be a no-op).
ASN is also mmdb database GeoLite2-ASN.mmdb not CSV
Something like this:
[transforms.my_transform_id]
type = "geoip"
inputs = ["my-source-id"]
database = "/path/to/GeoLite2-City.mmdb"
asn_database = "/path/to/GeoLite2-ASN.mmdb" # HERE !!!
source = "ip_address"
target = "geoip"
Thank you for awesome project btw, specially extract metrics from logs feature is awesome.
Thanks @mhf-ir . If we add this option, I'd prefer that we deprecate the database field in favor of city_database and so on.
I thought I’d take a crack at implementing this since I need it :)
Before I noticed this issue I had planned a database_type field that would indicate the type of database file provided. Would default to ”city”, but also accept ”asn”. If you wanted both, you’d do two geoip transforms.
Happy to do the “city_database” and “asn_database” aproach too, if you prefer.
Thanks @markonen, is there any reason we wouldn't just have two options? city_database and asn_database? Just trying to understand so we can prevent rework.
A database_type field is a smaller change to geoip.rs because it doesn't imply support for dealing with multiple databases in one go. That's the main reason it would have been my design.
There's also a bunch of different MaxMind databases, so in case you'd like to support more in the future, it feels neater to me to add new supported values to an enum rather than all-new fields.
But neither of these is a big deal, happy to go either way.
I see. I'm mostly wondering how we would document fields for each database, like we do here. I don't fully understand the difference between the two, which is part of the problem. Is there overlap in the fields that would be produced, or are they entirely separate?
The GeoLite2 ASN database doesn’t have overlapping fields with the City database, but the City and Country databases and their free/commercial variants have a lot of commonality.
One possibly convenient config API would be to allow the user to simply specify the set of fields to extract (and default to the current City-specific set) without caring about the database type at all. Especially since you can also create custom databases in the MaxMind format with whatever fields you want.
But I’m not sure how convenient that would be to implement with the maxminddb crate; the API seems to be geared towards going through its pre-defined structs (AnonymousIp, City, ConnectionType, Country and Isp).
ASN data will help the network maintainers to determine which ISP for example dosnt care about the clients to send to Spam/Attack, and then you can block or show captcha base on ISP each server. monitorng behaviour of ASN could help about this use cases.
ASN/ISP can buy new range ips and change them but during registration IP ASN is unique you can block new range of ips of same week ASN/ISP.
This might be so helpful
Thanks, I appreciate the help/input here. Last question 😄...
Especially since you can also create custom databases in the MaxMind format with whatever fields you want.
I wasn't aware of this. Given that it is possible to create custom databases, could we autodetect the fields present? Is there not metadata, such as a header row, that would tell us this? Then we could accept and array of databases:
[transforms.my_transform_id]
type = "geoip"
inputs = ["my-source-id"]
- database = "/path/to/GeoLite2-City.mmdb"
+ databases = [
+ "/path/to/GeoLite2-City.mmdb",
+ "/path/to/GeoLite2-ASN.mmdb"
+ ]
source = "ip_address"
target = "geoip"
Any thoughts on that approach?
I looked at the format spec and unfortunately the database file metadata does not contain the field names (only the data types). So the user will need to specify the expected structure one way or another. @binarylogic you seem to prefer approaches where the user can process more than one database in one go. Is that right?
Not necessarily. We care deeply about the UX of Vector, and I'm trying to figure out what the best UX is here. In general, Vector leans towards reducing the number of decisions a user has to make. I want to preserve the automatic behavior as it exists now without limiting the use of other databases. Since we can't detect fields from the file itself, it seems like we have no choice but to determine that from the config.
With that said, I lean towards:
[transforms.my_transform_id]
type = "geoip"
inputs = ["my-source-id"]
- database = "/path/to/GeoLite2-City.mmdb"
+ city_database = "/path/to/GeoLite2-City.mmdb"
+ asn_database = "/path/to/GeoLite2-ASN.mmdb"
source = "ip_address"
target = "geoip"
Given that we expect certain files for the city_database and asn_database, we should be able to extract the data and add them into preconfigured fields, right?
In regards to custom databases, unless you have an immediate need for that, we can handle that separately via a custom_database field with an additionally required custom_fields array. Something we could do in the future at some point.
What do you think?
Sounds good! I’ll get back to you with a PR :)
Soooo… ended up with a different design once I realized that I could autodetect the ISP/ASN database by the database_type metadata field. Have a look when you have a moment!
Most helpful comment
Sounds good! I’ll get back to you with a PR :)