Create a normal prefix, then run this code against the API (replace $token, $prefixid and $host):
for i in $(seq 16); do curl -X POST -so/dev/null -H "Authorization: Token $token" -H "Accept: application/json; indent=4" https://$host/api/ipam/prefixes/$prefixid/available-ips/ & done
This code requests 16 IP addresses simultaneously from Netbox triggering the race condition bug.
Netbox assigns IP addresses from .1 up to .16: Example screenshot
Netbox assigned IP addresses from .1 up to .5, assigning some of the IPs multiple times: Example screenshot
What's your setting for 'ENFORCE_GLOBAL_UNIQUE'?
@sdktr It was set to False. Setting it to True and restarting still allows the command above to create duplicate addresses. Just to confirm that the option was correctly set, manually creating a duplicate address in the web interface causes an error Duplicate IP address found in global table: 100.64.8.1/22
I don't know that there's a feasible way to prevent this given the nature of the function. However, you can easily work around it by sending a single request for multiple addresses. For example, POSTing a list of five empty objects will return the first five available IPs.
curl -X POST \
-H "Authorization: Token $token" \
-H "Content-Type: application/json" \
-H "Accept: application/json; indent=4" \
https://$host/api/ipam/prefixes/$prefixid/available-ips/ \
--data '[{}, {}, {}, {}, {}]'
(Each object can optionally specify a VRF, which is why NetBox expects a list of objects rather than simply a count of IPs to create.)
@jeremystretch We have multiple VM hypervisors access phpIPAM to do our IP management. We wanted to give Netbox a try for DCIM and IPAM but duplicate IP assignments are a no-go.
Our workaround at the moment for this is to add a unique constraint to the ipam_ipaddress table:
ALTER TABLE ONLY ipam_ipaddress ADD CONSTRAINT ipam_ipaddress_ukey UNIQUE (vrf_id, address);
This makes Netbox throw an error in case of it trying to create duplicate IP entries. If we see an error related to the PSQL unique constraint or {"non_field_errors":["Duplicate IP address found in VRF testing (testing): 100.64.8.3/22"]} (which does happen sometimes when using VRFs), we back off for a few seconds and then retry.
I don't understand why this is closed? The API is used for automation, but cannot be used because this API is implemented incorrectly. What about the "nature of the function" prevents this? Why can't it be protected against as a key constraint, like what Fusl did above?
I understand you're just one person and there's only so much time, but how can anyone else consider making a pull request for issues that are just shut down, with the assertion that a fix is impossible? It's not always possible to just batch a request like that at once, you can't just change the way your automation works to account for a broken API. I'm trying to write a Terraform provider, where resources are handled individually, they cannot be batched into a single API call.
I believe this is the proper fix given how Django models work:
https://github.com/netbox-community/netbox/compare/develop...mattolenik:available-api-race
Can this be reopened so a PR can be submitted?
This was closed because no acceptable modification was proposed, and a workaround was provided.
I believe this is the proper fix given how Django models work
This does not address the underlying issue:
prefix = Prefix.objects.select_for_update().get(pk=pk)
That locks _the parent prefix_ being queried; it does not prevent duplicate child prefixes or IP addresses from being created.
It doesn't prevent the insertion of another row with the same address, but it prevents this method from running concurrently. This most certainly does fix the problem with concurrent API calls. I can provide test cases that prove this, but not if it's just going to be shot down. I don't understand the attitude of not wanting to fix a serious race condition. This is a very serious bug and shouldn't be closed regardless of whether or not a fix was proposed...what you're telling people is that they can never attempt to fix this, because you'll never accept a fix for it.
Please, I'm literally trying to write an open source provider that people can use to help them use NetBox. Why shoot this down?
I see this as a critical issue. The data model should never allow duplicate IPs to be inserted.
@jeremystretch Do you have a recommendation on an approach that would fix the issue more completely? Can we also reopen this issue?
Full disclosure: I work with @mattolenik and this issue is a blocker for us being able to rollout Netbox as our central IPAM solution. We want to do everything we can to avoid maintaining an internal fork and will gladly devote some dev time to fixing this properly.
@mattolenik @nicpar are one of you volunteering to own the fix as well as any follow-on issues from implementing the change? I am focused on v2.7 work for the near future and don't have any cycles for this. (Maybe one of the other maintainers can assist.)
@jeremystretch sure, I can do it! I'll start working on tests :)
Most helpful comment
It doesn't prevent the insertion of another row with the same address, but it prevents this method from running concurrently. This most certainly does fix the problem with concurrent API calls. I can provide test cases that prove this, but not if it's just going to be shot down. I don't understand the attitude of not wanting to fix a serious race condition. This is a very serious bug and shouldn't be closed regardless of whether or not a fix was proposed...what you're telling people is that they can never attempt to fix this, because you'll never accept a fix for it.
Please, I'm literally trying to write an open source provider that people can use to help them use NetBox. Why shoot this down?