Openfoodnetwork: Cannot create OC on USA instance

Created on 11 Sep 2018  Â·  33Comments  Â·  Source: openfoodfoundation/openfoodnetwork

Trying to create an OC in USA instance for Ms Future Farms (producer only) and getting "failed to create Order Cycle"

image.png

There are valid Shipping and Payment methods
image.png

image.png

Severity S1 - critical function and means effectively the platform doesn't work

See initial support request at https://community.openfoodnetwork.org/t/failed-to-create-order-cycle/1422/5

Contact @LXW and @Brandy on Slack

bug-s2

Most helpful comment

@sauloperez, come volete, il mio amico, come volete 😁

All 33 comments

ping @luisramos0 @sauloperez @mkllnk @Matt-Yorkley

awesome, with dev tools screenshot :smile:

@kirstenalarsen can you click on that order_cycles.json, there could be more details there.
Otherwise, we need someone with access to the server to see the logs.

I have just checked and there seems to also be a problem with normal order cycles (not 'producer only). I think that I have filled out everything I should have, and the hub has valid shipping and payment methods. hmmmm. It is not even giving me an active 'Create' button . .

image

image

Worth checking the state of delayed_job? If it's running at all?

clicked on order_cycles.json @luisramos0

image

image

image

I have sysadmin access through the server but won’t be home where I can access it until 5pm central. Maikel also has access.

So after looking at the logs it appears skylight is failing to start because it does not have the information it needs.
Unable to start instrumenter; msg=authentication token required; class=skylight::configuration error

image

skylight is a relatively new thing - it is quite possible that setting it up properly is missed in the installation / configuration process? @sauloperez @mkllnk ?

7d4041a946260b657be49cfd97e8db9d8a61b095

There is no skylight api key in the secrets file. I just found the link in the secrets file on how to apply for a skylight key. so I will apply and update the key and then scp the file onto the server. that should fix the problem with skylight. Can someone tell me if skylight is associated with making/tracking order cycles?

hey @brandyarmstrong and others - when you get to the bottom of this can we make sure ofn-install and/or wikis are updated? thanks :)

Hey, I dont think skylight is related to the problem. Even if it's not working properly.
What you need to look for is any error message in the log. Another place to look for the 500 entry would be on the nginx access.log.

@kirstenalarsen on your screenshot, the tab response can have more info.

I can help if you give me access to the app backoffice (so I can replicate) and/or server (so I can debug).

The stacktrace from ~/apps/openfoodnetwork/current/log/production.log:

Started POST "/admin/order_cycles.json" for 182.239.162.19 at 2018-09-12 09:03:25 +0000
Processing by Admin::OrderCyclesController#create as JSON
  Parameters: {"order_cycle"=>{"incoming_exchanges"=>[{"enterprise_id"=>54, "incoming"=>true, "active"=>true, "variants"=>{"188"=>true}, "enterprise_fee_ids"=>nil}], "outgoing_exchanges"=>[{"enterprise_id"=>54, "incoming"=>false, "active"=>true, "variants"=>{"188"=>true, "undefined"=>false}, "pickup_time"=>"testing", "enterprise_fee_ids"=>nil}], "name"=>"Maikel's Test", "orders_open_at"=>"2018-09-12 00:00:00", "orders_close_at"=>"2018-09-13 00:00:00", "coordinator_id"=>54, "schedule_ids"=>nil, "coordinator_fee_ids"=>nil}}
Completed 500 Internal Server Error in 96.0ms (ActiveRecord: 25.5ms)

Delayed::DeserializationError (ActiveRecord::RecordNotFound, class: Spree::User, primary key: 66 (Couldn't find Spree::User with id=66)):
  lib/open_food_network/products_cache_refreshment.rb:35:in `pending_job?'
  lib/open_food_network/products_cache_refreshment.rb:20:in `refresh'
  lib/open_food_network/products_cache.rb:169:in `refresh_cache'
  lib/open_food_network/products_cache.rb:128:in `refresh_outgoing_exchange'
  lib/open_food_network/products_cache.rb:76:in `exchange_changed'
  app/models/exchange.rb:91:in `refresh_products_cache'
  lib/open_food_network/order_cycle_form_applicator.rb:73:in `add_exchange'
  lib/open_food_network/order_cycle_form_applicator.rb:47:in `block in go!'
  lib/open_food_network/order_cycle_form_applicator.rb:35:in `each'
  lib/open_food_network/order_cycle_form_applicator.rb:35:in `go!'
  app/services/order_cycle_form.rb:33:in `apply_exchange_changes'
  app/services/order_cycle_form.rb:19:in `block in save'
  app/services/order_cycle_form.rb:17:in `save'
  app/controllers/admin/order_cycles_controller.rb:41:in `create'

As far as I can tell:

  • Creating an order cycle triggers the products cache to be refreshed.
  • The cache refresh job doesn't want to run multiple times, because it's quite expensive.
  • The job checks all other jobs to find duplicates of itself.
  • One of the jobs fails to load, because it has a reference to a user that doesn't exist any more.
  • This triggers an exception that breaks creating order cycles.

The bugs are:

  • Bad, expensive design of the cache refresh job checking all other jobs and not rescuing from any errors.
  • A job with a reference to a user object instead of just a user id. That means that loading the job fails already without any chance to rescue from that error.

Quick fix: remove that job referencing a deleted user.
Workaround: don't delete any users.
Solution: fix the mentioned bugs.

One offending job:

<Delayed::Backend::ActiveRecord::Job id: 580, priority: 0, attempts: 25, handler: "--- !ruby/struct:ConfirmSignupJob\nuser_id: 66\n", last_error: "Couldn't find Spree::User with id=66\n/home/openfood...", run_at: "2018-06-25 08:53:46", locked_at: nil, failed_at: "2018-06-25 08:53:47", locked_by: nil, queue: nil, created_at: "2018-06-04 23:07:04", updated_at: "2018-06-25 08:53:47">

I identified two types of jobs that are a problem here. In order to find a quick solution, I decided to delete all jobs of that category. Any sign-up emails can be triggered again if needed:

Delayed::Job.where(locked_at: nil).where("handler like '%ConfirmSignupJob%'").destroy_all
Delayed::Job.where(locked_at: nil).where("handler like '%send_on_create_confirmation_instructions_without_delay%'").destroy_all

The quick fix is applied. A workaround exits. I'm degrading this to an s2 bug and we should create issues for the two things that can lead to this.

I wonder what are the implications of

Workaround: don't delete any users.

are we sure that's affordable from a legal stand-point? I guess we must give users the possibility to remove their accounts? Soft-deleting and possibly anonymize them could be a solution.

However, this sounds like a big feature to implement. Can't we just rescue from this exceptions?

Sorry, that was just meant as a short term solution (a week or two) until the underlying issues are solved. I'm only recommending that admins should not delete any users while this is not fixed. And yes, rescuing from the exception is the easiest fix. We should also fix the delayed job to only contain a user id instead of a user.

thanks people. Couple of short/medium term action items and apologies if these have happened already

  • @mkllnk could you create issues for the rescue exemptions thing and also possibly the cache thing? (@myriamboure @daniellemoorhead what do we do with 'tech debt' wishlist items?)
  • can someone bring @luisramos0 into the core sysadmins access bizzo so that he can access instances when needed - thanks!!!!

I created the two follow up issues which replace this one. Closing.

(@myriamboure @daniellemoorhead what do we do with 'tech debt' wishlist items?)

At this present moment in time @kirstenalarsen they get labelled 'tech debt' and left with the "All the things" status. There have been a few suggestions for the devs to create and manage a tech debt project board similar to the bugs management board, but this hasn't been picked up so far.

imo, I think we are ok with a random process on tech debt. If you see @kristinalim tends to the ui (/pages), @sauloperez cares about rails conventions (#send), Ive been on the engines front (cookies poc), @Matt-Yorkley just rebuilt a view without deface, @mkllnk has been taking good care of the dependabot issues, etc, etc. We just need to be reminded if there are any major techdebt items being ignored by everyone. BUT, if you guys think something needs to be done from your perspective, you are probably talking about something that deserves normal prioritization.

The last thing we should do is create more process / things that aren’t needed, for sure!

There are two areas that we have to cover off -

  1. We closed all “improvements” type issues and have only left bugs in GH. Even then, the inventory of open issues we have is constantly growing and will eventually need to be culled (not a job anyone enjoys). I wonder how tech debt issues get managed if they’re not picked up and done as you lay out above, or aren’t considered important to do yet? Will they accrue till the gather dust, or do they get managed somehow?

  2. Larger pieces of tech debt that aren’t on the feature backlog and aren’t considered at the moment in the prioritization process (unless it gets to the edge of catastrophy like spree). We don’t have a view of these things at all right now, to be able to include it in quarterly planning as effectively as we can features.

On 14 Sep 2018, at 7:31 pm, Luis Ramos notifications@github.com wrote:

imo, I think we are ok with a random process on tech debt. If you see @kristinalim tends to the ui (/pages), @sauloperez cares about rails conventions (#send), Ive been on the engines front (cookies poc), @mkllnk has been taking good care of the dependabot issues, etc, etc. We just need to be reminded if there are any major techdebt items being ignored by everyone. BUT, if you guys think something needs to be done from your perspective, you are probably talking about something that deserves normal prioritization.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

I see.

  1. imo, dust is fine. This issue for example is ok: https://github.com/openfoodfoundation/openfoodnetwork/issues/287
  2. it's still applicable (I believe)
  3. we need to do it
  4. it's not the most important thing
  5. 4 years old

  6. We can create tech debt epics to be managed/planned.
    Apart from spree upgrade, Id list 2 more:

  7. epic above this: https://github.com/openfoodfoundation/openfoodnetwork/issues/2524
  8. epic above this: https://github.com/openfoodfoundation/openfoodnetwork/pull/2521
    we could also create epics for dependabot issues, and the forget-deface project.

Go ahead and create those 2 epics you suggest @luisramos0 I fully support that.
Dependabot I'm not sure we need because there's a max 5 simultaneous PRs from dependabot, they bump into Code Review directly and we follow them quite easily. (dependabots that jump out of this like #2619 would justify an epic though).

For the deface one we HAVE to call it Defacepocalypse :D

I find it quite satisfying to solve tech debt issues. I just have to hold myself back, otherwise nothing else would get done. ;-)

There is always the stuff that would be nice, but is not really needed. And then there is the stuff that will bite you later.

cost of bugs curve

so what, do we create and prioritize epics for the biggest tech-debt items?

yes, I think that's the objective @sauloperez
also, it gives more visibility/clarity over what's going on the tech debt world for non tech people.

@sauloperez, come volete, il mio amico, come volete 😁

I see this discussion is happening here now (vs discourse). Maybe I'm late to the game here and you've already solved this issue - so then ignore this.
I tested the US order cycle (I used Locovore Farms where the problem was). It works fine for me. One thing to note - both the 'ready for date' and the 'pick up details' MUST have content entered if the OC is to work. This happens on the Can instance too. And there is no error message (be good if a pop up happened to tell you to enter these details actually). It too me a while to figure it out. The OC creation just kind of stalls on you if one of these fields is left blank. (Be good to move that info into the user guide?) Here's my screenshot of the OC I created - and the shop front works...... (Again - sorry if you already solved this)
screenshot - test of us order cycle

Given what @tschumilas says above, is this no longer a problem @kirstenalarsen?

I now have this issue on the german instance (I think) - order cycle failing to be created for no other apparent reason.

I realise that there are follow up issues created to fix this properly, but for now i need the quickfix applied asap - have created issue in ofn-install here

which brings attention to the fact that I AGAIN have no dev around with sys admin access to the instance that has the problem. this is insane and is going to make me cry. @daniellemoorhead @myriamboure @sauloperez @mkllnk @luisramos0 @sigmundpetersen

Was this page helpful?
0 / 5 - 0 ratings