Anonymous | Login
Project:
RSS
  
News | My View | View Issues | Roadmap | Summary

View Revisions: Issue #36033 All Revisions ] Back to Issue ]
Summary 0036033: Prevent duplicate/parallel processing of the same message from webpos if it gets retried due to long processing times
Revision 2017-06-15 14:23 by mtaal
Description THE PROBLEM TODAY

It seems today in the customer environment, a process which creates records in a specific table continuously was run. This process impacted very significantly an EventHandler which the integrations team created and that is executed every time an order is created. The end result was that an order creation message suddenly took a very significant time to execute (more than 30 seconds in some cases).

In the Web POS, we are running synchronized mode. When an order is completed, it is sent to the backend, and the client waits for the backend to confirm that the order was correctly processed to close the popup and allow the cashier to continue working. There is also a mechanism that initiates a retry, and sends the message again in case it reaches the defined timeout.

This means that if the OrderLoader for some reason takes a very long time to execute, it is possible that we do a retry when it is still processing, and in this case we may receive a duplicated primary key error.

It seems this is exactly what was happening today in BUT. As the OrderLoader took very long, the Web POS did a retry, and this retry was executed while the initial OrderLoader was being executed, so finally it failed with a primary key error.

A POSSIBLE SOLUTION

Initially, I thought that the fix for the following issue:

https://issues.openbravo.com/view.php?id=35289 [^]

would solve the problem. However, now I think it would not help in this case. The reason is that we create the ImportEntryArchive only when we have confirmation that the order was correctly created. This makes sense, as we have the requirement from the Synchronized Mode that we should only reply to the client that everything is fine when everything is indeed completely fine, but in this particular case, it means that the second request arrives at a moment in time in which the ImportEntryArchive record still doesn't exist, which means that the second request will initiate a duplicated call to OrderLoader.

Revision 2017-06-08 13:46 by mtaal
Description
THE PROBLEM TODAY

It seems today in the customer environment, a process which creates records in a specific table continuously was run. This process impacted very significantly an EventHandler which the integrations team created and that is executed every time an order is created. The end result was that an order creation message suddenly took a very significant time to execute (more than 30 seconds in some cases).

In the Web POS, we are running synchronized mode. When an order is completed, it is sent to the backend, and the client waits for the backend to confirm that the order was correctly processed to close the popup and allow the cashier to continue working. There is also a mechanism that initiates a retry, and sends the message again in case it reaches the defined timeout.

This means that if the OrderLoader for some reason takes a very long time to execute, it is possible that we do a retry when it is still processing, and in this case we may receive a duplicated primary key error.

It seems this is exactly what was happening today in BUT. As the OrderLoader took very long, the Web POS did a retry, and this retry was executed while the initial OrderLoader was being executed, so finally it failed with a primary key error.

A POSSIBLE SOLUTION

Initially, I thought that the fix for the following issue:

https://issues.openbravo.com/view.php?id=35289 [^]

would solve the problem. However, now I think it would not help in this case. The reason is that we create the ImportEntryArchive only when we have confirmation that the order was correctly created. This makes sense, as we have the requirement from the Synchronized Mode that we should only reply to the client that everything is fine when everything is indeed completely fine, but in this particular case, it means that the second request arrives at a moment in time in which the ImportEntryArchive record still doesn't exist, which means that the second request will initiate a duplicated call to OrderLoader.



Copyright © 2000 - 2009 MantisBT Group
Powered by Mantis Bugtracker