This document contains three sections: - other changes based on topics discussed separately - changes which were done in reaction to specific listed code review topics - topics of the code review which have not yet been implemented >>>>> OTHER changes based on discussion <<<<<<<< Other changes: - Added role to the import entry table, is set when an import entry is created, is then used to create the OBContext when processing the entry - Replace a call to list by a call to scroll, set queue limits to prevent tasks to get too long lists of entries to handle resulting in OOM - Made several properties configurable through Openbravo.properties [TODO doc link] - Moved to use a global executor service with global setting of number of threads >>>>> IMPLEMENTED CODE REVIEW COMMENTS <<<<<<<< * In the C_ImportEntry table, there are some columns which maybe are not needed: - Why is the Stored column different from the Creation Date column? - Why is the Imported column different from the Updated column? Just to think about it. Maybe they are indeed different, and therefore they are needed, but the smaller this table is, the better for the general performance of the solution. MT>> Removed stored column, using created now, imported can be different from updated as updated can be accidentally set by changing something in the window and pressing save. So I kep that one. Also the idea is to keep this importentry table small by moving stuff to an archive table. * There is a list reference which contains the types of data that we handle. However, I feel like all of the entries there should go inside the Web POS module, and currently only the cash up one is there. Is there a reason for them to be in the ERP? MT>> I checked, and I was forced to do this as the searchkey of the list reference needs the module prefix, so I can only keep them as 'simple' values if I put them in core. So I left this for now. * The ImportEntryProcessRunnable.run() method does a rollbackAndClose() call inside the loop which goes through all the records which are pending to be processed. Is this really needed? It should have a performance impact, and afaik our tasks already manage the transaction, or at least they are responsible for doing it (so they commit every record they process, or they rollback every time there is a problem). Maybe we should just delegate the management of the transaction to the underlying synchronization tasks? Or maybe there was some hidden problem which I'm missing... MT>> Normally there should not be a life session anymore, so then rollbackand close does not do anything. I changed the code a bit to really log a debug and close only when necessary. But this needs to be changed to log a warning, can only be done if the rest of the code has been changed to really close the session. * Also, in this table basically there is just one index, which has four columns (importstatus, stored, typeofdata and ad_org_id). However, the main query which seems to read this table is the one in the run() method of the ImportEntryManagerThread class, which afaics only filters by importstatus and orders by stored. Maybe we could remove the last two columns from the index? This would be nice as it would make it smaller and I guess more efficient because it can be more easily cached by the Database/Operating system. MT>> Done * Currently, in the old sync tasks, in case we see a duplicate, we do nothing, but we leave a message in the Tomcat log. I think it makes sense to do this, mainly because sometimes this is useful when reviewing the instance of customers in some situations. MT>> Done * In the importEntryManager, in the ImportStatistics there is a log() method which does System.err.println. This should be replaced by log.error. By the way, it's really a very neat idea to have these statistics! MT>> removed system.err * Speaking of the ImportEntryManagerThread.run() method, the .list() in the query should be replaced by a scroll. The number of records pending to be synced should be small most of the times, but if there is a sudden rush of data sent to the backend then this could grow quite a bit and .list() won't work. MT>> done * One thing I'm not sure about is how we handle errors in the ImportEntries table which happened due to weird infrastructure problem. I guess these shouldn't happen in general, but if they happen, what should the user do? Is there some kind of way for him to force the retry of the record, or something? MT>> Will add a process now function, there is also a description at the end of this section: http://wiki.openbravo.com/wiki/Retail:Web_POS_User_Guide#Backend_Data_Processing:_Data_Import_Entries * In some cases, it's possible that the Web POS sends an array with several orders instead of just one for example. Just for me to understand, I think what we are doing is saving this record as a single entry in the ImportEntries table, and then we delegate on the OrderLoader to manage this array as it's always been. I think this is the case, is it? MT>> Added a testcase for this to test with messages of 10MB, this works * In the ImportEntryProcessor class, in the handleImportEntry method there is a call to initialize(). This method is synchronized, and is called every time a new importEntry is going to be processed. I understand that the main reason it's synchronized is because it's used to ensure that the executorService is created and running. Maybe we could move this logic to the constructor of the ImportEntryProcessor, or to a method which is called when the Processor is created, and this would allow us to avoid this initialize call every time the handleImportEntry method is called, which should improve performance. MT>> This code has been removed as we now have a global executor service >>>>> NOT IMPLEMENTED CODE REVIEW COMMENTS <<<<<<<< * Two topics about the Archiving process: - Why does it exist as a separate process? Right now what we do is: * Save the record in the ImportEntry table * Read the record in the ImportEntry table, and process it. Mark it as process whenever it's done, in the table * Then, in a separate process, read the processed records and delete them from the main table, and save them in the archiving table. Why do we have this separate process? Couldn't we just save the record in the archiving table when we have finished processing it, and delete it right away? This would save us both an update in the main table, and a query to the main table, and also we could get rid of the archiving process infrastructure, unless that's needed for a different reason. MT>> The reason to have a separate process is to make the processing of the import as lightweight as possible, as the archive table can get very big it is not certain how fast creating records in there is. So therefore the separate process. Anyway, if we finally decide that we need the process, then maybe we could change it a bit to move records in a more efficient way. In the OrderGrouping we recently did a change to create invoices in "bulk" by doing an INSERT...SELECT statement. I think what we do in the archiving process is basically query for records, then go one by one creating the corresponding record in the archiving table, and deleting the original one, and this is slow. As this is supposed to be basically a "non-smart" process which just moves data from one table to the other, maybe we could just do this simple INSERT...SELECT and a bulk DELETE FROM statement to clear the data afterwards? This should be more efficient in general. MT>> The process is slightly smarter as it allows modules to add columns to the import entry table. For pos retail this is for example the posterminal. So therefore an insert select is not directly trivial, as the archive process is something separate anyway I would not change it, the current way should perform good enough for periodical running. MT>> The idea is to have a separate process scheduled which does the archiving. This will probably be taken care of for Q4