Openbravo Issue Tracking System - Retail Modules
View Issue Details
0041542Retail ModulesWeb POSpublic2019-08-05 10:332022-05-02 14:41
aaroncalero 
Retail 
normalmajorN/A
closedno change required 
5
 
 
No
0041542: Full Master Data Load can be improved by inserting data in batches
Full Master Data Load can be improved by inserting data in batches
Check the code of the OB.Dal.insertData function (in ob-dal.js). In a full refresh (incremental = false) the insert statements are executed one per record.
In environments with big amounts of data, the can mean queueing thousands of insert statements, which can slow the system.
SQLite (the db engine behind WebSQL) supports batched inserts (a single insert statement inserting several records). This way of inserting data is faster than doing single inserts, while also reducing the amount of statements queued at the same time.

[1] https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-insertions-971aff98eef2 [^]


Another improvement that can be added (not implemented in the current diff):
Create all indexes after the full refresh has been done. In SQLite, it is faster to do inserts then create the index, than creating the index and then inserting data [2].
In a test environment with 250K products and 250K productprices (using multi price lists) login time can be reduced by 20-25 seconds removing indexes (from 300 seconds to around 280), and it only takes 2 seconds to create all indexes them at the end.

[2] https://stackoverflow.com/questions/15778716/sqlite-insert-speed-slows-as-number-of-records-increases-due-to-an-index [^]
Performance
diff Issue41542.diff (9,559) 2019-08-05 10:48
https://issues.openbravo.com/file_download.php?file_id=13211&type=bug
diff Issue41542_part2.diff (2,891) 2019-08-09 13:21
https://issues.openbravo.com/file_download.php?file_id=13221&type=bug
log TimeMeasurement_split.log (8,731) 2019-08-09 13:45
https://issues.openbravo.com/file_download.php?file_id=13223&type=bug
Issue History
2019-08-05 10:33aaroncaleroNew Issue
2019-08-05 10:33aaroncaleroAssigned To => Retail
2019-08-05 10:33aaroncaleroTriggers an Emergency Pack => No
2019-08-05 10:48aaroncaleroFile Added: Issue41542.diff
2019-08-05 10:49aaroncaleroNote Added: 0113807
2019-08-06 08:48aaroncaleroNote Added: 0113819
2019-08-06 11:04aaroncaleroProposed Solution updated
2019-08-06 11:07aaroncaleroProposed Solution updated
2019-08-07 07:53guillermogilTag Attached: Performance
2019-08-09 13:21aaroncaleroFile Added: Issue41542_part2.diff
2019-08-09 13:22aaroncaleroNote Added: 0113928
2019-08-09 13:39aaroncaleroNote Added: 0113929
2019-08-09 13:40aaroncaleroNote Edited: 0113929bug_revision_view_page.php?bugnote_id=0113929#r19292
2019-08-09 13:45aaroncaleroNote Added: 0113930
2019-08-09 13:45aaroncaleroFile Added: TimeMeasurement_split.log
2022-05-02 14:41aaroncaleroNote Added: 0136886
2022-05-02 14:41aaroncaleroStatusnew => closed
2022-05-02 14:41aaroncaleroResolutionopen => no change required

Notes
(0113807)
aaroncalero   
2019-08-05 10:49   
Attached a diff with a possible implementation (based on code from 19Q1)
(0113819)
aaroncalero   
2019-08-06 08:48   
Several numbers:
Tried loading 250K products in batches of 40K (7 batches).

Load times with the old code (in milliseconds):
##### TRY 1 ####
Master Data Insert: Product: 6468.073974609375ms
Master Data Insert: Product: 6509.407958984375ms
Master Data Insert: Product: 6336.663818359375ms
Master Data Insert: Product: 6190.18798828125ms
Master Data Insert: Product: 6150.75390625ms
Master Data Insert: Product: 6520.14111328125ms
Master Data Insert: Product: 1866.98095703125ms
Total time: 40042.2097167969ms

##### TRY 2 ####
Master Data Insert: Product: 6645.43408203125ms
Master Data Insert: Product: 6691.505859375ms
Master Data Insert: Product: 6575.923095703125ms
Master Data Insert: Product: 6298.0087890625ms
Master Data Insert: Product: 6501.262939453125ms
Master Data Insert: Product: 6529.31298828125ms
Master Data Insert: Product: 1810.626953125ms
Total time: 41052.0747070313ms

##### TRY 3 ####
Master Data Insert: Product: 6645.43408203125ms
Master Data Insert: Product: 6691.505859375ms
Master Data Insert: Product: 6575.923095703125ms
Master Data Insert: Product: 6298.0087890625ms
Master Data Insert: Product: 6501.262939453125ms
Master Data Insert: Product: 6529.31298828125ms
Master Data Insert: Product: 1810.626953125ms
Total time: 41052.0747070313ms

Load times with the new code (in milliseconds):
##### TRY 1 ####
Master Data Insert: Product: 5413.219970703125ms
Master Data Insert: Product: 5502.469970703125ms
Master Data Insert: Product: 5310.0419921875ms
Master Data Insert: Product: 5191.815185546875ms
Master Data Insert: Product: 5183.9140625ms
Master Data Insert: Product: 6352.371337890625ms
Master Data Insert: Product: 1519.47900390625ms
Total time: 34473.3115234375ms

##### TRY 2 ####
Master Data Insert: Product: 5418.7470703125ms
Master Data Insert: Product: 5495.56591796875ms
Master Data Insert: Product: 5040.642822265625ms
Master Data Insert: Product: 5010.64599609375ms
Master Data Insert: Product: 5376.614990234375ms
Master Data Insert: Product: 5321.185302734375ms
Master Data Insert: Product: 1519.27880859375ms
Total time: 33182.6809082031ms

##### TRY 3 ####
Master Data Insert: Product: 5736.070068359375ms
Master Data Insert: Product: 5904.421875ms
Master Data Insert: Product: 5103.2158203125ms
Master Data Insert: Product: 5036.76513671875ms
Master Data Insert: Product: 5048.485107421875ms
Master Data Insert: Product: 5121.67822265625ms
Master Data Insert: Product: 1452.478759765625ms
Total time: 33403.1149902344ms
(0113928)
aaroncalero   
2019-08-09 13:22   
Attached a second diff with a proposal to create indexes only after data has been inserted in a full refresh. Currently this approach needs to drop indexes and recreate them; if possible, this should be refactored to only create the indexes once.
(0113929)
aaroncalero   
2019-08-09 13:39   
(edited on: 2019-08-09 13:40)
Measurements of the login time improvements (old code vs first patch):
### Old Code, Try 1 ###
Time spent inserting data (all models): 50655.7094726563ms
Total login time: 128.538 seconds

### Old Code, Try 2 ###
Time spent inserting data (all models): 53410.6020507813ms
Total login time: 129.466 seconds

### NEW Code, Try 1 ###
Time spent inserting data (all models): 44221.5944824219ms
Total login time: 120.93 seconds
Insert time improvement ~13% vs Old Code Try1
Login time improvement ~6% vs Old Code Try1

### NEW Code, Try 2 ###
Time spent inserting data (all models): 44153.3312988281ms
Total login time: 119.945 seconds
Insert time improvement ~ 17% vs Old Code Try2
Login time improvement ~7% vs Old Code Try2


I haven't been able to measure improved times with the "delayed index" approach testing only with the Product model. However in earlier tests with 250K records in ProductPrice, insert times with normal indexes went increasing from 1.5s on the first batch to 3s on the last batch, while it was staying in a stable 800ms per batch with the delayed indexes approach.

(0113930)
aaroncalero   
2019-08-09 13:45   
Attached a log file with the times measured in the previous note, split by models.
(0136886)
aaroncalero   
2022-05-02 14:41   
WebPOS uses IndexedDB for web storage now, so this FR is no longer valid.