Kirill Izotov ,Managing Director, IT has provided comments on Mandeep's article , I thought of publishing it as a separate post. Please find Kirill's original comment below. This is an interesting discussion, and I would like to offer some additions to the recommendations kindly published by Mandeep Grewal.
1. The batch size should be considered very carefully depending on the status of the system – if the system that EIM is running on is "hot" or "cold", i.e. if users and other processes are actively performing insert/update/delete operations ("hot") or the system is provided exclusively for the EIM processing ("cold").
2. Look up isolation level. Locking, for example on SQL server, may cause lock waits and even chained lock waits, and insert/update/delete operations on certain systems may cause true operational problems if the batch size is larger than – in some cases – even 2000 records. In a case of merge operations the batch size should be even smaller in most cases per the number of tables involved into processing.
3. If the system is "cold" – run the EIM processes in parallel (from EIM components residing on different Siebel servers if you need to run many tasks, depends on the capacity of you EIM component Siebel server) and turn on row level locks – disable all higher locking. This will help avoiding processes waiting on each other for lock releases. Run as large batches as you see fit knowing your system (start with 30K rows and go up and down in the increments of 10K, than 2K, until the optimal batch size is defined). Keep in mind that for most systems to avoid bottlenecks on I/O, CPU and memory 20K-40K batches may be optimal. Hint: the bottleneck usually resides with I/O, so plan in advance with DBAs which file groups/files you place EIM tables and EIM tables indices in, and place those onto separate sets of spindles (almost irrelevant in case of solid state drives or advance storage systems). Partition the EIM tables based on batch numbers if you need to perform large load, create partitions for 200-300K rows. Implement alerting on the db server for any lock waits >60 seconds, comes handy.
4. It is worth putting batches into a minimum number of processing sequences; that eliminates necessity for the EIM task loading dictionary from the repository for each of the batches – it is done once for all batches in a process.
5. "Regular maintenance" is a general rule for any table in a DB that is undergoing changes; in the case of EIM when any significant number of records are processed defragmentation and statistics updates should be performed on the EIM table(s) right post populating EIM table and prior processing, and may be in parallel with processing (depends on how large is the load and the degree of indices fragmentation you observe when doing routine check during processing). This stands true for inserts, updates, deletes, exports and merges. For DB2 you can enjoy UPDATE STATISTICS parameter in the .ifb file, but defragmentation is still needed if you are not running on some very advance storage system like EMC Symmetrix DMX.
6. Once working on a "cold" system (when the system is dedicated to the load) interesting results in terms of speeding up large loads – up to 3-15000 times per my experience – can be obtained once you drop all indices from the target base tables (leave only clustered and user key/unique indices alone on the base tables), and drop ALL indices on the EIM table – create clustered index with a "covering" key based on the columns of the indices you see in the trace minus columns that you are not populating in the EIM table + ROW_ID for uniqueness (yes, this contradicts best practices for clustered indices but in this case works with a multi-column unique key like a charm). Of cause, if you use this approach setting of hints and essential hints to false in the .ifb file is necessary (in some cases you will still see hints in traces – use trace flag 8602 on SQL server to disable hints on db server level, I cannot recall for Oracle from the top of my head). Once you are done with EIM processing, apply all indeces as those were defined prior you started your exercise, and remove the flag.
7. After you are done with the initial planning/testing/tuning of a large load, disable all events logs (to 0 or 1 depending on your version of Siebel) and traces on the application servers EIM components. Not a large win, but once we are talking of hundreds of millions of records to process even small percentage may give you extra time for your precious resting. The same applies to dropping triggers, disabling transactions logging, etc.