Friday, April 29, 2011

Slow bulk insert for table with many indexes

I try to insert millions of records into a table that has more than 20 indexes.

In the last run it took more than 4 hours per 100.000 rows, and the query was cancelled after 3½ days...

Do you have any suggestions about how to speed this up.

(I suspect the many indexes to be the cause. If you also think so, how can I automatically drop indexes before the operation, and then create the same indexes afterwards again?)

Extra info:

  • The space used by the indexes is about 4 times the space used by the data alone
  • The inserts are wrapped in a transaction per 100.000 rows.


Update on status:

The accepted answer helped me make it much faster.

From stackoverflow
  • You can disable and enable the indexes. Note that disabling them can have unwanted side-effects (such as having duplicate primary keys or unique indices etc.) which will only be found when re-enabling the indexes.

    —-Disable Index
    ALTER INDEX [IXYourIndex ON YourTable DISABLE
    GO
    
    —-Enable Index
    ALTER INDEX [IXYourIndex] ON YourTable REBUILD
    GO
    
    Richard : Probably want to not disable the PK index (and definitely not if clustered).
    Lucero : Yes, you're right. Thanks for pointing that out.
    Ole Lynge : @Lucero: Thanks for this answer. A suggestion: Maybe update your answer according to Richard's comment?
  • This sounds like a data warehouse operation. It would be normal to drop the indexes before the insert and rebuild them afterwards.

    When you rebuild the indexes, build the clustered index first, and conversely drop it last. They should all have fillfactor 100%.

    Code should be something like this

    if object_id('Index') is not null drop table IndexList
    select name into Index from dbo.sysindexes where id = object_id('Fact')
    
    if exists (select name from Index where name = 'id1') drop index Fact.id1
    if exists (select name from Index where name = 'id2') drop index Fact.id2        
    if exists (select name from Index where name = 'id3') drop index Fact.id3
    .
    .
    BIG INSERT
    
    RECREATE THE INDEXES
    
  • As noted by another answer disabling indexes will be a very good start.

    4 hours per 100.000 rows [...] The inserts are wrapped in a transaction per 100.000 rows.

    You should look at reducing the number, the server has to maintain a huge amount of state while in a transaction (so it can be rolled back), this (along with the indexes) means adding data is very hard work.

    Why not wrap each insert statement in its own transaction?

    Also look at the nature of the SQL you are using, are you adding one row per statement (and network roundtrip), or adding many?

    Ole Lynge : Thanks for the answer, and for the additional questions. The bulk insert is happening with one single call to a stored procedure.
    Ole Lynge : I think a link to the other answer using just href="#751062" avoids the reload of the page.
    Richard : @Ole: Thanks for the idea on link link (obvious in retrospect :-)).
    Brian : Wrapping each insert statement in a transaction is much slower than doing a bulk insert.

0 comments:

Post a Comment