I try to insert millions of records into a table that has more than 20 indexes.
In the last run it took more than 4 hours per 100.000 rows, and the query was cancelled after 3½ days...
Do you have any suggestions about how to speed this up.
(I suspect the many indexes to be the cause. If you also think so, how can I automatically drop indexes before the operation, and then create the same indexes afterwards again?)
Extra info:
- The space used by the indexes is about 4 times the space used by the data alone
- The inserts are wrapped in a transaction per 100.000 rows.
Update on status:
The accepted answer helped me make it much faster.
-
You can disable and enable the indexes. Note that disabling them can have unwanted side-effects (such as having duplicate primary keys or unique indices etc.) which will only be found when re-enabling the indexes.
—-Disable Index ALTER INDEX [IXYourIndex ON YourTable DISABLE GO —-Enable Index ALTER INDEX [IXYourIndex] ON YourTable REBUILD GORichard : Probably want to not disable the PK index (and definitely not if clustered).Lucero : Yes, you're right. Thanks for pointing that out.Ole Lynge : @Lucero: Thanks for this answer. A suggestion: Maybe update your answer according to Richard's comment? -
This sounds like a data warehouse operation. It would be normal to drop the indexes before the insert and rebuild them afterwards.
When you rebuild the indexes, build the clustered index first, and conversely drop it last. They should all have fillfactor 100%.
Code should be something like this
if object_id('Index') is not null drop table IndexList select name into Index from dbo.sysindexes where id = object_id('Fact') if exists (select name from Index where name = 'id1') drop index Fact.id1 if exists (select name from Index where name = 'id2') drop index Fact.id2 if exists (select name from Index where name = 'id3') drop index Fact.id3 . . BIG INSERT RECREATE THE INDEXES -
As noted by another answer disabling indexes will be a very good start.
4 hours per 100.000 rows [...] The inserts are wrapped in a transaction per 100.000 rows.
You should look at reducing the number, the server has to maintain a huge amount of state while in a transaction (so it can be rolled back), this (along with the indexes) means adding data is very hard work.
Why not wrap each insert statement in its own transaction?
Also look at the nature of the SQL you are using, are you adding one row per statement (and network roundtrip), or adding many?
Ole Lynge : Thanks for the answer, and for the additional questions. The bulk insert is happening with one single call to a stored procedure.Ole Lynge : I think a link to the other answer using just href="#751062" avoids the reload of the page.Richard : @Ole: Thanks for the idea on link link (obvious in retrospect :-)).Brian : Wrapping each insert statement in a transaction is much slower than doing a bulk insert.
0 comments:
Post a Comment