Recently I’ve been using LogParser a lot to import log files, such as W3C logs from an IIS web server. The distasteful format of these files is made palatable by the awesomeness that is LogParser; it just takes care fo it for you. (Check out this SQLBits session on LogParser by Jonathan Allen (Twitter | Blog) for a good intro to LogParser)

However I’ve been suffering with very poor performance with large files, so started to investigate further.

Exporting a 5Mb IIS log to a csv file was taking 0.4 seconds, whereas exporting the same log to SQL Server was taking over 15 times longer. This isn’t a problem for a small file, but processing a 0.5Gb log file is somewhat of a different matter!

Performing a trace on SQL Server during an import was revealing that each record being inserted was wrapped up in its own transaction. Importing 100,000 records into SQL was generating 100,000 insert statements, and 100,000 transactions. In the trace below you can see the transactions either side of the insert (using exec sp_cursor…).

Log Parser Transactions

This was executed using the following command

  • LogParser.exe “SELECT * INTO staging.test FROM Test.log” -i:IISW3C -o:sql -server:localhost -database:Staging -cleartable:ON

 

I have to admit that I’m surprised by this – I assumed that LogParser would open up a single transaction for the bulk insert. It turns out however that you can control this behaviour using LogParser’s transactionRowCount parameter.

The ‘-transactionRowCount’ parameter defaults to 0, which auto commits every single row. It can be changed to any number you like, or to -1, which will commit all rows in a single transaction. The choice is yours.

By changing this parameter, the performance difference can be huge on a large file.

The following is an extract of the trace on the same file when executed using the following command

  • LogParser.exe “SELECT * INTO staging.test FROM Test.log” -i:IISW3C -o:sql -server:localhost -database:Staging -cleartable:ON -transactionRowCount:-1

Log Parser Single Transaction

You can clearly see that only a single transaction is opened, with all inserts batched together.

For processing large files this can make a very significant different. For an 11Mb IIS log with just over 18k records on one of my test servers:

  • Default behaviour: 38.26 seconds
  • Single transaction: 2.92 seconds

This is a performance improvement of 92% – which you can’t grumble about.

If you’re calling LogParser from C# script within SSIS (as I usually do) then you can set this same behaviour using the transactionRowCount property of the LogQueryClassClass object (before you call ExecuteBatch!).

 

Frog-Blog-Out

© Alex Whittles, Purple Frog Systems Ltd

6 Responses to “LogParser Performance Inserting into SQL Server”

  • Yogesh:

    Hii, I am also using log parser to parse the event logs of a remote machine but transactionRowCount flag is not helping me in reducing the time. Using it or not is not making any difference. Can you help me out??

    • Alex:

      Hi Yogesh
      Have you run a profile trace on the database to find out how many rows are being inserted per transaction?
      This setting will help if it’s only inserting one row per transaction. So compare the trace with and without the setting and compare the results.
      Regards
      Alex

  • Yogesh:

    Yes, I have run the profile trace on both the settings and I am getting the same traces as shown in your 2nd screen shot. The only difference was SQL BatchStarting and SQL BatchCompleted trace on the both ends while using -transactionRowCount:-1

    Adding that I am fetching the logs from a remote machine.
    And 2 weeks ago, same query without transactionRowCount was taking approx. 7 minutes and with transactionRowCount:-1 approx. 48sec. But now both the queries are taking the same time 7 minutes.

    Do you have any idea where am I making the mistake.

    Query: logparser -i:EVT “select * into tableName from \\serverNameapplication” -o:SQL -driver:”SQL Server” -server:serverName1 -database:databaseName

  • Yogesh:

    TransactionRowCount flag works when I fetch the logs from my local machine. It reduces my time from 12 secs to 4 secs.
    But still not getting why this flag is not helpful in remote fetching of logs.
    Help me out. Thanks.!

    • Alex:

      I can’t think of any reason why this would cause you a problem, other than the raw throughput of data from your source.
      The TransactionRowCount flag affects the insert into the destination, not the speed at which data can be received from the source.
      So I’d recommend running some tests on the bandwidth or throughput, outside of LogParser.
      Regards
      Alex

  • Yogesh:

    Thanks for your help..!

Leave a Reply

The Frog Blog

I'm Alex Whittles.

I specialise in designing and implementing SQL Server business intelligence solutions, and this is my blog! Just a collection of thoughts, techniques and ramblings on SQL Server, Cubes, Data Warehouses, MDX, DAX and whatever else comes to mind.

Organising SQLRelay

Submit a session for SQLBits

Frog Blog Out
twitter
rssicon