Performance Testing: Quick Tip: Handling very large log files

Logfiles tend to grow and grow and grow. The other day I had a log file about 1.3 GB large.

The relevant logging was only in one part of the file. Handling such a big file is a nightmare using most editors. I could open it but the editor would crash as soon as I tried to delete some parts of the file.

What I needed was the part of the file which contains the log entries for the load test, i.e. split the file into smaller parts - or - even better, cut the parts off that I would not need for my analysis.

As expected Unix-based operating systems have some built-in tools for this task:

split : Splits a file into equal-sized parts.

csplit : Splits a file into context-based parts.

Examples:

Create a new file named ‘output.log.00’ . It should contain only the part of the file input.log beginning with the line that matches the regular expression ‘\^17:15:’ . I.e. give me everything from 17:15 until the end of the file:

csplit -f output.log. input.log '%^17:15:%'

Now create two files out of ‘output.log.00’: One containing everything before 18:00 (6pm) and one for the remaining part. I.e. I want a file named ‘log.17.00’ which contains all log entries from 17:15 until 18:00 .

csplit -f log.17. output.log.00 '/^18:/'

Using Windows? I’d recommend to install cygwin which provides split/csplit and many other useful Unix tools. In my opinion a must-have for every serious software developer working on this platform.

Martin Gross

Performance Testing: Quick Tip: Handling very large log files

You might also enjoy (View all posts)