1022

I mean 100+ MB big; such text files can push the envelope of editors.

I need to look through a large XML file, but cannot if the editor is buggy.

Any suggestions?

17
  • 166
    Actually, text files of 100+ MB or even 1+ GB is not as uncommon as you may think (i.e. log files from busy servers). Dec 19, 2008 at 19:18
  • 15
    Sneakyness: And not exactly text. I think the requirements of reading text files and reading binary files differ somewhat. You might pass it through base64 or uuencode, though.
    – Joey
    Aug 16, 2009 at 10:24
  • 2
    This should be at least a similar question or even linked as it was asked 18 months prior... stackoverflow.com/questions/102829/…
    – ONDEV
    Jan 19, 2012 at 0:49
  • 1
    I was also looking for the answer to this exact question in order to read some huge log files that I've generated! Jul 20, 2012 at 16:19
  • 1
    @BlairHippo I feel the same way, I'm almost nervous when asking a question because chances are high that someone will say "Close this, it should go in WhateverExchange instead"
    – Rodolfo
    Dec 17, 2013 at 18:04

2 Answers 2

1648

Free read-only viewers:

  • Large Text File Viewer (Windows) – Fully customizable theming (colors, fonts, word wrap, tab size). Supports horizontal and vertical split view. Also support file following and regex search. Very fast, simple, and has small executable size.
  • klogg (Windows, macOS, Linux) – A maintained fork of glogg. Its main feature is regular expression search. It supports monitoring file changes (like tail), bookmarks, highlighting patterns using different colors, and has serious optimizations built in. But from a UI standpoint, it's rather minimal.
  • LogExpert (Windows) – "A GUI replacement for tail." It's really a log file analyzer, not a large file viewer, and in one test it required 10 seconds and 700 MB of RAM to load a 250 MB file. But its killer features are the columnizer (parse logs that are in CSV, JSONL, etc. and display in a spreadsheet format) and the highlighter (show lines with certain words in certain colors). Also supports file following, tabs, multifiles, bookmarks, search, plugins, and external tools.
  • Lister (Windows) – Very small and minimalist. It's one executable, barely 500 KB, but it still supports searching (with regexes), printing, a hex editor mode, and settings.

Free editors:

  • Your regular editor or IDE. Modern editors can handle surprisingly large files. In particular, Vim (Windows, macOS, Linux), Emacs (Windows, macOS, Linux), Notepad++ (Windows), Sublime Text (Windows, macOS, Linux), and VS Code (Windows, macOS, Linux) support large (~4 GB) files, assuming you have the RAM.
  • Large File Editor (Windows) – Opens and edits TB+ files, supports Unicode, uses little memory, has XML-specific features, and includes a binary mode.
  • GigaEdit (Windows) – Supports searching, character statistics, and font customization. But it's buggy – with large files, it only allows overwriting characters, not inserting them; it doesn't respect LF as a line terminator, only CRLF; and it's slow.

Builtin programs (no installation required):

  • less (macOS, Linux) – The traditional Unix command-line pager tool. Lets you view text files of practically any size. Can be installed on Windows, too.
  • Notepad (Windows) – Decent with large files, especially with word wrap turned off.
  • MORE (Windows) – This refers to the Windows MORE, not the Unix more. A console program that allows you to view a file, one screen at a time.

Web viewers:

Paid editors/viewers:

  • 010 Editor (Windows, macOS, Linux) – Opens giant (as large as 50 GB) files.
  • SlickEdit (Windows, macOS, Linux) – Opens large files.
  • UltraEdit (Windows, macOS, Linux) – Opens files of more than 6 GB, but the configuration must be changed for this to be practical: Menu » Advanced » Configuration » File Handling » Temporary Files » Open file without temp file...
  • EmEditor (Windows) – Handles very large text files nicely (officially up to 16 TB). The speed of search and replace is very fast. Free version available for personal use.
  • BssEditor (Windows) – Handles large files and very long lines. Don’t require an installation. Free for non commercial use.
  • loxx (Windows) – Supports file following, highlighting, line numbers, huge files, regex, multiple files and views, and much more. The free version can not: process regex, filter files, synchronize timestamps, and save changed files.
75
  • 63
    VIM, or Emacs... pick your poison, both will handle any file you throw at them. I personally prefer Emacs, but both will beat notepad without so much as a hiccup.
    – Mike Stone
    Oct 2, 2008 at 8:46
  • 28
    Emacs has a maximum buffer size, dependent on the underlying architecture (32 or 64 bits). I think that on 32 bit systems you get "maximum buffer size exceeded" error on files larger than 128 MB. May 8, 2009 at 13:45
  • 108
    I just tried Notepad++ with a 561MB log file and it said it was too big
    – barfoon
    Jun 2, 2009 at 14:12
  • 11
    @Rafal Interesting! Looks like on 64bit it is ~1024 petabytes. The reason has to do with the fact that emacs has to track buffer positions (such as the point)
    – baudtack
    Jul 1, 2009 at 23:31
  • 88
    But be careful, vim will only work as long as the files in question have enough line breaks. I once had to edit a ca. 150 MB file without any line breaks, and had to resort to gedit because vim couldnt handle it.
    – Benno
    Jan 29, 2010 at 16:47
215

Tips and tricks

less

Why are you using editors to just look at a (large) file?

Under *nix or Cygwin, just use less. (There is a famous saying – "less is more, more or less" – because "less" replaced the earlier Unix command "more", with the addition that you could scroll back up.) Searching and navigating under less is very similar to Vim, but there is no swap file and little RAM used.

There is a Win32 port of GNU less. See the "less" section of the answer above.

Perl

Perl is good for quick scripts, and its .. (range flip-flop) operator makes for a nice selection mechanism to limit the crud you have to wade through.

For example:

$ perl -n -e 'print if ( 1000000 .. 2000000)' humongo.txt | less

This will extract everything from line 1 million to line 2 million, and allow you to sift the output manually in less.

Another example:

$ perl -n -e 'print if ( /regex one/ .. /regex two/)' humongo.txt | less

This starts printing when the "regular expression one" finds something, and stops when the "regular expression two" find the end of an interesting block. It may find multiple blocks. Sift the output...

logparser

This is another useful tool you can use. To quote the Wikipedia article:

logparser is a flexible command line utility that was initially written by Gabriele Giuseppini, a Microsoft employee, to automate tests for IIS logging. It was intended for use with the Windows operating system, and was included with the IIS 6.0 Resource Kit Tools. The default behavior of logparser works like a "data processing pipeline", by taking an SQL expression on the command line, and outputting the lines containing matches for the SQL expression.

Microsoft describes Logparser as a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows operating system such as the Event Log, the Registry, the file system, and Active Directory. The results of the input query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart.

Example usage:

C:\>logparser.exe -i:textline -o:tsv "select Index, Text from 'c:\path\to\file.log' where line > 1000 and line < 2000"
C:\>logparser.exe -i:textline -o:tsv "select Index, Text from 'c:\path\to\file.log' where line like '%pattern%'"

The relativity of sizes

100 MB isn't too big. 3 GB is getting kind of big. I used to work at a print & mail facility that created about 2% of U.S. first class mail. One of the systems for which I was the tech lead accounted for about 15+% of the pieces of mail. We had some big files to debug here and there.

And more...

Feel free to add more tools and information here. This answer is community wiki for a reason! We all need more advice on dealing with large amounts of data...

15
  • 9
    +1, I recently had some really huge xml files (+1 gigabyte) that I needed to look at. I'm on windows and both vim, emacs, notepad++ and several other editors completely choked on the file to the point where my system almost became unusable when trying to open the file. After a while I realized how unnecessary it was to actually attempt to open the file in an -editor- when I just needed to -view- it. Using cygwin (and some clever grep/less/sed-magic) I easily found the part I was interested in and could read it without any hassle.
    – wasatz
    Apr 23, 2010 at 11:56
  • 8
    you don't need cygwin for less, you can also use it under windows: gnuwin32.sourceforge.net/packages/less.htm
    – ChristophK
    Nov 2, 2011 at 9:33
  • 2
    This XML editor here has also a large file viewer component and does provide syntax coloring also for huge files. The files are not loaded completely into memory so a multi-GB document shouldn't be a problem. In addition this tool can also validate those big XML documents ... In my opinion one of the best approaches to work with huge XML data. Apr 21, 2013 at 12:38
  • 10
    OK so I just fixed my own issue. less with word wrap is slow. less -S without word wrap is lightning fast even on large lines. I'm happy again!
    – Andy Brown
    Jul 20, 2015 at 9:41
  • 9
    Great answer. I want to note that if you have Git for Windows installed, you probably have Git bash as well, which includes less. Jun 24, 2016 at 12:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.