Effortless Ways to Read Large Files on Linux

2024/09/27

Need to analyze a huge log file, text file, or dataset? You’re definitely not alone. Handling large files on Linux can be daunting when you’re trying to view or manage content while keeping an eye on your system resources. Fortunately, various methods exist for reading large files on Linux, including utilities like the less command, Vim, and techniques for splitting the document into smaller sections.

This article will guide you through different techniques to read large files or extract specific information on Linux using an array of tools.

Using the Less Command

Looking for a lightweight tool to display the contents of large text files or perform quick searches? The less command is your go-to solution.

One of my favorite aspects of this utility is that, unlike standard text editors, it allows you to view files one page at a time without loading the entire file into memory. This makes it remarkably faster, especially for large files, and simplifies the process of reviewing extensive documents or logs.

To use it, simply type less followed by the filename:

This command will open the less interface, allowing you to scroll through the document line by line using the arrow keys or search for specific terms by pressing / followed by your search term.

You can also utilize less to read outputs from other commands by using a pipeline operator. For instance, to view and display the output of the ls command, you’d use:

Reading ls command output in terminal window.

The less command also offers multiple options to customize its functionality. You can combine these options to tailor how less works to suit your needs.

For example, you can use the -p option with less to search for a specific term:

Finding specific pattern in the particular file on Linux.

This command opens the output and jumps directly to the first occurrence of the word sample.

You can also display line numbers alongside the file content using the -N option:

Numbering output command in less interface utility.

Splitting Files Using the Split Command

Sometimes, the best way to handle a large file is to break it into smaller, more manageable pieces – particularly when you want to read or process the file in sections. For instance, I tend to split files when their size exceeds 1 GB or contains more than 100 million lines.

You can divide files by either size or by line count. For text files, it’s generally best to split them by lines to avoid cutting words or lines in half.

For example, to split a file based on a specific number of lines, execute this command:

Splitting files to smaller chunks using split command.

In this case, the file “samplefile.txt”is divided into multiple parts, each consisting of 10,000 lines. The resulting files will be named “part_aa”, “part_ab”, and so on. You can now open and examine smaller segments of the file without being concerned about excessive memory usage or sluggish system performance.

If you want to split large files based on size, such as 100 MB, you can execute this command:

Using Midnight Commander

Midnight Commander (MC) is a dual-panel, text-based file manager that offers a user-friendly visual interface for navigating files and directories.

MC enables you to view files directly in its interface, which allows you to swiftly scroll through large logs or datasets without loading the entire document into memory. I appreciate how smoothly and efficiently MC lets you traverse large files.

To install MC, simply run:

Launch it by running mc in the terminal. Once inside, navigate to the large file you wish to read and explore its content.

Viewing large file on Midnight Commander tool.

Using Klogg

Klogg is a fast and open-source GUI-based log viewer that can efficiently process large files. Unlike standard editors that load the whole file into memory, Klogg only reads parts of the file as required, minimizing memory usage.

Klogg also provides real-time filtering and searching capabilities, making it simple to find specific content without extensive scrolling.

Before installing Klogg, you need to create the “/etc/apt/keyrings” directory and add the GPG key to it. The GPG key is essential for validating the Klogg package repository.

To create the directory, execute:

Add the GPG key with the following command:

Adding GPG key to /etc/apt/keyrings directory.

Next, add the Klogg repository to your system with the curl command:

Update your package list with:

Finally, you can install Klogg using this command:

Once installed, simply open the file through Klogg’s interface and utilize its built-in features for searching and navigating through the file content.

Reading Large Files Using a Text Editor

While many text editors struggle with large files, editors like Vim and Emacs manage larger files more effectively than standard editors such as Nano or Gedit.

For instance, Vim’s features allow you to navigate through the file and search for terms quickly without loading the entire content into memory at once. However, do note that searches are confined to sections of the file that have already been loaded.

To open a file in Vim, run:

Searching Through a File with the Grep Command

If you’re looking to find specific information within a large file, use the grep command. This powerful tool enables you to search through files and display only the lines that match your query.

When piping command outputs into less, remember that it’s only temporary—the output is lost once you exit less. To retain the output for later viewing, redirect it to a file and open it with a command-line tool such as less.

For instance, to filter every line containing the word “ERROR” from a large file, you would execute:

Searching through grep command within specified file.

You can refine your search with additional options, such as ignoring case sensitivity (grep -i) or searching for whole words only (grep -w).

Redirecting Output to a File

If you wish to save specific search results for later review, you can redirect the output of your commands to a new file.

For example, search for lines using the grep command and save them to a new file:

The > operator creates a new file or overwrites an existing one each time. To append data to an already-existing file, use >> instead of >.

Using the Head and Tail Commands

When working with large files on Linux, you might only need to view the beginning or the end of a file. This is where the head and tail commands come into play.

To display the first 20 lines of a file, use:

Displaying first twenty lines of the large files on Linux.

Likewise, to display the last 20 lines, execute:

In both cases, -n 20 specifies that you only want to see the first or last 20 lines. You can adjust this number to display more or fewer lines as needed, with both commands defaulting to show 10 lines.

Combining tail and head can help you navigate through a specific section of a file. To view lines 10-14 of a 100-line file, calculate the starting line from the end by subtracting the starting line minus one from the total line count (100 – 9 = 91). Execute the following command:

This will display lines 10-14. You can verify the output against what you see with less.

For continuously updating files, such as log files, you can use tail -f to monitor changes in real-time.

Conclusion

Whether you’re analyzing logs, working with datasets, or simply reading a large text file, these strategies will help simplify the process significantly. You can also explore additional techniques to find large files on Linux and transfer them via the terminal.

Image credit: Unsplash. All alterations and screenshots by Haroon Javed.

Source