Comprehensive Text File Processing in the Command Line

Apr 11, 2024 3 min read

A comprehensive guide to processing text files using command-line tools for viewing, searching, editing, and analyzing content.

1. Viewing Text Files

Before processing text files, you’ll need to examine their contents using various viewing commands:

cat filename.txt: Displays the entire file contents at once
less filename.txt: Provides scrollable viewing with search capabilities (press / to search)
more filename.txt: Similar to less but with fewer features
head -n 10 filename.txt: Shows the first 10 lines
tail -n 10 filename.txt: Shows the last 10 lines
tail -f filename.txt: Follows the file for real-time updates (useful for logs)

The grep command is your primary tool for searching text patterns:

Basic searches:

Advanced searches:

grep -r "search_string" /path/: Recursively search all files in a directory
grep -E "pattern1|pattern2" filename.txt: Search for multiple patterns
grep -c "search_string" filename.txt: Count matching lines

Use sed for powerful find-and-replace operations:

Basic substitution:

sed 's/old/new/g' filename.txt: Replace all occurrences of ‘old’ with ’new’
sed 's/old/new/' filename.txt: Replace only the first occurrence per line
sed -i 's/old/new/g' filename.txt: Edit the file in-place

Advanced sed operations:

Sort content using various criteria:

Use uniq to work with duplicate content:

sort filename.txt | uniq: Remove adjacent duplicates (requires sorted input)
sort filename.txt | uniq -c: Count occurrences of each line
sort filename.txt | uniq -u: Show only truly unique lines (no duplicates)
sort filename.txt | uniq -d: Show only lines that appear multiple times

Get file statistics with the wc command:

Use cut and awk for column-based data extraction:

Using cut:

Using awk (more powerful):

Combine multiple files:

Split large files:

Use tr for character-level transformations:

Understanding redirection is crucial for text processing: