Before processing text files, you might need to view their contents. You can use commands like cat, less, more, and tail:
cat filename.txt: Displays the entire contents of the file.
less filename.txt: Allows for scrollable viewing of the file contents.
more filename.txt: Similar to less, but with less flexibility.
tail -n 10 filename.txt: Shows the last 10 lines of the file.
To search within a text file, grep is incredibly useful:
grep “search_string” filename.txt: Prints lines containing the search string.
grep -i “search_string” filename.txt: Case-insensitive search.
grep -r “search_string” /path/: Recursively search all files under the specified directory.
While Bash isn’t typically used for interactive editing, you can use sed for stream editing:
sed ’s/old/new/g’ filename.txt: Replaces all occurrences of ‘old’ with ’new’ in the file and displays the result.
To save changes back to the file, you can redirect the output: sed ’s/old/new/g’ filename.txt > modified_filename.txt
Sorting content is another common requirement:
sort filename.txt: Sorts lines alphabetically.
sort -r filename.txt: Sorts lines in reverse order.
sort -n filename.txt: Sorts lines numerically.
To find or filter unique lines, use uniq:
sort filename.txt | uniq: Removes duplicate lines (note that uniq requires sorted input).
sort filename.txt | uniq -u: Displays only unique lines that do not have duplicates.
The wc command is useful for getting basic statistics:
wc filename.txt: Displays the line, word, and character counts.
wc -l filename.txt: Counts the number of lines in the file.
The cut command is handy when dealing with delimited data:
To combine the contents of multiple text files, use cat:
You can use tr to translate or delete characters:
cat filename.txt | tr ‘[:lower:]’ ‘[:upper:]’: Converts all lowercase letters to uppercase.
tr -s ’ ’ < filename.txt: Squeezes multiple spaces into a single space.
Understanding redirection is crucial:
command > file.txt: Redirects the output of command to file.txt, overwriting it.
command » file.txt: Appends the output of command to file.txt.
Sort file lines and deduplicate content:
sort file.txt | uniq -u
Usage of AWK for deduplicate file content:
awk '!seen[$0]++' file.txt
]]>