List of useful text transformation commands

List of commands useful to transform text files in bash

Convert encoding:

iconv  -f CP1252 -t UTF-8 example.txt > example-utf-8.txt

Xrumer files conversion from windows encoding to UTF-8

iconv  -f windows-1251 -t UTF-8//IGNORE example.txt > example-utf-8.txt
iconv  -f windows-1251 -t UTF-8//IGNORE  4Success.txt > utf8/4Success.txt

Text wiles cleanup, remove duplicates and empty lines:

awk 'NF && !seen[$0]++' inputfile.txt > outputfile.txt
grep -v "^[[:space:]]*$" in.txt | uniq

Count number of lines in file:

wc -l file.txt

Split files by lines:

split -l <number-of-lines-in-file> file.txt
  • will be generated xa* each with lines

Find and remove files:

find . -name "Success.txt" -exec rm -rf {} \;

find . -name "Success.txt" | xargs rm -rf