List of useful text transformation commands
List of commands useful to transform text files in bash
Convert encoding:
iconv -f CP1252 -t UTF-8 example.txt > example-utf-8.txt
Xrumer files conversion from windows encoding to UTF-8
iconv -f windows-1251 -t UTF-8//IGNORE example.txt > example-utf-8.txt
iconv -f windows-1251 -t UTF-8//IGNORE 4Success.txt > utf8/4Success.txt
Text wiles cleanup, remove duplicates and empty lines:
awk 'NF && !seen[$0]++' inputfile.txt > outputfile.txt
grep -v "^[[:space:]]*$" in.txt | uniq
Count number of lines in file:
wc -l file.txt
Split files by lines:
split -l <number-of-lines-in-file> file.txt
- will be generated
xa*
each withlines
Find and remove files:
find . -name "Success.txt" -exec rm -rf {} \;
find . -name "Success.txt" | xargs rm -rf