Shell awk useful commands

Shell awk useful commands

A few awk commands that I used quite often.

Column sum and conditional selection

cat *2827_2.txt | awk '$2 > 0'| awk '{sum += $2} END {print sum}' 
awk '{for (i=2; i<=20; i++) if ($i >=100) {print $0; next}}' INPUTFILE

Search content between 2 words:

awk '/START-WORD/, /END-WORD/' input > output

Combine multiple rows based on the same column value

awk '$1!=p{if(p)print s; p=$1; s=$0; next}{sub(p,x); s=s $0} END{print s}'

Split multiple columns into two columns with common first column

awk -v OFS='\t' '{for (i=2;i<=NF;i++) print $1,$i}'

Are the sub-strings the same?

awk 'substr($2,1,10)!=substr($4,1,10)'
awk '$2/$4>0.9 && $2/$4<1.1'|tr -d \% | awk '$3>0.90'| awk '{print $1 "\t" $3}'

Show content at a specific line and specific position

awk '(NR==2){print $1}' INPUTFILE

Print line of a tab-delimited file when the 8th field is 10090:

awk -F "\t" '$8 == 10090 { print $0 }' myFile

Print fields 1, 2, 3 from a tab-delimited file where the 4th field contains a ‘99’:

awk -F "\t" '$4 ~ /99/ {print $1"\t"$2"\t"$3}' myFile

Swap two columns

awk '{ t=$2; $2=$14; $14=t; print;}' INPUTFILE.TXT (swap $2 and $14 and print out)

Print 1st and last columns

awk '{print $1 " " $NF}' myFile

Print A to B columns

awk '{ for(i=A; i<B; i++) printf "%s",$i OFS; if(NF) printf "%s",$NF; printf ORS}'

Define delimiter and condition

awk 'BEGIN {FS="\t"}; $7<0.05' FILEIN

Condition in a for loop

for i in 10 12 16 20; do cat malePhases | awk -v i="$i" '$2==i'  | awk '{print $1}' > $i.genes; done

For each row, print number of columns matching the condition:

cat male-heatmap-5nodes.csv | sort | uniq |sed 's/,/ /g'| awk '{for(j=i=2;i<=NF;++i){if($i<0.05) j++}print $1 " " j-2}' # note {}
awk '{ for (j=i=2;i<=NF,i++){
              if($i<0.05)
                      j++ # sum the number of columns matching pattern
              }
      print $1 " " j-2
}
Z. Lu avatar
Z. Lu
Data scientist, bioinformatician, retro fan and web lover.
comments powered by Disqus