awk - Print characters before matching a pattern -
following combination of awk commands
awk ' {if ($0~/>/) {head=$0;getline} {if($0~/pattern/) print head"\n"$0}}' filename1 | awk 'begin {pos=0;char=0} {if($0~/>/) head=$0;getline} {pos=0; if($0~/pattern/) {pos=match($0,/pattern/);char=substr($0,pos,55)} print head"\n"char}'
above 1 works great, wanted capture 55 characters after identifying pattern "aattggcc". problem how can 55 characters prefix matching pattern (55 words before matching pattern). yes,i can write whole in perl, since have above in awk wondering if can modify somehow.
thanks
it bit brute force use pattern has 55 periods before aattggcc
for instance:
/.......................................................aattggcc/ {print substr(%$0,1,55)}
should trick. better see if awk regular expressions support subexpressions.
but best use python , library pygep because python used lot in bioinformatics.
Comments
Post a Comment