Why sed rules/programming
Posted on 2003-10-22 by ivo :: /programming :: link
Or: why you should use perl when you notice that your sed expression is becoming far too complicated.
sed -e 's/^\([0-9]\+\);--;\([0-9]\+\);\([0-9]\+\);;\([0-9]\+\);--;\([0-9 ]\+\);\([0-9]\+\);;\([0-9]\+\);--;\([0-9]\+\);\([0-9]\+\);;/pa=\1-\2\&za =\3\&pb=\4-\5\&zb=\6\&pc=\7-\8\&zc=\9;/g' -e 's/zc=\([0-9]\+\);\([0-9]*\ );-\?-\?;\([0-9]*\);\([0-9]*\)/zc=\1\&pd=\2-\3\&zd=\4/g' | sed -e '=' | sed -e 's/^/+/;N;s/^+\([0-9]\+\)\n/\1 /' | sed -e 's/^\([0-9]\+\) pa=\([ 0-9]\+-[0-9]\+\)&za=\([0-9]\+\)&pb=\([0-9]\+-[0-9]\+\)&zb=\([0-9]\+\)&pc =\([0-9]\+-[0-9]\+\)&zc=\([0-9]\+\)&pd=\([0-9]*-[0-9]*\)&zd=\([0-9]*\)$/ pa\1=\2\&za\1=\3\&pb\1=\4\&zb\1=\5\&pc\1=\6\&zc\1=\7\&pd\1=\8\&zd\1=\9/g ' | tr '&' '\n'
The first thing I ran into is that sed only handles nine backreferences. I should have switched then, but I was stubborn and managed to do it anyway using the trick of running sed twice on the same line.
I should have switched to perl or python or whatever else, but I
almost had it working... until line numbers had to be added. I found
an example in the info page, using the = and
N commands. It worked, but since they had to be inserted
in each line in the output, another nasty regular expression
emerged.
It worked, and luckily the input wasn't too big, but I really should have done this in perl right from the start, like I usually do…
