Navigation

home code debian images resume weblog wiki

Older news:

Dec 11, 2006:

Debian GNU/Linux

About lychnis.net
Jul 7, 2005:
Wanneer gebruik je -d en wanneer gebruik je -t?
Feb 18, 2005:
Mixing whitespace
Jan 10, 2005:
The difference between dogs and cats
Dec 22, 2004:
Sunrise in winter
Dec 12, 2004:
New site layout

Browse:


Mixing whitespace/programming

Posted on 2005-02-18 by ivo :: /programming :: link

The following emacs lisp code visually marks all initial whitespace that mixes both tabs and spaces on the same line:

;; Highlight dangerous whitespace mixing
(defface invalid-whitespace-face
  '((t (:background "red")))
  "Used in programming modes for marking mixed tabs
and spaces.")

(mapcar (lambda (mode)
          (font-lock-add-keywords
           mode
           '(("^\\(\t+ \\| +\t\\)\\s-*" 0
              'invalid-whitespace-face))))
        '(c-mode python-mode ...))

Just add a list of modes for which you want to activate this warning.

Python decorators/programming

Posted on 2004-11-26 by ivo :: /programming :: link

As another example of what you could do with python 2.4 decorators, I tried to wrap class methods in a database transaction, and this is what it became:

def intransaction(method):
    def wraptransaction(self, *args, **kwargs):
        try:
            self.conn.beginTransaction()
            rv = method(self, *args, **kwargs)
        except:
            self.conn.rollbackTransaction()
            raise
        else:
            self.conn.commitTransaction()
        return rv
    return wraptransaction
class DatabaseInterface(object):
    def __init__(self, **kwargs):
        self.conn = DatabaseConnection(**kwargs)

    @intransaction
    def getSomething(self, id):
        return self.conn.select(id)[0]

All applications suck/programming

Posted on 2004-07-21 by ivo :: /programming :: link

Seriously, I don't know what is wrong with me. I just can't seem to find any usable software package lately. Either they all suck, or I'm being too demanding or impatient. Probably both, there's a lot of crappy software out there—which includes my own. Most software depends on either a specific database system, a specific programming language, or has far too many dependencies (have a look at slash or scoop for example, gah).

Last monday I ran into a fixed limit in the X protocol, I had opened rougly 220 X terminals, at which time I couldn't open any more because the X protocol somehow allows only 240 client connections. (Yeah I know, I have a habit of not closing them when I've done something—have a look at this screenshot to get an idea of what my desktop usually looks like.) Why is there even a limit like this? It probably has something to do with the X protocol being invented when computers were slow and networking bandwidth was limited, and terminals seemed like a good idea, so fixed field lengths were being used. There's Fresco, but I have never looked at it long enough to tell whether or not it fixes these things.

Isn't there some kind of way to make these things more generic? I don't want to install php if I have perl available, I don't want to install mysql if I already have postgresql installed. Hell, if I wanted to use XML files or Oracle, why not use those? And I certainly don't like to be forced to use a certain desktop environment—if you're bored and have some internet bandwidth to waste, try to install konqueror on an otherwise pure GNOME-system.

I often threaten to write my own programming language, my own operating system, and given the time and motivation I'd probably rewrite everything in existance to fix limitations like these.

The latest project I'm working on (kansloos.it) involves selecting a news or weblog package, something that allows me to post short stories, lets users submit stories, and some level of integration with several forums on the same site. I've been increasingly frustrated by blosxom, and from talking on IRC I've heard a lot of similar sentiments about other weblog packages. Now it tends to go in the direction of writing yet another weblog implementation, which probably also has its own constraints and weird limitations, and it won't be finished for a while either.

So what am I going to do? I just wrote “write weblog system” on my todo-list…

Configuring exim4/programming

Posted on 2004-06-30 by ivo :: /programming :: link

 ┌──────────────────┤ Configuring Exim v4 (exim4-config) ├───────────────────┐
 │                                                                           │
 │ The headers of outgoing mail can be rewritten to make it appear to have   │
 │ been generated on a different system, replacing                           │
 │ "phoenix.office.next-element.nl" "localhost" and "" in From, Reply-To,    │
 │ Sender and Return-Path.                                                   │
 │                                                                           │
 │ Hide local mail name in outgoing mail?                                    │
 │                                                                           │
 │                    <Yes>                       <No>                       │
 │                                                                           │
 └───────────────────────────────────────────────────────────────────────────┘

So, what should I choose if I want to allow overriding the From-header?

Abstract methods in python (4)/programming

Posted on 2004-01-26 by ivo :: /programming :: link

Well, that was interesting. Rigel pushed me to submit the code to the ASPN cookbook, which I did. Then he told me that my entry was included in today's issue of daily python (apparently they do that to all entries from the ASPN cookbook). Incidentally I fixed a few minor bugs in the code. The complete code is downloadable here: abstractmethods.py.

Abstract methods in python (3)/programming

Posted on 2004-01-22 by ivo :: /programming :: link

Ok, here is the same code as in the last two articles, this time with more explanation. The point is that python doesn't have a notion of “abstract methods.” Abstract methods are part of an base class that defines an interface, without any code. Abstract methods can't be called directly, because they don't contain any code in their definition.

In the definition of the base class, you may want to include a specific method that is part of the interface, but the specific implementation is still unknown. A popular example seems to be the drawing of a point or a line in a graphical application.

The classes Point and Line share several implementation details, but differ on other. In particular, the way they are drawn is completely different (you will want to optimize the drawing of a line). Suppose these two classes are derived from the same class, Object. It is possible to separate the implementation of the method draw of these two classes, while draw can still be called from the base class Object.

The text below will introduce some utility classes that make this possible.

The goal of this article is defining a way to make it possible to define classes such as the following (not yet paying attention to the proper syntax):

class Object (object):
    abstract draw()

    def update(self):
        self.draw()

class Point (Object):
    def draw(self):
        ...
    ...

class Line (Object):
    def draw(self):
        ...
    ...

The method draw of the class Object cannot be implemented, because the concept of ‘drawing’ a generic object is undefined. Other methods, such as the update above may want to use draw anyway, because it is part of the specification for the Object class and its descendants.

The implementation in python exists of two parts:

  1. The definition of a way to declare abstract methods, and
  2. a way to restrict the creation/usage of these abstract classes.

First the declaration part. To declare an abstract method, we can use callable class variables:

class Object (object):
    draw = AbstractMethod()

When somebody tries to call Object.draw(), an exception will be raised. But as long as methods in Object use self.draw(), they will actually use Point.draw(), because self will be of type Point.

If AbstractMethod is a class, draw will be an instance of this class, so we can make draw callable, and raise a proper exception (TypeError or NotImplementedError for example) if it is called instead of an implementation in one of the descendant classes.

class AbstractMethod (object):
    def __init__(self, func):
        self._function = func

    def __get__(self, obj, type):
        return self.AbstractMethodHelper(self._function, type)

    class AbstractMethodHelper (object):
        def __init__(self, func, cls):
            self._function = func
            self._class = cls

        def __call__(self, *args, **kwargs):
            raise TypeError('Abstract method `' + self._class.__name__ \
                            + '.' + self._function + '\' called')

So now we can declare Object as follows:

class Object (object):
    draw = AbstractMethod('draw')
    def update(self):
        self.draw()

If we tried to call Object().draw() directly, we get an exception:

>>> Object().draw()
TypeError: Abstract method `Object.draw' called

The same happens with Object().update():

>>> Object().update()
TypeError: Abstract method `Object.draw' called

If we implement a descendant class which implements draw, there is no error.

class Point (Object):
    def draw(self):
        print 'Point.draw called'

(Note that there is no definition for update in Point, it uses the implementation inherited from Object.)

>>> Point().update()
Point.draw called

Of course, we shouldn't be getting an exception at all if we try to call an abstract function. It should be impossible to create an instance of a class that has one or more abstract methods in its definition (either declared directly in the class definition, or implicitly via inheritance without overriding it with a real method). We can solve this pretty easily by declaring a metaclass that checks if there are any abstract methods in a class definition, and raise an exception if there are.

class Metaclass (type):
    def __init__(cls, name, bases, *args, **kwargs):
        type.__init__(cls, name, bases, *args, **kwargs)
        cls.__new__ = staticmethod(cls.new)

        ancestors = list(cls.__mro__)
        ancestors.reverse()  # Start with __builtin__.object
        for ancestor in ancestors:
            for clsname, clst in ancestor.__dict__.items():
                if isinstance(clst, AbstractMethod):
                    abstractmethods.append(clsname)
                else:
                    if clsname in abstractmethods:
                        abstractmethods.remove(clsname)

        abstractmethods.sort()
        setattr(cls, '__abstractmethods__', abstractmethods)

    def new(self, cls):
        if len(cls.__abstractmethods__):
            raise NotImplementedError('Can\'t instantiate class `' + \
                                      cls.__name__ + '\';\n' + \
                                      'Abstract methods: ' + \
                                      ", ".join(cls.__abstractmethods__))

        return object.__new__(self)

The definition of Object becomes:

class Object (object):
    __metaclass__ = Metaclass
    draw = AbstractMethod('draw')

This has the final result:

>>> Point().update()
Point.draw called
>>> Object().update()
NotImplementedError: Can't instantiate class `Object';
Abstract methods: draw

The error can be caught much earlier on when the exception is raised when the class is instantiated.

There is one remaining issue, which is that descendant classes of Object which don't implement all the abstract methods defined in Object can also not be instantiated:

>>> class FooClass (Object):
...     pass
>>> FooClass()
NotImplementedError: Can't instantiate class `FooClass';
Abstract methods: draw

The code in the last article didn't do this, but the code in this article checks all ancestors for any abstract methods that haven't been implemented.

Abstract methods in python (2)/programming

Posted on 2004-01-22 by ivo :: /programming :: link

The fun never stops!

class Object (object):
    __metaclass__ = Metaclass
class Metaclass (type):
    def __init__(cls, name, bases, *args, **kwargs):
        type.__init__(cls, name, bases, *args, **kwargs)
        cls.__new__ = staticmethod(cls.new)

        abstractmethods = []
        for clsname, clst in cls.__dict__.items():
            if isinstance(clst, AbstractMethod):
                abstractmethods.append(clsname)

        abstractmethods.sort()
        setattr(cls, '__abstractmethods__', abstractmethods)

    def new(self, cls):
        if len(cls.__abstractmethods__):
            raise NotImplementedError('Can\'t instantiate class `' + \
                                      cls.__name__ + '\';\n' + \
                                      'Abstract methods: ' + \
                                      ", ".join(cls.__abstractmethods__))

        return object.__new__(self)
class MyAbstractObject (Object):
    foo = AbstractMethod('foo')

class MyObject (MyAbstractObject):
    def foo(self):
        print 'foo'

def main():
    a = MyObject()
    a.foo()
    b = MyAbstractObject()
    b.foo()
> python test.py
foo
Traceback (most recent call last):
  File "test.py", line 25, in ?
    main()
  File "test.py", line 21, in main
    b = MyAbstractObject()
  File "/home/ivo/p/python/abstract-classes/Metaclass.py", line 29, in new
    raise NotImplementedError('Can\'t instantiate class `' + \
NotImplementedError: Can't instantiate class `MyAbstractObject';
Abstract methods: foo

Abstract methods in python/programming

Posted on 2004-01-22 by ivo :: /programming :: link

Classes are fun!

class AbstractMethod (object):
    def __init__(self, func):
        self._function = func

    def __get__(self, obj, type):
        return self.AbstractMethodHelper(self._function, type)

    class AbstractMethodHelper (object):
        def __init__(self, func, cls):
            self._function = func
            self._class = cls

        def __call__(self, *args, **kwargs):
            raise TypeError('Abstract method `' + self._class.__name__ \
                            + '.' + self._function + '\' called')
class MyAbstractObject (object):
    foo = AbstractMethod('foo')

class MyObject (MyAbstractObject):
    def foo(self):
        print 'foo'

def main():
    a = MyObject()
    a.foo()
    b = MyAbstractObject()
    b.foo()
> python test.py
foo
Traceback (most recent call last):
  File "test.py", line 25, in ?
    main()
  File "test.py", line 22, in main
    b.foo()
  File "/home/ivo/p/python/abstract-classes/AbstractMethod.py", line 19, in __call__
    raise TypeError('Abstract method `' + self._class.__name__ \
TypeError: Abstract method `MyAbstractObject.foo' called

A truly free mind/programming

Posted on 2004-01-15 by ivo :: /programming :: link

When complaining on IRC about the time I spent on writing articles for this weblog, someone jokingly said that I should put a PayPal banner on my website. But I tend to hate paypal.

But another thought struck me then, which is that I don't want to recieve any money for the work that I do for this site. All the texts on this site are licensed for redistribution, as long as you give me credit, and keep the copyright notices intact (details). I don't do this because I'm so philanthropic, I just want anyone to be able to take my texts or code or images or anything else, and modify it for their needs and redistribute the result.

Basically this has happened long before I came into contact with the free software movement. I had written some code that I wanted to give away to friends. I saw that they were using it, which felt good. But I was young and naive at that time.

Later I found out about the existence of Linux, and the idea of choice appealed to me. I installed it, and gradually became aware of all the projects that surrounded it: the GNU Project, the Free Software Foundation (FSF), the League for Programming Freedom. As time went on, I came into contact with GIMP. To install it, you needed lesstif. I had heard of Motif before, and I thought it was good that there was a project that wanted to provide a free version of it. That lesstif has been licensed under a non-free license for a long time didn't bother me then. It was better than nothing.

Gradually I became aware of the real reason for the existence of organizations such as the FSF. I started releasing my own code under licenses such as the GNU General Public License (GNU GPL).

As I started to become more involved in the free software movement, starting with the GNU Translation Project, I came into contact with the real values of licensing code under free licenses, and I learned to understand not only the virtues, but also the responsibilities that come with code released under the GPL. Being a user of (almost) only free software, I was already aware of the expectations of other programmers, and I learned to apply those feelings to my own software.

My code became free, and my mind followed. It was a gradual process, but I can't say I'm sorry it happened. I'm glad. But paypal still sucks.

Blosxom plugin: cvs/programming/blosxom

Posted on 2004-01-12 by ivo :: /programming/blosxom :: link

I was looking for a way to have the $Revision: 1.1 $ that CVS inserts in the theme file recognized and/or ignored by blosxom, so I could show the latest version of the HTML template in the page. Because the only plugin on the blosxom website that appears to do something that I think I wanted—cvsinfo—is unavailable, I wrote my own plugin to do it. You can download it here.

Documentation

NAME

Blosxom Plug-in: cvs

SYNOPSIS

Replaces CVS keywords (such as $Id: cvs,v 1.2 2004/01/12 12:59:43 ivo Exp $) with the part after the :.

INSTALLATION

Drop the cvs plug-in into your Blosxom plugins folder.

CONFIGURATION

None necessary.

VERSION

1.2

AUTHOR

Ivo Timmermans <ivo@o2w.nl>, http://www.lychnis.net/

LICENSE

cvs Blosxom Plug-in
Copyright 2004, Ivo Timmermans <ivo@o2w.nl>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Blosxom theme: lychnis/programming/blosxom

Posted on 2004-01-12 by ivo :: /programming/blosxom :: link

I finally made the theme for this website (http://www.lychnis.net/) available for download: lychnis 1.5.

The theme needs at least these plugins: better_title, cvs, headlines, sitelinks, theme, writeback, xhtml. Some of these plugins required tweaking to make them work, sometimes I had to correct the HTML code in them to be XHTML. I'll get these changes published once I've cleaned them up, or maybe I will submit them to the original author(s) of the plugin.

The original idea for the layout is copied from a design on OSWD, called libra. Among other things I changed it to be XHTML 1.1 compliant and to make more use of CSS.

Coding style guide for C code/programming

Posted on 2004-01-12 by ivo :: /programming :: link

After reading Fruit's Source Code Style, I have taken the document and modified it to match my own coding style. The result is this Coding style guide. Thanks to warp and Garion for their comments. More comments are always welcome.

It basically documents the style that I like to program in. This will differ from other people's. It is certainly not prescriptive, only indicative of what I think yields readable source code for any project.

Blosxom theme: scratchpad/programming/blosxom

Posted on 2004-01-09 by ivo :: /programming/blosxom :: link

I made the theme that I'm using for this page available here: version 0.1. It's pretty easily customizable with CSS. If your browser offers the possibility to select alternative stylesheets when viewing this page you can try it now to get an idea of what would be possible. The theme itself is loosely based on the iztsu theme.

The theme needs at least these plugins: archives, bloglinks, breadcrumbs, categories, find, htmllinks, readme, sitelinks, theme, writeback, xhtml. Some of these plugins required tweaking to make them work, sometimes I had to correct the HTML code in them to be XHTML. I'll get these changes published once I've cleaned them up, or maybe I will submit them to the original author(s) of the plugin.

Blosxom plugin: sitelinks/programming/blosxom

Posted on 2004-01-08 by ivo :: /programming/blosxom :: link

In the header of the page you'll see a new bar, with a list of shortcuts to more information on my website. To do this, I took the bloglinks plugin and changed it to a sitelinks plugin. The change to the code is minimal, the only thing that's really different is that it expects the contents to be aligned next to eachother instead of in a list structure. You can download the code here.

Lython/programming

Posted on 2004-01-07 by ivo :: /programming :: link

What's that, did Hell just freeze over? I didn't see this one coming, but now it's possible to compile lisp code to python bytecode with lython

On the website it claims that the lisp code “resembles common lisp.” I took a look at the example code fragments that are distributed with the code. It looks as if lython is just a wrapper for python, that takes lisp syntax and translates it to python syntax. And it's only the lisp syntax that has been implemented so far, no native lisp functions (all function calls are directly translated to python function calls). But Common Lisp is so much more than just syntax, it's a massive amount of function calls. These will have to be supported before lython becomes a useful program. Until that time, I fear that lython will be nothing more than a toy to scare your friends with on a cold winter night.

Design Patterns in Python/programming

Posted on 2003-12-18 by ivo :: /programming :: link

While searching for an algorithm to sort a graph topologically, I found an online version of the book Data Structures and Algorithms with Object-Oriented Design Patterns in Python. This should be mandatory reading for anyone wishing to do something a little bit more complicated in python. (By the way, the website has implementations in Java, C++ and C# as well.)

The author was nice enough to include a link to a fully working python package, in which the classes from the book have been completed and extended to a very nice and easy to use library. Unfortunately it's unusable in any project, because it has no license. There's a copyright notice in the package, and a little notice about the copyright on the texts on his website, but no license.

I have mailed the author, asking for an explanation. Let's see if he answers, and what he has to say…

Released python-gnutls 0.2/programming/python-gnutls

Posted on 2003-12-01 by ivo :: /programming/python-gnutls :: link

I have just released version 0.2 of my python wrapper for gnutls. The project homepage for python-gnutls is http://home.o2w.net/~ivo/python-gnutls/. The released files are here.

The changes since version 0.1 include:

  • New classs 'server' and 'conn' have been added. The classes 'client' and 'server' are derived from 'conn', and most methods from 'client' have been moved to 'conn'.
  • A method handshake() has been added. The handshake is no longer done implicitly in the gnutls.client constructor.
  • New methods:
    • Class conn: cipher_get, cipher_set_priority, compression_get, compression_set_priority, kx_get, kx_set_priority, mac_get, mac_set_priority
    • Class server: generate_dh_params
  • New constants defining various gnutls functions. They are named exactly like their counterparts in gnutls/gnutls.h, but without the GNUTLS_ prefix.

Reporting bugs in software/programming

Posted on 2003-11-29 by ivo :: /programming :: link

For free software to work, it is essential that people report any bugs they may find to the authors. When the feedback is accurate and correct enough, they can then fix their software.

However, when you've finally tracked down a bug in a software package, it's sometimes a lot of very frustrating work to find out what exactly is causing the software to break. This is essential information to the programmers, without this they usually wouldn't know where to start looking for the cause of this problem (there are exceptions of course).

As a Debian developer, I know that having incomplete information in a report for a bug that doesn't manifest itself on my system can be extremely frustrating and annoying. Asking the user for more information can help, maybe he needs to be guided a little, for example by providing a way to get a gdb backtrace.

Here's the tale of my latest adventure in this area, as a user. About two weeks ago, a CGI script written in Perl was failing mysteriously on a production server. I had checked everything, even changed the locale the script was running in. The code was pretty simple:

if ($value =~ /^$allowed$/m) {

$allowed is .*, and $value contained some UTF-8 text, with an ä in it. Nothing out of the ordinary, but it wouldn't work. The match statement would always be false. After hours of debugging, it turned out that when $allowed was compared to .*, it wasn't equal. We could set it to .*, in which case the expression was true. (Using $allowed = join('', split('', $allowed)) didn't help either, but maybe perl optimized that a bit.)

So, we looked at the perl bugs. The system I was developing this on was running Debian stable (woody), which has perl 5.6. So you have to look at bugs in perl 5.6. Or Debian bugs for the stable release. The bug under examination may or may not have been fixed already, either in new releases, CVS code, Debian patches, mailing list posts, or somewhere else entirely.

On the other hand, the bug might not even be in perl. The value of $allowed is passed on via a complicated structure of hashrefs and arrays from a parsed XML file, using XML::Simple. So maybe that module is at fault. XML::Simple gets its data from expat, so it might even be expat.

All these little bits of information make it pretty hard to find out if a bug has been fixed or not. I have been trying to see if the bug exists in more recent versions of perl, but so far I haven't been able to reproduce the situation well enough. For one thing, I would have to setup a system that is exactly the same as the production platform, which may take up quite a bit of time. And of course, time is money.

So, what do you do, report the bug or not, knowing that the information you have is incomplete, probably inconsistent and maybe even incorrect; knowing that the developers may ignore or flame you for your report?

I didn't. Hacking around the problem by replacing ä with &auml; was much easier.

Python wrapper for gnutls/programming/python-gnutls

Posted on 2003-11-05 by ivo :: /programming/python-gnutls :: link

I've re-started on my little project to create a simple python wrapper for the GNU TLS library (gnutls). The code is available from CVS only for now. I'll create a more permanent website for it in the wiki, under PythonGnutls.

This time I'm not using SWIG, mostly because I couldn't find out how to create a custom class, without resorting to C++; and I wanted to understand better what SWIG is trying to do for me. Maybe I'll switch back at some time in the future, when the first issue is solved.

Dutch programming contest/programming

Posted on 2003-10-27 by ivo :: /programming :: link

Last weekend, I participated in the Dutch rounds of the InterCollegiate Programming Contest (ICPC). Since we ended on the highest place of all teams from Delft, we can continue to the Northwest European finals (NWERC), in Lund, Sweden. The final score list lists our team (ECFh) on the sixth place, but Quintiq and ASML are companies, so they don't count on the score list for students.

While this is good news in general, it is weird. Instead of sending the top-10 teams to the NWERC, the top-x teams from each university are admitted. I'm not fully familiar with the rules and regulations of the admission policy, but this strategy seems flawed.

The problem set was horrible. The problems were written very badly, with clear errors and very vague wording. The examples weren't really supporting the text, and sometimes a restriction was only given in the explanation for the example input/output.

I realize it's not easy to write a clear, challenging problem set, that still leaves enough pitfalls to make it interesting. But please, don't clutter the goal of these contests with weird requirements. For example, there was a problem in which the input was given in Roman numerals. The problem was hard enough to do in decimal numbers, the Roman numerals just make it harder to verify input and output. I don't think that this added value to that particular problem.

Commonly confused characters/programming

Posted on 2003-10-22 by ivo :: /programming :: link

When looking for the html entity for an ellipsis (), I came across this page. It shows the difference between the different apostrophes, double quotes, dashes and spaces. For each different character, the author lists how to create the character in UTF-8, HTML, LaTeX.

The page is a good read, even if you already know the difference.

Why sed rules/programming

Posted on 2003-10-22 by ivo :: /programming :: link

Or: why you should use perl when you notice that your sed expression is becoming far too complicated.

sed -e 's/^\([0-9]\+\);--;\([0-9]\+\);\([0-9]\+\);;\([0-9]\+\);--;\([0-9
]\+\);\([0-9]\+\);;\([0-9]\+\);--;\([0-9]\+\);\([0-9]\+\);;/pa=\1-\2\&za
=\3\&pb=\4-\5\&zb=\6\&pc=\7-\8\&zc=\9;/g' -e 's/zc=\([0-9]\+\);\([0-9]*\
);-\?-\?;\([0-9]*\);\([0-9]*\)/zc=\1\&pd=\2-\3\&zd=\4/g' | sed -e '=' |
sed -e 's/^/+/;N;s/^+\([0-9]\+\)\n/\1 /' | sed -e 's/^\([0-9]\+\) pa=\([
0-9]\+-[0-9]\+\)&za=\([0-9]\+\)&pb=\([0-9]\+-[0-9]\+\)&zb=\([0-9]\+\)&pc
=\([0-9]\+-[0-9]\+\)&zc=\([0-9]\+\)&pd=\([0-9]*-[0-9]*\)&zd=\([0-9]*\)$/
pa\1=\2\&za\1=\3\&pb\1=\4\&zb\1=\5\&pc\1=\6\&zc\1=\7\&pd\1=\8\&zd\1=\9/g
' | tr '&' '\n'

The first thing I ran into is that sed only handles nine backreferences. I should have switched then, but I was stubborn and managed to do it anyway using the trick of running sed twice on the same line.

I should have switched to perl or python or whatever else, but I almost had it working... until line numbers had to be added. I found an example in the info page, using the = and N commands. It worked, but since they had to be inserted in each line in the output, another nasty regular expression emerged.

It worked, and luckily the input wasn't too big, but I really should have done this in perl right from the start, like I usually do…