Comments and notes about programming projects and general Linux or Open Source things.

09 November 2008

9:44 PM Term::VT102 0.91
It's been a while, but now there's a new version of Term::VT102. A few people have contacted me about the module over the past few weeks, and then Jörg Walter sent a patch to fix Unicode handling, which resurrected my interest in clearing a few of the TODOs from the list.

So, I cleaned it up a bit and extended the example scripts enough that I could effectively use Term::VT102 as a terminal emulator, and ran things like top and mutt within it to see how it handled. As a result I've fixed a few bugs in escape sequence handling and line wrapping as well as adding TAB stop support and callbacks for title changes and other private message strings.

There is also now an example script to show scrollback buffer processing for things like converting script logs or screen history into a flat file you can read with less without all the cursor positioning stuff getting in the way.

25 June 2008

11:47 PM Server move and upgrade
Recently I moved this web server's services from London to Dallas, which meant building a new installation pretty much from scratch. So instead of being based on a very creaky initial base of Red Hat 7.3, customised and running under UML, it's all now running on CentOS 5 under Xen.

Last night I upgraded the virtual hosts to CentOS 5.2, which went reasonably smoothly, so tonight I went ahead and upgraded the "real" host as well. That didn't go so well. On rebooting, everything came back up, but I couldn't route to any of the virtual hosts any more.

It seems that the updated version of Xen had modified some scripts which meant I ended up with two bridge devices - my old one, virbr0, containing all of my virtual hosts and an alias for the real host, and a new one, xenbr0, containing a renamed version of the raw Ethernet device plus one more interface I've blotted from my memory. For some reason this caused all of the iptables DNAT rules to fail to work. SNAT / masquerading for outbound connections worked fine, but inbound data would only go in; the responses wouldn't go back out.

Anyway, if you are trying to get Xen working again after upgrading and are seeing mysterious DNAT failures, try applying these two patches:

--- /etc/xen/scripts/network-bridge.rpmnew 2008-06-21 23:09:32.000000000 +0100
+++ /etc/xen/scripts/network-bridge 2008-05-20 21:14:32.000000000 +0100
@@ -110,8 +110,7 @@
ip addr show dev ${src} | egrep '^ *inet ' | sed -e "
s/inet/ip addr add/
s@\([0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+/[0-9]\+\)@\1@
-s/${src}/dev ${dst} label ${dst}/
-s/secondary//
+s/${src}/dev ${dst}/
" | sh -e
# Remove automatic routes on destination device
ip route list | sed -ne "

--- /etc/xen/scripts/xen-network-common.sh.rpmnew 2008-06-21 23:09:32.000000000 +0100
+++ /etc/xen/scripts/xen-network-common.sh 2008-05-20 21:14:32.000000000 +0100
@@ -120,12 +120,7 @@
ip link set ${bridge} arp off
ip link set ${bridge} multicast off
fi
-
- # A small MTU disables IPv6 (and therefore IPv6 addrconf).
- mtu=$(ip link show ${bridge} | sed -n 's/.* mtu \([0-9]\+\).*/\1/p')
- ip link set ${bridge} mtu 68
ip link set ${bridge} up
- ip link set ${bridge} mtu ${mtu:-1500}
}

# Usage: add_to_bridge bridge dev

I've not looked into why it works; the above is just a reversion to the scripts as they were before upgrading to xen-3.0.3-64.el5_2.1, and it works for me, so I'm happy.

06 March 2008

7:30 AM PV 1.1.4
I've finally got around to releasing version 1.1.4 of PV. Elias Pipping and Patrick Collison have been sending patches to improve compilation on Mac OS X, and there are a couple of minor cleanups: left-over IPC resources are cleaned up on termination thanks to Laszlo Ersek, and if you supply a non-numeric argument to an option that needs a number, you now get an error thanks to Boris Lohner.

Incidentally I did finish that toy garage in time, just forgot to update here. The lift needs a bit of work still - some sort of ratchet is needed, as it just slips at the moment. But the rest of it is in active service.

04 September 2007

7:30 AM RHEL 5 intermittent segfaults
For the past couple of months, on 12 servers, I have been seeing intermittent segmentation faults happening with the ssh, scp, and ntpstat commands. Those servers that weren't brand new had not exhibited that behaviour with RHEL 4 in the past, it was only when Red Hat Enterprise Linux 5 was installed that it began.

2 additional servers running RHEL 5 were not showing the same fault, but they weren't of the same type - all affected servers were IBM xSeries or System X with multiple processors and various model numbers, and all had ServeRAID cards.

I couldn't find any mention of such a fault anywhere except for on a CentOS bug tracker, bug ID 0002241.

After a few tests, it turned out that ntpstat would fail to run about 10 times in every 50000, or 0.02% of the time. Each failure, according to strace, was not actually with the program itself but with the attempt to run it - the execve call, which causes the program to be executed, was failing with an EINVAL error code, indicating some sort of problem to do with the ELF interpreter.

The only thing I could think of that would modify that sort of thing, and which would be nullified by the "replace RPMs with new ones, then replace new RPMs with old ones again" fix that the reporter described in the CentOS bug report, was prelink.

So, I turned it off, by editing /etc/sysconfig/prelink and by running prelink -au. Immediately after doing that, ntpstat worked 100% of the time instead of 99.98%.

I'm presuming that something to do with prelink's address space randomisation was breaking stuff on the servers I'm using, but I am not in a position to test that or to try to find a proper fix, so for now it remains disabled.

In summary then, if you're having weird random segmentation faults and you're sure it's not a fault with your RAM (having tried Memtest86 and Lucifer to check), then run:
prelink -au
sed -i s/^PRELINKING=yes/PRELINKING=no/ /etc/sysconfig/prelink
...and see if your problems disappear.



Update: I now have the results of testing with different parameters to prelink:

Options to prelinkTest resultsSuccess
-au50000/50000100.00%
-amR49988/5000099.98%
-aR49986/5000099.97%
-am50000/50000100.00%

Each test run had prelink -au run after it followed by another test to make sure success went back to 100%.

Basically, the -R option to prelink seems to be the one that's cocking everything up.



Update: Kernel 2.6.18-53.1.4.el5 appears to fix this problem.

30 August 2007

7:30 AM PV 1.1.0
Version 1.1.0 of PV has been released. This release incorporates some fixes for Mac OS X, a couple of packaging cleanups, a dramatic improvement in the resource usage of the --rate-limit (-L) option, and two new features.

The first new feature to be added, --line-mode (-l) was a Debian wishlist request. This causes PV to count lines instead of bytes. While it's not something I have ever particularly wanted myself, it does sound like it might come in handy occasionally (and, more importantly, it didn't require much to be added to the code to make it work).

The second was one that I have occasionally found myself wanting, particularly during long network data transfers. The --remote (-R) option allows the settings of an already-running PV to be altered. This can be used to change the rate limit while a transfer is in progress, for example, or set PV's idea of the total size of all data to something different.

28 August 2007

7:30 AM QSF 1.2.7
Version 1.2.7 of QSF has been released. Like the recent PV release, this was prompted by inclusion in the Fedora Project and the resultant need to change the license to Artistic 2.0.

QSF's development is, again like PV, moving from SourceForge to Google Code.

07 August 2007

7:49 PM PV 1.0.1
Version 1.0.1 of PV has been released. This is a code cleanup release, prompted by the discovery that PV has been included in the Fedora Project - version 0.9.9 is available now in FC7 and as an "extra" package in FC6.

It can be interesting to go back to old code and see how the style has changed over time. With a fresh perspective, a few oddities were more obvious, so the occasional untidy section was rewritten and a few more comment blocks were added. The organisation of the functions was changed a bit so that the "command-line program" part is now distinct from the "PV functionality" part, which means if I decide in future to create a library to add progress indicators to other command-line programs it will be significantly easier. Not that it's likely, but it seemed to make things neater.