October 2011 Archives


This is one of those problems that searching the Internet throws up zero results on, so I thought I'd ensure that in 2 years when it happens to one other person in the world, they can find out how to fix it and save themselves a couple of weeks of head-scratching.

A virtual server I maintain runs Red Hat Enterprise Linux 4 (as a paravirtualised Xen guest), and is used to develop a COBOL application central to business operations. For various reasons, the version of COBOL is ACUCOBOL 6.2.

We'd acquired several new servers to be used as Xen hosts, all much newer and more powerful than the original host, so I decided to move the development VM over to one of them.

50GB+ of data transfer later, I fired up the domU in its new home, and it booted fine. Ran the application's main menu to test it, and - nothing. All I got was

Memory access violation
COBOL error at 000000 in [[program]]

On investigation with strace I found that the runtime interpreter was segfaulting immediately after loading the program, before even trying to run any COBOL code at all.

Running in HVM mode instead of PV, with a regular kernel in the domU instead of a Xen kernel, caused the problem to go away - but at the cost of terrible performance and only 1 CPU available. However, running an SMP or PAE kernel in HVM mode made the error come back.

It turns out that the problem was caused by the non-executable pages feature available on the newer Xen hosts' CPUs, which is enabled automatically in PAE and Xen kernels (and presumably the SMP kernel for RHEL4).

To fix it, even in PV mode with a Xen kernel, just add this to the domU's kernel command line in GRUB and reboot:


This disables the feature, enabling ACUCOBOL to run.