Dvbmonkey’s Blog

February 27, 2009

Getting started with Open MPI on Fedora

Filed under: linux — dvbmonkey @ 4:36 pm
Tags: , , , ,

Recently rediscovered the world of parallel computing after wondering what to do with a bunch of mostly idle Linux boxes, all running various versions of Fedora Core Linux. I found this guide particularly useful and decided to elaborate on the subject here.

Background

Open MPI is an open-source implementation of the Message Passing Interface which allows programmers to write software that runs on several machines simultaneously. Furthermore it allows these copies of the program to communicate/cooperate with each other to say… share the load of an intensive calculation amongst each other or, daisy-chain the results from one ‘node’ to another. This is not new, its been around for decades and today it is one of the main techniques used in Supercomputing platforms.

The basic principle is you need two things, firstly the MPI development suite in order to build your MPI-capable applications (e.g. Open MPI) and secondly a client/server queue manager to distribute the programs to remote computers and return the results (e.g. TORQUE). Both these components are distributed by the Fedora Project and are readily available.

Setting up the TORQUE server

Firstly, you will need to doctor the /etc/hosts file, placing your preferred hostname infront of “localhost” on the “127.0.0.1” line, example:

127.0.0.1 mpimaster localhost.localdomain localhost

Now, you will need to install the following packages, using something like YUM, the package torque-client will require some GUI related libraries (freetype, libX*, tcl, tk etc.) even if you’re not using X on the torque-server.

$ sudo yum install torque torque-client torque-server torque-mom libtorque

Next you will need to do some setup stuff, if you get a warning that pbs_server is already running do a /etc/init.d/pbs_server stop:

$ sudo /usr/sbin/pbs_server -t create
$ sudo /usr/share/doc/torque-2.1.10/torque.setup root

Now, create the following file and put the hostname of this server.

/var/torque/mom_priv/config:
$pbsserver mpimaster

Create another file, this will contain a list of all the nodes/clients we’re going to be using. The parameter “np=4” describes the number of processors (or cores) available on this node, in both cases below the client will be a QuadCore processor so I have set “np=4”. If you need to add more nodes to your MPI cluster at a later time, this is where you configure them.

/var/torque/server_priv/nodes:
mpinode01 np=4
mpinode02 np=4

We create another config file, this time just containing the hostname of the server machine.

/var/torque/server_name:
mpiserver

Now we update IPTables to allow incoming connections to the server, an example of my own configuration with the additional two lines in bold opening up tcp/udp ports 15000 to 15004. Once done run $ sudo /etc/init.d/iptables restart to pickup the new settings.

/etc/sysconfig/iptables:
# Firewall configuration written by system-config-firewall
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 15000:15004 -j ACCEPT
-A INPUT -p udp -m udp --dport 15000:15004 -j ACCEPT

-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

IMPORTANT: Commands are sent to the client nodes over RSH/SSH, in order to make this all work its assumed you’ve setup key-based SSH from the server to each of the client nodes.

All done, a quick restart of the torque server and we’re onto setting up our client nodes.

$ sudo /etc/init.d/pbs_server restart
$ sudo /etc/init.d/pbs_mom restart

Setting up the TORQUE client nodes

Going for speed/efficiency, I devised a one-line shell command to install and configure each of the clients if you are logged on as root:

# yum -y install torque-client torque-mom && echo -e "192.168.0.240\tmpimaster" >> /etc/hosts && echo "mpimaster" >> /var/torque/server_name && echo "\$pbsserver mpimaster" >> /var/torque/mom_priv/config && /etc/init.d/pbs_mom start

But basically it breaks down into the following:

* Install the client software
$ sudo yum install openmpi torque-client torque-mom

* Add the server’s hostname and address to the /etc/hosts file
# echo -e "192.168.0.240\tmpimaster" >> /etc/hosts

* Set the server’s hostname in the config file(s)
# echo "mpimaster" >> /var/torque/server_name
# echo "\$pbsserver mpimaster" >> /var/torque/mom_priv/config

* Start the service
/etc/init.d/pbs_mom start

Testing it out

From the the ‘mpimaster’ machine, you should be able to issue the command pbsnodes -a and see the client machines connected e.g.

$ pbsnodes -a
mpinode01
state = free
np = 4
ntype = cluster
status = opsys=linux,uname=Linux pepe 2.6.27.12-170.2.5.fc10.i686.PAE #1 SMP Wed Jan 21 01:54:56 EST 2009 i686,sessions=? 0,nsessions=? 0,nusers=0,idletime=861421,
totmem=5359032kb,availmem=5277996kb,physmem=4146624kb,ncpus=4,loadave=0.00,netload=104310870,state=free,jobs=? 0,rectime=1235751237

mpinode02
state = free
np = 4
ntype = cluster
status = opsys=linux,uname=Linux taz 2.6.27.12-170.2.5.fc10.i686.PAE #1 SMP Wed Jan 21 01:54:56 EST 2008 i686,sessions=? 0,nsessions=? 0,nusers=0,idletime=366959,
totmem=5359048kb,availmem=5277268kb,physmem=4146640kb,ncpus=4,loadave=0.00,netload=46008061,state=free,jobs=? 0,rectime=1235751223

If you see this, congratulations! you are ready to rock! If your client nodes are not connected, check the configuration, network connectivity and lastly, check the ‘pbs_mom’ service is running on each client, optionally try restarting the ‘pbs_mom’ service.

MPI Development

You’ll need to install a couple of additional packages on your development machine,

$ sudo yum install openmpi openmpi-devel openmpi-libs

Now lets start with the inevitable ‘Hello World!’ example,

hello.c:

#include <stdio.h>
#include <mpi.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
   int numprocs, rank, namelen;
   char processor_name[MPI_MAX_PROCESSOR_NAME];
   MPI_Init(&argc, &argv);
   MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   MPI_Get_processor_name(processor_name, &namelen);
   printf("Hello World! from process %d out of %d on %s\n", rank, numprocs, processor_name);
   MPI_Finalize();
}

Normally we’d just use gcc to build this, but for convenience MPI provides a mpicc which handles the include and library paths for you.

$ mpicc hello.c -o hello

In order to tell Open MPI / Torque where to run your application we must provide it with a “hostfile”, similar to the file /var/torque/server_priv/nodes we made earlier:

./myhostfile:

mpinode01 slots=4
mpinode02 slots=4

Now, we’re ready to run it for the first time. Note, in this example I did my development work on the machine acting as the ‘mpiserver’ – if you try submitting an MPI job from another machine you might need slightly different configuration.

$ mpirun --hostfile myhostfile hello
Hello World! from process 0 out of 8 on mpinode01
Hello World! from process 1 out of 8 on mpinode01
Hello World! from process 2 out of 8 on mpinode01
Hello World! from process 3 out of 8 on mpinode01
Hello World! from process 4 out of 8 on mpinode02
Hello World! from process 5 out of 8 on mpinode02
Hello World! from process 6 out of 8 on mpinode02
Hello World! from process 7 out of 8 on mpinode02

VoilĂ , you have just submitted an MPI task and had it execute on a number of your processors.

MPI makes distributing & communication between copies of your programs easy, however its up to you to use this potential to provide a real speed up in a real-work application. A really simple example is a program that operates on a set of 8 large files. Normally, while running on a single processor you would process these files sequentially. Using MPI you could load 8 copies of your program on 8 processing nodes, and have each node process a different file. Effectively giving you a 8-times speed up compared to running it on a single processor.

I’ve loosely tested the approach described here on different systems running Fedora Core Linux versions 8, 9 & 10. Any questions / comments welcomed!

Troubleshooting
Firstly, try out the Open MPI FAQ’s, personally I encountered the following problems:

  • mpirun appears to ‘hang’: caused by iptables, I just shut down iptables to resolve the issue.
  • Fedora Core 7: the package sets the wrong library path in /etc/ld.so.conf
  • Fedora Core 7: the package included with the distribution ‘doesnt work’, library issues

Updated: 2nd March 2009
Ooppss! as Jeff Squyres pointed out in his comment below, the way I configured things in the original post meant that “mpirun” just spawned 8 processes on my localhost – not the remote nodes. I’ve reworked the configuration to account for this. Many thanks Jeff!

Advertisements

February 18, 2009

Hauppauge WinTV-NOVA-HD-S2 working on Fedora Rawhide

Filed under: dvb,linux — dvbmonkey @ 12:32 pm
Tags: , , , , , , , ,

Long story short, I installed Fedora Core 10, enabled fedora-rawhide.repo and upgraded to Rawhide (17th Feb ’09), giving the following post install/upgrade:

acl-2.2.47-3.fc10.i386
apr-1.3.3-3.fc11.i386
apr-util-1.3.4-2.fc11.i386
apr-util-ldap-1.3.4-2.fc11.i386
attr-2.4.43-1.fc10.i386
audit-libs-1.7.11-2.fc11.i386
audit-libs-python-1.7.11-2.fc11.i386
authconfig-5.4.7-1.fc11.i386
basesystem-10.0-1.noarch
bash-4.0-0.4.rc1.fc11.i386
binutils-2.19.51.0.2-12.fc11.i386
bzip2-1.0.5-3.fc10.i386
bzip2-libs-1.0.5-3.fc10.i386
ca-certificates-2008-7.noarch
checkpolicy-2.0.16-3.fc10.i386
chkconfig-1.3.41-1.i386
compat-db45-4.5.20-5.fc10.i386
ConsoleKit-libs-0.3.0-3.fc11.i386
coreutils-7.0-7.fc11.i386
cpio-2.9.90-3.fc11.i386
cpp-4.4.0-0.19.i386
cracklib-2.8.13-2.i386
cracklib-dicts-2.8.13-2.i386
cronie-1.2-7.fc10.i386
crontabs-1.10-28.fc11.noarch
curl-7.18.2-9.fc11.i386
cyrus-sasl-lib-2.1.22-21.fc11.i386
db4-4.7.25-9.fc11.i386
db4-utils-4.7.25-9.fc11.i386
dbus-1.2.4.4permissive-1.fc11.i386
dbus-glib-0.80-1.fc11.i386
dbus-libs-1.2.4.4permissive-1.fc11.i386
dbus-python-0.83.0-4.fc11.i386
device-mapper-1.02.30-1.fc11.i386
device-mapper-libs-1.02.30-1.fc11.i386
dhclient-4.1.0-5.fc11.i386
diffutils-2.8.1-22.fc11.i386
dirmngr-1.0.2-1.fc10.i386
dmraid-1.0.0.rc15-5.fc11.i386
dvb-apps-1.1.1-12.fc10.i386
dvbsnoop-1.4.50-99.fc9.i386
e2fsprogs-1.41.4-2.fc11.i386
e2fsprogs-libs-1.41.4-2.fc11.i386
ed-1.1-1.fc10.i386
efibootmgr-0.5.4-4.fc9.i386
elfutils-0.140-1.fc11.i386
elfutils-libelf-0.140-1.fc11.i386
elfutils-libs-0.140-1.fc11.i386
ethtool-6-2.20090115git.fc11.i386
exim-4.69-9.fc11.i386
expat-2.0.1-5.i386
fedora-logos-10.0.1-4.fc11.noarch
fedora-release-10.91-1.noarch
fedora-release-notes-10.0.0-1.noarch
file-5.00-2.fc11.i386
file-libs-5.00-2.fc11.i386
filesystem-2.4.19-1.fc10.i386
findutils-4.4.0-1.fc10.i386
fipscheck-1.0.4-1.fc11.i386
gamin-0.1.10-3.fc11.i386
gawk-3.1.6-4.fc11.i386
gcc-4.4.0-0.19.i386
gdbm-1.8.0-29.fc10.i386
glib2-2.19.7-1.fc11.i586
glibc-2.9.90-3.i686
glibc-common-2.9.90-3.i386
glibc-devel-2.9.90-3.i386
glibc-headers-2.9.90-3.i386
gmp-4.2.4-4.fc11.i386
gnupg2-2.0.10-1.fc11.i386
gpgme-1.1.7-1.fc10.i386
gpg-pubkey-4ebfc273-48b5dbf3
grep-2.5.3-3.fc11.i386
grub-0.97-38.fc10.i386
grubby-6.0.77-1.fc11.i386
gzip-1.3.12-7.fc10.i386
hdparm-9.8-1.fc11.i386
httpd-2.2.11-6.i386
httpd-tools-2.2.11-6.i386
hwdata-0.222-1.fc11.noarch
info-4.13a-1.fc11.i386
initscripts-8.89-1.i386
iproute-2.6.28-2.fc11.i386
iptables-1.4.1.1-2.fc10.i386
iptables-ipv6-1.4.1.1-2.fc10.i386
iputils-20071127-6.fc10.i386
isomd5sum-1.0.5-1.fc11.i386
kbd-1.15-4.fc11.i386
kernel-2.6.27.5-117.fc10.i686
kernel-2.6.29-0.124.rc5.fc11.i586
kernel-firmware-2.6.29-0.124.rc5.fc11.noarch
kernel-headers-2.6.29-0.124.rc5.fc11.i586
keyutils-libs-1.2-3.fc9.i386
kpartx-0.4.8-7.fc10.i386
krb5-libs-1.6.3-17.fc11.i386
kudzu-1.2.85-2.i386
less-424-1.fc10.i386
libacl-2.2.47-3.fc10.i386
libattr-2.4.43-1.fc10.i386
libcap-2.10-2.fc10.i386
libcurl-7.18.2-9.fc11.i386
libdhcp-1.99.8-1.fc10.i386
libdhcp4client-4.0.0-33.fc10.i386
libdhcp6client-1.0.22-1.fc10.i386
libgcc-4.4.0-0.19.i386
libgcrypt-1.4.4-1.fc11.i386
libgomp-4.4.0-0.19.i386
libgpg-error-1.6-2.i386
libidn-0.6.14-9.i386
libksba-1.0.5-1.fc11.i386
libnl-1.1-5.fc10.i386
libpng-1.2.34-1.fc11.i386
libselinux-2.0.77-3.fc11.i386
libselinux-python-2.0.77-3.fc11.i386
libselinux-utils-2.0.77-3.fc11.i386
libsemanage-2.0.31-2.fc11.i386
libsemanage-python-2.0.31-2.fc11.i386
libsepol-2.0.34-1.fc11.i386
libssh2-1.0-1.fc11.i586
libstdc++-4.4.0-0.19.i386
libusb-0.1.12-20.fc10.i386
libuser-0.56.9-2.i386
libutempter-1.1.5-2.fc9.i386
libvolume_id-137-4.fc11.i386
libxml2-2.7.3-1.fc11.i386
linux-atm-libs-2.5.0-5.i386
logrotate-3.7.8-1.fc11.i386
lsof-4.81-2.fc11.i386
lua-5.1.4-1.fc10.i386
lvm2-2.02.44-1.fc11.i386
lzma-4.32.7-1.fc10.i386
lzma-libs-4.32.7-1.fc10.i386
m4-1.4.12-1.fc11.i386
mailcap-2.1.29-1.fc11.noarch
make-3.81-14.fc10.i386
MAKEDEV-3.24-1.i386
mdadm-3.0-0.devel2.1.fc11.i386
mercurial-1.1.2-3.fc11.i386
mingetty-1.08-2.fc9.i386
mkinitrd-6.0.77-1.fc11.i386
module-init-tools-3.7-1.fc11.i386
mpfr-2.4.0-1.fc11.i386
nash-6.0.77-1.fc11.i386
ncurses-5.7-1.20090207.fc11.i386
ncurses-base-5.7-1.20090207.fc11.i386
ncurses-libs-5.7-1.20090207.fc11.i386
net-tools-1.60-91.fc10.i386
newt-0.52.10-2.fc11.i386
newt-python-0.52.10-2.fc11.i386
nspr-4.7.3-3.fc11.i386
nss-3.12.2.0-4.fc10.i386
openldap-2.4.12-3.fc11.i386
openssh-5.1p1-7.fc11.i386
openssh-server-5.1p1-7.fc11.i386
openssl-0.9.8j-7.fc11.i686
pam-1.0.90-2.fc11.i386
parted-1.8.8-12.fc11.i386
passwd-0.76-1.fc11.i386
patch-2.5.4-36.fc11.i386
pciutils-3.1.1-1.fc11.i386
pciutils-libs-3.1.1-1.fc11.i386
pcre-7.8-1.fc10.i386
perl-5.10.0-58.fc11.i386
perl-libs-5.10.0-58.fc11.i386
perl-Module-Pluggable-3.60-58.fc11.i386
perl-Pod-Escapes-1.04-58.fc11.i386
perl-Pod-Simple-3.07-58.fc11.i386
perl-version-0.74-58.fc11.i386
pinentry-0.7.4-5.fc9.i386
pkgconfig-0.23-7.fc11.i386
plymouth-0.6.0-2.fc11.i386
plymouth-libs-0.6.0-2.fc11.i386
plymouth-scripts-0.6.0-2.fc11.i386
policycoreutils-2.0.61-10.fc11.i386
policycoreutils-python-2.0.61-10.fc11.i386
popt-1.13-4.fc11.i386
ppl-0.10-6.fc11.i386
prelink-0.4.0-3.i386
procps-3.2.7-25.fc11.i386
psmisc-22.6-8.fc10.i386
pth-2.0.7-7.i386
pygpgme-0.1-11.20090121bzr54.fc11.i386
python-2.6-4.fc11.i386
python-iniparse-0.2.4-1.fc11.noarch
python-libs-2.6-4.fc11.i386
python-urlgrabber-3.0.0-11.fc11.noarch
readline-5.2-13.fc9.i386
redhat-rpm-config-9.0.3-5.fc11.noarch
rhpl-0.219-1.i386
rootfiles-8.1-2.fc11.noarch
rpm-4.6.0-4.fc11.i386
rpm-build-4.6.0-4.fc11.i386
rpm-libs-4.6.0-4.fc11.i386
rpm-python-4.6.0-4.fc11.i386
rsyslog-3.21.10-1.fc11.i386
screen-4.0.3-12.fc10.i386
sed-4.1.5-11.fc11.i386
selinux-policy-3.6.6-1.fc11.noarch
selinux-policy-targeted-3.6.6-1.fc11.noarch
setserial-2.17-22.fc9.i386
setup-2.7.7-4.fc11.noarch
shadow-utils-4.1.2-11.fc11.i386
slang-2.1.4-2.fc11.i386
sqlite-3.6.10-3.fc11.i386
system-config-firewall-tui-1.2.13-3.fc11.noarch
system-config-network-tui-1.5.95-1.fc11.noarch
sysvinit-tools-2.86-26.i386
tar-1.21-1.fc11.i386
tcp_wrappers-libs-7.6-53.fc10.i386
tzdata-2009a-1.fc11.noarch
udev-137-4.fc11.i386
unzip-5.52-9.fc9.i386
upstart-0.3.9-19.fc10.i386
usermode-1.99-2.i386
ustr-1.0.4-7.fc10.i386
util-linux-ng-2.14.2-2.fc11.i386
vim-minimal-7.2.088-1.fc11.i386
wget-1.11.4-2.fc11.i386
wireless-tools-29-2.fc9.i386
yum-3.2.21-9.fc11.noarch
yum-metadata-parser-1.1.2-11.fc11.i386
zlib-1.2.3-19.fc11.i386

I grabbed kernel-2.6.29-0.119.rc5.fc11.src.rpm from my friendly local mirror and hacked in an additional patch I discovered over at http://patchwork.kernel.org/patch/4637/.

diff -r 4086371cea7b -r 3542d1c1e03a linux/drivers/media/dvb/frontends/cx24116.c
--- a/drivers/media/dvb/frontends/cx24116.c	Sat Jan 17 17:23:31 2009 +0200
+++ b/drivers/media/dvb/frontends/cx24116.c	Thu Jan 29 20:21:07 2009 +0200
@@ -1184,7 +1184,12 @@ 
 	if (ret != 0)
 		return ret;
 
-	return cx24116_diseqc_init(fe);
+	ret = cx24116_diseqc_init(fe);
+	if (ret != 0)
+		return ret;
+
+	/* HVR-4000 needs this */
+	return cx24116_set_voltage(fe, SEC_VOLTAGE_13);
 }
 
 /*

A quick(ish) rpmbuild -ba SPECS/kernel.spec later and I had a shiny new kernel-2.6.29-0.119.rc5.fc11.i586 kernel to rpm -i, reboot and tried tuning – no joy complained about a missing dvb-fe-cx24116.fw firmware file. I obtained the file by dd’ing it out of the Windows drivers provided with the card and dropped it into /lib/firmware.
The card started working but only on DVB-S transponders, a bit more snooping around and I discovered szap-s2:

hg clone http://mercurial.intuxication.org/hg/szap-s2
cd szap-s2
make

A bit of head-scratching looking at the new options compared to the old szap then finally:

# ./szap-s2 -a 0 -S 1 -C 34 11798h
reading channels from file '/root/.szap/channels.conf'
zapping to 27 '11798h':
delivery DVB-S2, modulation QPSK
sat 1, frequency 11798 MHz H, symbolrate 29500000, coderate 3/4, rolloff 0.35
vpid 0x0201, apid 0x0281, sid 0x0002
using '/dev/dvb/adapter0/frontend0' and '/dev/dvb/adapter0/demux0'
status 1f | signal d0c0 | snr e667 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e99a | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e99a | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK
status 1f | signal d0c0 | snr e800 | ber 00000000 | unc 00000000 | FE_HAS_LOCK

I captured the raw_ts using dvbsnoop and all appears to be good!
*phew*

Create a free website or blog at WordPress.com.