[Netkit.users] RE: tap problems - Netkit bug

Massimo Rimondini rimondin at dia.uniroma3.it
Sun Feb 17 21:19:21 CET 2008


Hi Pablo,

ok, I've got an explanation for the problem you are experiencing.
This is actually a bug, and that's the reason why I am posting this
conversation on the mailing list.

Actually, the bug affects the setup of virtual machines that use TAP
interfaces. At least one of these two behaviors may be observed:
- A virtual machine with multiple network interfaces having a TAP interface
that is not numbered lowest (e.g., --eth0=A --eth1=tap,10.0.0.1,10.0.0.2)
does not get properly configured.
- A virtual machine with multiple network interfaces having a TAP interface
that is not numbered highest (e.g., --eth0=tap,10.0.0.1,10.0.0.2 --eth1=A)
does not start at all and ends up with a TUNSETIFF error (this applies to
your case).

The bug is confirmed for Netkit version 2.6, but is likely to apply to
previous releases as well.
The problem should be solved by applying the patch I have sent in this
reply:
http://list.dia.uniroma3.it/pipermail/netkit.users/2008-February/000342.html
For your convenience, I attach the same patch once more below.
To apply it, enter the $NETKIT_HOME directory and run 'patch -p1 <
patch_file_name'.

Sorry for not figuring this out before, but I had inadvertently performed
the tests on our development version, which is not affected by this bug.

Regards,
Massimo.


========================================================
diff -Naur netkit-old/bin/script_utils netkit-new/bin/script_utils
--- netkit-old/bin/script_utils	2007-12-19 10:55:58.000000000 +0100
+++ netkit-new/bin/script_utils	2008-02-02 12:48:46.000000000 +0100
@@ -317,8 +317,8 @@
 # This function starts all the hubs inside a given list
 runHubs() {
    local HUB_NAME BASE_HUB_NAME ACTUAL_HUB_NAME TAP_ADDRESS GUEST_ADDRESS
-   HUB_NAME="$1"
    while [ $# -gt 0 ]; do
+      HUB_NAME="$1"
       BASE_HUB_NAME="`varReplace HUB_NAME \".*_\" \"\"`"
       if [ "${BASE_HUB_NAME#tap${HUB_SOCKET_EXTENSION},}" !=
"$BASE_HUB_NAME" ]; then
          # This is an Internet connected hub
@@ -328,7 +328,7 @@
          startInetHub "$ACTUAL_HUB_NAME" "$TAP_ADDRESS" "$GUEST_ADDRESS"
       else
          # This is a normal hub
-         startHub "$1"
+         startHub "$HUB_NAME"
       fi
       shift
    done
========================================================



-----Original Message-----
From: Pablo Alonso [mailto:pabloalonsop a gmail.com] 
Sent: Wednesday, February 13, 2008 12:41 PM
To: Massimo Rimondini
Subject: Re: tap problems


Dear Massimo,

I have executed those commands after running the lab (rete4p2 which doesn't
work) and the output is the following one: 

Password:
TUNSETIFF: Device or resource busy
Error while configuring the tunnel.
Error while starting virtual machine "N3".
alonso a alonso:~/netkit/rete4p2$ [ -e ~/.netkit/hubs/vhub_${USER}_tap.cnct ]
&& echo ok
ok
alonso a alonso:~/netkit/rete4p2$ lsof -f --
~/.netkit/hubs/vhub_${USER}_tap.cnct
COMMAND    PID   USER   FD   TYPE     DEVICE SIZE  NODE NAME
uml_switc 8756 alonso    3u  unix 0xd31a0580      24721
/home/alonso/.netkit/hubs/vhub_alonso_tap.cnct
uml_switc 8756 alonso    6u  unix 0xcbf23d00      25317
/home/alonso/.netkit/hubs/vhub_alonso_tap.cnct
alonso a alonso:~/netkit/rete4p2$              

Thanks again,

Pablo


2008/2/12, Massimo Rimondini <rimondin a dia.uniroma3.it>:
Dear Pablo,

the problem lies right in Netkit attempting to setup a TAP interface
multiple times: this should happen only once. As a further confirmation,
see the output of lstart on my machine below:

> max a gabbiano:~/TEMP$ lstart
>
> ======================== Starting lab ===========================
> Lab directory: /home/max/TEMP
> Version:       0.1
> Author:        Massimo Rimondini
> Email:         contact a netkit.org
> Web:           http://www.netkit.org/
> Description:
> Sample lab for testing purposes
> =================================================================
> Starting "N1" with options "-q --eth0 A --hostlab=/home/max/TEMP
> --hostwd=/home/max/TEMP"...
> Starting "N2" with options "-q --eth0=tap,10.0.0.11,10.0.0.2 --eth1 A
> --hostlab=/home/max/TEMP --hostwd=/home/max/TEMP"...
> Password:
> Starting "N3" with options "-q --eth0=tap,10.0.0.1,10.0.0.3 --eth1 B
> --hostlab=/home/max/TEMP --hostwd=/home/max/TEMP"...
> Starting "N4" with options "-q --eth0 B --hostlab=/home/max/TEMP
> --hostwd=/home/max/TEMP"...
>
> The lab has been started.
> =================================================================


This behavior is likely to be due to Netkit being unable to detect that
a virtual hub is already running for the TAP interface. Could you please
provide the output of the following two commands, executed after
attempting to launch one of the faulty labs?

[ -e ~/.netkit/hubs/vhub_${USER}_tap.cnct ] && echo ok
lsof -f -- ~/.netkit/hubs/vhub_${USER}_tap.cnct

I suspect the second one might somehow fail.

Thank you and regards,
Massimo.



Pablo Alonso wrote:
> Hi Massimo,
>
> I've tried again this morning after running vclean -T but nothing
> changes, it always appears the same mistake.
> I am working with Kubuntu. I send you three different configurations:
> the first one works perfectly (four nodes witho only a tap interface,
> and other node with 3 interfaces); the second one as a result of using
> two tap in two different nodes doesn't work (the system ask two times
> for the password and later appears an error message); finally the
> third one has a node with three interfaces in which one is a tap and
> it also doesn't work (the system also asks me the password twice and
> after it, the error message appears).
> It always appears the same message (in both cases, 2 and 3):
>
> Password:
> TUNSETIFF: Device or resource busy
> Error while configuring the tunnel.
> Error while starting virtual machine "N3".
>
> Thank you very much.
> (I am working in all cases in a same subnet with different collision
> domains, but this point shouldn't interfere, should it?)
>
> Pablo
>
>
>
> 2008/2/6, Massimo Rimondini <rimondin a dia.uniroma3.it>:
>
>>  Hi Pablo,
>>
>>  at my site everything seems to work fine: likely, I am not properly
>> reproducing the scenario in which you experience the fault.
>>  Could you please provide a concrete example, e.g., lab.conf, sequence of
>> vstarts, etc.? And, please also mention the Linux distribution you are
>> using: that might be useful to discover the responsibility of default
>> settings in this problem.
>>
>>  Also, in the meantime try again after running 'vclean -T'. This gets rid
of
>> any stale tap configurations.
>>
>>  Thank you,
>>  Massimo.
>>
>>
>>
>>  Pablo Alonso wrote:
>> Hi!
>>
>>  I am starting to work with netkit and I have a problem with this
command:
>> eth0=tap,10.0.0.1,10.0.0.2.
>>  Working in a lab context I can use it without problems if I use it only
>> once time, but I can't use it in two different virtual machines. Neither
I
>> can build a virtual machine with three interfaces (of which one is tap).
In
>> both cases it always appears a system error:
>>
>>             TUNSETIFF: Device or resource busy
>>             Error while configuring the tunnel.
>>             Error while starting virtual machine "N2".
>>
>>  If you could suggest me one possible solution, I would be really
grateful
>>
>>  Thanks.
>>
>>  Pablo Alonso.
>>
>>




More information about the Netkit.users mailing list