[Netkit.users] Issue with vlan/mtu

Csaba Kiraly kiraly a dit.unitn.it
Ven 23 Nov 2007 09:43:56 CET

Hello Cedric, Saverio,

I don't really know 802.1q, but I give it a try.

To me it seems that your net is not working at L2, and you can't really 
expect L3 to solve it. In more detail:

An ICMP "fragmentation needed" can only be sent by an L3 router who 
drops the packet becouse L3 fragmentation is not allowed and it can't 
fit on the next L2 link. Your bridge is definately not one of these. 
Instead, you have an L2 network in which on side (PC1) can't send frames 
to the other side (FW), even if the size of the frame is within the 
allowed limits, at least according to PC1. I don't know of any 
automatism at L2 level that should discover the right max frame size, so 
you are stuck.

This was the theory, as I see it. Now, what can be done:
- Making the "bridge" more intelligent" let's call it an L3 switch 
(anyway it was at least a switch already before). I suppose some of 
those boxes that are called L3 switches can throw back an ICMP 
fragmentation needed.
- Reducing MTU on the PC1 side: seems to work
- Increasing MTU on the other side: did you find out why it doesn't 
work? If frames are not going out, you can still easily debug the whole 
kernel of your bridge virtual machine. See later ...

Regarding the idea of some card drivers handling MTU wrong:
We have seen something similar: we were adding our own protocol header, 
and from that point we had problems with some cards, but not with 
others. Unfortunately I did not manage to reproduce it on my  PCs, but 
my collegues traced it back to some card dependent kernel code at quite 
a low level.
Finally, we have solved the issue not by fixing the driver code, but by 
fixing the reported header size of our protocol ...

If you want to look at what is happening with your frames inside the 
kernel, you can debug the whole kernel with Netkit quite easily. Here is 
a tutorial:


In some words:
- recompile your kernel with SKAS3 patch (this might not be necessary, I 
think you can do the debug without this as well)
- compile a kernel for Netkit with ARCH=um and a suitable .config file
- make Netkit use your kernel
- run your lab, fire up DDD (or gdb) with the kernel file, make it 
attach to the kernel of your "bridge" VM, place your breakpoint, and see 
what's happening.


BTW, an idea to Netkit maintainers: it would be nice to have a 
debuggable version of the netkit kernel as a downloadable package as well

Cedric Foll wrote:
>> Hold on, you mean that the Linux box is VLAN unaware, and the the IP
>> packet transits towards a VLAN aware segment ??
> PC1, right. But it's not his problem. The link between PC1 and Bridge
> isn't a 802.1q
> I have this:
> PC1-------Bridge------(802.1q)----FW
> PC2--------|
> If i force PC1 with a mtu of 1496 it works but i don't want to solve
> the problem there.
>> In this case OF COURSE the first switch at the edge of the VLAN aware
>> segment of the Network will fragment the packet if HIS MTU is not
>> bigger than 1500. This is the normal behaviour as far as I know.
> But bridge doesn't fragment anything (maybe i have to do a echo 1 >
> somewhere in /proc...)
>> The REAL solution is having a MTU of  at least 1504 bytes in the VLAN
>> aware network
> I've already tried it:
> ifconfig eth2 mtu 1504 on Bridge
> ifconfig eth0 mtu 1504 on FW but doeesn't solve anything.
>> There is not ICMP message for indicating that IP packets are being
>> fragmented. The only case is when the Dont' Fragment bit is set, and
>> packet is dropped, in that case you get back an ICMP
> Right. Is it an option in /proc which force all packet with this bit ?
> Thanks for you help.
> _______________________________________________
> Netkit.users mailing list
> Netkit.users at list.dia.uniroma3.it
> http://list.dia.uniroma3.it/mailman/listinfo/netkit.users

Maggiori informazioni sulla lista Netkit.users