With batman-adv 2020.4 and the according backports to batman-adv v2019.2
several more bugs were found and fixed regarding the batman-adv
multicast optimizations feature.
Also a "wakeup-call" feature was added to the Linux bridge IGMP/MLD
snooping code in Gluon to work around issues with Android devices.
With batman-adv now at v2019.2, multicast-to-multi-unicasts conversion
is supported, too. Which means that even if there are a few outdated nodes
these and all other recipients will be served multicast packets via unicast,
too, as long as the sum of receiving nodes does not exceed the multicast
fanout setting (default: 16). If is exceeded, then batman-adv will revert
back to broadcast flooding automatically.
Long story short, with all these extra measures in place, let's reenable
the batman-adv multicast optimizations to reduce the layer 2 overhead
and in preparation for multicast applications in the future.
The default is enabled for this feature anyway, so removing the
"batctl multicast_mode 0" overwrite is sufficient.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
With very bad timing, it is possible that the teardown script of a
gluon_mesh interface runs when bat0 was just created, but primary0 is not
yet added to it. Although there is no hardif to remove in this case,
bat0 will still be deleted, because there is no hardif in bat0.
Disable the interface removal logic by passing `-M` to `batctl interface`.
The new routing_algo site.conf value BATMAN_IV_LEGACY is introduced. With
these changes, the routing_algo setting becomes mandatory.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
The commit b3762fc61c ("gluon-client-bridge: move IPv4 local subnet route
to br-client (#1312)") moves the IPv4 prefix from the local-port interface
to br-client. A client requesting an IPv4 connection to the IPv4 anycast
address of the node (the device running gluon) will create following
packets:
1. ARP packet from client to get the MAC of the mac address of the anycast
IPv4 address
2. ARP reply from node to client with the anycast MAC address for the IPv4
anycast address
3. IPv4 packet from client which requires reply (for example ICMP echo
request)
4. ARP request for the client MAC address for its IPv4 address in prefix4
(done with the mac address of br-client and transmitted over br-client)
5. IPv4 packet from node (transmitted over br-client with br-client MAC
address) as reply for the client IPv4 packet (for example ICMP echo
reply)
The step 4 is extremely problematic here. ARP replies with the anycast IPv4
address must not be submitted or received via bat0 - expecially not when it
contains an node specific MAC address as source. When it is still done then
the wrong MAC address is stored in the batadv DAT cache and ARP packet is
maybe even forwarded to clients. This latter is especially true for ARP
requests which are broadcast and will be flooded to the complete mesh.
Clients will see these ARP packets and change their own neighbor IP
(translation) table. They will then try to submit the packets for IPv4
anycast addresses to the complete wrong device in the mesh. This will for
example break the access to the status page to the connected device or the
anycast DNS forwarder implementation. Especially the latter causes extreme
latency when clients try to connect to server using a domain name or even
breaks the connection setup process completely. Both are caused by the
unanswered DNS requests which at first glance look like packet loss.
An node must therefore take care of:
* not transmitting ARP packets related to the anycast IPv4 address over
bat0
* drop ARP packets related to the anycast IPv4 when they are received on
bat0 from a still broken node
* don't accept ARP packets related to the anycast IPv4 replies on local
node when it comes from bat0
Fixes: b3762fc61c ("gluon-client-bridge: move IPv4 local subnet route to br-client (#1312)")
net.ipv6.conf.br-client.forwarding is moved from gluon-client-bridge to
gluon-mesh-batman-adv, as the setting is not useful with non-bridged
protocols.
Filtering by MAC address won't filter out multicast packages like router
solicitations, causing uradvd to send out router advertisements with
maximum frequency (every 3 seconds) in active meshes, even when no local
client is actually interested in the advertisements.
Fixes#1230
We now create bat0 and primary0 independently of the lower mesh interfaces,
making the whole setup a lot more robust. In particular:
- we can't accidentially destroy primary0 because of concurrent setup and
teardown runs of different interfaces
- bat0 will always exist, even when no mesh interfaces are up (e.g. no link
on wired mesh)
- interfaces going down and up again will never tear down the whole of
batman-adv
- we can enable and disable bat0 independently of the lower interface
states