The amount of local wifi clients is currently counted by two different
ways:
* asking the kernel wifi layer for the number of of clients on 2.4GHz and
5GHz band
* asking batman-adv for the number of non-timed out entries in the local
translation table with WiFi flag
The number of wifi24+wifi5 and the number of TT wifi client counts are
reported via respondd to various consumers. The ffrgb meshviewer is
displaying these values as:
* 2,4 GHz: wifi24
* 5 GHz: wifi5
* other: (TT local wifi+non-wifi clients) - (wifi24 + wifi5)
But the local translation table is holding entries much longer than the
wifi layer. It can therefore easily happen that a wifi client disappears in
the kernel wifi layer and batman-adv still has the entry stored in the
local TT.
The ffrgb meshviewer would then show this count in the category "other".
This often results in confusions because "other" is usually for ethernet
clients. And nodes with a frequently disappearing larger group of clients
(near bus stations or larger intersections) often show most clients under
the group "other" even when this devices doesn't have a LAN ethernet port.
It is better for presentation to calculate the number of total wifi clients
by summing up wifi24 + wifi5. And getting the number of total clients (non
wifi + wifi) by adding the result of the previous calculation to the sum of
non-wifi client in the local batman-adv translation table.
Fixes: 89a9d8138c ("gluon-mesh-batman-adv-core: Announce client count by frequency")
Reported-by: Pascal Wettin <p.wettin@gmx.de>
(cherry picked from commit b850fff7e4)
The commit b3762fc61c ("gluon-client-bridge: move IPv4 local subnet route
to br-client (#1312)") moves the IPv4 prefix from the local-port interface
to br-client. A client requesting an IPv4 connection to the IPv4 anycast
address of the node (the device running gluon) will create following
packets:
1. ARP packet from client to get the MAC of the mac address of the anycast
IPv4 address
2. ARP reply from node to client with the anycast MAC address for the IPv4
anycast address
3. IPv4 packet from client which requires reply (for example ICMP echo
request)
4. ARP request for the client MAC address for its IPv4 address in prefix4
(done with the mac address of br-client and transmitted over br-client)
5. IPv4 packet from node (transmitted over br-client with br-client MAC
address) as reply for the client IPv4 packet (for example ICMP echo
reply)
The step 4 and 5 are problematic here because packets use the node specific
MAC addresses from br-client instead of the anycast MAC address. The client
will receive the ARP packet with the node specific MAC address and change
their own neighbor IP (translation) table. This will for example break the
access to the status page to the connected device or the anycast DNS
forwarder implementation when the client roams to a different node.
This reverts commit b3762fc61c and adds an
upgrade code to remove local_node_route on on existing installations.
The commit b3762fc61c ("gluon-client-bridge: move IPv4 local subnet route
to br-client (#1312)") moves the IPv4 prefix from the local-port interface
to br-client. A client requesting an IPv4 connection to the IPv4 anycast
address of the node (the device running gluon) will create following
packets:
1. ARP packet from client to get the MAC of the mac address of the anycast
IPv4 address
2. ARP reply from node to client with the anycast MAC address for the IPv4
anycast address
3. IPv4 packet from client which requires reply (for example ICMP echo
request)
4. ARP request for the client MAC address for its IPv4 address in prefix4
(done with the mac address of br-client and transmitted over br-client)
5. IPv4 packet from node (transmitted over br-client with br-client MAC
address) as reply for the client IPv4 packet (for example ICMP echo
reply)
The step 4 is extremely problematic here. ARP replies with the anycast IPv4
address must not be submitted or received via bat0 - expecially not when it
contains an node specific MAC address as source. When it is still done then
the wrong MAC address is stored in the batadv DAT cache and ARP packet is
maybe even forwarded to clients. This latter is especially true for ARP
requests which are broadcast and will be flooded to the complete mesh.
Clients will see these ARP packets and change their own neighbor IP
(translation) table. They will then try to submit the packets for IPv4
anycast addresses to the complete wrong device in the mesh. This will for
example break the access to the status page to the connected device or the
anycast DNS forwarder implementation. Especially the latter causes extreme
latency when clients try to connect to server using a domain name or even
breaks the connection setup process completely. Both are caused by the
unanswered DNS requests which at first glance look like packet loss.
An node must therefore take care of:
* not transmitting ARP packets related to the anycast IPv4 address over
bat0
* drop ARP packets related to the anycast IPv4 when they are received on
bat0 from a still broken node
* don't accept ARP packets related to the anycast IPv4 replies on local
node when it comes from bat0
Fixes: b3762fc61c ("gluon-client-bridge: move IPv4 local subnet route to br-client (#1312)")
In multidomain setups, VXLAN is enabled by default, but can be disabled in
domain configs using the mesh/vxlan option. In single domain setups, the
mesh/vxlan option is mandatory.
The UCI option for legacy mode is removed.
Fixes#1364
net.ipv6.conf.br-client.forwarding is moved from gluon-client-bridge to
gluon-mesh-batman-adv, as the setting is not useful with non-bridged
protocols.
The RFC standard multicast querier interval is 120s. Our querier uses in
interval of 20s for better support of roaming clients, but our robustness
setting of 3 leads to external queriers using the standard interval to be
timeout after only 60s, leading to frequent "querier appeared/disappeared"
messages. Increase robustness so that external queriers with any interval
<180s are supported.
This patch moves the prefix4 subnet route from the local-node veth
device to br-client (while keeping the next node ipv4 address on the
local node device).
This is in preparation to allow routing over the br-client interface
later.
In addition to significant internal differences in check_site_lib.lua (in
particular unifying error handling to a single place for the upcoming
multi-domain support), this changes the way fields are addressed in site
check scripts: rather than providing a string like 'next_node.ip6', the
path is passed as an array {'next_node', 'ip6'}.
Other changes in site check scripts:
* need_array and need_table now pass the full path to the sub fields to the
subcheck instead of the key and value
* Any check referring to a field inside a table implies that all higher
levels must be tables if they exist: a check for {'next_node', 'ip6'} adds
an implicit (optional) check for {'next_node'}, which allows to remove many
explicit checks for such tables
* gluon-core, gluon-client-bridge: introduce new firewall zone: local_client
* gluon-core: put clients in local_client zone, introduce drop-zone,
set dns-rules and zones
* gluon-respondd: allow respondd on mesh
* gluon-status-page-api: allow http input on mesh and client
Filtering by MAC address won't filter out multicast packages like router
solicitations, causing uradvd to send out router advertisements with
maximum frequency (every 3 seconds) in active meshes, even when no local
client is actually interested in the advertisements.
Fixes#1230
When a Gluon node is used to connect to an uplink router/DHCP server (for
example in deployments without VPN tunnels), the gw_mode must be set to
server; this should be preserved on upgrades.
Fixes#1196
We now create bat0 and primary0 independently of the lower mesh interfaces,
making the whole setup a lot more robust. In particular:
- we can't accidentially destroy primary0 because of concurrent setup and
teardown runs of different interfaces
- bat0 will always exist, even when no mesh interfaces are up (e.g. no link
on wired mesh)
- interfaces going down and up again will never tear down the whole of
batman-adv
- we can enable and disable bat0 independently of the lower interface
states