Gluon: Migrate ar71xx target to ath79

Created on 16 Nov 2018  Â·  22Comments  Â·  Source: freifunk-gluon/gluon

Upstream is planning to drop the ar71xx target after the 19.01 release. To allow us to smoothly upgrade to OpenWRT releases after that, we need to work on a migration path for existing devices.

To upgrade an ar71xx based device to ath79, the following adjustments have to be made:

  • Change WiSoC device path
  • Reinitialize LED configuration

Additionally all the downstream integration needs to be reverified, expecially with regards to

  • Errors (LEDs, Buttons, Network config)
  • Model naming changes (Autoupdater filename)
  • Primary MAC selection

This process has to be repeated for every device/revision individually. We have to basically treat each device as a if they were a new integration. With each error we risk breaking already deployed devices.

However not every device that is currently supported in Gluon will be ported over to ath79. Devices which rely on custom initialization code implemented in the machine-file are among those that are most likely not going to be ported.

Examples are:

  • Meraki MR18

I think we should focus our efforts on widely used and popular devices, among these:

AVM

  • FRITZ!Box 4020
  • FRITZ!WLAN Repeater 450E

TP-Link

  • Archer C5 v1
  • Archer C59 v1
  • Archer C7 v2
  • Archer C7 v4
  • Archer C7 v5
  • CPE210 v1
  • CPE210 v2
  • CPE510 v1
  • TL-WDR3600
  • TL-WDR4300
  • TL-WR842N v3
  • TL-WR1043ND v2
  • TL-WR1043ND v3
  • TL-WR1043ND v4
  • TL-WR1043N v5

Ubiquiti

  • UniFi AC Lite
  • UniFi AC Pro
  • UniFi AC Mesh
  • UniFi AC Mesh Pro
rfc

Most helpful comment

That's just a matter of communication.

  • Stop promoting devices you know are not future-proof

    • Stop offering the factory install image for these devices, preventing new users from falling for that trap

  • Advertise the fact that support for these devices can only be guaranteed for a limited amount of time/releases, after which the situation will need to be reevaluated, clearly mentioning their shortcomings

If I did all these things with 1-2y lead time people can't really be mad at me, because I acted within the limits of my knowledge and kept everyone informed, see https://darmstadt.freifunk.net/news/2018/05/16/eol-devices.html.

If you did neither of these things you probably knowingly led your users into this situation - and thats a problem you have to sort out with yourself.

All 22 comments

What about WR841N(D)s and WR940Ns? They are about 98% of our deployed devices...

@CodeFetch There are already discussions about that, but speaking about 4/32 devices i had a hard time getting my test-device (TL-WR740N v1) to work reliably in the ath79 target for Gluon (sloppy Config-mode was the smallest problem) some months ago on OpenWRT master.

As this change is going to hit Gluon probably somewhere in 2020 (19.01 supports ath79 and ar71xx in parallel) it's likely 4/32 devices have a hard time running Gluon by then. Also see this current discussion about this topic upstream: https://forum.openwrt.org/t/4-32-devices-vs-17-01-18-06/25203

And to not start the whole discussion again: This is not about dropping support for 4/32. I think we should focus our effort on devices which have a future in the mid- to long-run. We can work with tiny devices in parallel or after them.

The parallel support of ar71xx and ath79 enables us to migrate in two waves, i would not go for migrating tiny devices in this first wave.

I wrote a - admittedly hacky - script to migrate WMAC path and LEDs.

https://github.com/blocktrron/gluon/commit/1da39e7e4741f5743dd594ee40e6e7be6691f51b

It does it's job on a FRITZBox 4020, Ocedo Koala and a WDR3600.

I encountered some pitfalls while doing so:

  • 5GHz LED is currently broken with kernel 4.14 for devices controlling it via the ath9k card (https://github.com/openwrt/openwrt/commit/ccab68f2d399d2395a18d1fb3495fe7e048fe054)
  • A lot of devices are missing their correct ar71xx-boardname for the ath79 image. sysupgrade then only works with '-F' flag. We either have to get this fixed upstream or maybe carry a patch for a limited time, allowing the migration.

Things we have to check that came to my mind:

  • Check if PoE pass-through gets migrated correctly (didn't look up anything about this topic yet)

Did someone already create a branch with ath79 target? So we could build some test firmware for some new models?

for example the devol WiFi pro 1750e is on sale at the moment for <40€

Be my guest: https://github.com/blocktrron/gluon/tree/devolo

There is no working autoupdater nor web-sysupgrade for this. You have been warned.

The complexity of what the routers do did not increase. Neither the 2.4 GHz chips evolved nor did the power consumption decrease. Dropping support for 4/32MB devices is not the same as dropping support for the WRT54. Using DTS, decreasing the NAPI weight, segmenting the Batman networks, using Babel, using WireGuard, not using the status page etc. decreases the memory consumption far enough to let those devices run with Gluon for some years. At the moment we still have nothing as low-priced as a WR841N. Thus people still buy these devices.
The huge number of deployed 4/32MB devices justifies porting them to ath79. The forum discussion is not relevant anymore. In the mailing list discussion it became clear that OpenWrt will not drop support for old 4/32MB devices as long as someone ports them to ath79, but OpenWrt only turns off image generation for these devices. The reason is to allow projects like LibreMesh and Freifunk to keep their deployed devices running.

not using the status page

Is in my opinion a non-starter upstream at Gluon, since it breaks the principle of least surprise … router is up, yet status-page does not load - and we do actively link to them from our map implementations.

@mweinelt That was meant only as an example. It's up to the community whether they want to support tiny devices and how they get them running. Gluon should inform people that tiny devices might not be supported in the future, but unless OpenWrt drops support for these devices, we should neither. Maybe it would make sense to introduce a flag like "BROKEN" e.g. "DEPRECATED" for 4MB flash or 32MB RAM devices. It would be nice if someone who is trained at doing it, could still create DTS support for the most used devices for that it is missing e.g. WR940N v3/v4. This should be straightforward. I've done it, but haven't tested it, yet and am uncertain what the right way is. If we drop support for 4/32MB devices, we kill more than half of most Freifunk communities' networks. That is out of discussion for me. Freifunkers put much effort in building these networks. I understand it is ridiculous that people buy a new mobile phone every year, but are not willing to buy a new Freifunk router every 5 years, but that's reality. They will be angry if their routers stop working and the will blame the communities and they will blame us or worse, they will keep using old versions which can potentially be exploited and that would lead to really bad publicity.

not using the status page

Is in my opinion a non-starter upstream at Gluon, since it breaks the principle of least surprise … router is up, yet status-page does not load - and we do actively link to them from our map implementations.

You could still deliver something rudimentary like the old text-only status page, (ab)using netcat as httpd, or something like that.

That's just a matter of communication.

  • Stop promoting devices you know are not future-proof

    • Stop offering the factory install image for these devices, preventing new users from falling for that trap

  • Advertise the fact that support for these devices can only be guaranteed for a limited amount of time/releases, after which the situation will need to be reevaluated, clearly mentioning their shortcomings

If I did all these things with 1-2y lead time people can't really be mad at me, because I acted within the limits of my knowledge and kept everyone informed, see https://darmstadt.freifunk.net/news/2018/05/16/eol-devices.html.

If you did neither of these things you probably knowingly led your users into this situation - and thats a problem you have to sort out with yourself.

@CodeFetch Everyone is welcome porting the devices to ath79. However it is required to go thru the device checklist AND testing of the migration path. It was just a "i would not go for tiny in the first wave".

So the general 4/32 discussion here is off-topic.

I will give you the point of

.. they will keep using old versions which can potentially be exploited and that would lead to really bad publicity.

This is already an issue. Many smaller communities seem not up to the task of keeping their firmware base half-way recent. 2016.x is already our "Windows XP" and we should try to avoid a "Windows 7" occurence.

Be my guest: https://github.com/blocktrron/gluon/tree/devolo
There is no working autoupdater nor web-sysupgrade for this. You have been warned.

tried to get it working...
git clone https://github.com/blocktrron/gluon ffnh_devolo -b devolo
but:
make GLUON_TARGET=ath79-generic GLUON_BRANCH=stable V=s
awk: fatal: cannot open file `openwrt/feeds.conf.default' for reading (No such file or directory)
make[1]: Entering directory '/home/freifunker/space2/ffnh_devolo/openwrt'
make[1]: * No rule to make target 'defconfig'. Stop.
make[1]: Leaving directory '/home/freifunker/space2/ffnh_devolo/openwrt'
Makefile:124: recipe for target 'config' failed
make: *
[config] Error 2

i read your comment about things are broken... but is there a way to get the build done ?

@heini66 It seems you didn't prepare the OpenWRT tree.

(This is also not really on topic.)

did a git pull a couple of mins ago and now ist working ...

For the primary MAC selection maybe have a look at:
https://github.com/freifunk-gluon/gluon/issues/1767

This won't help you immediately for the conversion, but might come handy in the long run.

@adrianschmutzler I think there was some effort for a migration script in OpenWrt. Do you know about it?

a migration script doesn't replace testing though.

The issues which can occur when upgrading from ar71xx to ath79 are well-known.
Only the phy-names can be swapped and the switch configuration can change. For standard Gluon configurations we don't need to test every device. We can see the differences once the devices are supported by OpenWrt's ath79. I doubt that if a migration script exists for OpenWrt, that it's really usable for Gluon. We likely need to write an own migration script nonetheless, but treating every ath79 device in Gluon as if it's new, is overkill.

  • Change WiSoC device path

Manageable. All of our currently supported devices have at most 2 radios of which one is 2.4 GHz and the other 5 GHz. So we can decide based on that information.

  • Reinitialize LED configuration

Manageable. We can just get rid of the old config after having checked that the GPIO configuration is equivalent to the one of ath79.

  • Errors (LEDs)

Covered by point above

Buttons

Can be checked/should not be an issue at all unless there is an error in the DTS

Network config

We need to check if the phy-names were swapped and if the switch configuration changed (e.g. eth0.1 instead of eth0)

  • Model naming changes (Autoupdater filename)

Can be checked from the code

  • Primary MAC selection

Covered by adrianschmutzler's PR

So I don't see any issue which would force us to test every device again.

There are some trivial migration scripts I know of:
https://github.com/openwrt/openwrt/blob/master/target/linux/ath79/base-files/etc/uci-defaults/04_led_migration
https://github.com/openwrt/openwrt/blob/master/target/linux/ath79/base-files/etc/hotplug.d/ieee80211/00-wifi-migration

eth0/eth1 swap is something you actually have to check.
So far I know of gl-ar150 and the qca9561 archer devices being swapped at the moment.

Before migration to ath79 is possible, nodes need to be updated to 19.07 ar71xx, because the image metadata format will likely change in ath79. Thus ath79 images will be rejected by older firmware. I'll dig out the mailing list thread when I find it. Maybe there will still be a solution...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lcb01a picture lcb01a  Â·  3Comments

rotanid picture rotanid  Â·  5Comments

rubo77 picture rubo77  Â·  5Comments

mweinelt picture mweinelt  Â·  3Comments

sargon picture sargon  Â·  4Comments