Commit Graph

322 Commits

Author SHA1 Message Date
adrian
839ab24e1e Fix missing path conversion from the previous commit to shuffle mdio around.
It turns out the recent work to cut down the number of atheros kernels built
didnt include one with ARGE_MDIO defined..
2015-12-27 07:39:44 +00:00
adrian
7ef1ee2112 [qca953x] remove unneeded initialisation.
This was copied from another chip file and it's not required on Honeybee.

Tested:

* AP143, QCA9531 SoC.

Obtained from: OpenWRT
2015-12-15 04:45:00 +00:00
adrian
795576f81b [ar71xx] always count interrupts, spurious or otherwise.
This aids in debugging.
2015-12-15 04:44:06 +00:00
adrian
9c16fc858e [arge] add a comment about needing mdio busses in order to use the interface.
This is a holdover from how reset is handled in the ARGE_MDIO world.
You need to define the mdio bus device if you want to use the ethernet
device or the arge setup path doesn't bring the MAC out of reset.
2015-12-15 04:43:28 +00:00
adrian
d03d60bd6b Add QCA9533 to the list of SoCs that require IRQ's be ACKed. 2015-11-16 06:15:01 +00:00
adrian
83e37c2104 Add initial support for the QCA953x ("Honeybee") from Qualcomm Atheros.
The QCA953x SoC is an integrated 2x2 2GHz 11n + MIPS24k core, with
a 5 port FE switch, gige WAN port, and all the same stuff you'd find on
its predecessor - the AR9331.

However, buried deep in here somewhere is also a PCIe EP/RC for various
applications and some other weird bits I don't yet know about.

This is enough to get the reference board up and booting.  I haven't yet
had it pass lots of packets - I need to finalise the ethernet switch
bits and the GMAC configuration (ie, how the ethernet ports and switch
are wired up) and I'll bring that in when I commit the base configuration
files to use the thing.

The wifi stuff will come much later.  I have to port that support from
Linux ath9k and extend our vendor HAL to support it.

The reference board (AP143) comes with 32MB RAM and 4MB flash, so in order
to use it I need to get USB working fully so I can run root from there.

Thankyou to Qualcomm Atheros for access to the reference design board.

Details:

* Add register definitions from openwrt;
* It looks like a QCA955x but shrunk down to a QCA933x footprint, so
  use the QCA955x bits and fix up the clock detection code to do the
  QCA953x bits (they're very subtly different);
* Teach GPIO about it;
* Teach EHCI about it;
* Teach if_arge about it;
* Teach the CPU detection code about it.

Tested:

* AP143, QCA9533v2 SoC

Obtained from:	Linux, Linux OpenWRT
2015-11-16 04:28:00 +00:00
adrian
ab6e31c922 Remove this; it's also in sys/conf/files.mips. 2015-11-03 21:03:26 +00:00
adrian
be517d7ef4 arge_mdio: fix barriers; correctly check MII indicator register.
* use barriers in a slightly better fashion.  You can blame this
  glass of whiskey on putting barriers in the wrong spot.  Grr adrian.

* steal/rewrite the mdio busy check from ag7100 from openwrt and
  refactor the existing code out.  This is .. more correct.

This seems to fix the boot-to-boot variation that I've been seeing
and it quietens the switch port status flapping.

Tested:

* QCA9558 SoC (AP135.)

Obtained from:	Linux OpenWRT
2015-10-30 23:59:52 +00:00
adrian
722c2df320 arge: fix barrier macro. 2015-10-30 23:57:20 +00:00
adrian
6b197b2c77 arge: attempt to close a transmit race by only enabling the descriptor at the end of setup.
This driver and the linux ag71xx driver both treat the transmit ring
as a circular linked list of descriptors.  There's no "end" pointer
that is ever NULL - instead, it expects the MAC to hit a finished
descriptor (ARGE_DESC_EMPTY) and stop.

Now, since it's a circular buffer, we may end up with the hardware
hitting the beginning of our multi-descriptor frame before we've finished
setting it up. It then DMA's it in, starts sending it, and we finish
writing out the new descriptor.  The hardware may then write its
completion for the next descriptor out; then we do, and when we next
read it it'll show up as "not done" and transmit completion stops.

This unfortunately manifests itself as the transmit queue always
being active and a massive TX interrupt storm.  We need to actively
ACK packets back from the transmit engine and if we don't (eg because
we think the transmit isn't finished but it is) then the unit will
just keep generating interrupts.

I hit this finally with the below testing setup.  This fixed it for me.

Strictly speaking I should put in a sync in between writing out all of
the descriptors and writing out that final descriptor.

Tested:

* QCA9558 SoC (AP135 reference board) w/ arge1 + vlans acting as a
  router, and iperf -d (tcp, bidirectional traffic.)

Obtained from:	Linux OpenWRT (ag71xx_main.c.)
2015-10-30 23:18:02 +00:00
adrian
1c25d2a759 arge: just use 1U since it's a 32 bit unsigned destination value. 2015-10-30 23:09:08 +00:00
adrian
305a5a647d arge: do an explicit flush between updating the TX ring and starting transmit.
The MIPS busdma sync operations currently are a big no-op on coherent memory.
This isn't strictly correct behaviour as we need a SYNC in here to ensure that
the writes have finished and are visible in main memory before the MMIO accesses
occur.  This will have to be addressed in a later commit.

But, before that happens, let's at least do a flush here to make things
more "correct".

This is required for even remotely sensible behaviour on mips74k with
write-through memory enabled.
2015-10-30 23:07:32 +00:00
adrian
401d1e9afa arge_mdio: add explicit read barriers for MDIO_READs.
The mips74k programmers guide notes that reads can be re-ordered, even
uncached ones, so we need an explicit SYNC between them.

Yes, this is a case of a driver author actively doing a bus barrier
operation.

This ends up being necessary when the mips74k core is run in write-back
mode rather than write-through mode.  That's coming in an upcoming
commit.

Tested:

* mips74k, QCA9558 SoC (AP135 reference board), arge<->arge interface
  routing traffic tests.
2015-10-30 23:00:47 +00:00
adrian
ea415f2530 arge: ensure there's enough space in the TX ring before attempting to
send frames.

This matches the other check for space.

"enough" is a misnomer, for "reasons".  The biggest reason is that
the TX ring is actually a circular linked list, with no head/tail pointers.
This is just a bit more headroom between head/tail so we have time to
schedule frames before we hit where the hardware is at.

Ideally this would be tunable and a little larger.
2015-10-30 22:55:41 +00:00
adrian
22eed60ca3 arge: do a read-after-write on all arge register writes, not just MDIO writes.
This flushes out the write to the system before anything continues.

The mips74k guide, chapter 3.3.3 (write gathering) notes that writes
can be buffered in FIFOs - even uncached ones - so we can't guarantee
the device has felt its effects.  Now, since we're all lazy driver
authors and don't pepper read/write barriers everywhere, fake it here.

tested:

* mips74k - QCA9558 SoC (AP135 reference board)
2015-10-30 22:53:30 +00:00
adrian
48944f6b39 Oops - use the wrong array offset. 2015-10-28 23:39:33 +00:00
adrian
39fb527bf9 Add some debugging code (under ARGE_DEBUG) that counts each interrupt source.
This should make it easier to track down interrupt storms from arge.

Tested:

* AP135 (QCA955x) SoC - defaults to ARGE_DEBUG enabled
* Carambola2 (AR9331 SoC) - defaults to ARGE_DEBUG disabled
2015-10-28 05:11:06 +00:00
adrian
e35de425dc arge(4): flip this on for AR9344 SoCs.
I couldn't test arge0->arge1 bridging, only arge0 VLAN bridging.
The DIR-825C1 only hooks up arge0 to the switch GMAC0 and so
you need to abuse VLANs to test.

Tested:

* DIR-825C1 (AR9344)
2015-10-24 22:37:59 +00:00
adrian
05283b18cc arge: use 1-byte TX and RX alignment for AR9330/AR9331.
This part seems to work bug-free with single byte TX/RX buffer alignment.

This drops the CPU requirement to bridge 100mbit iperf from 100% CPU
to ~ 50% CPU.

Tested:

* AP121 (AR9330) SoC, highly magic netbooted kernel + USB rootfs
  due to 4mb flash, 16mb RAM; doing bridging between arge0 and arge1.

Notes:

* Yes, I likely can also turn this on for the AR934x SoC family now.

  But since hardware design apparently follows similar branching
  strategies to software design, I'll go and make sure all the AR934x's
  that made it out into shipping products work before I flip it on.
2015-10-22 08:02:27 +00:00
adrian
67be823944 arge: Remove the debugging printf that snuck in.
This was triggering when using it as an AP bridge rather than an ethernet
bridge.

The code is unclear but it works; I'll fix it to be clearer and test
performance at a later stage.
2015-10-21 05:52:04 +00:00
adrian
3ec63ec821 arge: don't do the rx fixup copy and just offset the mbuf by 2 bytes
The existing code meets the "alignment" requirement for the l3 payload
by offsetting the mbuf by uint64_t and then calling an rx fixup routine
to copy the frame backwards by 2 bytes.  This DWORD aligns the
L3 payload so tcp, etc doesn't panic on unaligned access.

This is .. slow.

For arge MACs that support 1 byte TX/RX address alignment, we can do
the "other" hack: offset the RX address of the mbuf so the L3 payload
again is hopefully DWORD aligned.

This is much cheaper - since TX/RX is both 1 byte align ready (thanks
to the previous commit) there's no bounce buffering going on and there
is no rx fixup copying.

This gets bridging performance up from 180mbit/sec -> 410mbit/sec.
There's around 10% of CPU cycles spent in _bus_dmamap_sync(); I'll
investigate that later.

Tested:

* QCA955x SoC (AP135 reference board), bridging arge0/arge1
  by programming the switch to have two vlangroups in dot1q mode:

# ifconfig bridge0 inet 192.168.2.20/24
# etherswitchcfg config vlan_mode dot1q
# etherswitchcfg vlangroup0 members 0,1,2,3,4
# etherswitchcfg vlangroup1 vlan 2 members 5,6
# etherswitchcfg port5 pvid 2
# etherswitchcfg port6 pvid 2
# ifconfig arge1 up
# ifconfig bridge0 addm arge1
2015-10-21 01:41:18 +00:00
adrian
88a4a1403b if_arge: fix up TX workaround; add TX/RX requirements for busdma; add stats
The early ethernet MACs (I think AR71xx and AR913x) require that both
TX and RX require 4-byte alignment for all packets.

The later MACs have started relaxing the requirements.

For now, the 1-byte TX and 1-byte RX alignment requirements are only for
the QCA955x SoCs.  I'll add in the relaxed requirements as I review the
datasheets and do testing.

* Add a hardware flags field and 1-byte / 4-byte TX/RX alignment.
* .. defaulting to 4-byte TX and 4-byte RX alignment.
* Only enforce the TX alignment fixup if the hardware requires a 4-byte
  TX alignment.  This avoids a call to m_defrag().
* Add counters for various situations for further debugging.
* Set the 1-byte and 4-byte busdma alignment requirement when
  the tag is created.

This improves the straight bridging performance from 130mbit/sec
to 180mbit/sec, purely by removing the need for TX path bounce buffers.

The main performance issue is the RX alignment requirement and any RX
bounce buffering that's occuring.  (In a local test, removing the RX
fixup path and just aligning buffers raises the performance to above
400mbit/sec.

In theory it's a no-op for SoCs before the QCA955x.

Tested:

* QCA9558 SoC in AP135 board, using software bridging between arge0/arge1.
2015-10-18 00:59:28 +00:00
bz
2fb4661e45 Remove more unused variables leading to compile time errors. 2015-09-17 12:04:41 +00:00
bz
ca76185f1a Remove unused variable leading to compile errors. 2015-09-17 06:07:49 +00:00
zbb
95f13176f5 Add domain support to PCI bus allocation
When the system has more than a single PCI domain, the bus numbers
are not unique, thus they cannot be used for "pci" device numbering.
Change bus numbers to -1 (i.e. to-be-determined automatically)
wherever the code did not care about domains.

Reviewed by:   jhb
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3406
2015-09-16 23:34:51 +00:00
adrian
af6717c229 Populate hw.model with the CPU model information.
Now you see something like:

# sysctl hw.model
hw.model: Atheros AR9330 rev 1

Tested:

* Carambola 2, AR9331 SoC
2015-07-14 05:14:10 +00:00
adrian
e44e1c10f7 Reshuffle all of the DDR flush operations into a single switch/mux,
and start teaching subsystems about it.

The Atheros MIPS platforms don't guarantee any kind of FIFO consistency
with interrupts in hardware.  So software needs to do a flush when it
receives an interrupt and before it calls the interrupt handler.

There are new ones for the QCA934x and QCA955x, so do a few things:

* Get rid of the individual ones (for ethernet and IP2);
* Create a mux and enum listing all the variations on DDR flushes;
* replace the uses of IP2 with the relevant one (which will typically
  be "PCI" here);
* call the USB DDR flush before calling the real USB interrupt handlers;
* call the ethernet one upon receiving an interrupt that's for us,
  rather than never calling it during operation.

Tested:

* QCA9558 (TP-Link archer c7 v2)
* AR9331 (Carambola 2)

TODO:

* PCI, USB, ethernet, etc need to do a double-check to see if the
  interrupt was truely for them before doing the DDR.  For now I
  prefer "correct" over "fast".
2015-07-04 03:05:57 +00:00
adrian
bf0aa35100 Oops - fix typo. 2015-07-03 07:00:24 +00:00
adrian
3c153beb7c Enable setting the QCA955x GPIO output mux configuration.
It's not used by any boards yet, but it's going to creep up soon
as more boards show up.
2015-07-03 03:34:21 +00:00
adrian
d584d71770 Add register defines for the QCA955x DDR flush and GPIO control. 2015-07-03 03:32:54 +00:00
adrian
a267e1e605 Add initial support for the QCA955x PCIe host controller.
The QCA955x looks a lot like the AR724x PCIe controller, except it
supports two root complexes.  Unfortunately I only have one, so
although this code has started down the path of supporting more than
one, it's definitely not yet ready.

Tested:

* AP135 board (QCA9558 SoC), with the 11ac NIC swapped for an AR9380
  PCIe NIC.

Notes:

* Yes, this driver isn't very pretty.  I decided to commit what I have
  versus holding onto something that isn't yet finished.  It is enough
  to bring up the above NIC and interrupt routing works, so it's a good
  start.

* However, yes, the DDR flush routine hooks need to be fixed up.
  I don't think I'm firing the right one at the moment.
2015-05-19 05:31:58 +00:00
andrew
e2a65d5cfa Add support for the uart classes to set their default register shift value.
This is needed with the pl011 driver. Before this change it would default
to a shift of 0, however the hardware places the registers at 4-byte
addresses meaning the value should be 2.

This patch fixes this for the pl011 when configured using the fdt. The
other drivers have a default value of 0 to keep this a no-op.

MFC after:	1 week
2015-04-11 17:16:23 +00:00
adrian
11776dfc59 Begin moving support for board MAC addresses over to being explicitly defined.
A lot of these dinky atheros based MIPS boards don't have a nice, well,
anything consistent defining their MAC addresses for things.

The Atheros reference design boards will happily put MAC addresses
into the wifi module calibration data like they should, and individual
ethernet MAC addresses into the calibration area in flash.
That makes my life easy - "hint.arge.X.eeprommac=<addr>" reads from
that flash address to extract a MAC, and everything works fine.

However, aside from some very well behaved vendors (eg the Carambola 2
board), everyone else does something odd.

eg:

* a MAC address in the environment (eg ubiquiti routerstation/RSPRO)
   that you derive arge0/arge1 MAC addresses from.
* a MAC address in flash that you derive arge0/arge1 MAC addresses from.
* The wifi devices having their own MAC addresses in calibration data,
  like normal.
* The wifi devices having a fixed, default or garbage value for a MAC
  address in calibration data, and it has to be derived from the
  system MAC.

So to support this complete nonsense of a situation, there needs to be
a few hacks:

* The "board" MAC address needs to be derived from somewhere and squirreled
  away.  For now it's either redboot or a MAC address stored in calibration
  flash.

* Then, a "map" set of hints to populate kenv with some MAC addresses
  that are derived/local, based on the board address.  Each board has
  a totally different idea of what you do to derive things, so each
  map entry has an "offset" (+ve or -ve) that's added to the board
  MAC address.

* Then if_arge (and later, if_ath) should check kenv for said hint and
  if it's found, use that rather than the EEPROM MAC address - which may
  be totally garbage and not actually work right.

In order to do this, I've undone some of the custom redboot expecting
hacks in if_arge and the stuff that magically adds one to the MAC
address supplied by the board - instead, as I continue to test this
out on more hardware, I'll update the hints file with a map explaining
(a) where the board MAC should come from, and (b) what offsets to use
for each device.

The aim is to have all of the tplink, dlink and other random hardware
we run on have valid MAC addresses at boot, so (a) people don't get
random B:S:Dx:x ethernet MACs, and (b) the wifi MAC is valid
so it works rather than trying to use an invalid address that
actually upsets systems (think: multicast bit set in BSSID.)

Tested:

* TP-Link TL_WDR3600 - subsequent commits will add the hints map
  and the if_ath support.

TODO:

* Since this is -HEAD, and I'm all for debugging, there's a lot of
  printf()s in here.  They'll eventually go under bootverbose.
* I'd like to turn the macaddr routines into something available
  to all drivers - too many places hand-roll random MAC addresses
  and parser stuff.  I'd rather it just be shared code.
  However, that'll require more formal review.
* More boards.
2015-03-28 23:40:29 +00:00
adrian
0e84465cb2 Add GPIO function mux configuration for AR934x SoCs.
The AR934x (and maybe others in this family) have a more complicated
GPIO mux.  The AR71xx just has a single function register for a handful
of "GPIO or X" options, however the AR934x allows for one of roughly
100 behaviours for each GPIO pin.

So, this adds a quick hints based mechanism to configure the output
functions, which is required for some of the more interesting board
configurations.  Specifically, some use external LNAs to improve
RX, and without the MUX/output configured right, the 2GHz RX side
will be plain terrible.

It doesn't yet configure the "input" side yet; I'll add that if
it's required.

Tested:

* TP-Link TL-WDR3600, testing 2GHz STA/AP modes, checking some
  basic RX sensitivity things (ie, "can I see the AP on the other
  side of the apartment that intentionally has poor signal reception
  from where I am right now.")

Whilst here, fix a silly bug in the maxpin routine; I was missing
a break.
2015-03-21 06:08:35 +00:00
adrian
f5a9460e2b add QCA955x PCIe configuration registers.
These are /not/ absolute addresses, as the QCA955x SoC has 2 PCIe RC's
(and 1 PCIe EP.)
2015-03-21 06:00:46 +00:00
adrian
9ce94e7c25 Note that the AR724x PCIe registers are actually from the PCI_CTRL
register range.
2015-03-21 05:59:45 +00:00
adrian
af70fecfb9 Use ar71xx_mac_addr_random_init() instead of a hand-rolled random
MAC address.
2015-03-15 21:56:41 +00:00
adrian
74b6ad6fe4 Start fleshing out some MAC address helper functions.
A lot of these embedded boards don't have a unique MAC address per
device stored somewhere unique - sometimes they'll have one MAC
for both arge NICs; someties they'll have one MAC for both arge NICs
/and/ the ath NICs.  In these instances, we need to derive device
specific MAC addresses from the base MAC address.

These functions will be used by some follow-up code that'll slot
into if_arge and if_ath.
2015-03-15 21:56:12 +00:00
adrian
92c375f66c Modify the if_arge code to use a /fixed/ media mode when it's configured.
Otherwise, the initial media speed would change if a PHY is hooked up,
sending PHY speed notifications.  For the AP135 at least, the RGMII
PHY has a static speed/duplex configured and if the PHY plumbing
attaches the PHY to the if_arge interface, the first link speed change
from 1000/full will set the MAC to something that isn't useful.

This shouldn't affect any other platforms - everything I looked at is
using hard-coded speed/duplex as static, as they're facing a switch
with no PHY attached.
2015-03-08 22:03:54 +00:00
adrian
8f1977d6ce Add ethernet MAC DDR flush hookups for QCA955x.
Tested:

* AP135
2015-03-04 03:52:50 +00:00
adrian
bc2ba88d1c Add DDR flush registers for QCA955x. 2015-03-04 03:51:54 +00:00
adrian
b865675364 [QCA955x] make the USB EHCI interrupts shareable.
There's two EHCI controllers in the QCA955x SoCs - they have different
interrupts available via various demux registers, but they both tie to
IP3.

So for now, allow them to be sharable so they can hang off of IP3.
2015-03-02 02:08:43 +00:00
adrian
1ee8c32e32 Add initial QCA955x support to if_arge.c.
Tested:

* AP135 development board, QCA9558 SoC.
2015-03-02 01:53:47 +00:00
adrian
b5a61383ac Add a MII mode for SGMII.
This appears on the AR934x and later chips, although it's not
something that's programmed via the arge0/arge1 register space.
It's just cosmetic.
2015-03-02 01:23:59 +00:00
adrian
a065c9d2dc Add very initial QCA955x awareness to the GPIO code.
There's a lot more to come - the QCA955x has a bunch more GPIO MUX
configuration, reminiscent of what the ARM chips let you do - but
it'll have to come later.
2015-03-01 07:00:34 +00:00
adrian
90aba284a2 Flesh out some more QCA955x ethernet PLL setup. 2015-03-01 06:59:32 +00:00
adrian
3664d4bf70 Add Ethernet PLL values for the QCA955x.
These are the same as the AR934x.

Obtained from:	Linux openwrt
2015-03-01 06:54:59 +00:00
adrian
0525b8f3b9 Make QCA955X_GMAC_REG_ETH_CFG defined like most other registers like this. 2015-03-01 06:52:23 +00:00
adrian
9709e28ab3 Add QCA955x support to the EHCI setup path.
Tested:

* QCA AP135 development board, USB rootfs.
2015-03-01 06:05:01 +00:00
sbruno
fce5f51901 The linux driver code for the MDIO bus does a read-after-write
which seems to be required on MIPS74k platforms for correct
behaviour.

Reviewed by:	adrian
2015-02-02 17:33:00 +00:00