Commit Graph

1893 Commits

Author SHA1 Message Date
Adrian Chadd
cd15736797 Add support for the integrated wifi for the QCA953x base config and
AP143.

Tested:

* AP143 reference design board
2015-11-29 05:49:49 +00:00
Konstantin Belousov
724f4b62b0 Remove sv_prepsyscall, sv_sigsize and sv_sigtbl members of the struct
sysent.

sv_prepsyscall is unused.

sv_sigsize and sv_sigtbl translate signal number from the FreeBSD
namespace into the ABI domain.  It is only utilized on i386 for iBCS2
binaries.  The issue with this approach is that signals for iBCS2 were
delivered with the FreeBSD signal frame layout, which does not follow
iBCS2.  The same note is true for any other potential user if
sv_sigtbl.  In other words, if ABI needs signal number translation, it
really needs custom sv_sendsig method instead.

Sponsored by:	The FreeBSD Foundation
2015-11-28 08:49:07 +00:00
Svatopluk Kraus
eae22c4430 Revert r291142.
The not quite consistent logic for bounce pages allocation is utilizited
by re(4) interface which can hang now.

Approved by:	kib (mentor)
2015-11-23 11:19:00 +00:00
Adrian Chadd
181f3573ee [mips]: Don't hard-code PHYS_AVAIL_ENTRIES. 2015-11-22 02:40:19 +00:00
Svatopluk Kraus
6fa7734d6f Fix BUS_DMA_MIN_ALLOC_COMP flag logic. When bus_dmamap_t map is being
created for bus_dma_tag_t tag, bounce pages should be allocated
only if needed.

Before the fix, they were allocated always if BUS_DMA_COULD_BOUNCE flag
was set but BUS_DMA_MIN_ALLOC_COMP not. As bounce pages are never freed,
it could cause memory exhaustion when a lot of such tags together with
their maps were created.

Note that there could be more maps in one tag by current design.
However BUS_DMA_MIN_ALLOC_COMP flag is tag's flag. It's set after
bounce pages are allocated. Thus, they are allocated only for first
tag's map which needs them.

Approved by:	kib (mentor)
2015-11-21 19:55:01 +00:00
Adrian Chadd
d6ebaf0a5e mips: teach the malta platform about extended memory.
Extended memory here is "physical memory above 256MB".
"memsize" in the environment only grows to 256MB; "ememsize" is the entire
memory range.  Extended memory shows up at physical address 0x90000000.

This allows for malta64 VMs to be created with > 256MB RAM, all the way
up to 2GB RAM.

Tested:

* qemu-devel package; qemu-system-mips64 -m 2048 (and -m 256 to test the
  no-ememsize case.)

TODO:

* testing mips32 with > 256MB RAM.

Reviewed by:	imp
2015-11-21 00:22:47 +00:00
Warner Losh
29d1144aeb Mark the mostly redundant kernels that just pull
in something from _BASE as NO_UNIVERSE

Differential Revision: https://reviews.freebsd.org/D4200
2015-11-19 01:58:12 +00:00
Adrian Chadd
e69ad1c9b3 Add the QCA9533 base configuration file and an example configuration
for the AP143.

Wifi doesn't work on the QCA9533 board, but basic ethernet/ethernet
and ethernet switch support does work.

The AP143 has 32MB RAM and 4MB flash, so this was tested with a USB
rootfs.

Tested:

* QCA9533v2, AP143 reference design board.
2015-11-18 06:25:25 +00:00
Allan Jude
a065797aa4 Add a kernel config for the Onion Omega
Small $25 IoT device, 400mhz Atheros cpu, Atheros WiFi and Ethernet
18 GPIOs, and support for Relay, Servo, and OLED expansion
https://onion.io/omega/

Reviewed by:	adrian
Approved by:	bapt (mentor)
Relnotes:	yes
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D4188
2015-11-17 21:02:27 +00:00
Adrian Chadd
27ddeed4a3 Add QCA9533 to the list of SoCs that require IRQ's be ACKed. 2015-11-16 06:15:01 +00:00
Adrian Chadd
d6141d33bc Add initial support for the QCA953x ("Honeybee") from Qualcomm Atheros.
The QCA953x SoC is an integrated 2x2 2GHz 11n + MIPS24k core, with
a 5 port FE switch, gige WAN port, and all the same stuff you'd find on
its predecessor - the AR9331.

However, buried deep in here somewhere is also a PCIe EP/RC for various
applications and some other weird bits I don't yet know about.

This is enough to get the reference board up and booting.  I haven't yet
had it pass lots of packets - I need to finalise the ethernet switch
bits and the GMAC configuration (ie, how the ethernet ports and switch
are wired up) and I'll bring that in when I commit the base configuration
files to use the thing.

The wifi stuff will come much later.  I have to port that support from
Linux ath9k and extend our vendor HAL to support it.

The reference board (AP143) comes with 32MB RAM and 4MB flash, so in order
to use it I need to get USB working fully so I can run root from there.

Thankyou to Qualcomm Atheros for access to the reference design board.

Details:

* Add register definitions from openwrt;
* It looks like a QCA955x but shrunk down to a QCA933x footprint, so
  use the QCA955x bits and fix up the clock detection code to do the
  QCA953x bits (they're very subtly different);
* Teach GPIO about it;
* Teach EHCI about it;
* Teach if_arge about it;
* Teach the CPU detection code about it.

Tested:

* AP143, QCA9533v2 SoC

Obtained from:	Linux, Linux OpenWRT
2015-11-16 04:28:00 +00:00
Adrian Chadd
3036d0128e Remove this; it's also in sys/conf/files.mips. 2015-11-03 21:03:26 +00:00
Adrian Chadd
d24766cc51 mips: rate limit the trap handler output; add pid/tid/program name.
I discovered that we're logging each trap, which gets pretty spendy;
and there wasn't any further information on the pid/tid/progname involved.

I originally noticed this because I don't attach anything to /dev/log and so
the log() output stays going to the kernel.  That's an oops on my part, but
I'm glad I did it.

This commit adds the following:

* a rate limiter, which could do with some eyeballs/ideas on how to
  make it more predictable on SMP;
* log pid, tid, progname (comm) as part of the output.

I now get output like this:

Unaligned Load Word: pid=621 (pmcstat), tid=100060, pc=0xffffffff803ae898, badvaddr=0x40a10055
Unaligned Load Word: pid=621 (pmcstat), tid=100060, pc=0xffffffff803ae898, badvaddr=0x40a10051
Unaligned Load Word: pid=621 (pmcstat), tid=100060, pc=0xffffffff803ae898, badvaddr=0x40a1004d
Unaligned Load Word: pid=602 (login), tid=100042, pc=0xffffffff803ae898, badvaddr=0x401159
Unaligned Load Word: pid=602 (login), tid=100042, pc=0xffffffff803ae898, badvaddr=0x401155
Unaligned Load Word: pid=602 (login), tid=100042, pc=0xffffffff803ae898, badvaddr=0x401151

.. which makes it much easier to start figuring out what/where to fix.

The pc looks suss (it looks like it's in kernel space); I'll dig into that one next.

Tested:

* AR9331 SoC (Carambola2)
2015-11-02 03:36:15 +00:00
Adrian Chadd
f2c42f690f mips: do mips_sync() on sync operations to uncachable memory.
mips24k/mips74k document that we need an explicit SYNC so to order
things correctly, even with access to uncachable memory.
We were doing calls to SYNC in the cache ops (inv, wbinv) but we
weren't doing it for uncachable memory.
2015-10-31 00:29:26 +00:00
Adrian Chadd
941f53b9a9 mips74k: use cache-writeback for memory, not writethrough.
When I ported this code from netbsd I was .. slightly mips74k greener.
I used writethrough because (a) it's what netbsd did, and (b) if I used
writethrough then things "didn't work."

Fast-forward a couple years, more MIPS hacking and a whole lot more
understanding of the bus APIs (the last few commits notwithstanding;
it's been a long week, ok?) and I have this working for arge,
argemdio, spi and ath.  Hans has it working for USB.  The ath barrier
code will come in a later commit.

This gets the routing throughput up from 220mbit -> 337mbit.
I'm sure the bridging throughput will be similarly improved.

Tested:

* QCA955x SoC, routing workload.
2015-10-31 00:04:44 +00:00
Adrian Chadd
f17acb5fbe arge_mdio: fix barriers; correctly check MII indicator register.
* use barriers in a slightly better fashion.  You can blame this
  glass of whiskey on putting barriers in the wrong spot.  Grr adrian.

* steal/rewrite the mdio busy check from ag7100 from openwrt and
  refactor the existing code out.  This is .. more correct.

This seems to fix the boot-to-boot variation that I've been seeing
and it quietens the switch port status flapping.

Tested:

* QCA9558 SoC (AP135.)

Obtained from:	Linux OpenWRT
2015-10-30 23:59:52 +00:00
Adrian Chadd
78e1370bbc arge: fix barrier macro. 2015-10-30 23:57:20 +00:00
Adrian Chadd
29f88ae706 arge: attempt to close a transmit race by only enabling the descriptor at the end of setup.
This driver and the linux ag71xx driver both treat the transmit ring
as a circular linked list of descriptors.  There's no "end" pointer
that is ever NULL - instead, it expects the MAC to hit a finished
descriptor (ARGE_DESC_EMPTY) and stop.

Now, since it's a circular buffer, we may end up with the hardware
hitting the beginning of our multi-descriptor frame before we've finished
setting it up. It then DMA's it in, starts sending it, and we finish
writing out the new descriptor.  The hardware may then write its
completion for the next descriptor out; then we do, and when we next
read it it'll show up as "not done" and transmit completion stops.

This unfortunately manifests itself as the transmit queue always
being active and a massive TX interrupt storm.  We need to actively
ACK packets back from the transmit engine and if we don't (eg because
we think the transmit isn't finished but it is) then the unit will
just keep generating interrupts.

I hit this finally with the below testing setup.  This fixed it for me.

Strictly speaking I should put in a sync in between writing out all of
the descriptors and writing out that final descriptor.

Tested:

* QCA9558 SoC (AP135 reference board) w/ arge1 + vlans acting as a
  router, and iperf -d (tcp, bidirectional traffic.)

Obtained from:	Linux OpenWRT (ag71xx_main.c.)
2015-10-30 23:18:02 +00:00
Adrian Chadd
70487bd29b arge: just use 1U since it's a 32 bit unsigned destination value. 2015-10-30 23:09:08 +00:00
Adrian Chadd
a73d5cc09f arge: do an explicit flush between updating the TX ring and starting transmit.
The MIPS busdma sync operations currently are a big no-op on coherent memory.
This isn't strictly correct behaviour as we need a SYNC in here to ensure that
the writes have finished and are visible in main memory before the MMIO accesses
occur.  This will have to be addressed in a later commit.

But, before that happens, let's at least do a flush here to make things
more "correct".

This is required for even remotely sensible behaviour on mips74k with
write-through memory enabled.
2015-10-30 23:07:32 +00:00
Adrian Chadd
ab2477c2c1 arge_mdio: add explicit read barriers for MDIO_READs.
The mips74k programmers guide notes that reads can be re-ordered, even
uncached ones, so we need an explicit SYNC between them.

Yes, this is a case of a driver author actively doing a bus barrier
operation.

This ends up being necessary when the mips74k core is run in write-back
mode rather than write-through mode.  That's coming in an upcoming
commit.

Tested:

* mips74k, QCA9558 SoC (AP135 reference board), arge<->arge interface
  routing traffic tests.
2015-10-30 23:00:47 +00:00
Adrian Chadd
47ed24efe2 arge: ensure there's enough space in the TX ring before attempting to
send frames.

This matches the other check for space.

"enough" is a misnomer, for "reasons".  The biggest reason is that
the TX ring is actually a circular linked list, with no head/tail pointers.
This is just a bit more headroom between head/tail so we have time to
schedule frames before we hit where the hardware is at.

Ideally this would be tunable and a little larger.
2015-10-30 22:55:41 +00:00
Adrian Chadd
3b8a3b85eb arge: do a read-after-write on all arge register writes, not just MDIO writes.
This flushes out the write to the system before anything continues.

The mips74k guide, chapter 3.3.3 (write gathering) notes that writes
can be buffered in FIFOs - even uncached ones - so we can't guarantee
the device has felt its effects.  Now, since we're all lazy driver
authors and don't pepper read/write barriers everywhere, fake it here.

tested:

* mips74k - QCA9558 SoC (AP135 reference board)
2015-10-30 22:53:30 +00:00
Adrian Chadd
948457f1be Oops - use the wrong array offset. 2015-10-28 23:39:33 +00:00
Adrian Chadd
3ea1870967 Add some debugging code (under ARGE_DEBUG) that counts each interrupt source.
This should make it easier to track down interrupt storms from arge.

Tested:

* AP135 (QCA955x) SoC - defaults to ARGE_DEBUG enabled
* Carambola2 (AR9331 SoC) - defaults to ARGE_DEBUG disabled
2015-10-28 05:11:06 +00:00
Adrian Chadd
87af896340 mips: use the correct va for wbinv flushing.
arge doesn't trigger this, but ath(4) does.

Tested:

* AR9331 SoC (Carambola2); ath(4) hostap

Submitted by:	ian
2015-10-27 23:11:22 +00:00
Adrian Chadd
141a008498 arge(4): flip this on for AR9344 SoCs.
I couldn't test arge0->arge1 bridging, only arge0 VLAN bridging.
The DIR-825C1 only hooks up arge0 to the switch GMAC0 and so
you need to abuse VLANs to test.

Tested:

* DIR-825C1 (AR9344)
2015-10-24 22:37:59 +00:00
Adrian Chadd
bd1df7e776 Commit the right board file - use the right name + hints. 2015-10-22 15:15:45 +00:00
Adrian Chadd
bb5c955e8d Add support for the TP-Link TL-WR740N v4.
This is an AR9331 part based on the AP121 reference design but with
32MB RAM.  Yes, it has 4MB flash and it has no USB, so clever hacks
are required to get it up and working.

But boot/work it does.
2015-10-22 08:08:06 +00:00
Adrian Chadd
73f96038d2 arge: use 1-byte TX and RX alignment for AR9330/AR9331.
This part seems to work bug-free with single byte TX/RX buffer alignment.

This drops the CPU requirement to bridge 100mbit iperf from 100% CPU
to ~ 50% CPU.

Tested:

* AP121 (AR9330) SoC, highly magic netbooted kernel + USB rootfs
  due to 4mb flash, 16mb RAM; doing bridging between arge0 and arge1.

Notes:

* Yes, I likely can also turn this on for the AR934x SoC family now.

  But since hardware design apparently follows similar branching
  strategies to software design, I'll go and make sure all the AR934x's
  that made it out into shipping products work before I flip it on.
2015-10-22 08:02:27 +00:00
Ian Lepore
2bd58a9fa5 Treat mbufs as cacheline-aligned. Even when the transfer begins at an
offset within the buffer to align the L3 headers we know the buffer itself
was allocated and sized on cacheline boundaries and we don't need to
preserve partitial cachelines at the start and end of the buffer when
doing busdma sync operations.
2015-10-21 19:24:20 +00:00
Ian Lepore
2fca9311fc Free memory back into the categories it was allocated from.
Noticed by: sbruno
Pointy hat: ian
2015-10-21 17:41:20 +00:00
Ian Lepore
f9a5123470 Switch mips busdma to using the common busdma_buffalloc code. This amounts
to copying in some code from the armv4 busdma, and adapting a few variable
and flag names to match the surrounding mips code.

Instead of keeping a local cache of prealloced busdma_map structs on a
mutex-protected list, set up an uma zone to cache them.

Instead of all memory allocations using M_DEVBUF, use new categories
M_BUSDMA for allocations of metadata (tags, maps, segment tracking lists),
and M_BOUNCE for bounce pages.

When buffers are allocated out of the busdma_bufalloc zones the alignment
and size of the buffers is known, and the code can skip doing any "partial
cacheline flush" logic to preserve data that may be adjacent to the DMA
buffer but contain non-DMA data.

Reviewed by:	adrian, imp
2015-10-21 15:06:48 +00:00
Ian Lepore
f4110e9110 Switch from a stub to a real implementation of pmap_page_set_attr() for mips,
and implement support for VM_MEMATTR_UNCACHEABLE.  This will be used in
upcoming changes to support BUS_DMA_COHERENT in bus_dmamem_alloc().

Reviewed by:	adrian, imp
2015-10-21 14:57:59 +00:00
Adrian Chadd
c358c04640 arge: Remove the debugging printf that snuck in.
This was triggering when using it as an AP bridge rather than an ethernet
bridge.

The code is unclear but it works; I'll fix it to be clearer and test
performance at a later stage.
2015-10-21 05:52:04 +00:00
Adrian Chadd
240de6998b arge: don't do the rx fixup copy and just offset the mbuf by 2 bytes
The existing code meets the "alignment" requirement for the l3 payload
by offsetting the mbuf by uint64_t and then calling an rx fixup routine
to copy the frame backwards by 2 bytes.  This DWORD aligns the
L3 payload so tcp, etc doesn't panic on unaligned access.

This is .. slow.

For arge MACs that support 1 byte TX/RX address alignment, we can do
the "other" hack: offset the RX address of the mbuf so the L3 payload
again is hopefully DWORD aligned.

This is much cheaper - since TX/RX is both 1 byte align ready (thanks
to the previous commit) there's no bounce buffering going on and there
is no rx fixup copying.

This gets bridging performance up from 180mbit/sec -> 410mbit/sec.
There's around 10% of CPU cycles spent in _bus_dmamap_sync(); I'll
investigate that later.

Tested:

* QCA955x SoC (AP135 reference board), bridging arge0/arge1
  by programming the switch to have two vlangroups in dot1q mode:

# ifconfig bridge0 inet 192.168.2.20/24
# etherswitchcfg config vlan_mode dot1q
# etherswitchcfg vlangroup0 members 0,1,2,3,4
# etherswitchcfg vlangroup1 vlan 2 members 5,6
# etherswitchcfg port5 pvid 2
# etherswitchcfg port6 pvid 2
# ifconfig arge1 up
# ifconfig bridge0 addm arge1
2015-10-21 01:41:18 +00:00
Sean Bruno
205bb74daa Disable SWAPPING as we don't do it on this board. 2015-10-20 19:32:26 +00:00
Sean Bruno
3675892f65 Remove geom_uncompress from TP-MR3020 config. Its now using root on USB
and there's no need for it now.
2015-10-18 18:41:30 +00:00
Sean Bruno
dd9f3185c9 Add VM_KMEM_SIZE_SCALE=1 as these systems are going to have super small
amount of RAM, e.g. 16M or 32M

Reviewed by:	adrian
2015-10-18 18:40:11 +00:00
Sean Bruno
a53f1fce3b Correctly use the default values for location of MAC addrs of arge0,
arge1, ath0.  woo!

Reviewed by:	adrian
2015-10-18 04:50:51 +00:00
Adrian Chadd
9919dec83c if_arge: fix up TX workaround; add TX/RX requirements for busdma; add stats
The early ethernet MACs (I think AR71xx and AR913x) require that both
TX and RX require 4-byte alignment for all packets.

The later MACs have started relaxing the requirements.

For now, the 1-byte TX and 1-byte RX alignment requirements are only for
the QCA955x SoCs.  I'll add in the relaxed requirements as I review the
datasheets and do testing.

* Add a hardware flags field and 1-byte / 4-byte TX/RX alignment.
* .. defaulting to 4-byte TX and 4-byte RX alignment.
* Only enforce the TX alignment fixup if the hardware requires a 4-byte
  TX alignment.  This avoids a call to m_defrag().
* Add counters for various situations for further debugging.
* Set the 1-byte and 4-byte busdma alignment requirement when
  the tag is created.

This improves the straight bridging performance from 130mbit/sec
to 180mbit/sec, purely by removing the need for TX path bounce buffers.

The main performance issue is the RX alignment requirement and any RX
bounce buffering that's occuring.  (In a local test, removing the RX
fixup path and just aligning buffers raises the performance to above
400mbit/sec.

In theory it's a no-op for SoCs before the QCA955x.

Tested:

* QCA9558 SoC in AP135 board, using software bridging between arge0/arge1.
2015-10-18 00:59:28 +00:00
Ed Maste
42d17d369b Add Ubiquiti EdgeRouter Lite (ERL) kernel config file
The ERL is a fairly cheap (~$100 USD) and readily available dual core
MIPS64 device so it makes a useful MIPS reference platform.

This is based in part on the kernel config generated by the mkerlimage
script from http://rtfm.net/FreeBSD/ERL/.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3884
2015-10-14 21:10:05 +00:00
Sean Bruno
88a1a27261 Correct flash layout (this is a 4M flash unit).
Remove "rootfs" entry and assign the 800K or so to the kernel
partition as this unit boots from usb mass storage.
2015-10-11 18:37:29 +00:00
Alexander Motin
4a3760bae6 Remove compatibility shims for legacy ATA device names.
We got new ATA stack in FreeBSD 8.x, switched to it at 9.x, completely
removed old stack at 10.x, so at 11.x it is time to remove compat shims.
2015-10-11 13:01:51 +00:00
Sean Bruno
fc28939612 Use machine specific values cleaned from openwrt for the mac address
location on the TP link mr3020
2015-10-11 03:31:11 +00:00
Sean Bruno
6b79bfd0de There's no way a fbsd install + kernel will fit into 4MB of flash.
Assume and enforce the fact that this will always boot a rootfs from
usb.
2015-10-10 19:08:34 +00:00
Adrian Chadd
27b3a39a8f Update the AP135 reference design flash layout to be more useful.
* Shuffle the kernel to be at the beginning
* Give the kernel 2mb, the rootfs 6mb, and 'mib0' the rest
* put the cfg parition just before the ART calibration data for the
  wifi part in the SoC
* .. and make sure ART points to the right 64k region.

I've updated the freebsd-wifi-build wiki the instructions on using this.

If someone has an AP135 with 8MB SPI flash then this won't work; everything
minus the big mib0 partition is just a bit over 8MB.  Come see me if this
ever happens (you'll likely just have to shrink the rootfs and the kernel
a little in order to make it fit.)

Tested:

* AP135 reference board.
2015-10-10 05:00:18 +00:00
Sean Bruno
2bfaedae36 Set correct argemdio addr, comment out arge1 as its not physically
connected to anything.  Move a couple of devices out of the kernel
and into modules.
2015-10-04 22:50:37 +00:00
Alan Cox
9f86aba61c Exploit r288122 to address a cosmetic issue. Since PV chunk pages don't
belong to a vm object, they can't be paged out.  Since they can't be paged
out, they are never enqueued in a paging queue.  Nonetheless, passing
PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages
are being enqueued in the inactive queue.  As of r288122, we can avoid
this false impression by passing PQ_NONE.

Submitted by:	kmacy (an earlier version)
Differential Revision:	https://reviews.freebsd.org/D1674
2015-09-26 07:18:05 +00:00
Konstantin Belousov
cff8c6f2d1 Add support for weak symbols to the kernel linkers. It means that
linkers no longer raise an error when undefined weak symbols are
found, but relocate as if the symbol value was 0.  Note that we do not
repeat the mistake of userspace dynamic linker of making the symbol
lookup prefer non-weak symbol definition over the weak one, if both
are available.  In fact, kernel linker uses the first definition
found, and ignores duplicates.

Signature of the elf_lookup() and elf_obj_lookup() functions changed
to split result/error code and the symbol address returned.
Otherwise, it is impossible to return zero address as the symbol
value, to MD relocation code.  This explains the mechanical changes in
elf_machdep.c sources.

The powerpc64 R_PPC_JMP_SLOT handler did not checked error from the
lookup() call, the patch leaves the code as is (untested).

Reported by:	glebius
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-09-20 01:27:59 +00:00