freebsd-skq

Author	SHA1	Message	Date
ian	a354169953	Add a mips implementation of OF_decode_addr().	2015-12-21 18:19:14 +00:00
adrian	b6b0db3869	[mips] print out l2 cache configuration if it exists. The Ingenic JZ7480 SoC that is on the Imagination Technologies CI20 board has an L2 cache: Cache info: picache_stride = 4096 picache_loopcount = 8 pdcache_stride = 4096 pdcache_loopcount = 8 cpu0: Ingenic Xburst processor v79.2 MMU: Standard TLB, 32 entries L1 i-cache: 8 ways of 128 sets, 32 bytes per line L1 d-cache: 8 ways of 128 sets, 32 bytes per line L2 cache: 8 ways of 256 sets, 128 bytes per line, 256 KiB total size Config1=0xbe67338b<WatchRegs,EJTAG,FPU> Config2=0x80000267 Config3=0x20	2015-12-21 01:48:16 +00:00
ian	d49c59ad67	Tidy up mips ofw_machdep.h. Don't include openfirm.h because openfirm.h is what includes machine/ofw_machdep.h. Don't declare OF_decode_addr(); it isn't implemented yet on mips and the declaration for it is about to be commonized into openfirm.h.	2015-12-20 19:09:12 +00:00
alc	8343c406db	Introduce a new mechanism for relocating virtual pages to a new physical address and use this mechanism when: 1. kmem_alloc_{attr,contig}() can't find suitable free pages in the physical memory allocator's free page lists. This replaces the long-standing approach of scanning the inactive and inactive queues, converting clean pages into PG_CACHED pages and laundering dirty pages. In contrast, the new mechanism does not use PG_CACHED pages nor does it trigger a large number of I/O operations. 2. on 32-bit MIPS processors, uma_small_alloc() and the pmap can't find free pages in the physical memory allocator's free page lists that are covered by the direct map. Tested by: adrian 3. ttm_bo_global_init() and ttm_vm_page_alloc_dma32() can't find suitable free pages in the physical memory allocator's free page lists. In the coming months, I expect that this new mechanism will be applied in other places. For example, balloon drivers should use relocation to minimize fragmentation of the guest physical address space. Make vm_phys_alloc_contig() a little smarter (and more efficient in some cases). Specifically, use vm_phys_segs[] earlier to avoid scanning free page lists that can't possibly contain suitable pages. Reviewed by: kib, markj Glanced at: jhb Discussed with: jeff Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4444	2015-12-19 18:42:50 +00:00
adrian	7ef1ee2112	[qca953x] remove unneeded initialisation. This was copied from another chip file and it's not required on Honeybee. Tested: * AP143, QCA9531 SoC. Obtained from: OpenWRT	2015-12-15 04:45:00 +00:00
adrian	795576f81b	[ar71xx] always count interrupts, spurious or otherwise. This aids in debugging.	2015-12-15 04:44:06 +00:00
adrian	9c16fc858e	[arge] add a comment about needing mdio busses in order to use the interface. This is a holdover from how reset is handled in the ARGE_MDIO world. You need to define the mdio bus device if you want to use the ethernet device or the arge setup path doesn't bring the MAC out of reset.	2015-12-15 04:43:28 +00:00
imp	69e096c120	Correct the CONFIG0_VI value. According to http://www.t-es-t.hu/download/mips/md00090c.pdf this is bit 3 of the config0 word, not bit 2. This should fix virtually indexed caches (relatively new in the MIPS world, so no current platforms used this and current code just uses it as an optimization). It was causing false positives on newer platforms that default to large values for the kseg0 cache coherency attribute. Submitted by: Stanislav Galabov PR: 205249	2015-12-11 16:51:04 +00:00
markj	f734f97f4e	Add helper functions proc_readmem() and proc_writemem(). These helper functions can be used to read in or write a buffer from or to an arbitrary process' address space. Without them, this can only be done using proc_rwmem(), which requires the caller to fill out a uio. This is onerous and results in code duplication; the new functions provide a simpler interface which is sufficient for most existing callers of proc_rwmem(). This change also adds a manual page for proc_rwmem() and the new functions. Reviewed by: jhb, kib Differential Revision: https://reviews.freebsd.org/D4245	2015-12-07 21:33:15 +00:00
adrian	0a5b508e93	Add support for the integrated wifi for the QCA953x base config and AP143. Tested: * AP143 reference design board	2015-11-29 05:49:49 +00:00
kib	ee461b4bba	Remove sv_prepsyscall, sv_sigsize and sv_sigtbl members of the struct sysent. sv_prepsyscall is unused. sv_sigsize and sv_sigtbl translate signal number from the FreeBSD namespace into the ABI domain. It is only utilized on i386 for iBCS2 binaries. The issue with this approach is that signals for iBCS2 were delivered with the FreeBSD signal frame layout, which does not follow iBCS2. The same note is true for any other potential user if sv_sigtbl. In other words, if ABI needs signal number translation, it really needs custom sv_sendsig method instead. Sponsored by: The FreeBSD Foundation	2015-11-28 08:49:07 +00:00
skra	40737e57a9	Revert r291142. The not quite consistent logic for bounce pages allocation is utilizited by re(4) interface which can hang now. Approved by: kib (mentor)	2015-11-23 11:19:00 +00:00
adrian	8de7408125	[mips]: Don't hard-code PHYS_AVAIL_ENTRIES.	2015-11-22 02:40:19 +00:00
skra	878d380e47	Fix BUS_DMA_MIN_ALLOC_COMP flag logic. When bus_dmamap_t map is being created for bus_dma_tag_t tag, bounce pages should be allocated only if needed. Before the fix, they were allocated always if BUS_DMA_COULD_BOUNCE flag was set but BUS_DMA_MIN_ALLOC_COMP not. As bounce pages are never freed, it could cause memory exhaustion when a lot of such tags together with their maps were created. Note that there could be more maps in one tag by current design. However BUS_DMA_MIN_ALLOC_COMP flag is tag's flag. It's set after bounce pages are allocated. Thus, they are allocated only for first tag's map which needs them. Approved by: kib (mentor)	2015-11-21 19:55:01 +00:00
adrian	b5c03ef320	mips: teach the malta platform about extended memory. Extended memory here is "physical memory above 256MB". "memsize" in the environment only grows to 256MB; "ememsize" is the entire memory range. Extended memory shows up at physical address 0x90000000. This allows for malta64 VMs to be created with > 256MB RAM, all the way up to 2GB RAM. Tested: * qemu-devel package; qemu-system-mips64 -m 2048 (and -m 256 to test the no-ememsize case.) TODO: * testing mips32 with > 256MB RAM. Reviewed by: imp	2015-11-21 00:22:47 +00:00
imp	2df6cd56e8	Mark the mostly redundant kernels that just pull in something from _BASE as NO_UNIVERSE Differential Revision: https://reviews.freebsd.org/D4200	2015-11-19 01:58:12 +00:00
adrian	7100cc07d3	Add the QCA9533 base configuration file and an example configuration for the AP143. Wifi doesn't work on the QCA9533 board, but basic ethernet/ethernet and ethernet switch support does work. The AP143 has 32MB RAM and 4MB flash, so this was tested with a USB rootfs. Tested: * QCA9533v2, AP143 reference design board.	2015-11-18 06:25:25 +00:00
allanjude	2832a618fc	Add a kernel config for the Onion Omega Small $25 IoT device, 400mhz Atheros cpu, Atheros WiFi and Ethernet 18 GPIOs, and support for Relay, Servo, and OLED expansion https://onion.io/omega/ Reviewed by: adrian Approved by: bapt (mentor) Relnotes: yes Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D4188	2015-11-17 21:02:27 +00:00
adrian	d03d60bd6b	Add QCA9533 to the list of SoCs that require IRQ's be ACKed.	2015-11-16 06:15:01 +00:00
adrian	83e37c2104	Add initial support for the QCA953x ("Honeybee") from Qualcomm Atheros. The QCA953x SoC is an integrated 2x2 2GHz 11n + MIPS24k core, with a 5 port FE switch, gige WAN port, and all the same stuff you'd find on its predecessor - the AR9331. However, buried deep in here somewhere is also a PCIe EP/RC for various applications and some other weird bits I don't yet know about. This is enough to get the reference board up and booting. I haven't yet had it pass lots of packets - I need to finalise the ethernet switch bits and the GMAC configuration (ie, how the ethernet ports and switch are wired up) and I'll bring that in when I commit the base configuration files to use the thing. The wifi stuff will come much later. I have to port that support from Linux ath9k and extend our vendor HAL to support it. The reference board (AP143) comes with 32MB RAM and 4MB flash, so in order to use it I need to get USB working fully so I can run root from there. Thankyou to Qualcomm Atheros for access to the reference design board. Details: * Add register definitions from openwrt; * It looks like a QCA955x but shrunk down to a QCA933x footprint, so use the QCA955x bits and fix up the clock detection code to do the QCA953x bits (they're very subtly different); * Teach GPIO about it; * Teach EHCI about it; * Teach if_arge about it; * Teach the CPU detection code about it. Tested: * AP143, QCA9533v2 SoC Obtained from: Linux, Linux OpenWRT	2015-11-16 04:28:00 +00:00
adrian	ab6e31c922	Remove this; it's also in sys/conf/files.mips.	2015-11-03 21:03:26 +00:00
adrian	4ee0fbe912	mips: rate limit the trap handler output; add pid/tid/program name. I discovered that we're logging each trap, which gets pretty spendy; and there wasn't any further information on the pid/tid/progname involved. I originally noticed this because I don't attach anything to /dev/log and so the log() output stays going to the kernel. That's an oops on my part, but I'm glad I did it. This commit adds the following: * a rate limiter, which could do with some eyeballs/ideas on how to make it more predictable on SMP; * log pid, tid, progname (comm) as part of the output. I now get output like this: Unaligned Load Word: pid=621 (pmcstat), tid=100060, pc=0xffffffff803ae898, badvaddr=0x40a10055 Unaligned Load Word: pid=621 (pmcstat), tid=100060, pc=0xffffffff803ae898, badvaddr=0x40a10051 Unaligned Load Word: pid=621 (pmcstat), tid=100060, pc=0xffffffff803ae898, badvaddr=0x40a1004d Unaligned Load Word: pid=602 (login), tid=100042, pc=0xffffffff803ae898, badvaddr=0x401159 Unaligned Load Word: pid=602 (login), tid=100042, pc=0xffffffff803ae898, badvaddr=0x401155 Unaligned Load Word: pid=602 (login), tid=100042, pc=0xffffffff803ae898, badvaddr=0x401151 .. which makes it much easier to start figuring out what/where to fix. The pc looks suss (it looks like it's in kernel space); I'll dig into that one next. Tested: * AR9331 SoC (Carambola2)	2015-11-02 03:36:15 +00:00
adrian	943c3df264	mips: do mips_sync() on sync operations to uncachable memory. mips24k/mips74k document that we need an explicit SYNC so to order things correctly, even with access to uncachable memory. We were doing calls to SYNC in the cache ops (inv, wbinv) but we weren't doing it for uncachable memory.	2015-10-31 00:29:26 +00:00
adrian	f8c53f2a34	mips74k: use cache-writeback for memory, not writethrough. When I ported this code from netbsd I was .. slightly mips74k greener. I used writethrough because (a) it's what netbsd did, and (b) if I used writethrough then things "didn't work." Fast-forward a couple years, more MIPS hacking and a whole lot more understanding of the bus APIs (the last few commits notwithstanding; it's been a long week, ok?) and I have this working for arge, argemdio, spi and ath. Hans has it working for USB. The ath barrier code will come in a later commit. This gets the routing throughput up from 220mbit -> 337mbit. I'm sure the bridging throughput will be similarly improved. Tested: * QCA955x SoC, routing workload.	2015-10-31 00:04:44 +00:00
adrian	be517d7ef4	arge_mdio: fix barriers; correctly check MII indicator register. * use barriers in a slightly better fashion. You can blame this glass of whiskey on putting barriers in the wrong spot. Grr adrian. * steal/rewrite the mdio busy check from ag7100 from openwrt and refactor the existing code out. This is .. more correct. This seems to fix the boot-to-boot variation that I've been seeing and it quietens the switch port status flapping. Tested: * QCA9558 SoC (AP135.) Obtained from: Linux OpenWRT	2015-10-30 23:59:52 +00:00
adrian	722c2df320	arge: fix barrier macro.	2015-10-30 23:57:20 +00:00
adrian	6b197b2c77	arge: attempt to close a transmit race by only enabling the descriptor at the end of setup. This driver and the linux ag71xx driver both treat the transmit ring as a circular linked list of descriptors. There's no "end" pointer that is ever NULL - instead, it expects the MAC to hit a finished descriptor (ARGE_DESC_EMPTY) and stop. Now, since it's a circular buffer, we may end up with the hardware hitting the beginning of our multi-descriptor frame before we've finished setting it up. It then DMA's it in, starts sending it, and we finish writing out the new descriptor. The hardware may then write its completion for the next descriptor out; then we do, and when we next read it it'll show up as "not done" and transmit completion stops. This unfortunately manifests itself as the transmit queue always being active and a massive TX interrupt storm. We need to actively ACK packets back from the transmit engine and if we don't (eg because we think the transmit isn't finished but it is) then the unit will just keep generating interrupts. I hit this finally with the below testing setup. This fixed it for me. Strictly speaking I should put in a sync in between writing out all of the descriptors and writing out that final descriptor. Tested: * QCA9558 SoC (AP135 reference board) w/ arge1 + vlans acting as a router, and iperf -d (tcp, bidirectional traffic.) Obtained from: Linux OpenWRT (ag71xx_main.c.)	2015-10-30 23:18:02 +00:00
adrian	1c25d2a759	arge: just use 1U since it's a 32 bit unsigned destination value.	2015-10-30 23:09:08 +00:00
adrian	305a5a647d	arge: do an explicit flush between updating the TX ring and starting transmit. The MIPS busdma sync operations currently are a big no-op on coherent memory. This isn't strictly correct behaviour as we need a SYNC in here to ensure that the writes have finished and are visible in main memory before the MMIO accesses occur. This will have to be addressed in a later commit. But, before that happens, let's at least do a flush here to make things more "correct". This is required for even remotely sensible behaviour on mips74k with write-through memory enabled.	2015-10-30 23:07:32 +00:00
adrian	401d1e9afa	arge_mdio: add explicit read barriers for MDIO_READs. The mips74k programmers guide notes that reads can be re-ordered, even uncached ones, so we need an explicit SYNC between them. Yes, this is a case of a driver author actively doing a bus barrier operation. This ends up being necessary when the mips74k core is run in write-back mode rather than write-through mode. That's coming in an upcoming commit. Tested: * mips74k, QCA9558 SoC (AP135 reference board), arge<->arge interface routing traffic tests.	2015-10-30 23:00:47 +00:00
adrian	ea415f2530	arge: ensure there's enough space in the TX ring before attempting to send frames. This matches the other check for space. "enough" is a misnomer, for "reasons". The biggest reason is that the TX ring is actually a circular linked list, with no head/tail pointers. This is just a bit more headroom between head/tail so we have time to schedule frames before we hit where the hardware is at. Ideally this would be tunable and a little larger.	2015-10-30 22:55:41 +00:00
adrian	22eed60ca3	arge: do a read-after-write on all arge register writes, not just MDIO writes. This flushes out the write to the system before anything continues. The mips74k guide, chapter 3.3.3 (write gathering) notes that writes can be buffered in FIFOs - even uncached ones - so we can't guarantee the device has felt its effects. Now, since we're all lazy driver authors and don't pepper read/write barriers everywhere, fake it here. tested: * mips74k - QCA9558 SoC (AP135 reference board)	2015-10-30 22:53:30 +00:00
adrian	48944f6b39	Oops - use the wrong array offset.	2015-10-28 23:39:33 +00:00
adrian	39fb527bf9	Add some debugging code (under ARGE_DEBUG) that counts each interrupt source. This should make it easier to track down interrupt storms from arge. Tested: * AP135 (QCA955x) SoC - defaults to ARGE_DEBUG enabled * Carambola2 (AR9331 SoC) - defaults to ARGE_DEBUG disabled	2015-10-28 05:11:06 +00:00
adrian	99f10c926a	mips: use the correct va for wbinv flushing. arge doesn't trigger this, but ath(4) does. Tested: * AR9331 SoC (Carambola2); ath(4) hostap Submitted by: ian	2015-10-27 23:11:22 +00:00
adrian	e35de425dc	arge(4): flip this on for AR9344 SoCs. I couldn't test arge0->arge1 bridging, only arge0 VLAN bridging. The DIR-825C1 only hooks up arge0 to the switch GMAC0 and so you need to abuse VLANs to test. Tested: * DIR-825C1 (AR9344)	2015-10-24 22:37:59 +00:00
adrian	3a1b629a4a	Commit the right board file - use the right name + hints.	2015-10-22 15:15:45 +00:00
adrian	685decc3b5	Add support for the TP-Link TL-WR740N v4. This is an AR9331 part based on the AP121 reference design but with 32MB RAM. Yes, it has 4MB flash and it has no USB, so clever hacks are required to get it up and working. But boot/work it does.	2015-10-22 08:08:06 +00:00
adrian	05283b18cc	arge: use 1-byte TX and RX alignment for AR9330/AR9331. This part seems to work bug-free with single byte TX/RX buffer alignment. This drops the CPU requirement to bridge 100mbit iperf from 100% CPU to ~ 50% CPU. Tested: * AP121 (AR9330) SoC, highly magic netbooted kernel + USB rootfs due to 4mb flash, 16mb RAM; doing bridging between arge0 and arge1. Notes: * Yes, I likely can also turn this on for the AR934x SoC family now. But since hardware design apparently follows similar branching strategies to software design, I'll go and make sure all the AR934x's that made it out into shipping products work before I flip it on.	2015-10-22 08:02:27 +00:00
ian	492b716bf0	Treat mbufs as cacheline-aligned. Even when the transfer begins at an offset within the buffer to align the L3 headers we know the buffer itself was allocated and sized on cacheline boundaries and we don't need to preserve partitial cachelines at the start and end of the buffer when doing busdma sync operations.	2015-10-21 19:24:20 +00:00
ian	8703e1c477	Free memory back into the categories it was allocated from. Noticed by: sbruno Pointy hat: ian	2015-10-21 17:41:20 +00:00
ian	b17e7488f4	Switch mips busdma to using the common busdma_buffalloc code. This amounts to copying in some code from the armv4 busdma, and adapting a few variable and flag names to match the surrounding mips code. Instead of keeping a local cache of prealloced busdma_map structs on a mutex-protected list, set up an uma zone to cache them. Instead of all memory allocations using M_DEVBUF, use new categories M_BUSDMA for allocations of metadata (tags, maps, segment tracking lists), and M_BOUNCE for bounce pages. When buffers are allocated out of the busdma_bufalloc zones the alignment and size of the buffers is known, and the code can skip doing any "partial cacheline flush" logic to preserve data that may be adjacent to the DMA buffer but contain non-DMA data. Reviewed by: adrian, imp	2015-10-21 15:06:48 +00:00
ian	fecc322187	Switch from a stub to a real implementation of pmap_page_set_attr() for mips, and implement support for VM_MEMATTR_UNCACHEABLE. This will be used in upcoming changes to support BUS_DMA_COHERENT in bus_dmamem_alloc(). Reviewed by: adrian, imp	2015-10-21 14:57:59 +00:00
adrian	67be823944	arge: Remove the debugging printf that snuck in. This was triggering when using it as an AP bridge rather than an ethernet bridge. The code is unclear but it works; I'll fix it to be clearer and test performance at a later stage.	2015-10-21 05:52:04 +00:00
adrian	3ec63ec821	arge: don't do the rx fixup copy and just offset the mbuf by 2 bytes The existing code meets the "alignment" requirement for the l3 payload by offsetting the mbuf by uint64_t and then calling an rx fixup routine to copy the frame backwards by 2 bytes. This DWORD aligns the L3 payload so tcp, etc doesn't panic on unaligned access. This is .. slow. For arge MACs that support 1 byte TX/RX address alignment, we can do the "other" hack: offset the RX address of the mbuf so the L3 payload again is hopefully DWORD aligned. This is much cheaper - since TX/RX is both 1 byte align ready (thanks to the previous commit) there's no bounce buffering going on and there is no rx fixup copying. This gets bridging performance up from 180mbit/sec -> 410mbit/sec. There's around 10% of CPU cycles spent in _bus_dmamap_sync(); I'll investigate that later. Tested: * QCA955x SoC (AP135 reference board), bridging arge0/arge1 by programming the switch to have two vlangroups in dot1q mode: # ifconfig bridge0 inet 192.168.2.20/24 # etherswitchcfg config vlan_mode dot1q # etherswitchcfg vlangroup0 members 0,1,2,3,4 # etherswitchcfg vlangroup1 vlan 2 members 5,6 # etherswitchcfg port5 pvid 2 # etherswitchcfg port6 pvid 2 # ifconfig arge1 up # ifconfig bridge0 addm arge1	2015-10-21 01:41:18 +00:00
sbruno	ef10d126bf	Disable SWAPPING as we don't do it on this board.	2015-10-20 19:32:26 +00:00
sbruno	467d89cb6f	Remove geom_uncompress from TP-MR3020 config. Its now using root on USB and there's no need for it now.	2015-10-18 18:41:30 +00:00
sbruno	d057275ef4	Add VM_KMEM_SIZE_SCALE=1 as these systems are going to have super small amount of RAM, e.g. 16M or 32M Reviewed by: adrian	2015-10-18 18:40:11 +00:00
sbruno	5145e8ee6b	Correctly use the default values for location of MAC addrs of arge0, arge1, ath0. woo! Reviewed by: adrian	2015-10-18 04:50:51 +00:00
adrian	88a4a1403b	if_arge: fix up TX workaround; add TX/RX requirements for busdma; add stats The early ethernet MACs (I think AR71xx and AR913x) require that both TX and RX require 4-byte alignment for all packets. The later MACs have started relaxing the requirements. For now, the 1-byte TX and 1-byte RX alignment requirements are only for the QCA955x SoCs. I'll add in the relaxed requirements as I review the datasheets and do testing. * Add a hardware flags field and 1-byte / 4-byte TX/RX alignment. * .. defaulting to 4-byte TX and 4-byte RX alignment. * Only enforce the TX alignment fixup if the hardware requires a 4-byte TX alignment. This avoids a call to m_defrag(). * Add counters for various situations for further debugging. * Set the 1-byte and 4-byte busdma alignment requirement when the tag is created. This improves the straight bridging performance from 130mbit/sec to 180mbit/sec, purely by removing the need for TX path bounce buffers. The main performance issue is the RX alignment requirement and any RX bounce buffering that's occuring. (In a local test, removing the RX fixup path and just aligning buffers raises the performance to above 400mbit/sec. In theory it's a no-op for SoCs before the QCA955x. Tested: * QCA9558 SoC in AP135 board, using software bridging between arge0/arge1.	2015-10-18 00:59:28 +00:00

... 2 3 4 5 6 ...

2054 Commits