159609 Commits

Author SHA1 Message Date
bms
a7de2dc1be MFC r200871:
Use ALLOW_NEW_SOURCES and BLOCK_OLD_SOURCES to signal a join or leave
 with SSM MLDv2 by default.
 This is current practice and complies with RFC 4604, as well as being
 required by production IPv6 networks in Japan.
 The behaviour may be disabled by setting the net.inet6.mld.use_allow
 sysctl/tunable to 0.

Requested by:	Hideki Yamamoto, dikshie
2010-01-07 14:15:34 +00:00
kib
34b3205171 MFC r201347:
Allow swap out of the kernel stack for the thread with priority greater
or equial then PSOCK, not less or equial.
2010-01-07 11:41:47 +00:00
kib
0801ed4053 MFC r201504:
PG_NOSYNC is called VPO_NOSYNC for long time.
2010-01-07 11:33:57 +00:00
yongari
e5472454f5 MFC r200088,200227-200228,200246,200264,201446
r200088:
  Add workaround to overcome hardware limitation which allows only a
  single outstanding DMA read operation. Most controllers targeted to
  client with PCIe bus interface(e.g. BCM5761) may have this
  limitation. All controllers for servers does not have this
  limitation.
  Collapsing mbuf chains to reduce number of memory reads before
  transmitting was most effective way to workaround this. I got about
  940Mbps from 850Mbps with mbuf collapsing on BCM5761. However it
  takes a lot of CPU cycles to collapse mbuf chains so add tunable to
  control the number of allowed TX buffers before collapsing. The
  default value is 0 which effectively disables the forced collapsing.
  For most cases 2 would yield best performance(about 930Mbps)
  without much sacrificing CPU cycles.
  Note the collapsing is only activated when the controller is on
  PCIe bus and the frame does not need TSO operation. TSO does not
  seem to suffer from the hardware limitation because the payload
  size is much bigger than normal IP datagram.
  Thanks to davidch@ who told me the limitation of client controllers
  and actually gave possible workarounds to mitigate the limitation.

r200227:
  Remove PHY isolate/power down code in bge_stop(). The isolation
  handler in brgphy(4) does not exist and brgphy(4) just resets the
  PHY and returns EINVAL as it has no isolation handler. I also agree
  on Marius's opinion that stop handler of every NIC driver seems to
  be the wrong place for implementing PHY isolate/power down.
  If we need PHY isolate/power down it should be implemented in
  brgphy(4) and users should administratively down the PHY.

r200228:
  Don't access jumbo frame related registers if controller lacks the
  feature. These registers are reserved on controllers that have no
  support for jumbo frame.
  Only BCM5700 has mini ring so do not poke mini ring related
  registers if controller is not BCM5700.

r200246:
  Partially revert r200228. For mini RCB case, bge(4) still have to
  disable mini ring withtout regard to mini ring support.

r200264:
  Create sysctl node(dev.bge.%d.focred_collapse) instead of
  hw.bge.forced_collapse. hw.bge.forced_collapse affects all bge(4)
  controllers on system which may not desirable behavior of the
  sysctl node. Also allow the sysctl node could be modified at any
  time.

r201446:
  Fix regression introduced in r198318. BCM5754/BCM5754M uses the
  same ASIC ID of BCM5758 such that r198318 incorecctly enabled TSO
  on BCM5754.BCM5754M controllers. BCM5754/BCM5754M needs a special
  firmware to enable TSO and bge(4) does not support firmware based
  TSO.
2010-01-07 00:55:07 +00:00
yongari
4daba51ba7 MFC r199670-199671,199674,199679,199761,199807-199808
r199670:
  Fix two long standing bugs on bge(4). Most pre BCM5755 controllers
  have a DMA bug when buffer address crosses a multiple of the 4GB
  boundary(e.g. 4GB, 8GB, 12GB etc). Limit DMA address to be within
  4GB address for these controllers. The second DMA bug limits DMA
  address to be within 40bit address space. This bug applies to
  BCM5714 and BCM5715 and 5708(bce(4) controller). This is not
  actually a MAC controller bug but an issue with the embedded PCIe
  to PCI-X bridge in the device. So for BCM5714/BCM5715 controllers
  also limit the DMA address to be within 40bit address space.
  Special thanks to davidch@ who gave me detailed errata information.
  I think this change will fix long standing bge(4) instability
  issues on systems with more than 4GB memory.

r199671:
  Implement TSO for BCM5755 or newer controllers. Some controllers
  seem to require a special firmware to use TSO. But the firmware is
  not available to FreeBSD and Linux claims that the TSO performed by
  the firmware is slower than hardware based TSO. Moreover the
  firmware based TSO has one known bug which can't handle TSO if
  ethernet header + IP/TCP header is greater than 80 bytes. The
  workaround for the TSO bug exist but it seems it's too expensive
  than not using TSO at all. Some hardwares also have the TSO bug so
  limit the TSO to the controllers that are not affected TSO issues
  (e.g. 5755 or higher).
  While I'm here set VLAN tag bit to all descriptors that belengs to
  a frame instead of the first descriptor of a frame. The datasheet
  is not clear how to handle VLAN tag bit but it worked either way in
  my testing. This makes it simplify TSO configuration a little bit.

  Big thanks to davidch@ who sent me detailed TSO information.
  Without this I was not able to implement it.

r199674:
  Add missing function prototype in r199671.

r199679:
  Reduce status block size DMAed by controller. bge(4) uses single
  Tx/Rx/Rx return ring such that large part of status block was not
  used at all. All bge(4) controllers except BCM5700 AX/BX has a
  feature to control the size of status block. So use minimum status
  block size allowed in controller. This reduces number of DMAed
  status block size to 32 bytes from 80 bytes.

r199761:
  BGE_FLAG_40BIT_BUG should be set before creating DMA tags.

r199807:
  Make sure one shot MSI is enabled.

r199808:
  Fix typo which inversed the logic which in turn disabled MSI.
2010-01-07 00:44:54 +00:00
yongari
04868bdad3 MFC r199667-199668
r199667:
  Cache Rx producer/Tx consumer index as soon as we know status block
  update and then clear status block. Previously it used to access
  these index without synchronization which may cause problems when
  bounce buffers are used. Also add missing bus_dmamap_sync(9) in
  polling handler. Since we now update status block in driver, adjust
  bus_dmamap_sync(9) for status block.

r199668:
  For MSI case, interrupt is not shared and we don't need to force
  PCI flush to get correct status block update. Add an optimized
  interrupt handler that is activated for MSI case. Actual interrupt
  handling is done by taskqueue such that the handler does not
  require driver lock for Rx path. The MSI capable bge(4) controllers
  automatically disables further interrupt once it enters interrupt
  state so we don't need PIO access to disable interrupt in interrupt
  handler.
2010-01-06 23:42:15 +00:00
yongari
f07ee87d4c MFC 199663-199666
r199663:
  Due to newly added PCIe capabilities fallback code for finding the
  PCIe capability did not work right on recent controllers. Remove
  FreeBSD 6.x support code.

r199664:
  Use capability pointer to access PCIe registers rather than
  directly access them at fixed address. While I'm here don't touch
  other bits of PCIe device control register except max payload size.

r199665:
  Controller does not write Rx descriptors, remove BUS_DMASYNC_PREREAD.

r199666:
  Rearrange bge_start_locked to see we can send more frames by
  checking IFF_DRV_RUNNING and IFF_DRV_OACTIVE flags. Also if we
  have less than 16 free send BDs set IFF_DRV_OACTIVE and try it
  later. Previously bge(4) used to reserve 16 free send BDs after
  loading dma maps but hardware just need one reserved send BD. If
  prouder index has the same value of consumer index it means the Tx
  queue is empty.
  While I'm here check IFQ_DRV_IS_EMPTY first to save one lock
  operation.
2010-01-06 23:34:53 +00:00
yongari
6dab00a0cb MFC r199065,199115-199116,199153,199661-199662
r199065:
  Correct disabling checksum offloading for BCM5700 B0.

r199115:
  Add missing bus_dmamap_sync(9) before issuing kick command.

r199116:
  Zero out Tx/Rx descriptors before using them. Also add missing
  bus_dmamap_sync(9) after Tx descriptor initialization.

r199153:
  Controller does not update Tx descriptors(send BDs) after sending
  frames so remove unnecessary BUS_DMASYNC_PREREAD and
  BUS_DMASYNC_POSTREAD of bus_dmamap_sync(9).

r199661:
  Remove extra white space.

r199662:
  Fix typo introduced in r199011.
2010-01-06 23:26:09 +00:00
yongari
aa8c5c8e14 MFC r198967,199009-199011,199014,199020,199035-199036,199054
r198967:
  Correct MSI mode register bits.

r199009:
  bge(4) already switched to use UMA backed page allocator and local
  memory allocator for jumbo frame was removed long time ago. Remove
  no more used macros.

r199010:
  Do bus_dmamap_sync call only if frame size is greater than
  standard buffer size. If controller is not capable of handling
  jumbo frame, interface MTU couldn't be larger than standard MTU
  which in turn the received should be fit in standard buffer. This
  fixes bus_dmamap_sync call for jumbo ring is called even if
  interface is configured to use standard MTU.
  Also if total frame size could be fit into standard buffer don't
  use jumbo buffers.

r199011:
  Reimplement Rx buffer allocation to handle dma map load failure.
  Introduce two spare dma maps for standard buffer and jumbo buffer
  respectively. If loading a dma map failed reuse previously loaded
  dma map. This should fix unloaded dma map is used in case of dma
  map load failure. Also don't blindly unload dma map and defer
  dma map sync and unloading operation until we know dma map for new
  buffer is successfully loaded. This change saves unnecessary dma
  load/unload operation. Previously bge(4) tried to reuse mbuf
  with unloaded dma map which is really bad thing in bus_dma(9)
  perspective.
  While I'm here update if_iqdrops if we can't allocate Rx buffers.

r199014:
  Fix I mssied in r199011. Rx ring index also should be updated.
  If we fill Rx ring full instead of half we can simplify this logic
  but this requires more experimentation.

r199020:
  Tell upper layer we support long frames. ether_ifattach()
  initializes it to ETHER_HDR_LEN so we have to override it after
  calling ether_ifattch().
  While I'm here remove setting if_mtu value, it's initialized in
  ether_ifattach().

r199035:
  Don't count input errors twice, we always read input errors from
  MAC in bge_tick. Previously it used to show more number of input
  errors. I noticed actual input errors were less than 8% even for
  64 bytes UDP frames generated by netperf.
  Since we always access BGE_RXLP_LOCSTAT_IFIN_DROPS register in
  bge_tick, remove useless code protected by #ifdef notyet.

r199036:
  Count number of inbound packets which were chosen to be discarded
  as input errors. Also count out of receive BDs as input errors.

r199054:
  Partially revert r199035.
  Revision 1.158 says only lower ten bits of
  BGE_RXLP_LOCSTAT_IFIN_DROPS register is valid. For BCM5761 case it
  seems the controller maintains 16bits value for the register.
  However 16bits are still too small to count all dropped packets
  happened in a second. To get a correct counter we have to read the
  register in bge_rxeof() which would be too expensive.
2010-01-06 23:02:35 +00:00
yongari
a43cb3d4c3 MFC r198923-198924,198927-198928
r198923:
  Use correct dma tag for jumbo buffer.

r198924:
  Covert bge_newbuf_std to use bus_dmamap_load_mbuf_sg(9). Note,
  bge_newbuf_std still has a bug for handling dma map load failure
  under high network load. Just reusing mbuf is not enough as driver
  already unloaded the dma map of the mbuf. Graceful recovery needs
  more work.
  Ideally we can just update dma address part of a Rx descriptor
  because the controller never overwrite the Rx descriptor. This
  requires some Rx initialization code changes and it would be done
  later after fixing other incorrect bus_dma(9) usages.

r198927:
  Remove common DMA tag used for TX/RX mbufs and create Tx DMA tag
  and Rx DMA tag separately. Previously it used a common mbuf DMA tag
  for both Tx and Rx path but Rx buffer(standard ring case) should
  have a single DMA segment and maximum buffer size of the segment
  should be less than or equal to MCLBYTES. This change also make it
  possible to add TSO with minor changes.

r198928:
  Make bge_newbuf_std()/bge_newbuf_jumbo() returns actual error code
  for buffer allocation. If driver know we are out of Rx buffers let
  controller stop. This should fix panic when interface is run even
  if it had no configured Rx buffers.
2010-01-06 22:45:49 +00:00
simon
92b5431ace Fix BIND named(8) cache poisoning with DNSSEC validation.
[SA-10:01]

Fix ntpd mode 7 denial of service. [SA-10:02]

Fix ZFS ZIL playback with insecure permissions. [SA-10:03]

Various FreeBSD 8.0-RELEASE improvements. [EN-10:01]

Security:	FreeBSD-SA-10:01.bind
Security:	FreeBSD-SA-10:02.ntpd
Security:	FreeBSD-SA-10:03.zfs
Errata:		FreeBSD-EN-10:01.freebsd
Approved by:	so (simon)
2010-01-06 21:45:30 +00:00
gavin
bff2fcd685 MFC r200819:
Grammar and minor tweaks to powerd(8) man page.

PR:		docs/133186
Approved by:	ed (mentor, implicit)
2010-01-06 20:54:04 +00:00
gavin
29a4b1bc6c MFC r200820:
Support the tablet in (at least) the Toshiba Portege M200 Tablet PC.
  This device only appears on the ACPI bus, so isn't caught by the current
  entry for it in the uart(4) ISA attachment.

PR:		kern/140172
Reviewed by:	jhb, marcel
Approved by:	ed (mentor, implicit)
2010-01-06 20:40:41 +00:00
jkim
e262ac646f MFC: r200251
- Try pre-allocating all FIBs upfront.  Previously we tried pre-allocating
128 FIBs first and allocated more later if necessary.  Remove now unused
definitions from the header file[1].
- Force sequential bus scanning.  It seems parallel scanning is in fact
slower and causes more harm than good[1].  Adjust a comment to reflect that.
2010-01-06 20:28:47 +00:00
ru
1e75263796 MFC r201290: Treat an empty argument as an error, instead of
fetching the contents of the root directory.
2010-01-06 08:26:43 +00:00
bz
4e0b0a9186 According to basic instructions from jhb clean-up mergeinfo from r201614. 2010-01-05 23:03:59 +00:00
qingli
b738408ac2 MFC r201319
Remove a deleted comment line that was brought back by
my previous commit.
2010-01-05 22:37:05 +00:00
qingli
d8e285292f MFC r201285
Consolidate the route message generation code for when address
aliases were added or deleted. The announced route entry for
an address alias is no longer empty because this empty route
entry was causing some route daemon to fail and exit abnormally.
2010-01-05 22:33:10 +00:00
qingli
60e03ff574 MFC r201284
Multiple IPv6 addresses of the same prefix can be installed on the
same interface. The first address will install the prefix route into
the kernel routing table and that prefix will be marked as on-link.
Without RADIX_MPATH enabled, the other address aliases of the same
prefix will update the prefix reference count but no other routes
will be installed. Consequently the prefixes associated with these
addresses would not be marked as on-link. As such, incoming packets
destined to these address aliases will fail the ND6 on-link check
on input. This patch fixes the above problem by searching the kernel
routing table and try to find an on-link prefix on the given interface.
2010-01-05 22:28:23 +00:00
qingli
ea5192e625 MFC r201282, r201543
r201282
-------
The proxy arp entries could not be added into the system over the
IFF_POINTOPOINT link types. The reason was due to the routing
entry returned from the kernel covering the remote end is of an
interface type that does not support ARP. This patch fixes this
problem by providing a hint to the kernel routing code, which
indicates the prefix route instead of the PPP host route should
be returned to the caller. Since a host route to the local end
point is also added into the routing table, and there could be
multiple such instantiations due to multiple PPP links can be
created with the same local end IP address, this patch also fixes
the loopback route installation failure problem observed prior to
this patch. The reference count of loopback route to local end would
be either incremented or decremented. The first instantiation would
create the entry and the last removal would delete the route entry.

r201543
-------
The IFA_RTSELF address flag marks a loopback route has been installed
for the interface address. This marker is necessary to properly support
PPP types of links where multiple links can have the same local end
IP address. The IFA_RTSELF flag bit maps to the RTF_HOST value, which
was combined into the route flag bits during prefix installation in
IPv6. This inclusion causing the prefix route to be unusable. This
patch fixes this bug by excluding the IFA_RTSELF flag during route
installation.

PR:		ports/141342, kern/141134
2010-01-05 22:14:55 +00:00
jhb
7bf8a1b9d6 MFC 201196:
Change vlan interfaces to cope more usefully with the parent interface being
renamed.  Previously the vlan interfaces would lose their configuration as if
the parent interface had been physically removed.  Now vlan interfaces ignore
rename events.
- Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being
  renamed.  This flag can be checked in ifnet departure/arrival event
  handlers to treat rename events differently.
- Change the ifnet departure event handler in the if_vlan(4) driver to
  ignore departure events due to a trunk interface being renamed.
2010-01-05 18:25:41 +00:00
jhb
c979d2b5cc MFC 200847:
- Rename the __tcpi_(snd|rcv)_mss fields of the tcp_info structure to remove
  the leading underscores since they are now implemented.
- Implement the tcpi_rto and tcpi_last_data_recv fields in the tcp_info
  structure.
2010-01-05 17:04:14 +00:00
mav
8fec2c4c93 MFC 200977:
Avoid false positive probe on ICH6 chipsets.
2010-01-05 14:03:46 +00:00
mav
18590136fc MFC 200991:
Teach twe driver to report array stripe size to GEOM.
2010-01-05 14:02:12 +00:00
mav
f4128931eb MFC 200969:
Report stripe size only if physical sector size is not equal to logical.
2010-01-05 13:58:18 +00:00
mav
7a9bcff074 MFC 200968:
Make diskinfo report disk stripe size and offset. It should help users to
make file systems optimally aligned and tuned for better performance.
2010-01-05 13:56:58 +00:00
mav
a031d7ce23 MFC r196799:
Don't bother obtaining the ident if we are not going to print it.
2010-01-05 13:55:49 +00:00
mav
41ffc478e5 MFC r200934:
Add two disk ioctls, giving user-level tools information about disk/array
stripe (optimal access block) size and offset.
2010-01-05 13:51:23 +00:00
mav
536d45b203 MFC r200942:
Make geom_concat to passthrough stripe parameters of the first component,
hoping that rest will fit.
2010-01-05 13:50:14 +00:00
mav
3deed09e22 MFC r200940:
As soon as geom_raid3 reports it's own stripe as sector size, report largest
underlying provider's stripe, multiplied by number of data disks in array,
due to transformation done, as array stripe.
2010-01-05 13:49:18 +00:00
mav
0f3f0f89b5 MFC r200935:
As soon as mirror has no own stripes, report largest stripe of unrerlying
components, hoping others fit, if they are not equal.
2010-01-05 13:47:55 +00:00
mav
867e021455 MFC r200933:
Make geom_stripe report it's stripe size to upper layers.
2010-01-05 13:46:39 +00:00
kib
b8ea201676 MFC r201400:
Remove reference to the bug in FreeBSD 2.0.
2010-01-05 12:34:16 +00:00
kib
194840d7a1 MFC r201194:
Use clock_gettime(CLOCK_SECOND) instead of gettimeofday(2) for
implementation of time(3). CLOCK_SECOND is much faster.
2010-01-05 12:32:09 +00:00
jhb
7186116758 MFC 201351:
Use stricter checking to match possible vlan clones by not allowing extra
garbage characters around or within the tag.
2010-01-04 22:44:48 +00:00
imp
cf0d4c6060 Revert 201158. DEFAULTS isn't for this kind of thing.a 2010-01-04 21:33:10 +00:00
kensmith
a2d8e867c0 MFC r200775:
Add FreeBSD- to the beginning of the ISO image filenames.
2010-01-04 19:57:35 +00:00
jhb
93c1d4d2b3 MFC 201216:
Remove a trailing reference to the obsolete vaps_<IF> variable.
2010-01-04 19:27:17 +00:00
syrinx
3fa873baa9 MFC r201254:
Make sure the multicast forwarding cache entry's stall queue is properly
initialized before trying to insert an entry into it.

PR:		kern/142052
Reviewed by:	bms
2010-01-04 15:58:36 +00:00
ume
d5f1472e0f MFC r200055, r200102:
- Teach an IPv6 to the debug prints.
- Use INET_ADDRSTRLEN and INET6_ADDRSTRLEN rather than hard
  coded number.
2010-01-04 15:22:38 +00:00
ume
3b3ffe36df MFC r200027: Teach an IPv6 to send_pkt() and ipfw_tick().
It fixes the issue which keep-alive doesn't work for an IPv6.
2010-01-04 15:05:11 +00:00
jh
8ff22b74d0 MFC r198940:
File flags handling fixes for ext2fs:

- Disallow setting of flags not supported by ext2fs.
- Map EXT2_APPEND_FL to SF_APPEND.
- Map EXT2_IMMUTABLE_FL to SF_IMMUTABLE.
- Map EXT2_NODUMP_FL to UF_NODUMP.

Note that ext2fs doesn't support user settable append and immutable flags.
EXT2_NODUMP_FL is an user settable flag also on Linux.

PR:		kern/122047
Approved by:	trasz (mentor)
2010-01-04 14:35:36 +00:00
delphij
32f7e2fc4c MFC r201137:
Grammar fix.

Submitted by:	Kenyon Ralph <kenyon kenyonralph com>
2010-01-04 01:09:59 +00:00
delphij
41d80c75ba Plug a memory leak.
PR:		bin/141835
Submitted by:	Henning Petersen <henning.petersen t-online.de>
2010-01-04 01:08:27 +00:00
delphij
768b2f543d MFC r200793:
Plug a memory leak.

PR:		bin/141836
Submitted by:	Henning Petersen <henning.petersen at t-online.de>
2010-01-04 01:07:32 +00:00
delphij
35b3b2ceeb MFC r200727:
Apply fix for Solaris bug 6764159: restore_object() makes a call
that can block while having a tx open but not yet committed
(onnv revision 7994)

Submitted by:	mm
Approved by:	pjd
Obtained from:	OpenSolaris
2010-01-03 03:10:28 +00:00
delphij
9ec4dfa13c MFC r200726:
Apply fix for Solaris bug 6801979: zfs recv can fail with E2BIG
(onnv revision 8986)

PR:		kern/141355
Requested by:	mm
Submitted by:	pjd
Obtained from:	OpenSolaris
2010-01-03 03:05:30 +00:00
delphij
87394b81d9 MFC r200724:
Apply fix for Solaris bug  6462803 zfs snapshot -r failed because
filesystem was busy.

PR:		kern/141387
Submitted by:	mm
Approved by:	pjd
Obtained from:	OpenSolaris (onnv 8989:cfce31f4eebf)
2010-01-03 02:58:05 +00:00
delphij
9ea5418d3f MFC r200516:
Add an option to specify that the received ZFS should not be automatically
mounted (receive -u).

Obtained from:	OpenSolaris (onnv revision 8584:327a1b6dd944)
Approved by:	pjd
2010-01-03 00:27:35 +00:00
imp
53aa96e334 Welcome to 2010. 2010-01-02 20:34:13 +00:00