69766 Commits

Author SHA1 Message Date
Stanislav Sedov
e4ec1e683e - Eliminate unused variable. [1]
- Check for runt frames entering the stack. [2]

Suggested by:	ganbold[1], yongari[2]
Approved by:	kib (mentor)
MFC after:	2 weeks
2008-12-06 14:23:45 +00:00
Randall Stewart
830d754d52 Code from the hack-session known as the IETF (and a
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
   o don't pre-hash them when you issue one in a cookie.
   o Allow duplicates and use addresses and ports to
     discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
  is still experimental and needs more extensive testing with the
  Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
  with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
  if the adv-peer-ack point progressed. However we still make sure
  a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
  unreliable but not abandoned yet.

With the help of:	Michael Teuxen and Peter Lei :-)
MFC after:	 4 weeks
2008-12-06 13:19:54 +00:00
Edward Tomasz Napierala
d27a975f72 Make it possible to use gjournal for the root filesystem. Previously,
an unclean shutdown would make it impossible to mount rootfs at boot.

PR:		kern/128529
Reviewed by:	pjd
Approved by:	rwatson (mentor)
Sponsored by:	FreeBSD Foundation
2008-12-06 11:33:10 +00:00
Daniel Gerzo
f1f8583397 - correct variable name
PR:		docs/129448
Submitted by:	Kenyon Ralph <kralph@gmail.com>
MFC after:	Revision 1.91 is merged
2008-12-06 11:21:10 +00:00
George V. Neville-Neil
7f15419bb7 Bug fix to support N310 version of Chelsio cards (board ID 1088).
Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-12-06 02:10:53 +00:00
Alexander Motin
d8208d9eb1 Forget current bus power settings on full reset. Chip must be reconfigured.
Do not issue command if there is no card, clock or power.
Controller will not detect command timeout without clock active.
2008-12-06 01:31:07 +00:00
George V. Neville-Neil
5197f3abd7 Re submit code to print the part and serial number for Chelsio cards.
The original code was accidentally removed in another commit.

MFC after: 1 day
2008-12-05 21:40:11 +00:00
John Baldwin
ee03ef72e3 Add simple locking for the in-kernel iconv code. Translation operations
do not need any locking.  Opening and closing translators is serialized
using an sx lock.

Note: This depends on the earlier fix to kern_module.c to properly order
MOD_UNLOAD events.

MFC after:	2 months
2008-12-05 21:19:24 +00:00
Konstantin Belousov
2640173120 Unconditionally use locked addition of zero to tip of the stack for
memory barriers on i386. It works as a serialization instruction on
all IA32 CPUs.

Alternative solution of using {s,l,}fence requires run-time checking
of the presense of the corresponding SSE or SSE2 extensions, and
possible boot-time patching of the kernel text.

Suggested by:	many
2008-12-05 21:17:54 +00:00
Konstantin Belousov
aeb325719a Several threads in a process may do vfork() simultaneously. Then, all
parent threads sleep on the parent' struct proc until corresponding
child releases the vmspace. Each sleep is interlocked with proc mutex of
the child, that triggers assertion in the sleepq_add(). The assertion
requires that at any time, all simultaneous sleepers for the channel use
the same interlock.

Silent the assertion by using conditional variable allocated in the
child. Broadcast the variable event on exec() and exit().

Since struct proc * sleep wait channel is overloaded for several
unrelated events, I was unable to remove wakeups from the places where
cv_broadcast() is added, except exec().

Reported and tested by:	ganbold
Suggested and reviewed by:	jhb
MFC after:	2 week
2008-12-05 20:50:24 +00:00
John Baldwin
75444a8590 When the SYSINIT() to load a module invokes the MOD_LOAD event successfully,
move that module to the head of the associated linker file's list of modules.
The end result is that once all the modules are loaded, they are sorted in
the reverse of their load order.  This causes the kernel linker to invoke
the MOD_QUIESCE and MOD_UNLOAD events in the reverse of the order that
MOD_LOAD was invoked.  This means that the ordering of MOD_LOAD events that
is set by the SI_* paramters to DECLARE_MODULE() are now honored in the same
order they would be for SYSUNINIT() for the MOD_QUIESCE and MOD_UNLOAD
events.

MFC after:	1 month
2008-12-05 16:47:30 +00:00
Rafal Jaworowski
b09d6bf325 Avoid confusion and adjust link address range of Marvell Orion kernel so it is
the same as for Kirkwood and Discovery.
2008-12-05 15:31:51 +00:00
Rafal Jaworowski
fe3ea17045 Fix configuration of the PCI bridge. This got omitted in the initial import of
this code.
2008-12-05 15:27:28 +00:00
Gleb Smirnoff
0b476f1cce In a case of CARP status change run through the if_link_state_change()
routine, so that devd(8) and others are notified about link state change.
2008-12-05 14:37:14 +00:00
John Baldwin
b4824b48b4 - Invoke MOD_QUIESCE on all modules in a linker file (kld) before
unloading any modules.  As a result, if any module veto's an unload
  request via MOD_QUIESCE, the entire set of modules for that linker
  file will remain loaded and active now rather than leaving the kld
  in a weird state where some modules are loaded and some are unloaded.
- This also moves the logic for handling the "forced" unload flag out of
  kern_module.c and into kern_linker.c which is a bit cleaner.
- Add a module_name() routine that returns the name of a module and use that
  instead of printing pointer values in debug messages when a module fails
  MOD_QUIESCE or MOD_UNLOAD.

MFC after:	1 month
2008-12-05 13:40:25 +00:00
Konstantin Belousov
482b7172da Improve db_backtrace() for compat ia32 on amd64. 32bit image enters
the kernel via Xint0x80_syscall().

Submitted by:	dchagin
MFC after:	1 week
2008-12-05 11:34:36 +00:00
Warner Losh
5b9ee137aa Move to using filter for the change interrupts. Also rework the power
interrupt code to be more robust.  I've been running these changes for
over a year...  With these changes, I don't see the ath card going
into reset like the code in the tree.
2008-12-05 05:20:08 +00:00
Warner Losh
ca446278b1 Minor style nit. 2008-12-05 04:48:04 +00:00
Warner Losh
f40b0e2e97 Augment comments, and move things around a smidge. 2008-12-05 04:46:26 +00:00
Warner Losh
414f7ec8bd Implement a method described in NetBSD PR 36652 for coping with the
BAD VCC bit.
2008-12-05 04:43:25 +00:00
George V. Neville-Neil
9036240993 Fix a bug with the ael1006 PHY. The bug shows up as persistent but incomplete
packet loss, of between 10-30%. The fix is to put the PHY into
and take it out of local loopback mode when resetting the interface.

Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-12-04 20:32:53 +00:00
Warner Losh
78bc7eec0d Put the MIPS support back in after it was removed in r185029. 2008-12-04 16:31:08 +00:00
Kip Macy
23dc562170 Integrate 185578 from dfr
Use newbus to managed devices
2008-12-04 07:59:05 +00:00
Kip Macy
cbc936b6d8 fix initialization for case of normal kernbase
remove unused shutdown code
2008-12-04 07:28:13 +00:00
Pyun YongHyeon
450ab47230 Add HW MAC counter support for newer JMC250/JMC260 revisions. 2008-12-04 02:16:53 +00:00
Pyun YongHyeon
f37739d7ab Add support for newer JMC250/JMC260 revisions.
o Chip full mask revision 2 or later controllers have to
   set correct Tx MAC and Tx offload clock depending on negotiated
   link speed.
 o JMC260 chip full mask revision 2 has a silicon bug that can't
   handle 64bit DMA addressing. Add workaround to the bug by
   limiting DMA address space to be within 32bit.
 o Valid FIFO space of receive control and status register was
   changed on chip full mask revision 2 or later controllers. For
   these controllers, use default 16QW as it's supposed to be the
   safest value for maximum PCIe compatibility. JMicron confirmed
   performance will not be reduced even if the FIFO space is set
   to 16QW.
 o When interface is put into suspend/shutdown state, remove Tx MAC
   and Tx offload clock to save more power. We don't need Tx clock
   at all in this state.
 o Added new register definition for chip full mask revision 2 or
   later controllers.

Thanks to JMicron for their continuous support of FreeBSD.
2008-12-04 01:58:40 +00:00
Xin LI
2dd5c73163 Don't attempt to clear status updates if we did not do a link state
change.  As a side effect, this makes the excessive interrupts to
disappear which has been observed as a regression in recent stable/7.

Reported by:	many (on -stable@)
Reviewed by:	davidch
2008-12-03 23:00:00 +00:00
John Baldwin
3cdf485f87 When unloading a 32-bit system call module, restore the sysent vector in
the 32-bit system call table instead of the main system call table.
2008-12-03 18:45:38 +00:00
Alexander Kabaev
7d464bbf38 Change nfsserver slightly so that it does not trip over the timestamp
validation code on ZFS.

Problem: when opening file with O_CREAT|O_EXCL NFS has to jump through
extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8
bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely
identify this create request. Server then creates a new file with access
mode 0, copies received 8 bytes into va_atime member of struct vattr and
attempt to set the atime on file using VOP_SETATTR. If that succeeds, it
fetches file attributes with VOP_GETATTR and verifies that atime
timestamps match.  If timestamps do not match, NFS server concludes it
has probbaly lost the race to another process creating the file with the
same name and bails with EEXIST.

This scheme works OK when exported FS is FFS, but if underlying
filesystem is ZFS _and_ server is running 64bit kernel, it breaks down
due to sanity checking in zfs_setattr function, which refuses to accept
any timestamps which have tv_sec that cannot be represented as 32bit
int. Since struct timespec fields are 64 bit integers on 64bit platforms
and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight
bytes supplied by client end up in va_atime.tv_sec, forcing it out of
valid 32bit range.

The solution this change implements is simple: it treats
NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into
va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing
that tv_sec remains in 32 bit range and ZFS remains happy.

Reviewed by: kib
2008-12-03 17:54:09 +00:00
Joseph Koshy
b4d091f3a4 Fixes for Core2 Extreme support.
Submitted by:	 "Artem Belevich" <artemb at gmail dot com>
2008-12-03 17:30:36 +00:00
Doug Ambrisko
d205405c23 Change new card identification names.
Submitted by:	LSI
MFC after:	3 days
2008-12-03 16:29:12 +00:00
Bjoern A. Zeeb
118258f5c2 Fix a credential reference leak. [1]
Close subtle but relatively unlikely race conditions when
propagating the vnode write error to other active sessions
tracing to the same vnode, without holding a reference on
the vnode anymore. [2]

PR:		kern/126368 [1]
Submitted by:	rwatson [2]
Reviewed by:	kib, rwatson
MFC after:	4 weeks
2008-12-03 15:54:35 +00:00
Joseph Koshy
a10c6ee6bd Add aliases that map architectural event names to fixed function counters. 2008-12-03 15:23:08 +00:00
Luigi Rizzo
ae3096705c Another, hopefully final set of changes to boot0 and boot0cfg.
boot0.S changes:

+ import a patch from Christoph Mallon to rearrange the various
  print functions and save another couple of bytes;

+ implement the suggestion in PR 70531 to enable booting from
  any valid partition because even the extended partitions that
  were previously in our kill list may contain a valid boot loader.
  This simplifies the code and saves some bytes;

+ followwing up PR 127764, implement conditional code to preserve
  the 'Volume ID' which might be used by other OS (NT, XP, Vista)
  and is located at offset 0x1b8. This requires a relocation of the
  parameter block within the boot sector -- there is no other
  possible workaround.
  To address this, boot0cfg has been updated to handle both
  versions of the boot code;

+ slightly rearrange the strings printed in the menus to make
  the code buildable with all options. Given the tight memory
  budget, this means that with certain options we need to
  shrink or remove certain labels.

and especially:

	make -DVOLUME_LABEL -DPXE the default options.

  This means that the newly built boot0 block will preserve the
  Volume ID, and has the (hidden) option F6 to boot from INT18/PXE.
  I think the extra functionality is well worth the change.

  The most visible difference here is that the 'Default: ' string
  now becomes 'Boot: ' (it can be reverted to the old value
  but then we need to nuke 1/2 partition name or entries to
  make up for the extra room).

boot0cfg changes:

+ modify the code to recognise the new boot0 structure (with the
  relocated options block to make room for the Volume id).

+ add two options, '-i xxxx-xxxx' to set the volume ID, -e c
  to modify the character printed in case of bad input

PR:		127764 70531
Submitted by:	Christoph Mallon (portions)
MFC after:	4 weeks
2008-12-03 14:53:59 +00:00
Pyun YongHyeon
1ce1618851 AR8113 also need to set DMA read burst value. This should fix
occasional DMA read error seen on AR8113.

Submitted by:	Jie Yang < Jie.Yang <> Atheros com >
2008-12-03 09:01:12 +00:00
Pyun YongHyeon
19042fb8c7 Add some PHY magic to enable PHY hibernation and 1000baseT/10baseT
power adjustment. This change is required to guarantee correct
operation on certain switches.

Submitted by:	Jie Yang < Jie.Yang <> Atheros com >
2008-12-03 08:56:01 +00:00
Pyun YongHyeon
1f9cbabcdb Update if_iqdrops instead of if_ierrors when m_devget(9) fails. 2008-12-03 03:20:18 +00:00
Robert Watson
52267f7411 Merge OpenBSM 1.1 alpha 2 from the OpenBSM vendor branch to head, both
contrib/openbsm (svn merge) and sys/{bsm,security/audit} (manual merge).

- Add OpenBSM contrib tree to include paths for audit(8) and auditd(8).
- Merge support for new tokens, fixes to existing token generation to
  audit_bsm_token.c.
- Synchronize bsm includes and definitions.

OpenBSM history for imported revisions below for reference.

MFC after:      1 month
Sponsored by:   Apple Inc.
Obtained from:  TrustedBSD Project

--

OpenBSM 1.1 alpha 2

- Include files in OpenBSM are now broken out into two parts: library builds
  required solely for user space, and system includes, which may also be
  required for use in the kernels of systems integrating OpenBSM.  Submitted
  by Stacey Son.
- Configure option --with-native-includes allows forcing the use of native
  include for system includes, rather than the versions bundled with OpenBSM.
  This is intended specifically for platforms that ship OpenBSM, have adapted
  versions of the system includes in a kernel source tree, and will use the
  OpenBSM build infrastructure with an unmodified OpenBSM distribution,
  allowing the customized system includes to be used with the OpenBSM build.
  Submitted by Stacey Son.
- Various strcpy()'s/strcat()'s have been changed to strlcpy()'s/strlcat()'s
  or asprintf().  Added compat/strlcpy.h for Linux.
- Remove compatibility defines for old Darwin token constant names; now only
  BSM token names are provided and used.
- Add support for extended header tokens, which contain space for information
  on the host generating the record.
- Add support for setting extended host information in the kernel, which is
  used for setting host information in extended header tokens.  The
  audit_control file now supports a "host" parameter which can be used by
  auditd to set the information; if not present, the kernel parameters won't
  be set and auditd uses unextended headers for records that it generates.

OpenBSM 1.1 alpha 1

- Add option to auditreduce(1) which allows users to invert sense of
  matching, such that BSM records that do not match, are selected.
- Fix bug in audit_write() where we commit an incomplete record in the
  event there is an error writing the subject token.  This was submitted
  by Diego Giagio.
- Build support for Mac OS X 10.5.1 submitted by Eric Hall.
- Fix a bug which resulted in host XML attributes not being arguments so
  that const strings can be passed as arguments to tokens.  This patch was
  submitted by Xin LI.
- Modify the -m option so users can select more then one audit event.
- For Mac OS X, added Mach IPC support for audit trigger messages.
- Fixed a bug in getacna() which resulted in a locking problem on Mac OS X.
- Added LOG_PERROR flag to openlog when -d option is used with auditd.
- AUE events added for Mac OS X Leopard system calls.
2008-12-02 23:26:43 +00:00
Bjoern A. Zeeb
4b79449e2f Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by:	brooks, gnn, des, zec, imp
Sponsored by:	The FreeBSD Foundation
2008-12-02 21:37:28 +00:00
Ed Schouten
bfba40a452 Remove "[KEEP THIS!]" from COMPAT_43TTY. It's not really that important.
Sgtty is a programming interface that has been replaced by termios over
the years. In June we already removed <sgtty.h>, which exposes the
ioctl()'s that are implemented by this interface. The importance of this
flag is overrated right now.
2008-12-02 19:09:08 +00:00
George V. Neville-Neil
c9c0c99f30 Bug fix from Chelsio which addresses the issue of the device resetting
when it sees only received packets.  In some cases where a device only
recieves data it mistakenly thinks that its transmitting side is broken
and resets the device.

Obtained from:	Chelsio Inc.
MFC after:	3 days
2008-12-02 15:42:47 +00:00
Luigi Rizzo
4b91bf7948 This commits brings in a lot of documentation and some enhancement
of the boot0.S code, with a number of compile-time selectable options,
the most interesting one being the ability to select PXE booting.

The code is completely compatible with the previous one, and with
the boot0cfg program. Even the actual code is largely unmodified,
with only minor rearrangements or fixes to make room for the new
features.

The behaviour of the standard build differs from the previous
version in the following, minor things:

+ 'noupdate' is the default, which means the code does not
  write back the selection to disk. You can enable the feature
  at runtime with boot0cfg, or changing the flags in the Makefile.

+ a drive number of 0x00 (floppy, or USB in floppy emulation) is
  now accepted as valid. Previously, it was overridden with 0x80,
  meaning that the partition table coming from the media was
  used to access sectors on a possibly different media.
  You can revert to the previous mode building with -DCHECK_DRIVE,
  and you can always use the 'setdrv' option in boot0cfg

+ certain FAT or NTFS partitions are listed as WIN instead of DOS.

+ the 'bel' character on a bad selection is replaced by a '#' to
  make it clear that the system is not hang even if the machine
  does not have a speaker. This can be reverted back at compile
  time, or at runtime with an upcoming boot0cfg option.

Additional features are available as compile time options,
and may be become the default if deemed useful. In particular:

+ INT18/PXE boot (make -DPXE)
  This option enables booting through INT 18h (which on certain
  BIOSes can be hooked to PXE) by pressing F6. There is unfortunately
  no room to print the additional menu option.
  Also, to make room for the code, the 'Default: ' string is
  changed to 'Boot: '

+ print current drive number (make -DTEST)
  Prints a line indicating the current drive number.
  This is useful to figure out what is going on for machines/bioses
  which remap drives in sometimes surprising ways.

+ disable numeric keys in console mode (make -DONLY_F_KEYS)
  Not really a significant option, but it is needed to make
  room for the -DTEST mode.

+ disable floppy support (make -DCHECK_DRIVE)
  Revert to the old behaviour of only accepting 0x80 and above
  as valid drive numbers.

MFC after:	6 weeks
2008-12-02 14:57:48 +00:00
Ganbold Tsagaankhuu
7613f162e9 Remove unused variable.
Found with:     Coverity Prevent(tm)
CID: 3685

Approved by: jhb
2008-12-02 14:19:53 +00:00
Konstantin Belousov
d6568724e1 Shared lookup makes it possible to create several negative cache
entries for one name. Then, creating inode with that name would remove
one entry, leaving others dormant. Reclaiming the vnode would uncover
negative entries, causing false return of ENOENT from the calls like
stat, that do not create inode.

Prevent creation of the duplicated negative entries.

Reported and debugged with:	pho
Reviewed by:	jhb
X-MFC:	after shared lookup changes
2008-12-02 11:14:16 +00:00
Konstantin Belousov
ffdaeffe21 Do not lock vnode interlock around reading of v_iflag to check VI_DOOMED.
Read of the pointer is atomic, and flag cannot be set while vnode lock
is held.

Requested by:	jhb
MFC after:	1 month
2008-12-02 11:12:50 +00:00
Joseph Koshy
dfd9bc23c9 - Efficiency tweak: when checking for PMC overflows, only go to
hardware for PMCs that have been configured for sampling.

- Bug fix: acknowledge PMC hardware overflows irrespective of the
  the (software) PMC's state.
2008-12-02 10:46:35 +00:00
Peter Wemm
a8f4518c30 kf_offset was supposed to be signed. 2008-12-02 10:39:47 +00:00
Kip Macy
f3c713fdaf - fix bug where dnsperf would stop transmitting after a few seconds
- break complex conditionals in to multiple lines to avoid wrapping
- remove copious unused debug statements
- be more aggressive about cleaning in the calling thread
- eliminate usage of ENOSPC
- increase number of iterations that cxgbsp can do
- eliminate "initerr" usage to simplify ENOBUFS handling
- when coalescing pass all packets to BPF
- always set overrun if hardware queue is full
2008-12-02 07:01:18 +00:00
Peter Wemm
43151ee6cf Merge user/peter/kinfo branch as of r185547 into head.
This changes struct kinfo_filedesc and kinfo_vmentry such that they are
same on both 32 and 64 bit platforms like i386/amd64 and won't require
sysctl wrapping.

Two new OIDs are assigned.  The old ones are available under
COMPAT_FREEBSD7 - but it isn't that simple.  The superceded interface
was never actually released on 7.x.

The other main change is to pack the data passed to userland via the
sysctl.  kf_structsize and kve_structsize are reduced for the copyout.
If you have a process with 100,000+ sockets open, the unpacked records
require a 132MB+ copyout.  With packing, it is "only" ~35MB.  (Still
seriously unpleasant, but not quite as devastating).  A similar problem
exists for the vmentry structure - have lots and lots of shared libraries
and small mmaps and its copyout gets expensive too.

My immediate problem is valgrind.  It traditionally achieves this
functionality by parsing procfs output, in a packed format.  Secondly, when
tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled
32 bit binary which ran directly into the differing data structures in 32
vs 64 bit mode.  (valgrind uses this to track file descriptor operations
and this therefore affected every single 32 bit binary)

I've added two utility functions to libutil to unpack the structures into
a fixed record length and to make it a little more convenient to use.
2008-12-02 06:50:26 +00:00
Warner Losh
7d0fb1f7ef Don't call destroy_dev on the alias. This fixes half a dozen PRs I think. 2008-12-02 04:54:31 +00:00