Commit Graph

149303 Commits

Author SHA1 Message Date
Bjoern A. Zeeb
7cf8b4b933 Make libkvm work on live systems and crashdumps with and
without VIMAGE virtualization in the kernel.

If we cannot resolve a symbol try to see if we can find it with
prefix of the virtualized subsystem, currently only "vnet_entry"
by identifying either the vnet of the current process for a
live system or the vnet of proc0 (or of dumptid if compiled
in a non-default way).

The way this is done currently allows us to only touch libkvm
but no single application. Once we are going to virtualize more
subsystems we will have to review this decision for better scaling.

Submitted by:	rwatson (initial version of kvm_vnet.c, lots of ideas)
Reviewed by:	rwatson
Approved by:	re (kib)
2009-07-23 21:12:21 +00:00
Robert Watson
d0728d7174 Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
  occur each time a network stack is instantiated and destroyed.  In the
  !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
  For the VIMAGE case, we instead use SYSINIT's to track their order and
  properties on registration, using them for each vnet when created/
  destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
  previously, as well as its dependency scheme: we now just use the
  SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
  they want init functions to be called for each virtual network stack
  rather than just once at boot, compiling down to DOMAIN_SET() in the
  non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
  of modinfo or DOMAIN_SET() for init/uninit events.  In some cases,
  convert modular components from using modevent to using sysinit (where
  appropriate).  In some cases, do minor rejuggling of SYSINIT ordering
  to make room for or better manage events.

Portions submitted by:	jhb (VNET_SYSINIT), bz (cleanup)
Discussed with:		jhb, bz, julian, zec
Reviewed by:		bz
Approved by:		re (VIMAGE blanket)
2009-07-23 20:46:49 +00:00
Alan Cox
3313145243 Eliminate unnecessary cache and TLB flushes by pmap_change_attr(). (This
optimization was implemented in the amd64 version roughly 1 year ago.)

Approved by:	re (kensmith)
2009-07-23 19:43:23 +00:00
Nathan Whitehorn
ccf6415e82 Fix serial console on Apple Xserve G5 by falling back to input-device-1
if input-device is unavailable. The Xserve G5 defaults to using
screen/keyboard for output-device/input-device even if these are not
installed, and then falls back to serial ports at boot time.

Reviewed by:	marcel
Hardware from:	grehan
Approved by:	re (kib)
2009-07-23 12:51:27 +00:00
Brian Somers
aa14b5ab89 Add the -d switch to the usage message.
Submitted by:	Emil Mikulic - emil at dmr dot ath dot cx
Approved by:	re (kib)
MFC after:	1 week
2009-07-23 10:20:12 +00:00
Ken Smith
1e237f59b5 It is believed the last of the base system that could have an issue with
IDs larger than 16-bits has been updated so adjust sysinstall to allow
IDs up to the current system-wide size of 32-bits.

Approved by:	re (kib)
2009-07-22 22:13:42 +00:00
Ken Smith
764eca2591 It is believed the last subsystem that limited ID sizes to something
other than the current system-wide size (32-bits) has been updated so
for now just cautiously turn the check off.  While here fix the check
for IDs being too large which doesn't work due to type mis-matches.

Reviewed by:    jhb (previous version)
Approved by:	re (kib)
MFC after:	1 month (type mis-match fixes only)
2009-07-22 20:46:17 +00:00
Rick Macklem
93a15e2054 When vfs.newnfs.callback_addr is set to an IPv4 address, the
experimental NFSv4 client might try and use it as an IPv6 address,
breaking callbacks. The fix simply initializes the isinet6 variable
for this case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 18:10:44 +00:00
Edward Tomasz Napierala
d2ceff236a Fix extattr_list_file(2) on ZFS in case the attribute directory
doesn't exist and user doesn't have write access to the file.
Without this fix, it returns bogus value instead of 0.  For some
reason this didn't manifest on my kernel compiled with -O0.

PR:		kern/136601
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Approved by:	re (kib)
2009-07-22 15:15:58 +00:00
Rick Macklem
c79e697621 Add changes to the experimental nfs client to use the PBDRY flag for
msleep(9) when a vnode lock or similar may be held. The changes are
just a clone of the changes applied to the regular nfs client by
r195703.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:37:53 +00:00
Konstantin Belousov
206a336872 When the page caching attributes are changed, after new mapping is
established, OS shall flush the caches on all processors that may have
used the mapping previously. This operation is not needed if processors
support self-snooping. If not, but clflush instruction is implemented
on the CPU, series of the clflush can be used on the mapping region.
Otherwise, we have to flush the whole cache. The later operation is very
expensive, and AMD-made CPUs do not have self-snooping.

Implement cache flush for remapped region by using clflush for amd64,
when supported by CPU.

Proposed and reviewed by:	alc
Approved by:	re (kensmith)
2009-07-22 14:32:38 +00:00
Rick Macklem
7f1968ba10 When using an NFSv4 mount in the experimental nfs client with delegations
being issued from the server, there was a case where an Open issued locally
based on the delegation would be released before the associated vnode
became inactive. If the delegation was recalled after the open was released,
an Open against the server would not have been acquired and subsequent I/O
operations would need to use the special stateid of all zeros. This patch
fixes that case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:32:28 +00:00
Andrew Gallatin
94c7d993a3 mxge's tunable hw.mxge.rss_hash_type cannot be set from the
loader, because it uses a reserved suffix (_type).  Fix
this by removing the "_" and renaming the tunable to
hw.mxge.rss_hashtype.  The old (rss_hash_type) tunable is
still fetched, in case people load the driver via scripts.
When both are present in the kernel environment,
the new value (hw.mxge.rss_hashtype) overrides the old
value.

Approved by:	re (kib)
2009-07-22 11:57:34 +00:00
Colin Percival
90ad51e897 Remove the "dedicated disk mode" partitioning option from sysinstall, in
both the disk partitioning screen (the 'F' key) and via install.cfg (the
VAR_DEDICATED_DISK option).  This functionality is currently broken in 8.x
due to libdisk and geom generating different partition names; this commit
merely acts to help steer users away from the breakage.

Submitted by:	randi
Approved by:	re (kensmith)
2009-07-22 03:50:54 +00:00
Bruce M Simpson
f667763060 Output DWARF debug information for global 'using' declarations, instead
of just blowing up. A very similar change to this exists which is
GPLv3 licensed, this is my own change.

This problem was triggered by running the Boost regression tests.

See also:	http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31899
Reviewed by:	luigi
Approved by:	re (kib)
2009-07-22 01:07:11 +00:00
Bjoern A. Zeeb
a08362ce46 sysctl_msec_to_ticks is used with both virtualized and
non-vrtiualized sysctls so we cannot used one common function.

Add a macro to convert the arg1 in the virtualized case to
vnet.h to not expose the maths to all over the code.

Add a wrapper for the single virtualized call, properly handling
arg1 and call the default implementation from there.

Convert the two over places to use the new macro.

Reviewed by:	rwatson
Approved by:	re (kib)
2009-07-21 21:58:55 +00:00
Sam Leffler
e50821abe7 store mesh timers as ticks and sysctls for changing the defaults
Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:38:22 +00:00
Sam Leffler
411ccf5f63 Correct handling of keys that already have a hardware/device key index:
this was broken in r183248 when the check of wk_keyix was replaced by
a check of IEEE80211_KEY_DEVKEY (because the flag was clobbered).  Define
IEEE80211_KEY_DEVICE to specify flags that are owned by net80211/driver
and use this to preserve IEEE80211_KEY_DEVKEY so we don't ask the driver
for another key index when we already have one.

Testing by:	Daniel Thiele, Wes Morgan
Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:36:32 +00:00
Sam Leffler
05917235c0 update for recent mesh additions
Approved by:	re (kib)
2009-07-21 19:25:25 +00:00
Sam Leffler
bd6c4dd05d correct setup of opt_ddb.h
Submitted by:	jkim
Approved by:	re (kib)
2009-07-21 19:24:53 +00:00
Sam Leffler
82cadd5aa0 Fix handling of AR_RX_FILTER_BSSID: write the shadow value for AR_MISC_MODE
so other register writes preserve the setting of AR_MISC_MODE_BSSID_MATCH_FORCE.

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:23:34 +00:00
Marius Strobl
fada2a867d Add a MD __PCI_BAR_ZERO_VALID which denotes that BARs containing 0
actually specify valid bases that should be treated just as normal.
The PCI specifications have no indication that 0 would be a magic value
indicating a disabled BAR as commonly used on at least amd64 and i386
but not sparc64. It's unclear what to do in pci_delete_resource()
instead of writing 0 to a BAR though as there's no (other) way do
disable individual BARs so its decoding is left enabled in case of
__PCI_BAR_ZERO_VALID for now.

Approved by:	re (kib), jhb
MFC after:	1 week
2009-07-21 19:06:39 +00:00
Sam Leffler
fe0dd78965 track whether any mesh vaps are present to correctly setup the rx filter
when, for example, an ap vap is created first

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:01:04 +00:00
Alan Cox
9e367360e3 Catch up with r195249, "Improve the handling of cpuset with interrupts."
Specifically, update the return type of xenpic_assign_cpu() so that this
file compiles again.

Approved by:	re (kib)
2009-07-21 16:54:11 +00:00
Sam Leffler
f8c44ada2e Fix the logic to count the number of "live interfaces". With this change
dhclient now terminates when the underlying ifnet is destroyed (e.g.
on card eject).

Reviewed by:	brooks
Approved by:	re (kib)
2009-07-21 15:06:10 +00:00
Rui Paulo
a6d54dae20 Enable mesh support.
Submitted by:	jkim
Approved by:	re (kib)
2009-07-21 14:23:05 +00:00
Rui Paulo
5b4d5f9ee2 Improve the printf message when a module failed to load. This gives the
user some clue about the possibility of a __FreeBSD_version mismatch.

Discussed with:	rwatson, jhb
Approved by:	re (kib)
2009-07-21 14:18:25 +00:00
Alexander Motin
67b87e4429 Add siis CAM driver for SiliconImage SiI3124/3132/3531 SATA2 controllers.
Driver supports Serial ATA and ATAPI devices, Port Multipliers
(including FIS-based switching), hardware command queues (31 command
per port) and Native Command Queuing. This is probably the second on
popularity, after AHCI, type of SATA2 controllers, that benefits from
using CAM, because of hardware command queuing support.

Approved by:    re (kib)
2009-07-21 12:32:46 +00:00
Yi-Jheng Lin
bd8311d481 - Add my birthday
- Add myself to ports committers and to lwhsu's mentee list

Approved by:	re (kib), lwhsu (mentor)
2009-07-21 09:54:04 +00:00
Rafal Jaworowski
570255d8a5 Do not use OCP85XX_LBC_OFF twice when accessing LBC registers on MPC85XX.
It turns LBC control registers were not programmed correctly on MPC85XX. We
were accessing bogus addresses as the base offset (OCP85XX_LBC_OFF) was
erroneously added during offset calculations.  Effectively the state of LBC
control registers was not altered by the kernel initialization code, but
everything worked as long as we coincided to use the same settings (LBC decode
windows) as firmware has initialized.

Submitted by:	Lukasz Wojcik
Reviewed by:	marcel
Approved by:	re (kensmith)
Obtained from:	Semihalf
2009-07-21 08:38:45 +00:00
Rafal Jaworowski
84a5818564 Make dcache_inv_range() point to the proper routines on ARM9 and ARM9E/ARM10.
On some ARM variations CPU func dispatcher has the D-cache invalidate method
point to write-back invalidate, which is wrong, and can lead to a crash/panic
on affected platforms.

Spotted by:	HPS
Reviewed by:	cognet
Approved by:	re (kib)
2009-07-21 08:29:19 +00:00
Coleman Kane
e0d12c4b94 Fix regression in last set of commits. Submitted via e-mail and then
nagged again via PR. Thank Paul for his persistence and contributions.

PR:		136895
Submitted by:	Paul B. Mahol <onemda@gmail.com>
Reviewed by:	sam (timeout, 10 days), weongyo (timeout, 10 days), me
Approved by:	re (Kostik Belousov <kostikbel@gmail.com>)
2009-07-20 23:21:19 +00:00
Antoine Brodin
7b1399f6df Update ObsoleteFiles.inc
- Remove some USB headers that were resurrected recently
- Add new obsolete files (2 man pages and libvgl.so.5)

Approved by:	re (kib)
2009-07-20 19:51:47 +00:00
Robert Watson
a511354af4 Back out the moving in r195782 of V_ip_id's initialization from the top
back to the bottom of ip_init() as found in 7.x.  I missed the fact that
the bottom half of the init routine only runs in the !VNET case.

Submitted by:	zec
Approved by:	re (vimage blanket)
2009-07-20 19:40:09 +00:00
Edward Tomasz Napierala
65588fd503 Fix permission handling for extended attributes in ZFS. Without
this change, ZFS uses SunOS Alternate Data Streams semantics - each
EA has its own permissions, which are set at EA creation time
and - unlike SunOS - invisible to the user and impossible to change.
From the user point of view, it's just broken: sometimes access
is granted when it shouldn't be, sometimes it's denied when
it shouldn't be.

This patch makes it behave just like UFS, i.e. depend on current
file permissions.  Also, it fixes returned error codes (ENOATTR
instead of ENOENT) and makes listextattr(2) return 0 instead
of EPERM where there is no EA directory (i.e. the file never had
any EA).

Reviewed by:	pjd (idea, not actual code)
Approved by:	re (kib)
2009-07-20 19:16:42 +00:00
Rui Paulo
c104cff26e More mesh bits, namely:
* bridge support (sam)
* handling of errors (sam)
* deletion of inactive routing entries
* more debug msgs (sam)
* fixed some inconsistencies with the spec.
* decap is now specific to mesh (sam)
* print mesh seq. no. on ifconfig list mesh
* small perf. improvements

Reviewed by:	sam
Approved by:	re (kib)
2009-07-20 19:12:08 +00:00
Robert Watson
0a4747d4d0 Garbage collect vnet module registrations that have neither constructors
nor destructors, as there's no actual work to do.

In most cases, the constructors weren't needed because of the existing
protocol initialization functions run by net_init_domain() as part of
VNET_MOD_NET, or they were eliminated when support for static
initialization of virtualized globals was added.

Garbage collect dependency references to modules without constructors or
destructors, notably VNET_MOD_INET and VNET_MOD_INET6.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-07-20 13:55:33 +00:00
Rafal Jaworowski
331f685743 ARM pmap fixes.
a)  nocache-remap problem

   When a page is remapped into a non-cacheable virtual memory region there
   was no associated write-back invalidate operation performed. We remove
   writeback of the original buffer size from bus_dmamem_alloc() and add
   appropriate L1/L2 flush operation.

b) missing write-back invalidate operation

   In pmap_kremove a page is removed so we must do a write-back
   invalidate operation aligned to the page virtual address.

Submitted by:	Michal Hajduk
Reviewed by:	Mark Tinguely, rpaulo, stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-07-20 07:53:07 +00:00
Robert Watson
17ef1feb8a Add macros VNET_SETNAME and VNET_SYMPREFIX, and expose to userspace if
_WANT_VNET is defined.  This way we don't need separate definitions in
libkvm.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-07-20 07:50:50 +00:00
Scott Long
62fdee1dda Fix an apparently harmless typo.
Approved by:	re
2009-07-20 03:59:00 +00:00
Alan Cox
9861cbc6ca Change the handling of fictitious pages by pmap_page_set_memattr() on
amd64 and i386.  Essentially, fictitious pages provide a mechanism for
creating aliases for either normal or device-backed pages.  Therefore,
pmap_page_set_memattr() on a fictitious page needn't update the direct
map or flush the cache.  Such actions are the responsibility of the
"primary" instance of the page or the device driver that "owns" the
physical address.  For example, these actions are already performed by
pmap_mapdev().

The device pager needn't restore the memory attributes on a fictitious
page before releasing it.  It's now pointless.

Add pmap_page_set_memattr() to the Xen pmap.

Approved by:	re (kib)
2009-07-19 21:40:19 +00:00
Konstantin Belousov
7ac5806bfb When buffer write is failed, it is wrong for brelse() to invalidate
portion of the page that was written. Among other problems, this
page might be picked up by pagedaemon, with failed assertion in
vm_pageout_flush() about validity of the page.

Reported and tested by:	pho
Approved by:	re (kensmith)
MFC after:	3 weeks
2009-07-19 20:25:59 +00:00
Brian Somers
8e41399c8f Don't get stuck in an infinite loop comparing (short++ <= maxshort)
PR:		136893
Submitted by:	Aragon Gouveia - aragon at phat dot za dot net (mostly)
Approved by:	re (kib)
MFC after:	3 weeks
2009-07-19 19:01:30 +00:00
Robert Watson
006e9db452 Normalize field naming for struct vnet, fix two debugging printfs that
print them.

Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-19 17:40:45 +00:00
Jilles Tjoelker
80a0e9b5f5 Allow creating hard links to symlinks using ln(1).
This implements the POSIX.1-2008 -L and -P flags.

The default remains to create hard links to the target of symlinks.

Approved by:	re (kib), ed (mentor)
2009-07-19 17:35:23 +00:00
Ken Smith
3ca3047aee Bump the version of all non-symbol-versioned shared libraries in
preparation for 8.0-RELEASE.  Add the previous version of those
libraries to ObsoleteFiles.inc and bump __FreeBSD_Version.

Reviewed by:    kib
Approved by:    re (rwatson)
2009-07-19 17:25:24 +00:00
Sam Leffler
030d10ee97 add urtw
Approved by:	re (kib)
2009-07-19 16:54:24 +00:00
Jilles Tjoelker
22e4c1c47c Correct AT_SYMLINK_FOLLOW flag name in linkat(2) man page.
Approved by:	re (kib), ed (mentor)
2009-07-19 16:48:25 +00:00
Rick Macklem
80e556956e Fix two bugs in the experimental nfs client:
- When the root vnode was acquired during mounting, mnt_stat.f_iosize was
  still set to 0, so getnewvnode() would set bo_bsize == 0. This would
  confuse getblk(), so that it always returned the first block causing
  the problem when the root directory of the mount point was greater
  than one block in size. It was fixed by setting mnt_stat.f_iosize to
  NFS_DIRBLKSIZ before calling ncl_nget() to acquire the root vnode.
- NFSMNT_INT was being set temporarily while the initial connect to a
  server was being done. This erroneously configured the krpc for
  interruptible RPCs, which caused problems because signals weren't
  being masked off as they would have been for interruptible mounts.
  This code was deleted to fix the problem. Since mount_nfs does an
  NFS null RPC before the mount system call, connections to the server
  should work ok.

Tested by:	swell dot k at gmail dot com
Approved by:	re (kensmith), kib (mentor)
2009-07-19 16:44:26 +00:00
Robert Watson
94da184f68 Expose the definitions of 'struct vnet' and 'VNET_MAGIC_N' to userspace
if _WANT_VNET is defined.  This is required so that libkvm can locate
virtual network stack instances in order to reach their global variables
for monitoring and crashdump analysis.

Reviewed by:	bz
Approved by:	re (kib)
2009-07-19 15:21:42 +00:00