11775 Commits

Author SHA1 Message Date
mav
92d6358872 There is no atrtc driver in pc98, so hide atrtcclock_disable variable usage
in APM driver for this platform. This should fix pc98 build.
2009-05-04 08:36:47 +00:00
mav
98565a1214 Rename statclock_disable variable to atrtcclock_disable that it actually is,
and hide it inside of atrtc driver. Add new tunable hint.atrtc.0.clock
controlling it. Setting it to 0 disables using RTC clock as stat-/
profclock sources.

Teach i386 and amd64 SMP platforms to emulate stat-/profclocks using i8254
hardclock, when LAPIC and RTC clocks are disabled.

This allows to reduce global interrupt rate of idle system down to about
100 interrupts per core, permitting C3 and deeper C-states provide maximum
CPU power efficiency.
2009-05-03 17:47:21 +00:00
kmacy
7066780d7a fix XEN compilation 2009-05-02 22:22:00 +00:00
mav
b704e6092a Add support for using i8254 and rtc timers as event sources for i386 SMP
system. Redistribute hard-/stat-/profclock events to other CPUs using IPI.
2009-05-02 12:59:47 +00:00
dchagin
32b5830d97 Move extern variable definitions to the header file.
Approved by:	kib (mentor)
MFC after:	1 month
2009-05-02 10:06:49 +00:00
mav
50b57c0fb5 Small addition to r191720.
Restore previous behaviour for the case of unknown interrupt. Invocation
of IRQ -1 crashes my system on resume. Returning 0, as it was, is not
perfect also, but at least not so dangerous.
2009-05-01 20:53:37 +00:00
sam
c0a4585083 o add uath
o sort usb wireless drivers
2009-05-01 17:20:16 +00:00
mav
364f1b1af0 Use value -1 instead of 0 for marking unused APIC vectors. This fixes
IRQ0 routing on LAPIC-enabled systems.

Add hint.apic.0.clock tunable. Setting it 0 disables using LAPIC timers
as hard-/stat-/profclock sources falling back to using i8254 and rtc timers.

On modern CPUs LAPIC is a part of CPU core which is shutting down when CPU
enters C3 or deeper power state. It makes no problems for interrupt
processing, as chipset wakes up CPU on interrupt triggering. But entering
C3 state kills LAPIC timer and freezes system time, making C3 and deeper
states practically unusable. Using i8254 timer allows to avoid this
problem.

By using i8254 timer my T7700 C2D CPU with UP kernel successfully enters
C3 state, saving more then a Watt of total idle power (>10%) in addition to
all other power-saving techniques.

This technique is not working for SMP yet, as only one CPU receives
timer interrupts. But I think that problem could be fixed by forwarding
interrupts to other CPUs with IPI.
2009-05-01 17:05:49 +00:00
dchagin
dca50049ce Reimplement futexes.
Old implemention used Giant to protect the kernel data structures,
but at the same time called malloc(M_WAITOK), that could cause the
calling thread to sleep and lost Giant protection. User-visible
result was the missed wakeup.

New implementation uses one sx lock per futex. The sx protects
the futex structures and allows to sleep while copyin or copyout
are performed.

Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation
is requested and either caller specified futexes are equial or
second futex already exists. This is acceptable since the situation
can only occur from the application error, and glibc falls back to
old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-01 15:36:02 +00:00
jkim
680fc00a3e - Fix divide-by-zero panic when SMP kernel is used on UP system[1].
- Avoid possible divide-by-zero panic on SMP system when the CPUID is
disabled, unsupported, or buggy.

Submitted by:	pluknet (pluknet at gmail dot com)[1]
2009-04-30 22:10:04 +00:00
jeff
9339d50dc3 - Add support for cpuid leaf 0xb. This allows us to determine the
topology of nehalem/corei7 based systems.
 - Remove the cpu_cores/cpu_logical detection from identcpu.
 - Describe the layout of the system in cpu_mp_announce().

Sponsored by:   Nokia
2009-04-29 06:54:40 +00:00
jhb
3ca2800ae4 Reduce the number of bounce zones (and thus the number of bounce pages
used in some cases):
- Ignore DMA tag boundaries when allocating bounce pages.  The boundaries
  don't determine whether or not parts of a DMA request bounce.  Instead,
  they are just used to carve up segments.
- Allow tags with sub-page alignment to share bounce pages since bounce
  pages are always page aligned.

Reviewed by:	scottl (amd64)
MFC after:	1 month
2009-04-23 20:24:19 +00:00
jhb
5ff418d071 Adjust the way we number CPUs on x86 so that we attempt to "group" all
logical CPUs in a package.  We do this by numbering the non-boot CPUs
by starting with the first CPU whose APIC ID is after the boot CPU and
wrapping back around to APIC ID 0 if needed rather than always starting
at APIC ID 0.  While here, adjust the cpu_mp_announce() routine to list
CPUs based on the mapping established by assign_cpu_ids() rather than
making assumptions about the algorithm assign_cpu_ids() uses.

MFC after:	1 month
2009-04-22 21:40:37 +00:00
rwatson
21a8b350dc Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]

Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]

Suggested by:	bde [1], jhb [2]
MFC after:	2 weeks
2009-04-20 12:59:23 +00:00
rwatson
ab17fac487 Add description and cautionary note regarding CACHE_LINE_SIZE.
MFC after:	2 weeks
Suggested by:	alc
2009-04-19 21:26:36 +00:00
rwatson
8df790f38f For each architecture, define CACHE_LINE_SHIFT and a derived
CACHE_LINE_SIZE constant.  These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.

Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).

MFC after:	2 weeks
Discussed on:   arch@
2009-04-19 20:19:13 +00:00
kmacy
1aef8359b1 - Import infrastructure for caching flows as a means of accelerating L3 and L2 lookups
as well as providing stateful load balancing when used with RADIX_MPATH.
- Currently compiled in to i386 and amd64 but disabled by default, it can be enabled at
  runtime with 'sysctl net.inet.flowtable.enable=1'.

- Embedded users can remove it entirely from the kernel by adding 'nooption FLOWTABLE' to
  their kernel config files.

- A minimal hookup will be added to ip_output in a subsequent commit. I would like to see
  more review before bringing in changes that require more churn.

Supported by: Bitgravity Inc.
2009-04-19 00:16:04 +00:00
jhb
360bcf2161 Restore bus DMA bounce pages to an offset of 0 when they are released by
a tag that has BUS_DMA_KEEP_PG_OFFSET set.  Otherwise the page could be
reused with a non-zero offset by a tag that doesn't have
BUS_DMA_KEEP_PG_OFFSET leading to data corruption.

Sleuthing by:	avg
Reviewed by:	scottl
2009-04-17 13:22:18 +00:00
marcel
cf8f14f029 Add a compat option to the EBR scheme that controls the
naming of the partitions (GEOM_PART_EBR_COMPAT).  When
compatibility is enabled, changes to the partitioning are
disallowed.

Remove the device name aliasing added previously to provide
backward compatibility, but which in practice doesn't give
us anything.

Enable compatibility on amd64 and i386.
2009-04-15 22:38:22 +00:00
jkim
e8cee11d8d A simple rewrite of biossmap.c:
- Do not iterate int 15h, function e820h twice.  Instead, we use STAILQ to
store each return buffer and copy all at once.
- Export optional extended attributes defined in ACPI 3.0 as separate
metadata.  Currently, there are only two bits defined in the specification.
For example, if the descriptor has extended attributes and it is not
enabled, it has to be ignored by OS.  We may implement it in the kernel
later if it is necessary and proven correct in reality.
- Check return buffer size strictly as suggested in ACPI 3.0.

Reviewed by:	jhb
2009-04-15 17:31:22 +00:00
kib
9c0149c147 The bus_dmamap_load_uio(9) shall use pmap of the thread recorded in the
uio_td to extract pages from, instead of unconditionally use kernel
pmap.

Submitted by:	Jason Harmening <jason.harmening gmail com> (amd64 version)
PR:	amd64/133592
Reviewed by:	scottl (original patch), jhb
MFC after:	2 weeks
2009-04-13 19:20:32 +00:00
ed
a0f5dad6a9 Simplify in/out functions (for i386 and AMD64).
Remove a hack to generate more efficient code for port numbers below
0x100, which has been obsolete for at least ten years, because GCC has
an asm constraint to specify that.

Submitted by:	Christoph Mallon <christoph mallon gmx de>
2009-04-11 14:01:01 +00:00
ed
93a9ed75b4 Also remove the unused __word_swap_int*() macros.
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
2009-04-08 19:10:20 +00:00
ed
9141c649b8 Implement __bswap16() without using inline assembly.
Most compilers nowadays (including GCC) are smart enough to know what's
going on and generate more efficient code anyway.

Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
2009-04-08 19:06:47 +00:00
dchagin
01bf63c9fb Fix KBI breakage by r190520 which affects older linux.ko binaries:
1) Move the new field (brand_note) to the end of the Brandinfo structure.
2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer
   is valid.
3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old
   modules won't have the flag set, so the new field brand_note would be
   ignored.

Suggested by:	jhb
Reviewed by:	jhb
Approved by:	kib (mentor)
MFC after:	6 days
2009-04-05 09:27:19 +00:00
alc
85b0c58343 Retire VM_PROT_READ_IS_EXEC. It was intended to be a micro-optimization,
but I see no benefit from it today.

VM_PROT_READ_IS_EXEC was only intended for use on processors that do not
distinguish between read and execute permission.  On an mmap(2) or
mprotect(2), it automatically added execute permission if the caller
specified permissions included read permission.  The hope was that this
would reduce the number of vm map entries needed to implement an address
space because there would be fewer neighboring vm map entries that differed
only in the presence or absence of VM_PROT_EXECUTE.  (See vm/vm_mmap.c
revision 1.56.)

Today, I don't see any real applications that benefit from
VM_PROT_READ_IS_EXEC.  In any case, vm map entries are now organized
as a self-adjusting binary search tree instead of an ordered list.  So,
the need for coalescing vm map entries is not as great as it once was.
2009-04-04 23:12:14 +00:00
dfr
df0ed71781 Fix the Xen build for i386 PV mode. 2009-04-01 17:06:28 +00:00
kib
a2d099881f Sync definitions for struct sigcontext for i386 and amd64 architectures
to struct mcontext.
2009-04-01 13:44:28 +00:00
kib
aaeb4d6376 Fill the fsbase and gsbase fields of the mcontext structure on i386.
In collaboration with:	pho
Discussed with:	peter
Reviewed by:	jhb
2009-04-01 12:46:05 +00:00
kib
8e7a736a88 Add all segment registers for the amd64 CPU to struct reg and mcontext.
To keep these structures ABI-compatible, half the size of r_trapno,
r_err, mc_trapno, mc_flags.

Add fsbase and gsbase to mcontext on both amd64 and i386.
Add flags to amd64 mcontext to indicate that it contains valid segments
or bases.

In collaboration with:	pho
Discussed with:	peter
Reviewed by:	jhb
2009-04-01 12:44:17 +00:00
jkim
49cb67d70e Fix an uninitialized variable from the previous commit. 2009-03-31 21:14:05 +00:00
jkim
01c7b1ae8d Probe size of installed memory modules from loader and display it
as 'real memory' instead of Maxmem if the value is available.
Note amd64 displayed physmem as 'usable memory' since machdep.c r1.640
to unconfuse users.  Now it is consistent across amd64 and i386 again.
While I am here, clean up smbios.c a bit and update copyright date.

Reviewed by:	jhb
2009-03-31 21:02:55 +00:00
mr
b6886fc07b Extend comment in copyright notice as requested by author.
Submitted by:	G.Otsuji
2009-03-29 13:35:20 +00:00
mr
705fd8f6f9 Add support for Phenom (Family 10h) to cpufreq.
Its a newer version provided by the author than in the PR.

PR:		kern/128575
Submitted by:	Gen Otsuji annona2 [at] gmail.com
2009-03-28 08:54:47 +00:00
kib
a7383c9a55 Convert gdt_segs and ldt_segs initialization to C99 style.
Reviewed by:	jhb
2009-03-26 18:07:13 +00:00
jhb
afc2ecb61b Fix a few nits in the earlier changes to prevent local information leakage
in AMD FPUs:
- Do not clear the affected state in the case that the FPU registers for
  the thread that already owns the FPU are changed via fpu_setregs().  The
  only local information the thread would see is its own state in that
  case.
- Fix a type mismatch for the dummy variable used in a "fld".  It accepts
  a float, not a double.

Reviewed by:	bde
Approved by:	so (cperciva)
MFC after:	1 month
2009-03-25 22:08:30 +00:00
jhb
6cd843f315 Rename (fpu|npx)_cleanstate to (fpu|npx)_initialstate to better reflect
their purpose.

Inspired by:	bde
MFC after:	1 month
2009-03-25 14:17:08 +00:00
jhb
ac5c4c1c38 Fall back to using configuration type 1 accesses for PCI config requests if
the requested PCI bus falls outside of the bus range given in the ACPI
MCFG table.  Several BIOSes seem to not include all of the PCI busses in
systems in their MCFG tables.  It maybe that the BIOS is simply buggy and
does support all the busses, but it is more conservative to just fall back
to the old method unless it is certain that memory accesses will work.
2009-03-24 18:10:22 +00:00
alc
1623994337 Update stale comments. The alternate address space mapping was eliminated
when PAE support was added to i386.  The direct mapping exists on amd64.
2009-03-22 18:56:26 +00:00
alc
e68330f894 Eliminate the recomputation of pcb_cr3 from cpu_set_upcall(). The
bcopy()ed value from the old thread is the correct value because the new
thread and the old thread will share a page table.
2009-03-22 02:33:48 +00:00
thompsa
11f8f68779 Remove the uscanner(4) driver, this follows the removal of the kernel scanner
driver in Linux 2.6. uscanner was just a simple wrapper around a fifo and
contained no logic, the default interface is now libusb (supported by sane).

Reviewed by:	HPS
2009-03-19 20:33:26 +00:00
kib
7695aca762 Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer
to the full path of the image that is being executed.
Increase AT_COUNT.

Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.

Reviewed by:	kan
2009-03-17 12:50:16 +00:00
jkim
3eda4741da Initial suspend/resume support for amd64.
This code is heavily inspired by Takanori Watanabe's experimental SMP patch
for i386 and large portion was shamelessly cut and pasted from Peter Wemm's
AP boot code.
2009-03-17 00:48:11 +00:00
rwatson
70b6a8119c Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced
in FreeBSD 5.x to allow network device drivers to run with Giant
despite the network stack being Giant-free.  This significantly
simplifies calls into ioctl() on network interfaces, especially
in the multicast code, as well as eliminates deferred invocation
of interface if_start routines.

Disable the build on device drivers still depending on
IFF_NEEDSGIANT as they no longer compile.  They will be removed
in a few weeks if they haven't been made MPSAFE in that time.
Disabled drivers:

        if_ar
        if_axe
        if_aue
        if_cdce
        if_cue
        if_kue
        if_ray
        if_rue
        if_rum
        if_sr
        if_udav
        if_ural
        if_zyd

Drivers that were already disabled because of tty changes:

        if_ppp
        if_sl

Discussed on:	arch@
2009-03-15 14:21:05 +00:00
alc
7f1b26ac0c MFamd64 r189785
Update the pmap's resident page count when a page table page is freed in
  pmap_remove_pde() and pmap_remove_pages().

MFC after:	6 weeks
2009-03-14 15:37:19 +00:00
dchagin
2408b715a0 Implement new way of branding ELF binaries by looking to a
".note.ABI-tag" section.

The search order of a brand is changed, now first of all the
".note.ABI-tag" is looked through.

Move code which fetch osreldate for ELF binary to check_note() handler.

PR:		118473
Approved by:	kib (mentor)
2009-03-13 16:40:51 +00:00
dfr
598fb4217f Merge in support for Xen HVM on amd64 architecture. 2009-03-11 15:30:12 +00:00
rwatson
5277812531 Trim comments about the MP-safety of various bits of the amd64/i386
system call entry path and i386 IP checksum generation: we now assume
all code is MPSAFE unless explicitly marked otherwise.  Remove XXX
Giant comments along similar lines: the code by the comments either
doesn't need or doesn't want Giant (especially the NMI handler).

MFC after:	3 days
2009-03-09 13:11:16 +00:00
sobomax
82279c3ff2 Small comment nit: "run time" -> "run-time".
Submitted by:	rwatson
2009-03-08 05:01:39 +00:00
thompsa
4ab7fdce63 Reenable ndis in the LINT build now that it has been updated for USB. Thanks to
HPS and Weongyo.
2009-03-07 19:54:30 +00:00