Commit Graph

1336 Commits

Author SHA1 Message Date
jhb
ae432f93f2 - Always call exec_free_args() in kern_execve() instead of doing it in all
the callers if the exec either succeeds or fails early.
- Move the code to call exit1() if the exec fails after the vmspace is
  gone to the bottom of kern_execve() to cut down on some code duplication.
2006-02-06 22:06:54 +00:00
jeff
eeadb385e2 - Remove ifdef disabled code that doesn't have a chance of working anymore. 2006-02-06 10:10:42 +00:00
rwatson
b2508d6d8d Regenerate. 2006-02-04 13:29:09 +00:00
rwatson
e95beb66d3 Audit FreeBSD 32-bit system calls on 64-bit FreeBSD systems.
Obtained from:	TrustedBSD Project
2006-02-04 13:28:55 +00:00
jeff
4a759adfc8 - vn_lock with LK_RETRY can not return an error. The code that handled this
case was not necessary.

Sponsored by:	Isilon Systems, Inc.
2006-01-30 08:22:56 +00:00
cognet
a256f935eb Fix a typo : deivce => device
Spotted by:	rwatson
2006-01-26 21:48:50 +00:00
cognet
d6ecc915cc Linux compat bits needed to make linux programs use the new ptys :
linux_ioctl.[ch] : Implement LINUX_TIOCGPTN, which returns the pty number
linux_stats.c :
	- Return the magic number for devfs.
	- In various stats()-related functions, check that we're stating a
file in /dev/pts, and if so, change the st_rdev field to match what linux
expects to be there for a slave pty device. The glibc checks for this, and
their openpty() fails if it is no correct.
2006-01-26 01:32:46 +00:00
ambrisko
c64b5e9dbd Fix the build. When I added the lutimes the futimes definitions
went away in the generated files?  This didn't happen on my amd64
test machine but did when I committed it on my other i386 machine.
I need to figure this out since a regen on the amd64 doesn't fix it
now.  For now make the build work again.  Matt caught this before
my local mirror caught up.
2006-01-20 20:51:27 +00:00
ambrisko
b203612a40 Regen. 2006-01-20 16:22:37 +00:00
ambrisko
1e3545831a Add 32bit version of lutimes so untar doesn't mess up sym-links on amd64. 2006-01-20 16:22:06 +00:00
trhodes
b4c8f182a7 Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code. 2005-12-28 07:08:54 +00:00
glebius
3529ccbc92 Add \n to log() message.
Submitted by:	Stanislaw Halik <weirdo tehran.lain.pl>
2005-12-27 00:17:11 +00:00
sobomax
34fa5a81a5 Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR:		kern/87615
Submitted by:	Marcin Koziej <creep@desk.pl>
2005-12-26 21:23:57 +00:00
ru
779525de38 Regen. 2005-12-23 20:06:50 +00:00
ru
0fbab0b85d Fix build. 2005-12-23 20:06:14 +00:00
phk
213233d76c Regenerate sysent with new abort2 system call.
Implement abort2(const char *reason, int narg, void **args);

Submitted by:	"Wojciech A. Koszek" <dunstan@freebsd.czest.pl>
2005-12-23 11:58:42 +00:00
phk
80e1c5a55d Add missing 455-462 syscalls as unimplemented 2005-12-23 11:56:39 +00:00
phk
69832fdcf6 Add abort2() systemcall. 2005-12-23 11:54:11 +00:00
jhb
feebef55c2 Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1)
which existed to cleanup the linux_osname mutex.  Now that MTX_SYSINIT()
has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here
was redundant and resulted in panics in debug kernels.

MFC after:	1 week
Reported by:	Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu
2005-12-15 16:30:41 +00:00
delphij
53041c5448 In Linux, kernel parameters passed to ioctl are by value, while in FreeBSD
they are passed by reference.  Handle the difference within the
linux_ioctl_termio on the LINUX_TCFLSH path.

Submitted by:	Jaroslav Drzik <jaro_AT_coop-voz_dot_sk>
2005-12-13 15:32:52 +00:00
mlaier
a5a3178da8 Fix calculation of meminfo's swaptotal and swapfree on at least amd64.
MFC after:	3 days
2005-12-11 21:37:42 +00:00
ambrisko
8f15339baa Regen for futimes. 2005-12-08 22:15:09 +00:00
ambrisko
8cf667cb0b Add 32bit version of futimes so untar doesn't result in bad dates
(Jan 1, 1970) when run on amd64.

Reviewed by:	ps
2005-12-08 22:14:25 +00:00
glebius
d6ef4fe5b2 Suppress logging about unimplemented syscalls to one time per process. This
prevents hard flood of the system console.

Reviewed by:	bde
2005-12-08 13:33:57 +00:00
peter
a854646ee0 Catch up to the system siginfo changes. Use a union for the ia32 layout
of siginfo just like the system one.  There are now two fields to copy
instead of one.
2005-12-06 23:06:29 +00:00
ru
522e9c2b7b Fix -Wundef. 2005-12-04 02:12:43 +00:00
rodrigc
a5e716d31f Remove MNT_NODEV mount option. In RELENG_6, MNT_NODEV was a no-op.
The presence of MNT_NODEV was confusing the am-utils autoconf scripts.

PR:	conf/79715
2005-11-29 00:28:17 +00:00
wpaul
3d571e2f28 Somehow memmove() got mapped to memset() in the patch table. Create a
real memmove() implementation and use that instead.
2005-11-23 17:10:46 +00:00
wpaul
0b2e0bd48c Correct the API for Windows interupt handling a little. The prototype
for a Windows ISR is 'BOOLEAN isrfunc(KINTERRUPT *, void *)' meaning
the ISR get a pointer to the interrupt object and a context pointer,
and returns TRUE if the ISR determines the interrupt was really generated
by the associated device, or FALSE if not.

I had mistakenly used 'void isrfunc(void *)' instead. It happens the
only thing this affects is the internal ndis_intr() ISR in subr_ndis.c,
but it should be fixed just in case we ever need to register a real
Windows ISR vi IoConnectInterrupt().

For NDIS miniports that provide a MiniportISR() method, the 'is_our_intr'
value returned by the method serves as the return value from ndis_isr(),
and 'call_isr' is used to decide whether or not to schedule the interrupt
handler via DPC. For drivers that only supply MiniportEnableInterrupt()
and MiniportDisableInterrupt() methods, call_isr is always TRUE and
is_our_intr is always FALSE.

In the end, there should be no functional changes, except that now
ntoskrnl_intr() can terminate early once it finds the ISR that wants
to service the interrupt.
2005-11-20 01:29:29 +00:00
ru
96c796ea3f Unlike the rest of the world, NDIS code can access "struct
ifnet" before is has been fully initialized by if_attach().
Account for that to avoid a null pointer dereference.
2005-11-14 18:19:57 +00:00
wpaul
c3427572d7 Restore backwards source compatibility with 6.x and 5.x. 2005-11-13 21:36:48 +00:00
ru
f70f525b49 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
wpaul
358decbed5 Implement RtlZeroMemory() and RtlCopyMemory(). This seems to allow
the Broadcom Win64 wireless driver for the BCM4318 to work on amd64.
2005-11-10 02:22:55 +00:00
wpaul
72f875157d Change the definition for EXT_NDIS to EXT_NET_DRV. Since the latest
mbuf code changes, MEXTADD() can be used to add an external buffer with
arbitrary type, but mb_ext_free() won't let you free it.
2005-11-07 16:57:14 +00:00
wpaul
6daa19f5d4 The latest version of the Intel 2200BG/2915ABG driver (9.0.0.3-9) from
Intel's web site requires some minor tweaks to get it to work:

- The driver seems to have been released with full WMI tracing enabled,
  and makes references to some WMI APIs, namely IoWMIRegistrationControl(),
  WmiQueryTraceInformation() and WmiTraceMessage(). Only the first
  one is ever called (during intialization). These have been implemented
  as do-nothing stubs for now. Also added a definition for STATUS_NOT_FOUND
  to ntoskrnl_var.h, which is used as a return code for one of the WMI
  routines.

- The driver references KeRaiseIrqlToDpcLevel() and KeLowerIrql()
  (the latter as a function, which is unusual because normally
  KeLowerIrql() is a macro in the Windows DDK that calls KfLowewIrql()).
  I'm not sure why these are being called since they're not really
  part of WDM. Presumeably they're being used for backwards
  compatibility with old versions of Windows. These have been
  implemented in subr_hal.c. (Note that they're _stdcall routines
  instead of _fastcall.)

- When querying the OID_802_11_BSSID_LIST OID to get a BSSID list,
  you don't know ahead of time how many networks the NIC has found
  during scanning, so you're allowed to pass 0 as the list length.
  This should cause the driver to return an 'insufficient resources'
  error and set the length to indicate how many bytes are actually
  needed. However for some reason, the Intel driver does not honor
  this convention: if you give it a length of 0, it returns some
  other error and doesn't tell you how much space is really needed.
  To get around this, if using a length of 0 yields anything besides
  the expected error case, we arbitrarily assume a length of 64K.
  This is similar to the hack that wpa_supplicant uses when doing
  a BSSID list query.
2005-11-06 19:38:34 +00:00
ps
0bae323c8a Copy out the number of iovecs in freebsd32_recvmsg, not the length
of a single iovec.
2005-11-06 18:12:43 +00:00
ps
e0951fe504 Calling setrlimit from 32bit apps could potentially increase certain
limits beyond what should be capiable in a 32bit process, so we
must fixup the limits.

Reviewed by:	jhb
2005-11-02 21:18:07 +00:00
wpaul
c104267c1a Tests with my dual Opteron system have shown that it's possible
for code to start out on one CPU when thunking into Windows
mode in ctxsw_utow(), and then be pre-empted and migrated to another
CPU before thunking back to UNIX mode in ctxsw_wtou(). This is
bad, because then we can end up looking at the wrong 'thread environment
block' when trying to come back to UNIX mode. To avoid this, we now
pin ourselves to the current CPU when thunking into Windows code.

Few other cleanups, since I'm here:

- Get rid of the ndis_isr(), ndis_enable_interrupt() and
  ndis_disable_interrupt() wrappers from kern_ndis.c and just invoke
  the miniport's methods directly in the interrupt handling routines
  in subr_ndis.c. We may as well lose the function call overhead,
  since we don't need to export these things outside of ndis.ko
  now anyway.

- Remove call to ndis_enable_interrupt() from ndis_init() in if_ndis.c.
  We don't need to do it there anyway (the miniport init routine handles
  it, if needed).

- Fix the logic in NdisWriteErrorLogEntry() a little.

- Change some NDIS_STATUS_xxx codes in subr_ntoskrnl.c into STATUS_xxx
  codes.

- Handle kthread_create() failure correctly in PsCreateSystemThread().
2005-11-02 18:01:04 +00:00
andre
0df84f5a83 Retire MT_HEADER mbuf type and change its users to use MT_DATA.
Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it.  It only adds a layer of confusion.  The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.

Non-native code is not changed in this commit.  For compatibility
MT_HEADER is mapped to MT_DATA.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-02 13:46:32 +00:00
wpaul
87ccfda2f1 Clean up one remaining 'multiple DPC thread' bogon: only bzero() one
sizeof(kq_queue), not sizeof(kq_queue) * mp_ncpus.
2005-11-01 09:24:35 +00:00
ps
bd0529b5a0 Reformat socket control messages on input/output for 32bit compatibility
on 64bit systems.

Submitted by:	ps, ups
Reviewed by:	jhb
2005-10-31 21:09:56 +00:00
peter
234aa74f77 Regenerate (with the correct #ifdef COMPAT_43 tests now) 2005-10-26 22:21:03 +00:00
peter
a5633b701c There is no 'freebsd3_' prefix for COMPAT_43 syscalls. Those are all
bundled under MCOMPAT and have an 'o' prefix.  Adjust as appropriate.
This re-enables compiling without COMPAT_43 again.
2005-10-26 22:19:51 +00:00
wpaul
b382eabb5f Minor nit: in ntoskrnl_finddev(), only free the 'children' device_t
array if device_find_children() actually returned a non-NULL array pointer.
2005-10-26 20:21:45 +00:00
wpaul
6140104fb2 Clean up and apply the fix for PR 83477. The calculation for locating
the start of the section headers has to take into account the fact
that the image_nt_header is really variable sized. It happens that
the existing calculation is correct for _most_ production binaries
produced by the Windows DDK, but if we get a binary with oddball
offsets, the PE loader could crash.

Changes from the supplied patch are:

- We don't really need to use the IMAGE_SIZEOF_NT_HEADER() macro when
  computing how much of the header to return to callers of
  pe_get_optional_header(). While it's important to take the variable
  size of the header into account in other calculations, we never
  actually look at anything outside the non-variable portion of the
  header. This saves callers from having to allocate a variable sized
  buffer off the heap (I purposely tried to avoid using malloc()
  in subr_pe.c to make it easier to compile in both the -D_KERNEL and
  !-D_KERNEL case), and since we're copying into a buffer on the
  stack, we always have to copy the same amount of data or else
  we'll trash the stack something fierce.

- We need <stddef.h> to get offsetof() in the !-D_KERNEL case.

- ndiscvt.c needs the IMAGE_FIRST_SECTION() macro too, since it does
  a little bit of section pre-processing.

PR: kern/83477
2005-10-26 18:46:27 +00:00
wpaul
a160984738 Get rid of the timer tracking and reaping code in NdisMInitializeTimer()
and ndis_halt_nic(). It's been disabled for some time anyway, and
it turns out there's a possible deadlock in NdisMInitializeTimer() when
acquiring the miniport block lock to modify the timer list: it's
possible for a driver to call NdisMInitializeTimer() when the miniport
block lock has already been acquired by an earlier piece of code. You
can't acquire the same spinlock twice, so this can deadlock.

Also, implement MmMapIoSpace() and MmUnmapIoSpace(), and make
NdisMMapIoSpace() and NdisMUnmapIoSpace() use them. There are some
drivers that want MmMapIoSpace() and MmUnmapIoSpace() so that they can
map arbitrary register spaces not directly associated with their
device resources. For example, there's an Atheros driver for
a miniPci card (0x168C:0x1014) on the IBM Thinkpad x40 that wants
to map some I/O spaces at 0xF00000 and 0xE00000 which are held by
the acpi0 device. I don't know what it wants these ranges for,
but if it can't map and access them, the MiniportInitialize() method
fails.
2005-10-26 06:52:57 +00:00
wpaul
34370764e1 Fix handling of message table messages that got broken when I
converted NdisWriteErrorLogEntry() to use the RtlXXX unicode/ansi
conversion routines.
2005-10-24 05:05:09 +00:00
obrien
2546866153 Add a 'clean' target. 2005-10-23 23:58:23 +00:00
ps
349caab4d4 regen 2005-10-23 10:43:39 +00:00
ps
88dbbb6ebf Implement for FreeBSD 3 32 binaries:
sigaction, sigprocmask, sigpending, sigvec, sigblock, sigsetmask,
sigsuspend, sigstack
2005-10-23 10:43:14 +00:00
wpaul
be0b87ae93 Make the multiple DPC threads an option, and create only one by default.
This avoids the need for sched_bind() in the default case so that you
can start up the NDIS subsystem at boot time when only CPU 0 is running.

There are potentially ways to fix it so that the DPC threads aren't
started until after the other CPUs are launched, but doing it correctly
is tricky. You need to defer the startup of the ntoskrnl subsystem
(ntoskrnl_libinit()), not just defer ndis_attach().

For now, I don't think it will make much difference having just the
single DPC thread (I started out with just one anyway). Note that this
turns the KeSetTargetProcessorDpc() routine into a no-op, since the
CPU number in struct kdpc is now ignored.
2005-10-22 05:15:20 +00:00
wpaul
8be176b07b Correct the macro definition for KeRaiseIrql(). The official API
is KeRaiseIrql(newirql, &oldirql), not oldirql = KeRaiseIrql(newirql).
(The macro ultimately translates to KfRaiseIrql() which does use
the latter API, so this has no effect on generated code.)

Also, wait for thread termination the right way: kthread_exit()
will ultimately do a wakeup(td->td_proc). This is the event we
should wait on. Eliminate the previous synchronization machinery
for this since it was never guaranteed to work correctly.
2005-10-21 05:23:20 +00:00
wpaul
b76fb84305 Use sched_bind() to make sure the DPC threads are bound to the correct
processor, to insure DPC thread 0 runs on CPU0, DPC thread 1 runs on
CPU1, and so on.

Elevate the priority of the workitem threads, though don't use as
high a priority as the DPC threads.
2005-10-20 17:45:58 +00:00
davidxu
22847b1b84 Fix compiling problem by adding prefix name svr4 to si_xxx macro, the
si_xxx macro should not be used in compat headers, as these are standard
member names or only can be used in our native header file signal.h.
2005-10-19 09:33:15 +00:00
wpaul
81737fff08 Another round of cleanups and fixes:
- Change ndis_return() from a DPC to a workitem so that it doesn't
  run at DISPATCH_LEVEL (with the dispatcher lock held).

- In if_ndis.c, submit packets to the stack via (*ifp->if_input)() in
  a workitem instead of doing it directly in ndis_rxeof(), because
  ndis_rxeof() runs in a DPC, and hence at DISPATCH_LEVEL. This
  implies that the 'dispatch level' mutex for the current CPU is
  being held, and we don't want to call if_input while holding
  any locks.

- Reimplement IoConnectInterrupt()/IoDisconnectInterrupt(). The original
  approach I used to track down the interrupt resource (by scanning
  the device tree starting at the nexus) is prone to problems when
  two devices share an interrupt. (E.g removing ndis1 might disable
  interrupts for ndis0.) The new approach is to multiplex all the
  NDIS interrupts through a common internal dispatcher (ntoskrnl_intr())
  and allow IoConnectInterrupt()/IoDisconnectInterrupt() to add or
  remove interrupts from the dispatch list.

- Implement KeAcquireInterruptSpinLock() and KeReleaseInterruptSpinLock().

- Change the DPC and workitem threads to use the KeXXXSpinLock
  API instead of mtx_lock_spin()/mtx_unlock_spin().

- Simplify the NdisXXXPacket routines by creating an actual
  packet pool structure and using the InterlockedSList routines
  to manage the packet queue.

- Only honor the value returned by OID_GEN_MAXIMUM_SEND_PACKETS
  for serialized drivers. For deserialized drivers, we now create
  a packet array of 64 entries. (The Microsoft DDK documentation
  says that for deserialized miniports, OID_GEN_MAXIMUM_SEND_PACKETS
  is ignored, and the driver for the Marvell 8335 chip, which is
  a deserialized miniport, returns 1 when queried.)

- Clean up timer handling in subr_ntoskrnl.

- Add the following conditional debugging code:
	NTOSKRNL_DEBUG_TIMERS - add debugging and stats for timers
	NDIS_DEBUG_PACKETS - add extra sanity checking for NdisXXXPacket API
	NTOSKRNL_DEBUG_SPINLOCKS - add test for spinning too long

- In kern_ndis.c, always start the HAL first and shut it down last,
  since Windows spinlocks depend on it. Ntoskrnl should similarly be
  started second and shut down next to last.
2005-10-18 19:52:15 +00:00
ps
5dceb7fc4d regen after recvmsg, recvfrom, sendmsg 2005-10-15 05:57:34 +00:00
ps
a72385743d Implement the 32bit versions of recvmsg, recvfrom, sendmsg
Partially obtained from:	jhb
2005-10-15 05:57:06 +00:00
ps
36d05b2d9f regen for clock_gettime, clock_settime, clock_getres 2005-10-15 02:54:39 +00:00
ps
c63313b35d Implement 32bit wrappers for clock_gettime, clock_settime, and
clock_getres.
2005-10-15 02:54:18 +00:00
ps
a936b63f3d regen 2005-10-15 02:40:34 +00:00
ps
c7a8fca230 Correct the prototype for freebsd32_nanosleep and use the proper
size when copying struct timespec32 in and out.
2005-10-15 02:40:10 +00:00
davidxu
3fbdb3c215 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
wpaul
0ce580e541 Convert ndis_set_info() and ndis_get_info() from using msleep()
to KeSetEvent()/KeWaitForSingleObject(). Also make object argument
of KeWaitForSingleObject() a void * like it's supposed to be.
2005-10-12 03:02:50 +00:00
wpaul
ef07dbe57f This commit makes a big round of updates and fixes many, many things.
First and most importantly, I threw out the thread priority-twiddling
implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in
favor of a new scheme that uses sleep mutexes. The old scheme was
really very naughty and sought to provide the same behavior as
Windows spinlocks (i.e. blocking pre-emption) but in a way that
wouldn't raise the ire of WITNESS. The new scheme represents
'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If
a thread on cpu0 acquires the 'dispatcher mutex,' it will block
any other thread on the same processor that tries to acquire it,
in effect only allowing one thread on the processor to be at
'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit
and spin' routine on the spinlock variable itself. If a thread on
cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher
mutex' for cpu1 and then it too does an atomic sit and spin to try
acquiring the spinlock.

Unlike real spinlocks, this does not disable pre-emption of all
threads on the CPU, but it does put any threads involved with
the NDISulator to sleep, which is just as good for our purposes.

This means I can now play nice with WITNESS, and I can safely do
things like call malloc() when I'm at 'DISPATCH_LEVEL,' which
you're allowed to do in Windows.

Next, I completely re-wrote most of the event/timer/mutex handling
and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects()
have been re-written to use condition variables instead of msleep().
This allows us to use the Windows convention whereby thread A can
tell thread B "wake up with a boosted priority." (With msleep(), you
instead have thread B saying "when I get woken up, I'll use this
priority here," and thread A can't tell it to do otherwise.) The
new KeWaitForMultipleObjects() has been better tested and better
duplicates the semantics of its Windows counterpart.

I also overhauled the IoQueueWorkItem() API and underlying code.
Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the
same work item isn't put on the queue twice. ExQueueWorkItem(),
which in my implementation is built on top of IoQueueWorkItem(),
was also modified to perform a similar test.

I renamed the doubly-linked list macros to give them the same names
as their Windows counterparts and fixed RemoveListTail() and
RemoveListHead() so they properly return the removed item.

I also corrected the list handling code in ntoskrnl_dpc_thread()
and ntoskrnl_workitem_thread(). I realized that the original logic
did not correctly handle the case where a DPC callout tries to
queue up another DPC. It works correctly now.

I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and
modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to
use them. I also tried to duplicate the interrupt handling scheme
used in Windows. The interrupt handling is now internal to ndis.ko,
and the ndis_intr() function has been removed from if_ndis.c. (In
the USB case, interrupt handling isn't needed in if_ndis.c anyway.)

NdisMSleep() has been rewritten to use a KeWaitForSingleObject()
and a KeTimer, which is how it works in Windows. (This is mainly
to insure that the NDISulator uses the KeTimer API so I can spot
any problems with it that may arise.)

KeCancelTimer() has been changed so that it only cancels timers, and
does not attempt to cancel a DPC if the timer managed to fire and
queue one up before KeCancelTimer() was called. The Windows DDK
documentation seems to imply that KeCantelTimer() will also call
KeRemoveQueueDpc() if necessary, but it really doesn't.

The KeTimer implementation has been rewritten to use the callout API
directly instead of timeout()/untimeout(). I still cheat a little in
that I have to manage my own small callout timer wheel, but the timer
code works more smoothly now. I discovered a race condition using
timeout()/untimeout() with periodic timers where untimeout() fails
to actually cancel a timer. I don't quite understand where the race
is, using callout_init()/callout_reset()/callout_stop() directly
seems to fix it.

I also discovered and fixed a bug in winx32_wrap.S related to
translating _stdcall calls. There are a couple of routines
(i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that
return 64-bit quantities. On the x86 arch, 64-bit values are
returned in the %eax and %edx registers. However, it happens
that the ctxsw_utow() routine uses %edx as a scratch register,
and x86_stdcall_wrap() and x86_stdcall_call() were only preserving
%eax before branching to ctxsw_utow(). This means %edx was getting
clobbered in some cases. Curiously, the most noticeable effect of this
bug is that the driver for the TI AXC110 chipset would constantly drop
and reacquire its link for no apparent reason. Both %eax and %edx
are preserved on the stack now. The _fastcall and _regparm
wrappers already handled everything correctly.

I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem()
instead of the NdisScheduleWorkItem() API. This is to avoid possible
deadlocks with any drivers that use NdisScheduleWorkItem() themselves.

The unicode/ansi conversion handling code has been cleaned up. The
internal routines have been moved to subr_ntoskrnl and the
RtlXXX routines have been exported so that subr_ndis can call them.
This removes the incestuous relationship between the two modules
regarding this code and fixes the implementation so that it honors
the 'maxlen' fields correctly. (Previously it was possible for
NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't
own, which was causing many mysterious crashes in the Marvell 8335
driver.)

The registry handling code (NdisOpen/Close/ReadConfiguration()) has
been fixed to allocate memory for all the parameters it hands out to
callers and delete whem when NdisCloseConfiguration() is called.
(Previously, it would secretly use a single static buffer.)

I also substantially updated if_ndis so that the source can now be
built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only
WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled.

The original WPA code has been updated to fit in more cleanly with
the net80211 API, and to eleminate the use of magic numbers. The
ndis_80211_setstate() routine now sets a default authmode of OPEN
and initializes the RTS threshold and fragmentation threshold.
The WPA routines were changed so that the authentication mode is
always set first, followed by the cipher. Some drivers depend on
the operations being performed in this order.

I also added passthrough ioctls that allow application code to
directly call the MiniportSetInformation()/MiniportQueryInformation()
methods via ndis_set_info() and ndis_get_info(). The ndis_linksts()
routine also caches the last 4 events signalled by the driver via
NdisMIndicateStatus(), and they can be queried by an application via
a separate ioctl. This is done to allow wpa_supplicant to directly
program the various crypto and key management options in the driver,
allowing things like WPA2 support to work.

Whew.
2005-10-10 16:46:39 +00:00
jhb
2cc06a19bc Use the constants for the syscall names from syscall.h rather than
hardcoding the numbers for the SYSVIPC syscalls.
2005-10-03 18:34:17 +00:00
rwatson
2b01dbdaa0 Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by:	bde
2005-09-28 07:03:03 +00:00
peter
15eec54c70 Regenerate 2005-09-27 18:04:52 +00:00
peter
fe69f6532f Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added
stubs for ia64 to keep it compiling.  These are used by 32 bit apps such
as gdb.
2005-09-27 18:04:20 +00:00
rwatson
c479a90eb8 Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions.  In most cases, this means adding an
acquisition of Giant immediately around the function.  In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset.  This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after:	1 week
2005-09-19 16:51:43 +00:00
andre
84937ac44f Test the mbuf flags against the correct constant. The previous version
worked as intended but only by chance.  MT_HEADER == M_PKTHDR == 0x2.
2005-08-30 16:21:51 +00:00
delphij
c829bf7c00 Fix kernel build.
Reported by:	tinderbox
2005-08-28 13:11:08 +00:00
rodrigc
3627dcf262 Rewrite linux_ifconf() to be more like ifconf() in net/if.c
so that we do not call uiomove() while IFNET_RLOCK() is held.
This eliminates the witness warning:

Calling uiomove() with the following non-sleepable locks held:
exclusive sleep mutex ifnet r = 0 (0xc096dd60) locked @
/usr/src/sys/modules/linux/../../compat/linux/linux_ioctl.c:2170

MFC after:	2 days
2005-08-27 14:44:10 +00:00
rwatson
5d770a09e8 Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
jhb
42b3b6a20e Add missing dependencies on the SYSVIPC modules. 2005-07-29 19:41:04 +00:00
jhb
114f6b764d Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c
so that they aren't duplicated 3 times and are also in the same file as
the code that depends on the SYSVIPC modules.
2005-07-29 19:40:39 +00:00
jhb
8ca187d620 Regen. 2005-07-13 20:35:09 +00:00
jhb
7e35629af2 Make a pass through all the compat ABIs sychronizing the MP safe flags
with the master syscall table as well as marking several ABI wrapper
functions safe.

MFC after:	1 week
2005-07-13 20:32:42 +00:00
jhb
7010162a5e Regen. 2005-07-13 15:14:54 +00:00
jhb
ce792db906 - Stop hardcoding #define's for options and use the appropriate
opt_foo.h headers instead.
- Hook up the IPC SVR4 syscalls.

MFC after:	3 days
2005-07-13 15:14:33 +00:00
jhb
6c122e4c77 Wrap the ia64-specific freebsd32_mmap_partial() hack in Giant for now
since it calls into VFS and VM.  This makes the freebsd32_mmap() routine
MP safe and the extra Giants here can be revisited later.

Glanced at by:	marcel
MFC after:	3 days
2005-07-13 15:12:19 +00:00
jhb
d7eebc79f5 Add Giant around linux_getcwd_common() in linux_getcwd().
Approved by:	re (scottl)
2005-07-09 12:34:49 +00:00
jhb
8816876fa9 Add missing locking to linux_connect() so that it can be marked MP safe:
- Conditionally grab Giant around the EISCONN hack at the end based on
  debug.mpsafenet.
- Protect access to so_emuldata via SOCK_LOCK.

Reviewed by:	rwatson
Approved by:	re (scottl)
2005-07-09 12:26:22 +00:00
rik
d732ff5ca2 Use implicit type cast for ->k_lock to fix compilation of ndis
as a part of the GENERIC kernel with INVARIANT* and WITNESS*
turned off.
(For non GENERIC kernel KTR and MUTEX_PROFILING should be also
off).

Submitted by:	Eygene A. Ryabinkin <rea at rea dot mbslab dot kiae dot ru>
Approved by:	re (scottl)
PR:		81767
2005-07-08 18:36:59 +00:00
jhb
89c65bc296 Lock Giant in svr4_add_socket() so that the various svr4_*stat() calls
can be marked MP safe as this is the only part of them that is not
already MP safe.

Approved by:	re (scottl)
2005-07-07 19:27:29 +00:00
jhb
f2ca5e0f59 Remove an unused syscallarg() macro leftover from this code's origins in
NetBSD.

Approved by:	re (scottl)
2005-07-07 19:26:43 +00:00
jhb
64250ecf1e Rototill this file so that it actually compiles. It doesn't do anything
in the build still due to some #undef's in svr4.h, but if you hack around
that and add some missing entries to syscalls.master, then this file will
now compile.  The changes involved proc -> thread, using FreeBSD syscall
names instead of NetBSD, and axeing syscallarg() and retval arguments.

Approved by:	re (scottl)
2005-07-07 19:25:47 +00:00
jhb
d7828dd231 Fix the computation of uptime for linux_sysinfo(). Before it was returning
the uptime in seconds mod 60 which wasn't very useful.

Approved by:	re (scottl)
2005-07-07 19:17:55 +00:00
jhb
f962c71215 Regenerate.
Approved by:	re (scottl)
2005-07-07 18:20:38 +00:00
jhb
cf15cbb1b6 - Add two new system calls: preadv() and pwritev() which are like readv()
and writev() except that they take an additional offset argument and do
  not change the current file position.  In SAT speak:
  preadv:readv::pread:read and pwritev:writev::pwrite:write.
- Try to reduce code duplication some by merging most of the old
  kern_foov() and dofilefoo() functions into new dofilefoo() functions
  that are called by kern_foov() and kern_pfoov().  The non-v functions
  now all generate a simple uio on the stack from the passed in arguments
  and then call kern_foov().  For example, read() now just builds a uio and
  calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev().

PR:		kern/80362
Submitted by:	Marc Olzheim marcolz at stack dot nl (1)
Approved by:	re (scottl)
MFC after:	1 week
2005-07-07 18:17:55 +00:00
peter
921b3c5ee4 Jumbo-commit to enhance 32 bit application support on 64 bit kernels.
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.

ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
	procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
	and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
	sscanf fails.  They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets.  Note
	that 64 bit consumers can still debug 32 bit targets.

IA64 has got stubs for ia32_reg.c.

Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet.  We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.

Approved by:	re
2005-06-30 07:49:22 +00:00
jhb
62d0fed7ec - Change the commented out freebsd32_xxx() example to use kern_xxx() along
with a single copyin() + translate and translate + copyout() rather than
  using the stackgap.
- Remove implementation of the stackgap for freebsd32 since it is no longer
  used for that compat ABI.

Approved by:	re (scottl)
2005-06-29 15:16:20 +00:00
jhb
0a8b4194dc Correct the amount of data to allocate in these local copies of
exec_copyin_strings() to catch up to rev 1.266 of kern_exec.c.  This fixes
panics on amd64 with compat binaries since exec_free_args() was freeing
more memory than these functions were allocating and the mismatch could
cause memory to be freed out from under other concurrent execs.

Approved by:	re (scottl)
2005-06-24 17:41:28 +00:00
pjd
a99a8a69bd Actually only protect mount-point if security.jail.enforce_statfs is set to 2.
If we don't return statistics about requested file systems, system tools
may not work correctly or at all.

Approved by:	re (scottl)
2005-06-23 22:13:29 +00:00
pjd
be79126844 Do not allocate memory based on not-checked argument from userland.
It can be used to panic the kernel by giving too big value.
Fix it by moving allocation and size verification into kern_getfsstat().
This even simplifies kern_getfsstat() consumers, but destroys symmetry -
memory is allocated inside kern_getfsstat(), but has to be freed by the
caller.

Found by:	FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by:	Peter Holm <peter@holm.cc>
2005-06-11 14:58:20 +00:00
brooks
567ba9b00a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
pjd
47f442bcb9 Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs
and extend its functionality:

value	policy
0	show all mount-points without any restrictions
1	show only mount-points below jail's chroot and show only part of the
	mount-point's path (if jail's chroot directory is /jails/foo and
	mount-point is /jails/foo/usr/home only /usr/home will be shown)
2	show only mount-point where jail's chroot directory is placed.

Default value is 2.

Discussed with:	rwatson
2005-06-09 18:49:19 +00:00
pjd
3af857a21a Avoid code duplication in serval places by introducing universal
kern_getfsstat() function.

Obtained from:	jhb
2005-06-09 17:44:46 +00:00
sobomax
307c6bb149 Properly convert FreeBSD priority values into Linux values in the
getpriority(2) syscall.

PR:		kern/81951
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
2005-06-08 20:41:28 +00:00
ps
bac0ce72d5 Wrap copyin/copyout for kevent so the 32bit wrapper does not have
to malloc nchanges * sizeof(struct kevent) AND/OR nevents *
sizeof(struct kevent) on every syscall.

Glanced at by:	peter, jmg
Obtained from:	Yahoo!
MFC after:	2 weeks
2005-06-03 23:15:01 +00:00
rwatson
5010364761 Rebuild generated system call definition files following the addition of
the audit event field to the syscalls.master file format.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:20:21 +00:00
rwatson
370e72b242 Introduce a new field in the syscalls.master file format to hold the
audit event identifier associated with each system call, which will
be stored by makesyscalls.sh in the sy_auevent field of struct sysent.
For now, default the audit identifier on all system calls to AUE_NULL,
but in the near future, other BSM event identifiers will be used.  The
mapping of system calls to event identifiers is many:one due to
multiple system calls that map to the same end functionality across
compatibility wrappers, ABI wrappers, etc.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:09:18 +00:00
nyan
0fce92f5c4 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
pjd
311d5e1182 Remove (now) unused argument 'td' from bsd_to_linux_statfs(). 2005-05-27 19:25:39 +00:00
ps
5f0ea7b68e Copyout to userland if kern_sigaction succeeds 2005-05-24 17:52:14 +00:00
pjd
abf2289872 The code is under '#ifdef not_that_way', but anyway:
- Add missing prison_check_mount() check.
2005-05-22 22:30:31 +00:00
pjd
a6e0e217b2 If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us,
so do not duplicate the code in cvtstatfs().
Note, that we now need to clear fsid in freebsd4_getfsstat().

This moves all security related checks from functions like cvtstatfs()
and will allow to add more security related stuff (like statfs(2), etc.
protection for jails) a bit easier.
2005-05-22 21:52:30 +00:00
wpaul
df85175aeb Missed kern_windrv.c in the last checkin. 2005-05-20 04:01:36 +00:00
wpaul
29dcab8107 Deal with a few bootstrap issues:
We can't call KeFlushQueuedDpcs() during bootstrap (cold == 1), since
the flush operation sleeps to wait for completion, and we can't sleep
here (clowns will eat us).

On an i386 SMP system, if we're loaded/probed/attached during bootstrap,
smp_rendezvous() won't run us anywhere except CPU 0 (since the other CPUs
aren't launched until later), which means we won't be able to set up
the GDTs anywhere except CPU 0. To deal with this case, ctxsw_utow()
now checks to see if the TID for the current processor has been properly
initialized and sets up the GTD for the current CPU if not.

Lastly, in if_ndis.c:ndis_shutdown(), do an ndis_stop() to insure we
really halt the NIC and stop interrupts from happening.

Note that loading a driver during bootstrap is, unfortunately, kind of
a hit or miss sort of proposition. In Windows, the expectation is that
by the time a given driver's MiniportInitialize() method is called,
the system is already in 'multiuser' state, i.e. it's up and running
enough to support all the stuff specified in the NDIS API, which includes
the underlying OS-supplied facilities it implicitly depends on, such as
having all CPUs running, having the DPC queues initialized, WorkItem
threads running, etc. But in UNIX, a lot of that stuff won't work during
bootstrap. This causes a problem since we need to call MiniportInitialize()
at least once during ndis_attach() in order to find out what kind of NIC
we have and learn its station address.

What this means is that some cards just plain won't work right if
you try to pre-load the driver along with the kernel: they'll only be
probed/attach correctly if the driver is kldloaded _after_ the system
has reached multiuser. I can't really think of a way around this that
would still preserve the ability to use an NDIS device for diskless
booting.
2005-05-20 04:00:50 +00:00
wpaul
cb815ff30f In ndis_halt_nic(), invalidate the miniportadapterctx early to try and
prevent anything from making calls to the NIC while it's being shut down.
This is yet another attempt to stop things like mdnsd from trying to
poke at the card while it's not properly initialized and panicking
the system.

Also, remove unneeded debug message from if_ndis.c.
2005-05-20 02:35:43 +00:00
wpaul
8e4107ff8f Fix some of the things I broke so that the SMC2602W (AMD Am1772) driver
works again.

This driver uses NdisScheduleWorkItem(), and we have to take special steps
to insure that its workitems don't collide with any of the other workitems
used by the NDISulator. In particular, if one of the driver's work jobs
blocks, it can prevent NdisMAllocateSharedMemoryAsync() from completing
when expected.

The original hack to fix this was to have NdisMAllocateSharedMemoryAsync()
defer its work to the DPC queue instead of the general task queue. To
fix it now, I decided to add some additional workitem threads. (There's
supposed to be a pool of worker threads in Windows anyway.) Currently,
there are 4. There should be at least 2. One is reserved for the legacy
ExQueueWorkItem() API, while the others are used in round-robin by the
IoQueueWorkItem() API. NdisMAllocateSharedMemoryAsync() uses the latter
API while NdisScheduleWorkItem() uses the former, so the deadlock is
avoided.

Fixed NdisMRegisterDevice()/NdisMDeregisterDevice() to work a little
more sensibly with the new driver_object/device_object framework. It
doesn't really register a working user-mode interface, but the existing
code was completely wrong for the new framework.

Fixed a couple of bugs dealing with the cancellation of events and
DPCs. When cancelling an event that's still on the timer queue (i.e.
hasn't expired yet), reset dh_inserted in its dispatch header to FALSE.
Previously, it was left set to TRUE, which would make a cancelled
timer appear to have not been cancelled. Also, when removing a DPC
from a queue, reset its list pointers, otherwise a cancelled DPC
might mistakenly be treated as still pending.

Lastly, fix the behavior of ntoskrnl_wakeup() when dealing with
objects that have nobody waiting on them: sync event objects get
their signalled state reset to FALSE, but notification objects
should still be set to TRUE.
2005-05-19 04:44:26 +00:00
wpaul
3e9d45596e Remove harmless bit of leftover debug code. 2005-05-16 15:44:41 +00:00
wpaul
0228d4cac8 Correct some problems with workitem usage. NdisScheduleWorkItem() does
not use exactly the same workitem sturcture as ExQueueWorkItem() like
I originally thought it did.
2005-05-16 15:29:21 +00:00
wpaul
09647ee931 Add support for NdisMEthIndicateReceive() and MiniportTransferData().
The Ralink RT2500 driver uses this API instead of NdisMIndicateReceivePacket().

Drivers use NdisMEthIndicateReceive() when they know they support
802.3 media and expect to hand their packets only protocols that want
to deal with that particular media type. With this API, the driver does
not manage its own NDIS_PACKET/NDIS_BUFFER structures. Instead, it
lets bound protocols have a peek at the data, and then they supply
an NDIS_PACKET/NDIS_BUFFER combo to the miniport driver, into which
it copies the packet data.

Drivers use NdisMIndicateReceivePacket() to allow their packets to
be read by any protocol, not just those bound to 802.3 media devices.

To make this work, we need an internal pool of NDIS_PACKETS for
receives. Currently, we check to see if the driver exports a
MiniportTransferData() method in its characteristics structure,
and only allocate the pool for drivers that have this method.

This should allow the RT2500 driver to work correctly, though I
still have to fix ndiscvt(8) to parse its .inf file properly.

Also, change kern_ndis.c:ndis_halt_nic() to reap timers before
acquiring NDIS_LOCK(), since the reaping process might entail sleeping
briefly (and we can't sleep with a lock held).
2005-05-15 04:27:59 +00:00
wpaul
51b4d0ab71 More fixes for multibus drivers. When calling out to the match
function in if_ndis_pci.c and if_ndis_pccard.c, provide the bustype
too so the stubs can ignore devlists that don't concern them.
2005-05-08 23:19:20 +00:00
wpaul
ebc77ad893 Fix support for Windows drivers that support both PCI and PCMCIA devices at
the same time.

Fix if_ndis_pccard.c so that it sets sc->ndis_dobj and sc->ndis_regvals.

Correct IMPORT_SFUNC() macros for the READ_PORT_BUFFER_xxx() routines,
which take 3 arguments, not 2.

This fixes it so that the Windows driver for my Cisco Aironet 340 PCMCIA
card works again. (Yes, I know the an(4) driver supports this card natively,
but it's the only PCMCIA device I have with a Windows XP driver.)
2005-05-08 23:07:51 +00:00
wpaul
36f8fdfd36 Correct the patch table entries for the 64-bit intrinsic math
routines (_alldiv(), _allmul(), _alludiv(), _aullmul(), etc...)
that use the _stdcall calling convention.

These routines all take two arguments, but the arguments are 64 bits wide.
On the i386 this means they each consume two 32-bit slots on the stack.
Consequently, when we specify the argument count in the IMPORT_SFUNC()
macro, we have to lie and claim there are 4 arguments instead of two.
This will cause the resulting i386 assembly wrapper to push the right
number of longwords onto the stack.

This fixes a crash I discovered with the RealTek 8180 driver, which
uses these routines a lot during initialization.
2005-05-08 09:16:33 +00:00
wpaul
d2ae5c8a71 Cast 64 bit quantity to uintmax_t to print it with %jx. This is
technically a no-op since uintmax_t is uint64_t on all currently
supported architectures, but we should use an explicit cast instead
of depending on this obscure coincidence.
2005-05-05 22:33:06 +00:00
wpaul
819bac8d1e Use %jx instead of %qx to silence compiler warning on amd64. 2005-05-05 15:56:41 +00:00
wpaul
077b71e0fa Avoid sleeping with mutex held in kern_ndis.c.
Remove unused fields from ndis_miniport_block.

Fix a bug in KeFlushQueuedDpcs() (we weren't calculating the kq pointer
correctly).

In if_ndis.c, clear the IFF_RUNNING flag before calling ndis_halt_nic().

Add some guards in kern_ndis.c to avoid letting anyone invoke ndis_get_info()
or ndis_set_info() if the NIC isn't fully initialized. Apparently, mdnsd
will sometimes try to invoke the ndis_ioctl() routine at exactly the
wrong moment (to futz with its multicast filters) when the interface
comes up, and can trigger a crash unless we guard against it.
2005-05-05 06:14:59 +00:00
wpaul
bb2b136388 Remove extranaous free() of ASCII filename from NdisOpenFile().
Oh, one additional change I forgot to mention in the last commit:
NdisOpenFile() was broken in the case for firmware files that were
pre-loaded as modules. When searching for the module in NdisOpenFile(),
we would match against a symbol name, which would contain the string
we were looking for, then save a pointer to the linker file handle.
Later, in NdisMapFile(), we would refer to the filename hung off
this handle when trying to find the starting address symbol. Only
problem is, this filename is different from the embedded symbol
name we're searching for, so the mapping would fail. I found this
problem while testing the AirGo driver, which requires a small
firmware file.
2005-05-05 04:16:13 +00:00
wpaul
e9bace5ba1 This commit makes a bunch of changes, some big, some not so big.
- Remove the old task threads from kern_ndis.c and reimplement them in
  subr_ntoskrnl.c, in order to more properly emulate the Windows DPC
  API. Each CPU gets its own DPC queue/thread, and each queue can
  have low, medium and high importance DPCs. New APIs implemented:
  KeSetTargetProcessorDpc(), KeSetImportanceDpc() and KeFlushQueuedDpcs().
  (This is the biggest change.)

- Fix a bug in NdisMInitializeTimer(): the k_dpc pointer in the
  nmt_timer embedded in the ndis_miniport_timer struct must be set
  to point to the DPC, also embedded in the struct. Failing to do
  this breaks dequeueing of DPCs submitted via timers, and in turn
  breaks cancelling timers.

- Fix a bug in KeCancelTimer(): if the timer is interted in the timer
  queue (i.e. the timeout callback is still pending), we have to both
  untimeout() the timer _and_ call KeRemoveQueueDpc() to nuke the DPC
  that might be pending. Failing to do this breaks cancellation of
  periodic timers, which always appear to be inserted in the timer queue.

- Make use of the nmt_nexttimer field in ndis_miniport_timer: keep a
  queue of pending timers and cancel them all in ndis_halt_nic(), prior
  to calling MiniportHalt(). Also call KeFlushQueuedDpcs() to make sure
  any DPCs queued by the timers have expired.

- Modify NdisMAllocateSharedMemory() and NdisMFreeSharedMemory() to keep
  track of both the virtual and physical addresses of the shared memory
  buffers that get handed out. The AirGo MIMO driver appears to have a bug
  in it: for one of the segments is allocates, it returns the wrong
  virtual address. This would confuse NdisMFreeSharedMemory() and cause
  a crash. Why it doesn't crash Windows too I have no idea (from reading
  the documentation for NdisMFreeSharedMemory(), it appears to be a violation
  of the API).

- Implement strstr(), strchr() and MmIsAddressValid().

- Implement IoAllocateWorkItem(), IoFreeWorkItem(), IoQueueWorkItem() and
  ExQueueWorkItem(). (This is the second biggest change.)

- Make NdisScheduleWorkItem() call ExQueueWorkItem(). (Note that the
  ExQueueWorkItem() API is deprecated by Microsoft, but NDIS still uses
  it, since NdisScheduleWorkItem() is incompatible with the IoXXXWorkItem()
  API.)

- Change if_ndis.c to use the NdisScheduleWorkItem() interface for scheduling
  tasks.

With all these changes and fixes, the AirGo MIMO driver for the Belkin
F5D8010 Pre-N card now works. Special thanks to Paul Robinson
(paul dawt robinson at pwermedia dawt net) for the loan of a card
for testing.
2005-05-05 03:56:09 +00:00
jeff
f869be5c72 - Pass the ISOPEN flag to namei so filesystems will know we're about to
open them or otherwise access the data.
2005-04-27 09:05:19 +00:00
wpaul
b493dd59e2 Throw the switch on the new driver generation/loading mechanism. From
here on in, if_ndis.ko will be pre-built as a module, and can be built
into a static kernel (though it's not part of GENERIC). Drivers are
created using the new ndisgen(8) script, which uses ndiscvt(8) under
the covers, along with a few other tools. The result is a driver module
that can be kldloaded into the kernel.

A driver with foo.inf and foo.sys files will be converted into
foo_sys.ko (and foo_sys.o, for those who want/need to make static
kernels). This module contains all of the necessary info from the
.INF file and the driver binary image, converted into an ELF module.
You can kldload this module (or add it to /boot/loader.conf) to have
it loaded automatically. Any required firmware files can be bundled
into the module as well (or converted/loaded separately).

Also, add a workaround for a problem in NdisMSleep(). During system
bootstrap (cold == 1), msleep() always returns 0 without actually
sleeping. The Intel 2200BG driver uses NdisMSleep() to wait for
the NIC's firmware to come to life, and fails to load if NdisMSleep()
doesn't actually delay. As a workaround, if msleep() (and hence
ndis_thsuspend()) returns 0, use a hard DELAY() to sleep instead).
This is not really the right thing to do, but we can't really do much
else. At the very least, this makes the Intel driver happy.

There are probably other drivers that fail in this way during bootstrap.
Unfortunately, the only workaround for those is to avoid pre-loading
them and kldload them once the system is running instead.
2005-04-24 20:21:22 +00:00
wpaul
637643a433 Now that the GDT has been reorganized and GNDIS_SEL has been reserved
for us, use it if it's available, otherwise default to using slot 7
as before.
2005-04-17 19:36:08 +00:00
wpaul
f2f41f37b1 When setting up the new stack for a function in x86_64_wrap(), make
sure to make it 16-byte aligned, in keeping with amd64 calling
convention requirements.

Submitted by:	Mikore Li at sun dot com
2005-04-16 04:47:15 +00:00
jeff
afab3762a0 - Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case.  Se vfs_lookup.c r1.79 for details.

Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:59:09 +00:00
mdodd
91ee5f450f Implement SOUND_MIXER_INFO ioctl in compat layer. 2005-04-13 04:33:06 +00:00
mdodd
6f940cf20f Add support for O_NOFOLLOW and O_DIRECT to Linux fcntl() F_GETFL/F_SETFL. 2005-04-13 04:31:43 +00:00
wpaul
a5aab37b5d In winx32_wrap.S, preserve return values in the fastcall and regparm
wrappers by pushing them onto the stack rather than keeping them in %esi
and %edi.
2005-04-11 17:04:49 +00:00
wpaul
a3b2d3191d Create new i386 windows/bsd thunking layer, similar to the amd64 thunking
layer, but with a twist.

The twist has to do with the fact that Microsoft supports structured
exception handling in kernel mode. On the i386 arch, exception handling
is implemented by hanging an exception registration list off the
Thread Environment Block (TEB), and the TEB is accessed via the %fs
register. The problem is, we use %fs as a pointer to the pcpu stucture,
which means any driver that tries to write through %fs:0 will overwrite
the curthread pointer and make a serious mess of things.

To get around this, Project Evil now creates a special entry in
the GDT on each processor. When we call into Windows code, a context
switch routine will fix up %fs so it points to our new descriptor,
which in turn points to a fake TEB. When the Windows code returns,
or calls out to an external routine, we swap %fs back again. Currently,
Project Evil makes use of GDT slot 7, which is all 0s by default.
I fully expect someone to jump up and say I can't do that, but I
couldn't find any code that makes use of this entry anywhere. Sadly,
this was the only method I could come up with that worked on both
UP and SMP. (Modifying the LDT works on UP, but becomes incredibly
complicated on SMP.) If necessary, the context switching stuff can
be yanked out while preserving the convention calling wrappers.

(Fortunately, it looks like Microsoft uses some special epilog/prolog
code on amd64 to implement exception handling, so the same nastiness
won't be necessary on that arch.)

The advantages are:

- Any driver that uses %fs as though it were a TEB pointer won't
  clobber pcpu.
- All the __stdcall/__fastcall/__regparm stuff that's specific to
  gcc goes away.

Also, while I'm here, switch NdisGetSystemUpTime() back to using
nanouptime() again. It turns out nanouptime() is way more accurate
than just using ticks(). On slower machines, the Atheros drivers
I tested seem to take a long time to associate due to the loss
in accuracy.
2005-04-11 02:02:35 +00:00
peter
0cf2f68921 Fix 32 bit signals on amd64. It turns out that I was sign extending
the register values coming back from sigreturn(2).  Normally this wouldn't
matter because the 32 bit environment would truncate the upper 32 bits
and re-save the truncated values at the next trap.  However, if we got
a fast second signal and it was pending while we were returning from
sigreturn(2) in the signal trampoline, we'd never have had a chance to
truncate the bogus values in 32 bit mode, and the new sendsig would get
an EFAULT when trying to write to the bogus user stack address.
2005-04-05 22:41:49 +00:00
jhb
a3c6b782c3 - Change the vm_mmap() function to accept an objtype_t parameter specifying
the type of object represented by the handle argument.
- Allow vm_mmap() to map device memory via cdev objects in addition to
  vnodes and anonymous memory.  Note that mmaping a cdev directly does not
  currently perform any MAC checks like mapping a vnode does.
- Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the
  cdev the ioctl is acting on rather than trying to find a suitable vnode
  to map from.

Reviewed by:	alc, arch@
2005-04-01 20:00:11 +00:00
wpaul
5f091f4a37 Fix another KeInitializeDpc()/amd64 calling convention issue:
ndis_intrhand() has to be wrapped for the same reason as ndis_timercall().
2005-04-01 16:40:22 +00:00
jhb
38bfdc988f - Use a custom version of copyinuio() to implement readv/writev using
kern_readv/writev.
- Use kern_settimeofday() and kern_adjtime() rather than stackgapping it.
2005-03-31 22:58:13 +00:00
wpaul
6848972c73 Apparently I'm cursed. ndis_findwrap() should be searching ndis_functbl,
not ntoskrnl_functbl.
2005-03-31 21:20:19 +00:00
wpaul
123cbe5f9b Fix an amd64 issue I overlooked. When setting up a callout to
ndis_timercall() in NdisMInitializeTimer(), we can't use the raw
function pointer. This is because ntoskrnl_run_dpc() expects to
invoke a function with Microsoft calling conventions. On i386,
this works because ndis_timercall() is declared with the __stdcall
attribute, but this is a no-op on amd64. To do it correctly, we
have to generate a wrapper for ndis_timercall() and us the wrapper
instead of of the raw function pointer.

Fix this by adding ndis_timercall() to the funcptr table in subr_ndis.c,
and create ndis_findwrap() to extract the wrapped function from the
table in NdisMInitializeTimer() instead of just passing ndis_timercall()
to KeInitializeDpc() directly.
2005-03-31 16:38:48 +00:00
wpaul
2e83169800 Fix a possible mutex leak in KeSetTimerEx(): if timer is NULL, we
bail out without releasing the dispatcher lock. Move the lock acquisition
after the pointer test to avoid this.
2005-03-30 16:22:48 +00:00
wpaul
0257351eba Remove a couple of #ifdef 0'ed code blocks left over from Atheros debugging.
Remember to reset ndis_pendingreq to NULL when bailing out of
ndis_set_info() or ndis_get_info() due to miniportadapterctx not
being set.
2005-03-30 02:50:06 +00:00
jeff
bcbda3d771 - Initial cn_lkflags to LK_EXCLUSIVE.
Sponsored by:	Isilon Systems, Inc.
2005-03-29 10:16:12 +00:00
wpaul
bb3714b327 The filehandle allocated in NdisOpenFile() is allocated using
ExAllocatePoolWithTag(), not malloc(), so it should be released
with ExFreePool(), not free(). Fix a couple if instances of
free(fh, ...) that got overlooked.
2005-03-28 22:03:47 +00:00
wpaul
9920f4f146 Another Coverity fix from Sam: add NULL pointer test in
NdisMFreeSharedMemory() (if the list is already empty, just bail).
2005-03-28 21:09:00 +00:00
wpaul
3e1273271d More additions for amd64:
- On amd64, InterlockedPushEntrySList() and InterlockedPopEntrySList()
  are mapped to ExpInterlockedPushEntrySList and
  ExpInterlockedPopEntrySList() via macros (which do the same thing).
  Add IMPORT_FUNC_MAP()s for these.

- Implement ExQueryDepthSList().
2005-03-28 20:46:08 +00:00
wpaul
386c0634e4 Fix resource leak found by Coverity (via Sam Leffler). 2005-03-28 20:16:26 +00:00
wpaul
7f6e70f819 Fix for amd64. 2005-03-28 20:13:14 +00:00
wpaul
b1238bf60e Fix another amd64 issue with lookaside lists: we initialize the
alloc and free routine pointers in the lookaside list with pointers
to ExAllocatePoolWithTag() and ExFreePool() (in the case where the
driver does not provide its own alloc and free routines). For amd64,
this is wrong: we have to use pointers to the wrapped versions of these
functions, not the originals.
2005-03-28 19:27:58 +00:00
wpaul
12c3eeb4fa Tweak to hopefully make lookaside lists work on amd64: in Windows, the
nll_obsoletelock field in the lookaside list structure is only defined
for the i386 arch. For amd64, the field is gone, and different list
update routines are used which do their locking internally. Apparently
the Inprocomm amd64 driver uses lookaside lists. I'm not positive this
will make it work yet since I don't have an Inprocomm NIC to test, but
this needs to be fixed anyway.
2005-03-28 17:36:06 +00:00
wpaul
913253e3c7 Spell '0' as 'FALSE' when initializing npp_validcounts. (Doesn't change
the code, but emphasises that this field is used as a boolean.)
2005-03-28 17:06:47 +00:00
wpaul
a8513e48c5 Unbreak the build: correct the resource list traversal code for
__FreeBSD_version >= 600022.
2005-03-28 16:49:27 +00:00
wpaul
74837aa85b Argh. PCI resource list became an STAILQ instead of an SLIST. Try to
deal with this while maintaining backards source compatibility with
stable.
2005-03-27 10:35:07 +00:00
wpaul
e41bbf9219 Check in ntoskrnl_var.h, which should have been included in the
previous commit.
2005-03-27 10:16:45 +00:00
wpaul
959879757b Finally bring an end to the great "make the Atheros NDIS driver
work on SMP" saga. After several weeks and much gnashing of teeth,
I have finally tracked down all the problems, despite their best
efforts to confound and annoy me.

Problem nunmber one: the Atheros windows driver is _NOT_ a de-serialized
miniport! It used to be that NDIS drivers relied on the NDIS library
itself for all their locking and serialization needs. Transmit packet
queues were all handled internally by NDIS, and all calls to
MiniportXXX() routines were guaranteed to be appropriately serialized.
This proved to be a performance problem however, and Microsoft
introduced de-serialized miniports with the NDIS 5.x spec. Microsoft
still supports serialized miniports, but recommends that all new drivers
written for Windows XP and later be deserialized. Apparently Atheros
wasn't listening when they said this.

This means (among other things) that we have to serialize calls to
MiniportSendPackets(). We also have to serialize calls to MiniportTimer()
that are triggered via the NdisMInitializeTimer() routine. It finally
dawned on me why NdisMInitializeTimer() takes a special
NDIS_MINIPORT_TIMER structure and a pointer to the miniport block:
the timer callback must be serialized, and it's only by saving the
miniport block handle that we can get access to the serialization
lock during the timer callback.

Problem number two: haunted hardware. The thing that was _really_
driving me absolutely bonkers for the longest time is that, for some
reason I couldn't understand, my test machine would occasionally freeze
or more frustratingly, reset completely. That's reset and in *pow!*
back to the BIOS startup. No panic, no crashdump, just a reset. This
appeared to happen most often when MiniportReset() was called. (As
to why MiniportReset() was being called, see problem three below.)
I thought maybe I had created some sort of horrible deadlock
condition in the process of adding the serialization, but after three
weeks, at least 6 different locking implementations and heroic efforts
to debug the spinlock code, the machine still kept resetting. Finally,
I started single stepping through the MiniportReset() routine in
the driver using the kernel debugger, and this ultimately led me to
the source of the problem.

One of the last things the Atheros MiniportReset() routine does is
call NdisReadPciSlotInformation() several times to inspect a portion
of the device's PCI config space. It reads the same chunk of config
space repeatedly, in rapid succession. Presumeably, it's polling
the hardware for some sort of event. The reset occurs partway through
this process. I discovered that when I single-stepped through this
portion of the routine, the reset didn't occur. So I inserted a 1
microsecond delay into the read loop in NdisReadPciSlotInformation().
Suddenly, the reset was gone!!

I'm still very puzzled by the whole thing. What I suspect is happening
is that reading the PCI config space so quickly is causing a severe
PCI bus error. My test system is a Sun w2100z dual Opteron system,
and the NIC is a miniPCI card mounted in a miniPCI-to-PCI carrier card,
plugged into a 100Mhz PCI slot. It's possible that this combination of
hardware causes a bus protocol violation in this scenario which leads
to a fatal machine check. This is pure speculation though. Really all I
know for sure is that inserting the delay makes the problem go away.
(To quote Homer Simpson: "I don't know how it works, but fire makes
it good!")

Problem number three: NdisAllocatePacket() needs to make sure to
initialize the npp_validcounts field in the 'private' section of
the NDIS_PACKET structure. The reason if_ndis was calling the
MiniportReset() routine in the first place is that packet transmits
were sometimes hanging. When sending a packet, an NDIS driver will
call NdisQueryPacket() to learn how many physical buffers the packet
resides in. NdisQueryPacket() is actually a macro, which traverses
the NDIS_BUFFER list attached to the NDIS_PACKET and stashes some
of the results in the 'private' section of the NDIS_PACKET. It also
sets the npp_validcounts field to TRUE To indicate that the results are
now valid. The problem is, now that if_ndis creates a pool of transmit
packets via NdisAllocatePacketPool(), it's important that each time
a new packet is allocated via NdisAllocatePacket() that validcounts
be initialized to FALSE. If it isn't, and a previously transmitted
NDIS_PACKET is pulled out of the pool, it may contain stale data
from a previous transmission which won't get updated by NdisQueryPacket().
This would cause the driver to miscompute the number of fragments
for a given packet, and botch the transmission.

Fixing these three problems seems to make the Atheros driver happy
on SMP, which hopefully means other serialized miniports will be
happy too.

And there was much rejoicing.

Other stuff fixed along the way:

- Modified ndis_thsuspend() to take a mutex as an argument. This
  allows KeWaitForSingleObject() and KeWaitForMultipleObjects() to
  avoid any possible race conditions with other routines that
  use the dispatcher lock.

- Fixed KeCancelTimer() so that it returns the correct value for
  'pending' according to the Microsoft documentation

- Modfied NdisGetSystemUpTime() to use ticks and hz rather than
  calling nanouptime(). Also added comment that this routine wraps
  after 49.7 days.

- Added macros for KeAcquireSpinLock()/KeReleaseSpinLock() to hide
  all the MSCALL() goop.

- For x86, KeAcquireSpinLockRaiseToDpc() needs to be a separate
  function. This is because it's supposed to be _stdcall on the x86
  arch, whereas KeAcquireSpinLock() is supposed to be _fastcall.
  On amd64, all routines use the same calling convention so we can
  just map KeAcquireSpinLockRaiseToDpc() directly to KfAcquireSpinLock()
  and it will work. (The _fastcall attribute is a no-op on amd64.)

- Implement and use IoInitializeDpcRequest() and IoRequestDpc() (they're
  just macros) and use them for interrupt handling. This allows us to
  move the ndis_intrtask() routine from if_ndis.c to kern_ndis.c.

- Fix the MmInitializeMdl() macro so that is uses sizeof(vm_offset_t)
  when computing mdl_size instead of uint32_t, so that it matches the
  MmSizeOfMdl() routine.

- Change a could of M_WAITOKs to M_NOWAITs in the unicode routines in
  subr_ndis.c.

- Use the dispatcher lock a little more consistently in subr_ntoskrnl.c.

- Get rid of the "wait for link event" hack in ndis_init(). Now that
  I fixed NdisReadPciSlotInformation(), it seems I don't need it anymore.
  This should fix the witness panic a couple of people have reported.

- Use MSCALL1() when calling the MiniportHangCheck() function in
  ndis_ticktask(). I accidentally missed this one when adding the
  wrapping for amd64.
2005-03-27 10:14:36 +00:00
brooks
f16c448930 Use the CTASSERT() macro instead of rolling my own, non-portable one
using #error.

Suggested by:	jhb
2005-03-24 19:26:50 +00:00
brooks
b25337dcb4 Compile errors are way more useful then panics later.
Replace a KASSERT of LINUX_IFNAMSIZ == IFNAMSIZ with a preprocessor
check and #error message.  This will prevent nasty suprises if users
change IFNAMSIZ without updating the linux code appropriatly.
2005-03-24 17:51:15 +00:00
das
6a2a1d9492 Bounds check the user-supplied length used in a copyout() in
svr4_do_getmsg().  In principle this bug could disclose data from
kernel memory, but in practice, the SVR4 emulation layer is probably
not functional enough to cause the relevant code path to be executed.
In any case, the emulator has been disconnected from the build since
5.0-RELEASE.

Found by:	Coverity Prevent analysis tool
2005-03-23 08:28:06 +00:00
das
fbf7a9b2ee Reject packets larger than IP_MAXPACKET in linux_sendto() for sockets
with the IP_HDRINCL option set.  Without this change, a Linux process
with access to a raw socket could cause a kernel panic.  Raw sockets
must be created by root, and are generally not consigned to untrusted
applications; hence, the security implications of this bug are
minimal.  I believe this only affects 6-CURRENT on or after 2005-01-30.

Found by:	Coverity Prevent analysis tool
Security:	Local DOS
2005-03-23 08:28:00 +00:00
phk
00a6eab3e5 s/SLIST/STAILQ/
/imp/a\
pointy hat
.
2005-03-18 11:57:44 +00:00
phk
9cea99e06b Neuter the duplicated disk-device magic code for now. Somebody with
serious linux-clue is necessary to fix this properly.
2005-03-15 11:58:40 +00:00
sobomax
b795e2430a Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress
SIGPIPE signal for the duration of the sento-family syscalls. Use it to
replace previously added hack in Linux layer based on temporarily setting
SO_NOSIGPIPE flag.

Suggested by:	alfred
2005-03-08 16:11:41 +00:00
sobomax
a5d845fec6 Handle MSG_NOSIGNAL flag in linux_send() by setting SO_NOSIGPIPE on socket
for the duration of the send() call. Such approach may be less than ideal
in threading environment, when several threads share the same socket and it
might happen that several of them are calling linux_send() at the same time
with and without SO_NOSIGPIPE set.

However, such race condition is very unlikely in practice, therefore this
change provides practical improvement compared to the previous behaviour.

PR:		kern/76426
Submitted by:	Steven Hartland <killing@multiplay.co.uk>
MFC after:	3 days
2005-03-07 07:26:42 +00:00
wpaul
a72168b811 When you call MiniportInitialize() for an 802.11 driver, it will
at some point result in a status event being triggered (it should
be a link down event: the Microsoft driver design guide says you
should generate one when the NIC is initialized). Some drivers
generate the event during MiniportInitialize(), such that by the
time MiniportInitialize() completes, the NIC is ready to go. But
some drivers, in particular the ones for Atheros wireless NICs,
don't generate the event until after a device interrupt occurs
at some point after MiniportInitialize() has completed.

The gotcha is that you have to wait until the link status event
occurs one way or the other before you try to fiddle with any
settings (ssid, channel, etc...). For the drivers that set the
event sycnhronously this isn't a problem, but for the others
we have to pause after calling ndis_init_nic() and wait for the event
to arrive before continuing. Failing to wait can cause big trouble:
on my SMP system, calling ndis_setstate_80211() after ndis_init_nic()
completes, but _before_ the link event arrives, will lock up or
reset the system.

What we do now is check to see if a link event arrived while
ndis_init_nic() was running, and if it didn't we msleep() until
it does.

Along the way, I discovered a few other problems:

- Defered procedure calls run at PASSIVE_LEVEL, not DISPATCH_LEVEL.
  ntoskrnl_run_dpc() has been fixed accordingly. (I read the documentation
  wrong.)

- Similarly, the NDIS interrupt handler, which is essentially a
  DPC, also doesn't need to run at DISPATCH_LEVEL. ndis_intrtask()
  has been fixed accordingly.

- MiniportQueryInformation() and MiniportSetInformation() run at
  DISPATCH_LEVEL, and each request must complete before another
  can be submitted. ndis_get_info() and ndis_set_info() have been
  fixed accordingly.

- Turned the sleep lock that guards the NDIS thread job list into
  a spin lock. We never do anything with this lock held except manage
  the job list (no other locks are held), so it's safe to do this,
  and it's possible that ndis_sched() and ndis_unsched() can be
  called from DISPATCH_LEVEL, so using a sleep lock here is
  semantically incorrect. Also updated subr_witness.c to add the
  lock to the order list.
2005-03-07 03:05:31 +00:00
sobomax
f706f4bce8 Handle unimplemented syscall by instantly returning ENOSYS instead of sending
signal first and only then returning ENOSYS to match what real linux does.

PR:		kern/74302
Submitted by:	Travis Poppe <tlp@LiquidX.org>
2005-03-07 00:18:06 +00:00
sobomax
6f0b5d23e8 Always produce cpuX entries, even in the case when there is only one CPU
in the system. This is consistent with what real linuxes do.

PR:		kern/75848
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
MFC after:	3 days
2005-03-06 22:28:14 +00:00
wpaul
593ae58297 MAXPATHLEN is 1024, which means NdisOpenFile() and ndis_find_sym() were
both consuming 1K of stack space. This is unfriendly. Allocate the buffers
off the heap instead. It's a little slower, but these aren't performance
critical routines.

Also, add a spinlock to NdisAllocatePacketPool(), NdisAllocatePacket(),
NdisFreePacketPool() and NdisFreePacket(). The pool is maintained as a
linked list. I don't know for a fact that it can be corrupted, but why
take chances.
2005-03-03 03:51:02 +00:00
jhb
407a285a6f Remove linux_emul_find() and the CHECKALT*() macros as they are no longer
used.
2005-03-01 17:57:45 +00:00
ps
90b32391ca Use kern_kevent instead of the stackgap for 32bit syscall wrapping.
Submitted by:	jhb
Tested on:	amd64
2005-03-01 17:45:55 +00:00
wpaul
88eacaa717 In windrv_load(), I was allocating the driver object using
malloc(sizeof(device_object), ...) by mistake. Correct this, and
rename "dobj" to "drv" to make it a bit clearer what this variable
is supposed to be.

Spotted by: Mikore Li at Sun dot comnospamplzkthx
2005-03-01 17:21:25 +00:00
ps
3477112899 Ooops. I will compile test before committing. The stackgap version
of kevent32 will be going away shortly, so this is temporary until
I commit the non-stackgap version.
2005-03-01 13:50:57 +00:00
ps
7f5f318c48 Correct the freebsd32_kevent prototype. 2005-03-01 06:32:53 +00:00
wpaul
d20416c2fe Don't need to do MmInitializeMdl() in ndis_mtop() anymore:
IoInitializeMdl() does it internally (and doing it again here
messes things up).
2005-02-26 07:11:17 +00:00
wpaul
15a925bf93 MDLs are supposed to be variable size (they include an array of pages
that describe a buffer of variable size). The problem is, allocating
MDLs off the heap is slow, and it can happen that drivers will allocate
lots and lots of lots of MDLs as they run.

As a compromise, we now do the following: we pre-allocate a zone for
MDLs big enough to describe any buffer with 16 or less pages. If
IoAllocateMdl() needs a MDL for a buffer with 16 or less pages, we'll
allocate it from the zone. Otherwise, we allocate it from the heap.
MDLs allocate from the zone have a flag set in their mdl_flags field.
When the MDL is released, IoMdlFree() will uma_zfree() the MDL if
it has the MDL_ZONE_ALLOCED flag set, otherwise it will release it
to the heap.

The assumption is that 16 pages is a "big number" and we will rarely
need MDLs larger than that.

- Moved the ndis_buffer zone to subr_ntoskrnl.c from kern_ndis.c
  and named it mdl_zone.

- Modified IoAllocateMdl() and IoFreeMdl() to use uma_zalloc() and
  uma_zfree() if necessary.

- Made ndis_mtop() use IoAllocateMdl() instead of calling uma_zalloc()
  directly.

Inspired by: discussion with Giridhar Pemmasani
2005-02-26 00:22:16 +00:00
sam
b3f1395f0d fixup signal mapping:
o change the mapping arrays to have a zero offset rather than base 1;
  this eliminates lots of signo adjustments and brings the code
  back inline with the original netbsd code
o purge use of SVR4_SIGTBLZ; SVR4_NSIG is the only definition for
  how big a mapping array is
o change the mapping loops to explicitly ignore signal 0
o purge some bogus code from bsd_to_svr4_sigset
o adjust svr4_sysentvec to deal with the mapping table change

Enticed into fixing by:	Coverity Prevent analysis tool
Glanced at by:	marcel, jhb
2005-02-25 19:34:10 +00:00
wpaul
371673aec8 Add macros to construct Windows IOCTL codes, and to extract function
codes from an IOCTL. (The USB module will need them later.)
2005-02-25 18:25:48 +00:00
wpaul
64968e6acf Fix a couple of callback instances that should have been wrapped with
MSCALLx().

Add definition for STATUS_PENDING error code.
2005-02-25 08:34:32 +00:00
wpaul
7a502b7309 Compute the right length to use with bzero() when initializing an IRP
in IoInitializeIrp() (must use IoSizeOfIrp() to account for the stack
locations).
2005-02-25 06:31:45 +00:00
wpaul
efb3e8caac - Correct one aspect of the driver_object/device_object/IRP framework:
when we create a PDO, the driver_object associated with it is that
  of the parent driver, not the driver we're trying to attach. For
  example, if we attach a PCI device, the PDO we pass to the NdisAddDevice()
  function should contain a pointer to fake_pci_driver, not to the NDIS
  driver itself. For PCI or PCMCIA devices this doesn't matter because
  the child never needs to talk to the parent bus driver, but for USB,
  the child needs to be able to send IRPs to the parent USB bus driver, and
  for that to work the parent USB bus driver has to be hung off the PDO.

  This involves modifying windrv_lookup() so that we can search for
  bus drivers by name, if necessary. Our fake bus drivers attach themselves
  as "PCI Bus," "PCCARD Bus" and "USB Bus," so we can search for them
  using those names.

  The individual attachment stubs now create and attach PDOs to the
  parent bus drivers instead of hanging them off the NDIS driver's
  object, and in if_ndis.c, we now search for the correct driver
  object depending on the bus type, and use that to find the correct PDO.

  With this fix, I can get my sample USB ethernet driver to deliver
  an IRP to my fake parent USB bus driver's dispatch routines.

- Add stub modules for USB support: subr_usbd.c, usbd_var.h and
  if_ndis_usb.c. The subr_usbd.c module is hooked up the build
  but currently doesn't do very much. It provides the stub USB
  parent driver object and a dispatch routine for
  IRM_MJ_INTERNAL_DEVICE_CONTROL. The only exported function at
  the moment is USBD_GetUSBDIVersion(). The if_ndis_usb.c stub
  compiles, but is not hooked up to the build yet. I'm putting
  these here so I can keep them under source code control as I
  flesh them out.
2005-02-24 21:49:14 +00:00
jhb
54ea9f1912 Regen. 2005-02-24 18:24:29 +00:00
jhb
aa85f8d747 Use msync() to implement msync() for freebsd32 emulation. This isn't quite
right for certain MAP_FIXED mappings on ia64 but it will work fine for all
other mappings and works fine on amd64.

Requested by:	ps, Christian Zander
MFC after:	1 week
2005-02-24 18:24:16 +00:00
wpaul
6e74cf6e34 Couple of lessons learned during USB driver testing:
- In kern_ndis.c:ndis_unload_driver(), test that ndis_block->nmb_rlist
  is not NULL before trying to free() it.

- In subr_pe.c:pe_get_import_descriptor(), do a case-insensitive
  match on the import module name. Most drivers I have encountered
  link against "ntoskrnl.exe" but the ASIX USB ethernet driver I'm
  testing with wants "NTOSKRNL.EXE."

- In subr_ntoskrnl.c:IoAllocateIrp(), return a pointer to the IRP
  instead of NULL. (Stub code leftover.)

- Also in subr_ntoskrnl.c, add ExAllocatePoolWithTag() and ExFreePool()
  to the function table list so they'll get exported to drivers properly.
2005-02-24 17:58:27 +00:00
wpaul
954c02c21f Implement IoCancelIrp(), IoAcquireCancelSpinLock(), IoReleaseCancelSpinLock()
and a machine-independent though inefficient InterlockedExchange().
In Windows, InterlockedExchange() appears to be implemented in header
files via inline assembly. I would prefer using an atomic.h macro for
this, but there doesn't seem to be one that just does a plain old
atomic exchange (as opposed to compare and exchange). Also implement
IoSetCancelRoutine(), which is just a macro that uses InterlockedExchange().

Fill in IoBuildSynchronousFsdRequest(), IoBuildAsynchronousFsdRequest()
and IoBuildDeviceIoControlRequest() so that they do something useful,
and add a bunch of #defines to ntoskrnl_var.h to help make these work.
These may require some tweaks later.
2005-02-23 16:44:33 +00:00
phk
5bbf7f6810 Neuter linux_ustat() until somebody finds time to try to fix it.
The fundamental problem is that we get only the lower 8 bits of the
minor device number so there is no guarantee that we can actually
find the disk device in question at all.

This was probably a bigger issue pre-GEOM where the upper bits
signaled which slice were in use.

The secondary problem is how we get from (partial) dev_t to vnode.

The correct implementation will involve traversing the mount list
looking for a perfect match or a possible match (for truncated
minor).
2005-02-22 13:39:46 +00:00
sam
340ab0ebcb remove dead code
Submitted by:	Coverity Prevent analysis tool
2005-02-22 01:26:48 +00:00
jhb
847763ff6e - Add a custom version of exec_copyin_args() to deal with the 32-bit
pointers in argv and envv in userland and use that together with
  kern_execve() and exec_free_args() to implement freebsd32_execve()
  without using the stackgap.
- Fix freebsd32_adjtime() to call adjtime() rather than utimes().  Still
  uses stackgap for now.
- Use kern_setitimer(), kern_getitimer(), kern_select(), kern_utimes(),
  kern_statfs(), kern_fstatfs(), kern_fhstatfs(), kern_stat(),
  kern_fstat(), and kern_lstat().

Tested by:	cokane (amd64)
Silence on:	amd64, ia64
2005-02-18 18:56:04 +00:00
wpaul
359989e277 Fix a couple of u_int_foos that should have been uint_foos. 2005-02-18 04:33:34 +00:00
wpaul
cb91ac4b68 Make the Win64 -> ELF64 template a little smaller by using a string
copy op to shift arguments on the stack instead of transfering each
argument one by one through a register. Probably doesn't affect overall
operation, but makes the code a little less grotty and easier to update
later if I choose to make the wrapper handle more args. Also add
comments.
2005-02-18 03:22:37 +00:00
wpaul
90e2d970fc Remove redundant label. 2005-02-16 21:24:04 +00:00
wpaul
61fae0841d Fix freeing of custom driver extensions. (ExFreePool() was being
called with the wrong pointer.)
2005-02-16 19:21:07 +00:00
wpaul
a372ba85ce KeAcquireSpinLockRaiseToDpc() and KeReleaseSpinLock() are (at least
for now) exactly the same as KfAcquireSpinLock() and KfReleaseSpinLock().
I implemented the former as small routines in subr_ntoskrnl.c that just
turned around and invoked the latter. But I don't really need the wrapper
routines: I can just create an entries in the ntoskrnl func table that
map KeAcquireSpinLockRaiseToDpc() and KeReleaseSpinLock() to
KfAcquireSpinLock() and KfReleaseSpinLock() directly. This means
the stubs can go away.
2005-02-16 18:18:30 +00:00
wpaul
07b632956a Add support for Windows/x86-64 binaries to Project Evil.
Ville-Pertti Keinonen (will at exomi dot comohmygodnospampleasekthx)
deserves a big thanks for submitting initial patches to make it
work. I have mangled his contributions appropriately.

The main gotcha with Windows/x86-64 is that Microsoft uses a different
calling convention than everyone else. The standard ABI requires using
6 registers for argument passing, with other arguments on the stack.
Microsoft uses only 4 registers, and requires the caller to leave room
on the stack for the register arguments incase the callee needs to
spill them. Unlike x86, where Microsoft uses a mix of _cdecl, _stdcall
and _fastcall, all routines on Windows/x86-64 uses the same convention.
This unfortunately means that all the functions we export to the
driver require an intermediate translation wrapper. Similarly, we have
to wrap all calls back into the driver binary itself.

The original patches provided macros to wrap every single routine at
compile time, providing a secondary jump table with a customized
wrapper for each exported routine. I decided to use a different approach:
the call wrapper for each function is created from a template at
runtime, and the routine to jump to is patched into the wrapper as
it is created. The subr_pe module has been modified to patch in the
wrapped function instead of the original. (On x86, the wrapping
routine is a no-op.)

There are some minor API differences that had to be accounted for:

- KeAcquireSpinLock() is a real function on amd64, not a macro wrapper
  around KfAcquireSpinLock()
- NdisFreeBuffer() is actually IoFreeMdl(). I had to change the whole
  NDIS_BUFFER API a bit to accomodate this.

Bugs fixed along the way:
- IoAllocateMdl() always returned NULL
- kern_windrv.c:windrv_unload() wasn't releasing private driver object
  extensions correctly (found thanks to memguard)

This has only been tested with the driver for the Broadcom 802.11g
chipset, which was the only Windows/x86-64 driver I could find.
2005-02-16 05:41:18 +00:00
njl
25c48f7867 Unbreak the kernel build. Pointy hat to: sobomax. 2005-02-13 19:50:57 +00:00
sobomax
52ae2ac0b9 Backout previous change (disabling of security checks for signals delivered
in emulation layers), since it appears to be too broad.

Requested by:   rwatson
2005-02-13 17:37:20 +00:00
sobomax
1d558007d0 Split out kill(2) syscall service routine into user-level and kernel part, the
former is callable from user space and the latter from the kernel one. Make
kernel version take additional argument which tells if the respective call
should check for additional restrictions for sending signals to suid/sugid
applications or not.

Make all emulation layers using non-checked version, since signal numbers in
emulation layers can have different meaning that in native mode and such
protection can cause misbehaviour.

As a result remove LIBTHR from the signals allowed to be delivered to a
suid/sugid application.

Requested (sorta) by:	rwatson
MFC after:	2 weeks
2005-02-13 16:42:08 +00:00
sobomax
22b03e0f5d Semctl with IPC_STAT command should return zero in case of success.
PR:		73778
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
MFC after:	2 weeks
2005-02-11 13:46:55 +00:00
wpaul
df89b62698 Next step on the road to IRPs: create and use an imitation of the
Windows DRIVER_OBJECT and DEVICE_OBJECT mechanism so that we can
simulate driver stacking.

In Windows, each loaded driver image is attached to a DRIVER_OBJECT
structure. Windows uses the registry to match up a given vendor/device
ID combination with a corresponding DRIVER_OBJECT. When a driver image
is first loaded, its DriverEntry() routine is invoked, which sets up
the AddDevice() function pointer in the DRIVER_OBJECT and creates
a dispatch table (based on IRP major codes). When a Windows bus driver
detects a new device, it creates a Physical Device Object (PDO) for
it. This is a DEVICE_OBJECT structure, with semantics analagous to
that of a device_t in FreeBSD. The Windows PNP manager will invoke
the driver's AddDevice() function and pass it pointers to the DRIVER_OBJECT
and the PDO.

The AddDevice() function then creates a new DRIVER_OBJECT structure of
its own. This is known as the Functional Device Object (FDO) and
corresponds roughly to a private softc instance. The driver uses
IoAttachDeviceToDeviceStack() to add this device object to the
driver stack for this PDO. Subsequent drivers (called filter drivers
in Windows-speak) can be loaded which add themselves to the stack.
When someone issues an IRP to a device, it travel along the stack
passing through several possible filter drivers until it reaches
the functional driver (which actually knows how to talk to the hardware)
at which point it will be completed. This is how Windows achieves
driver layering.

Project Evil now simulates most of this. if_ndis now has a modevent
handler which will use MOD_LOAD and MOD_UNLOAD events to drive the
creation and destruction of DRIVER_OBJECTs. (The load event also
does the relocation/dynalinking of the image.) We don't have a registry,
so the DRIVER_OBJECTS are stored in a linked list for now. Eventually,
the list entry will contain the vendor/device ID list extracted from
the .INF file. When ndis_probe() is called and detectes a supported
device, it will create a PDO for the device instance and attach it
to the DRIVER_OBJECT just as in Windows. ndis_attach() will then call
our NdisAddDevice() handler to create the FDO. The NDIS miniport block
is now a device extension hung off the FDO, just as it is in Windows.
The miniport characteristics table is now an extension hung off the
DRIVER_OBJECT as well (the characteristics are the same for all devices
handled by a given driver, so they don't need to be per-instance.)
We also do an IoAttachDeviceToDeviceStack() to put the FDO on the
stack for the PDO. There are a couple of fake bus drivers created
for the PCI and pccard buses. Eventually, there will be one for USB,
which will actually accept USB IRP.s

Things should still work just as before, only now we do things in
the proper order and maintain the correct framework to support passing
IRPs between drivers.

Various changes:

- corrected the comments about IRQL handling in subr_hal.c to more
  accurately reflect reality
- update ndiscvt to make the drv_data symbol in ndis_driver_data.h a
  global so that if_ndis_pci.o and/or if_ndis_pccard.o can see it.
- Obtain the softc pointer from the miniport block by referencing
  the PDO rather than a private pointer of our own (nmb_ifp is no
  longer used)
- implement IoAttachDeviceToDeviceStack(), IoDetachDevice(),
  IoGetAttachedDevice(), IoAllocateDriverObjectExtension(),
  IoGetDriverObjectExtension(), IoCreateDevice(), IoDeleteDevice(),
  IoAllocateIrp(), IoReuseIrp(), IoMakeAssociatedIrp(), IoFreeIrp(),
  IoInitializeIrp()
- fix a few mistakes in the driver_object and device_object definitions
- add a new module, kern_windrv.c, to handle the driver registration
  and relocation/dynalinkign duties (which don't really belong in
  kern_ndis.c).
- made ndis_block and ndis_chars in the ndis_softc stucture pointers
  and modified all references to it
- fixed NdisMRegisterMiniport() and NdisInitializeWrapper() so they
  work correctly with the new driver_object mechanism
- changed ndis_attach() to call NdisAddDevice() instead of ndis_load_driver()
  (which is now deprecated)
- used ExAllocatePoolWithTag()/ExFreePool() in lookaside list routines
  instead of kludged up alloc/free routines
- added kern_windrv.c to sys/modules/ndis/Makefile and files.i386.
2005-02-08 17:23:25 +00:00
jhb
b03a8bb21b - Implement svr4_emul_find() using kern_alternate_path(). This changes
the semantics in that the returned filename to use is now a kernel
  pointer rather than a user space pointer.  This required changing the
  arguments to the CHECKALT*() macros some and changing the various system
  calls that used pathnames to use the kern_foo() functions that can accept
  kernel space filename pointers instead of calling the system call
  directly.
- Use kern_open(), kern_access(), kern_msgctl(), kern_execve(),
  kern_mkfifo(), kern_mknod(), kern_statfs(), kern_fstatfs(),
  kern_setitimer(), kern_stat(), kern_lstat(), kern_fstat(), kern_utimes(),
  kern_pathconf(), and kern_unlink().
2005-02-07 21:53:42 +00:00
jhb
3c3db95194 - Use kern_{l,f,}stat() and kern_{f,}statfs() functions rather than
duplicating the contents of the same functions inline.
- Consolidate common code to convert a BSD statfs struct to a Linux struct
  into a static worker function.
2005-02-07 18:47:28 +00:00
jhb
6fab308776 Make linux_emul_convpath() a simple wrapper for kern_alternate_path(). 2005-02-07 18:46:05 +00:00
jhb
71c05d27c0 - Tweak kern_msgctl() to return a copy of the requested message queue id
structure in the struct pointed to by the 3rd argument for IPC_STAT and
  get rid of the 4th argument.  The old way returned a pointer into the
  kernel array that the calling function would then access afterwards
  without holding the appropriate locks and doing non-lock-safe things like
  copyout() with the data anyways.  This change removes that unsafeness and
  resulting race conditions as well as simplifying the interface.
- Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(),
  fstatfs(), and fhstatfs().  Use these wrappers to cut out a lot of
  code duplication for freebsd4 and netbsd compatability system calls.
- Add a new lookup function kern_alternate_path() that looks up a filename
  under an alternate prefix and determines which filename should be used.
  This is basically a more general version of linux_emul_convpath() that
  can be shared by all the ABIs thus allowing for further reduction of
  code duplication.
2005-02-07 18:44:55 +00:00
jhb
6e2f7d4c8e Use kern_setitimer() to implement linux_alarm() instead of fondling the
real interval timer directly.
2005-02-07 18:36:21 +00:00
sobomax
69aa6843ef Boot away another stackgap (one of the lest ones in linuxlator/i386) by
providing special version of CDIOCREADSUBCHANNEL ioctl(), which assumes that
result has to be placed into kernel space not user space. In the long run
more generic solution has to be designed WRT emulating various ioctl()s
that operate on userspace buffers, but right now there is only one such
ioctl() is emulated, so that it makes little sense.

MFC after:	2 weeks
2005-01-30 08:12:37 +00:00
sobomax
68d0bd2186 Extend kern_sendit() to take another enum uio_seg argument, which specifies
where the buffer to send lies and use it to eliminate yet another stackgap
in linuxlator.

MFC after:	2 weeks
2005-01-30 07:20:36 +00:00
sobomax
f489acaf0f o Split out kernel part of execve(2) syscall into two parts: one that
copies arguments into the kernel space and one that operates
  completely in the kernel space;

o use kernel-only version of execve(2) to kill another stackgap in
  linuxlator/i386.

Obtained from:  DragonFlyBSD (partially)
MFC after:      2 weeks
2005-01-29 23:12:00 +00:00
sobomax
896df27c1a Split out kernel side of msgctl(2) into two parts: the first that pops data
from the userland and pushes results back and the second which does
actual processing. Use the latter to eliminate stackgap in the linux wrapper
of that syscall.

MFC after:      2 weeks
2005-01-26 00:46:36 +00:00
sobomax
35611d3699 Split out kernel side of {get,set}itimer(2) into two parts: the first that
pops data from the userland and pushes results back and the second which does
actual processing. Use the latter to eliminate stackgap in the linux wrappers
of those syscalls.

MFC after:	2 weeks
2005-01-25 21:28:28 +00:00
wpaul
e00c1df907 Apparently, the Intel icc compiler doesn't like it when you use
attributes in casts (i.e. foo = (__stdcall sometype)bar). This only
happens in two places where we need to set up function pointers, so
work around the problem with some void pointer magic.
2005-01-25 17:00:54 +00:00
wpaul
361515a412 Begin the first phase of trying to add IRP support (and ultimately
USB device support):

- Convert all of my locally chosen function names to their actual
  Windows equivalents, where applicable. This is a big no-op change
  since it doesn't affect functionality, but it helps avoid a bit
  of confusion (it's now a lot easier to see which functions are
  emulated Windows API routines and which are just locally defined).

- Turn ndis_buffer into an mdl, like it should have been. The structure
  is the same, but now it belongs to the subr_ntoskrnl module.

- Implement a bunch of MDL handling macros from Windows and use them where
  applicable.

- Correct the implementation of IoFreeMdl().

- Properly implement IoAllocateMdl() and MmBuildMdlForNonPagedPool().

- Add the definitions for struct irp and struct driver_object.

- Add IMPORT_FUNC() and IMPORT_FUNC_MAP() macros to make formatting
  the module function tables a little cleaner. (Should also help
  with AMD64 support later on.)

- Fix if_ndis.c to use KeRaiseIrql() and KeLowerIrql() instead of
  the previous calls to hal_raise_irql() and hal_lower_irql() which
  have been renamed.

The function renaming generated a lot of churn here, but there should
be very little operational effect.
2005-01-24 18:18:12 +00:00
ps
87f1a6a1a6 Add a 32bit syscall wrapper for modstat
Obtained from:	Yahoo!
2005-01-19 17:53:06 +00:00
ps
db53196a48 - rename nanosleep1 to kern_nanosleep
- Add a 32bit syscall entry for nanosleep

Reviewed by:	peter
Obtained from:	Yahoo!
2005-01-19 17:44:59 +00:00
wpaul
2fe7b09cb1 Fix a problem reported by Pierre Beyssac. Sometinmes when ndis_get_info()
calls MiniportQueryInformation(), it will return NDIS_STATUS_PENDING.
When this happens, ndis_get_info() will sleep waiting for a completion
event. If two threads call ndis_get_info() and both end up having to
sleep, they will both end up waiting on the same wait channel, which
can cause a panic in sleepq_add() if INVARIANTS are turned on.

Fix this by having ndis_get_info() use a common mutex rather than
using the process mutex with PROC_LOCK(). Also do the same for
ndis_set_info(). Note that Pierre's original patch also made ndis_thsuspend()
use the new mutex, but ndis_thsuspend() shouldn't need this since
it will make each thread that calls it sleep on a unique wait channel.

Also, it occured to me that we probably don't want to enter
MiniportQueryInformation() or MiniportSetInformation() from more
than one thread at any given time, so now we acquire a Windows
spinlock before calling either of them. The Microsoft documentation
says that MiniportQueryInformation() and MiniportSetInformation()
are called at DISPATCH_LEVEL, and previously we would call
KeRaiseIrql() to set the IRQL to DISPATCH_LEVEL before entering
either routine, but this only guarantees mutual exclusion on
uniprocessor machines. To make it SMP safe, we need to use a real
spinlock. For now, I'm abusing the spinlock embedded in the
NDIS_MINIPORT_BLOCK structure for this purpose. (This may need to be
applied to some of the other routines in kern_ndis.c at a later date.)

Export ntoskrnl_init_lock() (KeInitializeSpinlock()) from subr_ntoskrnl.c
since we need to use in in kern_ndis.c, and since it's technically part
of the Windows kernel DDK API along with the other spinlock routines. Use
it in subr_ndis.c too rather than frobbing the spinlock directly.
2005-01-14 22:39:44 +00:00
obrien
98e2482a94 Match the LINUX32's style with existing style
Submitted by:	Jung-uk Kim <jkim@niksun.com>

Use positive, not negative logic.
2005-01-14 04:44:56 +00:00
obrien
98c3a8a894 Fix Linux compat 'uname -m' on AMD64.
Submitted by:	Jung-uk Kim <jkim@niksun.com>
		(patch reworked by me)
2005-01-14 03:45:26 +00:00
phk
108b39b837 Remove duplicate code. 2005-01-13 19:27:28 +00:00
imp
362fcfc1e2 Start each of the license/copyright comments with /*- 2005-01-05 22:34:37 +00:00
jhb
3ec0dff7ad - Move the function prototypes for kern_setrlimit() and kern_wait() to
sys/syscallsubr.h where all the other kern_foo() prototypes live.
- Resort kern_execve() while I'm there.
2005-01-05 22:19:44 +00:00
jhb
830736d271 Regenerate. 2005-01-04 18:54:40 +00:00
jhb
adb721c906 Partial sync up to the master syscalls.master file:
- Mark mount, unmount and nmount MPSAFE.
- Add a stub for _umtx_op().
- Mark open(), link(), unlink(), and freebsd32_sigaction() MPSAFE.

Pointy hats to:	several
2005-01-04 18:53:32 +00:00
jhb
7b611b0cb2 Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.
2004-12-30 20:29:58 +00:00
phk
b0e48f2258 Do not blindly pass linux filesystem specific mount data across. 2004-12-03 18:14:22 +00:00
cperciva
ebbf4e4bde Fix unvalidated pointer dereference. This is FreeBSD-SA-04:17.procfs. 2004-12-01 21:33:02 +00:00
das
130bed6547 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
das
7f13dc5af0 Axe the semblance of support for PECOFF and Linux a.out core dumps. 2004-11-27 06:46:45 +00:00
phk
e2512dff3e Ignore MNT_NODEV option, it is implicit in choice of filesystem. 2004-11-26 07:39:20 +00:00
das
8d8b5ace18 Maintain the broken state of backwards compatibilty for a.out (and
PECOFF!) core dumps.  None of the old versions of gdb I tried were
able to read a.out core dumps before or after this change.

Reviewed by:	arch@
2004-11-20 02:32:04 +00:00
marks
dca89dc1d6 Rebuild from compat/freebsd32/syscalls.master:1.43
Reviewed by:	imp, phk, njl, peter
Approved by:	njl
2004-11-18 23:56:09 +00:00
marks
60b0534c96 32-bit FreeBSD ABI compatibility stubs from syscalls.master:1.179
Reviewed by:	imp, phk, njl, peter
Approved by:	njl
2004-11-18 23:54:26 +00:00
phk
216166ee0d Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.
Use this in all the places where sleeping with the lock held is not
an issue.

The distinction will become significant once we finalize the exact
lock-type to use for this kind of case.
2004-11-13 11:53:02 +00:00
phk
d09bec0098 Pick up the inode number using VOP_GETATTR() rather than caching it
in all vnodes on the off chance that linprocfs needs it.  If we can afford
to call vn_fullpath() we can afford the much cheaper VOP_GETATTR().
2004-11-10 07:25:37 +00:00
phk
e65c41f01e More sensible FILEDESC_ locking. 2004-11-07 15:59:27 +00:00
rwatson
e095dbaea3 Rebuild from FreeBSD32 syscalls.master:1.42. 2004-10-23 20:05:42 +00:00
rwatson
8bec59acae 32-bit FreeBSD ABI compatibility stubs from syscalls.master:1.178. 2004-10-23 20:04:56 +00:00
peter
09964b7499 Put on my peril sensitive sunglasses and add a flags field to the internal
sysctl routines and state.  Add some code to use it for signalling the need
to downconvert a data structure to 32 bits on a 64 bit OS when requested by
a 32 bit app.

I tried to do this in a generic abi wrapper that intercepted the sysctl
oid's, or looked up the format string etc, but it was a real can of worms
that turned into a fragile mess before I even got it partially working.

With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have
it not abort.  Things like netstat, ps, etc have a long way to go.

This also fixes a bug in the kern.ps_strings and kern.usrstack hacks.
These do matter very much because they are used by libc_r and other things.
2004-10-11 22:04:16 +00:00
dwmalone
d52e344f9f Rename thread args to be called "td" rather than "p" to be
consistent with other bits of this file. There should be no
functional change.

Submitted by:	Andrea Campi (many moons ago)
MFC after:	2 month
2004-10-10 18:34:30 +00:00
mtm
0a21f474dc Close a race between a thread exiting and the freeing of it's stack.
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.

Discussed with: davidxu, deischen, marcel
MFC after: 3 days
2004-10-06 14:23:00 +00:00
jhb
ce2d3f89af Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
  pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
  don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
  times it needs rather than calling getrusage() twice with associated
  stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
  for user, system, and interrupt time as well as a bintime of the total
  runtime.  A new p_rux field in struct proc replaces the same inline fields
  from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime).  A new p_crux
  field in struct proc contains the "raw" child time usage statistics.
  ruadd() has been changed to handle adding the associated rusage_ext
  structures as well as the values in rusage.  Effectively, the values in
  rusage_ext replace the ru_utime and ru_stime values in struct rusage.  These
  two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
  calculates appropriate timevals for user and system time as well as updating
  the rux_[isu]u fields of a passed in rusage_ext structure.  calcru() uses a
  copy of the process' p_rux structure to compute the timevals after updating
  the runtime appropriately if any of the threads in that process are
  currently executing.  It also now only locks sched_lock internally while
  doing the rux_runtime fixup.  calcru() now only requires the caller to
  hold the proc lock and calcru1() only requires the proc lock internally.
  calcru() also no longer allows callers to ask for an interrupt timeval
  since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
  calling calcru1() on p_crux.  Note that this means that any code that wants
  child times must now call this function rather than reading from p_cru
  directly.  This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
  in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
  proctree lock is used to protect the process group rather than the process
  group lock.  By holding this lock until the end of the function we now
  ensure that the process/thread that we pick to dump info about will no
  longer vanish while we are trying to output its info to the console.

Submitted by:	bde (mostly)
MFC after:	1 month
2004-10-05 18:51:11 +00:00
jhb
dcd41a1a0d Add a proc *p pointer for td->td_proc to make this code easier to read. 2004-09-24 20:26:15 +00:00
phk
1a87f07f3c Hold thread reference while frobbing cdevsw. 2004-09-24 06:37:00 +00:00
jhb
3956303607 Various small style fixes. 2004-09-22 15:24:33 +00:00
bms
94e51111dc Fix compiler warnings, when __stdcall is #defined, by adding explicit casts.
These normally only manifest if the ndis compat module is statically
compiled into a kernel image by way of 'options NDISAPI'.

Submitted by:	Dmitri Nikulin
Approved by:	wpaul
PR:		kern/71449
MFC after:	1 week
2004-09-17 19:54:26 +00:00
jhb
ac08ecfc54 Regenerate after fcntl() wrappers were marked MP safe. 2004-08-24 20:24:34 +00:00
jhb
cc23ea84d0 Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl()
directly.  This removes a few more users of the stackgap and also marks
the syscalls using these wrappers MP safe where appropriate.

Tested on:	i386 with linux acroread5
Compiled on:	i386, alpha LINT
2004-08-24 20:21:21 +00:00
des
bf69a16558 Don't try to translate the control message unless we're certain it's
valid; otherwise a caller could trick us into changing any 32-bit word
in kernel memory to LINUX_SOL_SOCKET (0x00000001) if its previous value
is SOL_SOCKET (0x0000ffff).

MFC after:	3 days
2004-08-23 12:41:29 +00:00
wpaul
097de72734 I'm a dumbass: remember to initialize fh->nf_map to NULL in
ndis_open_file() in the module loading case.
2004-08-16 19:25:27 +00:00
wpaul
5b5d2c54bc The Texas Instruments ACX111 driver wants srand(), so provide it. 2004-08-16 18:52:37 +00:00
wpaul
9f377407f3 Make the Texas Instruments 802.11g chipset work with the NDISulator.
This was tested with a Netgear WG311v2 802.11b/g PCI card. Things
that were fixed:

- This chip has two memory mapped regions, one at PCIR_BAR(0) and the
  other at PCIR_BAR(1). This is a little different from the other
  chips I've seen with two PCI shared memory regions, since they tend
  to have the second BAR ad PCIR_BAR(2). if_ndis_pci.c tests explicitly
  for PCIR_BAR(2). This has been changed to simply fill in ndis_res_mem
  first and ndis_res_altmem second, if a second shared memory range
  exists. Given that NDIS drivers seem to scan for BARs in ascending
  order, I think this should be ok.

- Fixed the code that tries to process firmware images that have been
  loaded as .ko files. To save a step, I was setting up the address
  mapping in ndis_open_file(), but ndis_map_file() flags pre-existing
  mappings as an error (to avoid duplicate mappings). Changed this so
  that the mapping is now donw in ndis_map_file() as expected.

- Made the typedef for 'driver_entry' explicitly include __stdcall
  to silence gcc warning in ndis_load_driver().

NOTE: the Texas Instruments ACX111 driver needs firmware. With my
card, there were 3 .bin files shipped with the driver. You must
either put these files in /compat/ndis or convert them with
ndiscvt -f and kldload them so the driver can use them. Without
the firmware image, the NIC won't work.
2004-08-16 18:50:20 +00:00
obrien
23e2b54285 Fix the 'DEBUG' argument code to unbreak the amd64 LINT build. 2004-08-16 12:15:07 +00:00
obrien
2e13038823 Fix the 'DEBUG' argument code to unbreak the amd64 LINT build. 2004-08-16 11:12:57 +00:00
obrien
4156b8dbb9 Fix the 'DEBUG' argument code to unbreak the LINT build. 2004-08-16 10:36:12 +00:00
tjr
33d20b8677 Add support for 32-bit Linux binary emulation on amd64:
- include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h>
  if building with the COMPAT_LINUX32 option.
- make minimal changes to the i386 linprocfs_docpuinfo() function to support
  amd64. We return a fake CPU family of 6 for now.
2004-08-16 08:19:18 +00:00
tjr
6d0528abdf Changes to MI Linux emulation code necessary to run 32-bit Linux binaries
on AMD64, and the general case where the emulated platform has different
size pointers than we use natively:
- declare certain structure members as l_uintptr_t and use the new PTRIN
  and PTROUT macros to convert to and from native pointers.
- declare some structures __packed on amd64 when the layout would differ
  from that used on i386.
- include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h>
  if compiling with COMPAT_LINUX32. This will need to be revisited before
  32-bit and 64-bit Linux emulation support can coexist in the same kernel.
- other small scattered changes.

This should be a no-op on i386 and Alpha.
2004-08-16 07:28:16 +00:00
tjr
94699de209 Replace linux_getitimer() and linux_setitimer() with implementations
based on those in freebsd32_misc.c, removing the assumption that Linux
uses the same layout for struct itimerval as we use natively.
2004-08-15 12:34:15 +00:00
tjr
9b0e1093a1 Avoid assuming that l_timeval is the same as the native struct timeval
in linux_select().
2004-08-15 12:24:05 +00:00