freebsd-nq

Author	SHA1	Message	Date
David Xu	9104847f21	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
Bill Paul	85c13a8375	Convert ndis_set_info() and ndis_get_info() from using msleep() to KeSetEvent()/KeWaitForSingleObject(). Also make object argument of KeWaitForSingleObject() a void * like it's supposed to be.	2005-10-12 03:02:50 +00:00
Bill Paul	21628ddbd6	This commit makes a big round of updates and fixes many, many things. First and most importantly, I threw out the thread priority-twiddling implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in favor of a new scheme that uses sleep mutexes. The old scheme was really very naughty and sought to provide the same behavior as Windows spinlocks (i.e. blocking pre-emption) but in a way that wouldn't raise the ire of WITNESS. The new scheme represents 'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If a thread on cpu0 acquires the 'dispatcher mutex,' it will block any other thread on the same processor that tries to acquire it, in effect only allowing one thread on the processor to be at 'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit and spin' routine on the spinlock variable itself. If a thread on cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher mutex' for cpu1 and then it too does an atomic sit and spin to try acquiring the spinlock. Unlike real spinlocks, this does not disable pre-emption of all threads on the CPU, but it does put any threads involved with the NDISulator to sleep, which is just as good for our purposes. This means I can now play nice with WITNESS, and I can safely do things like call malloc() when I'm at 'DISPATCH_LEVEL,' which you're allowed to do in Windows. Next, I completely re-wrote most of the event/timer/mutex handling and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects() have been re-written to use condition variables instead of msleep(). This allows us to use the Windows convention whereby thread A can tell thread B "wake up with a boosted priority." (With msleep(), you instead have thread B saying "when I get woken up, I'll use this priority here," and thread A can't tell it to do otherwise.) The new KeWaitForMultipleObjects() has been better tested and better duplicates the semantics of its Windows counterpart. I also overhauled the IoQueueWorkItem() API and underlying code. Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the same work item isn't put on the queue twice. ExQueueWorkItem(), which in my implementation is built on top of IoQueueWorkItem(), was also modified to perform a similar test. I renamed the doubly-linked list macros to give them the same names as their Windows counterparts and fixed RemoveListTail() and RemoveListHead() so they properly return the removed item. I also corrected the list handling code in ntoskrnl_dpc_thread() and ntoskrnl_workitem_thread(). I realized that the original logic did not correctly handle the case where a DPC callout tries to queue up another DPC. It works correctly now. I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to use them. I also tried to duplicate the interrupt handling scheme used in Windows. The interrupt handling is now internal to ndis.ko, and the ndis_intr() function has been removed from if_ndis.c. (In the USB case, interrupt handling isn't needed in if_ndis.c anyway.) NdisMSleep() has been rewritten to use a KeWaitForSingleObject() and a KeTimer, which is how it works in Windows. (This is mainly to insure that the NDISulator uses the KeTimer API so I can spot any problems with it that may arise.) KeCancelTimer() has been changed so that it only cancels timers, and does not attempt to cancel a DPC if the timer managed to fire and queue one up before KeCancelTimer() was called. The Windows DDK documentation seems to imply that KeCantelTimer() will also call KeRemoveQueueDpc() if necessary, but it really doesn't. The KeTimer implementation has been rewritten to use the callout API directly instead of timeout()/untimeout(). I still cheat a little in that I have to manage my own small callout timer wheel, but the timer code works more smoothly now. I discovered a race condition using timeout()/untimeout() with periodic timers where untimeout() fails to actually cancel a timer. I don't quite understand where the race is, using callout_init()/callout_reset()/callout_stop() directly seems to fix it. I also discovered and fixed a bug in winx32_wrap.S related to translating _stdcall calls. There are a couple of routines (i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that return 64-bit quantities. On the x86 arch, 64-bit values are returned in the %eax and %edx registers. However, it happens that the ctxsw_utow() routine uses %edx as a scratch register, and x86_stdcall_wrap() and x86_stdcall_call() were only preserving %eax before branching to ctxsw_utow(). This means %edx was getting clobbered in some cases. Curiously, the most noticeable effect of this bug is that the driver for the TI AXC110 chipset would constantly drop and reacquire its link for no apparent reason. Both %eax and %edx are preserved on the stack now. The _fastcall and _regparm wrappers already handled everything correctly. I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem() instead of the NdisScheduleWorkItem() API. This is to avoid possible deadlocks with any drivers that use NdisScheduleWorkItem() themselves. The unicode/ansi conversion handling code has been cleaned up. The internal routines have been moved to subr_ntoskrnl and the RtlXXX routines have been exported so that subr_ndis can call them. This removes the incestuous relationship between the two modules regarding this code and fixes the implementation so that it honors the 'maxlen' fields correctly. (Previously it was possible for NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't own, which was causing many mysterious crashes in the Marvell 8335 driver.) The registry handling code (NdisOpen/Close/ReadConfiguration()) has been fixed to allocate memory for all the parameters it hands out to callers and delete whem when NdisCloseConfiguration() is called. (Previously, it would secretly use a single static buffer.) I also substantially updated if_ndis so that the source can now be built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled. The original WPA code has been updated to fit in more cleanly with the net80211 API, and to eleminate the use of magic numbers. The ndis_80211_setstate() routine now sets a default authmode of OPEN and initializes the RTS threshold and fragmentation threshold. The WPA routines were changed so that the authentication mode is always set first, followed by the cipher. Some drivers depend on the operations being performed in this order. I also added passthrough ioctls that allow application code to directly call the MiniportSetInformation()/MiniportQueryInformation() methods via ndis_set_info() and ndis_get_info(). The ndis_linksts() routine also caches the last 4 events signalled by the driver via NdisMIndicateStatus(), and they can be queried by an application via a separate ioctl. This is done to allow wpa_supplicant to directly program the various crypto and key management options in the driver, allowing things like WPA2 support to work. Whew.	2005-10-10 16:46:39 +00:00
John Baldwin	f2107e8d54	Use the constants for the syscall names from syscall.h rather than hardcoding the numbers for the SYSVIPC syscalls.	2005-10-03 18:34:17 +00:00
Robert Watson	5f419982c2	Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133: Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant. This aligns this code more with the eventual locking of ttys. Suggested by: bde	2005-09-28 07:03:03 +00:00
Peter Wemm	a11ea6e325	Regenerate	2005-09-27 18:04:52 +00:00
Peter Wemm	add121a476	Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added stubs for ia64 to keep it compiling. These are used by 32 bit apps such as gdb.	2005-09-27 18:04:20 +00:00
Robert Watson	84d2b7df26	Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment). Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout. With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant. NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable. NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change. NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code. MFC after: 1 week	2005-09-19 16:51:43 +00:00
Andre Oppermann	e72b668b69	Test the mbuf flags against the correct constant. The previous version worked as intended but only by chance. MT_HEADER == M_PKTHDR == 0x2.	2005-08-30 16:21:51 +00:00
Xin LI	e68796868a	Fix kernel build. Reported by: tinderbox	2005-08-28 13:11:08 +00:00
Craig Rodrigues	8739cd44d0	Rewrite linux_ifconf() to be more like ifconf() in net/if.c so that we do not call uiomove() while IFNET_RLOCK() is held. This eliminates the witness warning: Calling uiomove() with the following non-sleepable locks held: exclusive sleep mutex ifnet r = 0 (0xc096dd60) locked @ /usr/src/sys/modules/linux/../../compat/linux/linux_ioctl.c:2170 MFC after: 2 days	2005-08-27 14:44:10 +00:00
Robert Watson	13f4c340ae	Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days	2005-08-09 10:20:02 +00:00
John Baldwin	ec1f24a934	Add missing dependencies on the SYSVIPC modules.	2005-07-29 19:41:04 +00:00
John Baldwin	813a5e14ec	Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c so that they aren't duplicated 3 times and are also in the same file as the code that depends on the SYSVIPC modules.	2005-07-29 19:40:39 +00:00
John Baldwin	ac5ee935dd	Regen.	2005-07-13 20:35:09 +00:00
John Baldwin	8683e7fdc1	Make a pass through all the compat ABIs sychronizing the MP safe flags with the master syscall table as well as marking several ABI wrapper functions safe. MFC after: 1 week	2005-07-13 20:32:42 +00:00
John Baldwin	6e9b02cf80	Regen.	2005-07-13 15:14:54 +00:00
John Baldwin	2773347338	- Stop hardcoding #define's for options and use the appropriate opt_foo.h headers instead. - Hook up the IPC SVR4 syscalls. MFC after: 3 days	2005-07-13 15:14:33 +00:00
John Baldwin	fa34d9b7a5	Wrap the ia64-specific freebsd32_mmap_partial() hack in Giant for now since it calls into VFS and VM. This makes the freebsd32_mmap() routine MP safe and the extra Giants here can be revisited later. Glanced at by: marcel MFC after: 3 days	2005-07-13 15:12:19 +00:00
John Baldwin	02295eedc7	Add Giant around linux_getcwd_common() in linux_getcwd(). Approved by: re (scottl)	2005-07-09 12:34:49 +00:00
John Baldwin	4641373fde	Add missing locking to linux_connect() so that it can be marked MP safe: - Conditionally grab Giant around the EISCONN hack at the end based on debug.mpsafenet. - Protect access to so_emuldata via SOCK_LOCK. Reviewed by: rwatson Approved by: re (scottl)	2005-07-09 12:26:22 +00:00
Roman Kurakin	fbb7165a4b	Use implicit type cast for ->k_lock to fix compilation of ndis as a part of the GENERIC kernel with INVARIANT* and WITNESS* turned off. (For non GENERIC kernel KTR and MUTEX_PROFILING should be also off). Submitted by: Eygene A. Ryabinkin <rea at rea dot mbslab dot kiae dot ru> Approved by: re (scottl) PR: 81767	2005-07-08 18:36:59 +00:00
John Baldwin	55522478e6	Lock Giant in svr4_add_socket() so that the various svr4_*stat() calls can be marked MP safe as this is the only part of them that is not already MP safe. Approved by: re (scottl)	2005-07-07 19:27:29 +00:00
John Baldwin	03badf38ab	Remove an unused syscallarg() macro leftover from this code's origins in NetBSD. Approved by: re (scottl)	2005-07-07 19:26:43 +00:00
John Baldwin	07fac65b15	Rototill this file so that it actually compiles. It doesn't do anything in the build still due to some #undef's in svr4.h, but if you hack around that and add some missing entries to syscalls.master, then this file will now compile. The changes involved proc -> thread, using FreeBSD syscall names instead of NetBSD, and axeing syscallarg() and retval arguments. Approved by: re (scottl)	2005-07-07 19:25:47 +00:00
John Baldwin	8d948cd1ec	Fix the computation of uptime for linux_sysinfo(). Before it was returning the uptime in seconds mod 60 which wasn't very useful. Approved by: re (scottl)	2005-07-07 19:17:55 +00:00
John Baldwin	9f3157a254	Regenerate. Approved by: re (scottl)	2005-07-07 18:20:38 +00:00
John Baldwin	bcd9e0dd20	- Add two new system calls: preadv() and pwritev() which are like readv() and writev() except that they take an additional offset argument and do not change the current file position. In SAT speak: preadv:readv::pread:read and pwritev:writev::pwrite:write. - Try to reduce code duplication some by merging most of the old kern_foov() and dofilefoo() functions into new dofilefoo() functions that are called by kern_foov() and kern_pfoov(). The non-v functions now all generate a simple uio on the stack from the passed in arguments and then call kern_foov(). For example, read() now just builds a uio and calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev(). PR: kern/80362 Submitted by: Marc Olzheim marcolz at stack dot nl (1) Approved by: re (scottl) MFC after: 1 week	2005-07-07 18:17:55 +00:00
Peter Wemm	62919d788b	Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work. ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_regs.c: vary the format of proc/XXX/regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets. IA64 has got stubs for ia32_reg.c. Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1. Approved by: re	2005-06-30 07:49:22 +00:00
John Baldwin	19042f9cce	- Change the commented out freebsd32_xxx() example to use kern_xxx() along with a single copyin() + translate and translate + copyout() rather than using the stackgap. - Remove implementation of the stackgap for freebsd32 since it is no longer used for that compat ABI. Approved by: re (scottl)	2005-06-29 15:16:20 +00:00
John Baldwin	de1c01ad37	Correct the amount of data to allocate in these local copies of exec_copyin_strings() to catch up to rev 1.266 of kern_exec.c. This fixes panics on amd64 with compat binaries since exec_free_args() was freeing more memory than these functions were allocating and the mismatch could cause memory to be freed out from under other concurrent execs. Approved by: re (scottl)	2005-06-24 17:41:28 +00:00
Pawel Jakub Dawidek	06a137780b	Actually only protect mount-point if security.jail.enforce_statfs is set to 2. If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)	2005-06-23 22:13:29 +00:00
Pawel Jakub Dawidek	3a996d6e91	Do not allocate memory based on not-checked argument from userland. It can be used to panic the kernel by giving too big value. Fix it by moving allocation and size verification into kern_getfsstat(). This even simplifies kern_getfsstat() consumers, but destroys symmetry - memory is allocated inside kern_getfsstat(), but has to be freed by the caller. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-11 14:58:20 +00:00
Brooks Davis	fc74a9f93a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
Pawel Jakub Dawidek	820a0de9a9	Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson	2005-06-09 18:49:19 +00:00
Pawel Jakub Dawidek	13a82b9623	Avoid code duplication in serval places by introducing universal kern_getfsstat() function. Obtained from: jhb	2005-06-09 17:44:46 +00:00
Maxim Sobolev	bc165ab0fe	Properly convert FreeBSD priority values into Linux values in the getpriority(2) syscall. PR: kern/81951 Submitted by: Andriy Gapon <avg@icyb.net.ua>	2005-06-08 20:41:28 +00:00
Paul Saab	efe5becafa	Wrap copyin/copyout for kevent so the 32bit wrapper does not have to malloc nchanges * sizeof(struct kevent) AND/OR nevents * sizeof(struct kevent) on every syscall. Glanced at by: peter, jmg Obtained from: Yahoo! MFC after: 2 weeks	2005-06-03 23:15:01 +00:00
Robert Watson	3984b2328c	Rebuild generated system call definition files following the addition of the audit event field to the syscalls.master file format. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:20:21 +00:00
Robert Watson	f3596e3370	Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:09:18 +00:00
Yoshihiro Takahashi	d4fcf3cba5	Remove bus_{mem,p}io.h and related code for a micro-optimization on i386 and amd64. The optimization is a trivial on recent machines. Reviewed by: -arch (imp, marcel, dfr)	2005-05-29 04:42:30 +00:00
Pawel Jakub Dawidek	d0cad55da8	Remove (now) unused argument 'td' from bsd_to_linux_statfs().	2005-05-27 19:25:39 +00:00
Paul Saab	473dd55f2e	Copyout to userland if kern_sigaction succeeds	2005-05-24 17:52:14 +00:00
Pawel Jakub Dawidek	672d95c55d	The code is under '#ifdef not_that_way', but anyway: - Add missing prison_check_mount() check.	2005-05-22 22:30:31 +00:00
Pawel Jakub Dawidek	a0e96a49df	If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us, so do not duplicate the code in cvtstatfs(). Note, that we now need to clear fsid in freebsd4_getfsstat(). This moves all security related checks from functions like cvtstatfs() and will allow to add more security related stuff (like statfs(2), etc. protection for jails) a bit easier.	2005-05-22 21:52:30 +00:00
Bill Paul	0b6c3bf1bc	Missed kern_windrv.c in the last checkin.	2005-05-20 04:01:36 +00:00
Bill Paul	450a94af7a	Deal with a few bootstrap issues: We can't call KeFlushQueuedDpcs() during bootstrap (cold == 1), since the flush operation sleeps to wait for completion, and we can't sleep here (clowns will eat us). On an i386 SMP system, if we're loaded/probed/attached during bootstrap, smp_rendezvous() won't run us anywhere except CPU 0 (since the other CPUs aren't launched until later), which means we won't be able to set up the GDTs anywhere except CPU 0. To deal with this case, ctxsw_utow() now checks to see if the TID for the current processor has been properly initialized and sets up the GTD for the current CPU if not. Lastly, in if_ndis.c:ndis_shutdown(), do an ndis_stop() to insure we really halt the NIC and stop interrupts from happening. Note that loading a driver during bootstrap is, unfortunately, kind of a hit or miss sort of proposition. In Windows, the expectation is that by the time a given driver's MiniportInitialize() method is called, the system is already in 'multiuser' state, i.e. it's up and running enough to support all the stuff specified in the NDIS API, which includes the underlying OS-supplied facilities it implicitly depends on, such as having all CPUs running, having the DPC queues initialized, WorkItem threads running, etc. But in UNIX, a lot of that stuff won't work during bootstrap. This causes a problem since we need to call MiniportInitialize() at least once during ndis_attach() in order to find out what kind of NIC we have and learn its station address. What this means is that some cards just plain won't work right if you try to pre-load the driver along with the kernel: they'll only be probed/attach correctly if the driver is kldloaded _after_ the system has reached multiuser. I can't really think of a way around this that would still preserve the ability to use an NDIS device for diskless booting.	2005-05-20 04:00:50 +00:00
Bill Paul	cebddbda3b	In ndis_halt_nic(), invalidate the miniportadapterctx early to try and prevent anything from making calls to the NIC while it's being shut down. This is yet another attempt to stop things like mdnsd from trying to poke at the card while it's not properly initialized and panicking the system. Also, remove unneeded debug message from if_ndis.c.	2005-05-20 02:35:43 +00:00
Bill Paul	0621191ab9	Fix some of the things I broke so that the SMC2602W (AMD Am1772) driver works again. This driver uses NdisScheduleWorkItem(), and we have to take special steps to insure that its workitems don't collide with any of the other workitems used by the NDISulator. In particular, if one of the driver's work jobs blocks, it can prevent NdisMAllocateSharedMemoryAsync() from completing when expected. The original hack to fix this was to have NdisMAllocateSharedMemoryAsync() defer its work to the DPC queue instead of the general task queue. To fix it now, I decided to add some additional workitem threads. (There's supposed to be a pool of worker threads in Windows anyway.) Currently, there are 4. There should be at least 2. One is reserved for the legacy ExQueueWorkItem() API, while the others are used in round-robin by the IoQueueWorkItem() API. NdisMAllocateSharedMemoryAsync() uses the latter API while NdisScheduleWorkItem() uses the former, so the deadlock is avoided. Fixed NdisMRegisterDevice()/NdisMDeregisterDevice() to work a little more sensibly with the new driver_object/device_object framework. It doesn't really register a working user-mode interface, but the existing code was completely wrong for the new framework. Fixed a couple of bugs dealing with the cancellation of events and DPCs. When cancelling an event that's still on the timer queue (i.e. hasn't expired yet), reset dh_inserted in its dispatch header to FALSE. Previously, it was left set to TRUE, which would make a cancelled timer appear to have not been cancelled. Also, when removing a DPC from a queue, reset its list pointers, otherwise a cancelled DPC might mistakenly be treated as still pending. Lastly, fix the behavior of ntoskrnl_wakeup() when dealing with objects that have nobody waiting on them: sync event objects get their signalled state reset to FALSE, but notification objects should still be set to TRUE.	2005-05-19 04:44:26 +00:00
Bill Paul	5b5687f6ba	Remove harmless bit of leftover debug code.	2005-05-16 15:44:41 +00:00
Bill Paul	d9ccba1ac4	Correct some problems with workitem usage. NdisScheduleWorkItem() does not use exactly the same workitem sturcture as ExQueueWorkItem() like I originally thought it did.	2005-05-16 15:29:21 +00:00
Bill Paul	433d61bb56	Add support for NdisMEthIndicateReceive() and MiniportTransferData(). The Ralink RT2500 driver uses this API instead of NdisMIndicateReceivePacket(). Drivers use NdisMEthIndicateReceive() when they know they support 802.3 media and expect to hand their packets only protocols that want to deal with that particular media type. With this API, the driver does not manage its own NDIS_PACKET/NDIS_BUFFER structures. Instead, it lets bound protocols have a peek at the data, and then they supply an NDIS_PACKET/NDIS_BUFFER combo to the miniport driver, into which it copies the packet data. Drivers use NdisMIndicateReceivePacket() to allow their packets to be read by any protocol, not just those bound to 802.3 media devices. To make this work, we need an internal pool of NDIS_PACKETS for receives. Currently, we check to see if the driver exports a MiniportTransferData() method in its characteristics structure, and only allocate the pool for drivers that have this method. This should allow the RT2500 driver to work correctly, though I still have to fix ndiscvt(8) to parse its .inf file properly. Also, change kern_ndis.c:ndis_halt_nic() to reap timers before acquiring NDIS_LOCK(), since the reaping process might entail sleeping briefly (and we can't sleep with a lock held).	2005-05-15 04:27:59 +00:00
Bill Paul	239a676456	More fixes for multibus drivers. When calling out to the match function in if_ndis_pci.c and if_ndis_pccard.c, provide the bustype too so the stubs can ignore devlists that don't concern them.	2005-05-08 23:19:20 +00:00
Bill Paul	6169e4d097	Fix support for Windows drivers that support both PCI and PCMCIA devices at the same time. Fix if_ndis_pccard.c so that it sets sc->ndis_dobj and sc->ndis_regvals. Correct IMPORT_SFUNC() macros for the READ_PORT_BUFFER_xxx() routines, which take 3 arguments, not 2. This fixes it so that the Windows driver for my Cisco Aironet 340 PCMCIA card works again. (Yes, I know the an(4) driver supports this card natively, but it's the only PCMCIA device I have with a Windows XP driver.)	2005-05-08 23:07:51 +00:00
Bill Paul	0ad8336bc5	Correct the patch table entries for the 64-bit intrinsic math routines (_alldiv(), _allmul(), _alludiv(), _aullmul(), etc...) that use the _stdcall calling convention. These routines all take two arguments, but the arguments are 64 bits wide. On the i386 this means they each consume two 32-bit slots on the stack. Consequently, when we specify the argument count in the IMPORT_SFUNC() macro, we have to lie and claim there are 4 arguments instead of two. This will cause the resulting i386 assembly wrapper to push the right number of longwords onto the stack. This fixes a crash I discovered with the RealTek 8180 driver, which uses these routines a lot during initialization.	2005-05-08 09:16:33 +00:00
Bill Paul	2f60d4f83f	Cast 64 bit quantity to uintmax_t to print it with %jx. This is technically a no-op since uintmax_t is uint64_t on all currently supported architectures, but we should use an explicit cast instead of depending on this obscure coincidence.	2005-05-05 22:33:06 +00:00
Bill Paul	3a712851ab	Use %jx instead of %qx to silence compiler warning on amd64.	2005-05-05 15:56:41 +00:00
Bill Paul	eb31d50cc7	Avoid sleeping with mutex held in kern_ndis.c. Remove unused fields from ndis_miniport_block. Fix a bug in KeFlushQueuedDpcs() (we weren't calculating the kq pointer correctly). In if_ndis.c, clear the IFF_RUNNING flag before calling ndis_halt_nic(). Add some guards in kern_ndis.c to avoid letting anyone invoke ndis_get_info() or ndis_set_info() if the NIC isn't fully initialized. Apparently, mdnsd will sometimes try to invoke the ndis_ioctl() routine at exactly the wrong moment (to futz with its multicast filters) when the interface comes up, and can trigger a crash unless we guard against it.	2005-05-05 06:14:59 +00:00
Bill Paul	5514ba90b2	Remove extranaous free() of ASCII filename from NdisOpenFile(). Oh, one additional change I forgot to mention in the last commit: NdisOpenFile() was broken in the case for firmware files that were pre-loaded as modules. When searching for the module in NdisOpenFile(), we would match against a symbol name, which would contain the string we were looking for, then save a pointer to the linker file handle. Later, in NdisMapFile(), we would refer to the filename hung off this handle when trying to find the starting address symbol. Only problem is, this filename is different from the embedded symbol name we're searching for, so the mapping would fail. I found this problem while testing the AirGo driver, which requires a small firmware file.	2005-05-05 04:16:13 +00:00
Bill Paul	9b307fe2be	This commit makes a bunch of changes, some big, some not so big. - Remove the old task threads from kern_ndis.c and reimplement them in subr_ntoskrnl.c, in order to more properly emulate the Windows DPC API. Each CPU gets its own DPC queue/thread, and each queue can have low, medium and high importance DPCs. New APIs implemented: KeSetTargetProcessorDpc(), KeSetImportanceDpc() and KeFlushQueuedDpcs(). (This is the biggest change.) - Fix a bug in NdisMInitializeTimer(): the k_dpc pointer in the nmt_timer embedded in the ndis_miniport_timer struct must be set to point to the DPC, also embedded in the struct. Failing to do this breaks dequeueing of DPCs submitted via timers, and in turn breaks cancelling timers. - Fix a bug in KeCancelTimer(): if the timer is interted in the timer queue (i.e. the timeout callback is still pending), we have to both untimeout() the timer _and_ call KeRemoveQueueDpc() to nuke the DPC that might be pending. Failing to do this breaks cancellation of periodic timers, which always appear to be inserted in the timer queue. - Make use of the nmt_nexttimer field in ndis_miniport_timer: keep a queue of pending timers and cancel them all in ndis_halt_nic(), prior to calling MiniportHalt(). Also call KeFlushQueuedDpcs() to make sure any DPCs queued by the timers have expired. - Modify NdisMAllocateSharedMemory() and NdisMFreeSharedMemory() to keep track of both the virtual and physical addresses of the shared memory buffers that get handed out. The AirGo MIMO driver appears to have a bug in it: for one of the segments is allocates, it returns the wrong virtual address. This would confuse NdisMFreeSharedMemory() and cause a crash. Why it doesn't crash Windows too I have no idea (from reading the documentation for NdisMFreeSharedMemory(), it appears to be a violation of the API). - Implement strstr(), strchr() and MmIsAddressValid(). - Implement IoAllocateWorkItem(), IoFreeWorkItem(), IoQueueWorkItem() and ExQueueWorkItem(). (This is the second biggest change.) - Make NdisScheduleWorkItem() call ExQueueWorkItem(). (Note that the ExQueueWorkItem() API is deprecated by Microsoft, but NDIS still uses it, since NdisScheduleWorkItem() is incompatible with the IoXXXWorkItem() API.) - Change if_ndis.c to use the NdisScheduleWorkItem() interface for scheduling tasks. With all these changes and fixes, the AirGo MIMO driver for the Belkin F5D8010 Pre-N card now works. Special thanks to Paul Robinson (paul dawt robinson at pwermedia dawt net) for the loan of a card for testing.	2005-05-05 03:56:09 +00:00
Jeff Roberson	7625cbf3cc	- Pass the ISOPEN flag to namei so filesystems will know we're about to open them or otherwise access the data.	2005-04-27 09:05:19 +00:00
Bill Paul	96b50ea387	Throw the switch on the new driver generation/loading mechanism. From here on in, if_ndis.ko will be pre-built as a module, and can be built into a static kernel (though it's not part of GENERIC). Drivers are created using the new ndisgen(8) script, which uses ndiscvt(8) under the covers, along with a few other tools. The result is a driver module that can be kldloaded into the kernel. A driver with foo.inf and foo.sys files will be converted into foo_sys.ko (and foo_sys.o, for those who want/need to make static kernels). This module contains all of the necessary info from the .INF file and the driver binary image, converted into an ELF module. You can kldload this module (or add it to /boot/loader.conf) to have it loaded automatically. Any required firmware files can be bundled into the module as well (or converted/loaded separately). Also, add a workaround for a problem in NdisMSleep(). During system bootstrap (cold == 1), msleep() always returns 0 without actually sleeping. The Intel 2200BG driver uses NdisMSleep() to wait for the NIC's firmware to come to life, and fails to load if NdisMSleep() doesn't actually delay. As a workaround, if msleep() (and hence ndis_thsuspend()) returns 0, use a hard DELAY() to sleep instead). This is not really the right thing to do, but we can't really do much else. At the very least, this makes the Intel driver happy. There are probably other drivers that fail in this way during bootstrap. Unfortunately, the only workaround for those is to avoid pre-loading them and kldload them once the system is running instead.	2005-04-24 20:21:22 +00:00
Bill Paul	427fea0ba6	Now that the GDT has been reorganized and GNDIS_SEL has been reserved for us, use it if it's available, otherwise default to using slot 7 as before.	2005-04-17 19:36:08 +00:00
Bill Paul	d84ed2322c	When setting up the new stack for a function in x86_64_wrap(), make sure to make it 16-byte aligned, in keeping with amd64 calling convention requirements. Submitted by: Mikore Li at sun dot com	2005-04-16 04:47:15 +00:00
Jeff Roberson	4585e3ac5a	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
Matthew N. Dodd	f9763094f1	Implement SOUND_MIXER_INFO ioctl in compat layer.	2005-04-13 04:33:06 +00:00
Matthew N. Dodd	73c730a694	Add support for O_NOFOLLOW and O_DIRECT to Linux fcntl() F_GETFL/F_SETFL.	2005-04-13 04:31:43 +00:00
Bill Paul	0a5c534cd2	In winx32_wrap.S, preserve return values in the fastcall and regparm wrappers by pushing them onto the stack rather than keeping them in %esi and %edi.	2005-04-11 17:04:49 +00:00
Bill Paul	d02239a3af	Create new i386 windows/bsd thunking layer, similar to the amd64 thunking layer, but with a twist. The twist has to do with the fact that Microsoft supports structured exception handling in kernel mode. On the i386 arch, exception handling is implemented by hanging an exception registration list off the Thread Environment Block (TEB), and the TEB is accessed via the %fs register. The problem is, we use %fs as a pointer to the pcpu stucture, which means any driver that tries to write through %fs:0 will overwrite the curthread pointer and make a serious mess of things. To get around this, Project Evil now creates a special entry in the GDT on each processor. When we call into Windows code, a context switch routine will fix up %fs so it points to our new descriptor, which in turn points to a fake TEB. When the Windows code returns, or calls out to an external routine, we swap %fs back again. Currently, Project Evil makes use of GDT slot 7, which is all 0s by default. I fully expect someone to jump up and say I can't do that, but I couldn't find any code that makes use of this entry anywhere. Sadly, this was the only method I could come up with that worked on both UP and SMP. (Modifying the LDT works on UP, but becomes incredibly complicated on SMP.) If necessary, the context switching stuff can be yanked out while preserving the convention calling wrappers. (Fortunately, it looks like Microsoft uses some special epilog/prolog code on amd64 to implement exception handling, so the same nastiness won't be necessary on that arch.) The advantages are: - Any driver that uses %fs as though it were a TEB pointer won't clobber pcpu. - All the __stdcall/__fastcall/__regparm stuff that's specific to gcc goes away. Also, while I'm here, switch NdisGetSystemUpTime() back to using nanouptime() again. It turns out nanouptime() is way more accurate than just using ticks(). On slower machines, the Atheros drivers I tested seem to take a long time to associate due to the loss in accuracy.	2005-04-11 02:02:35 +00:00
Peter Wemm	50860ac0ee	Fix 32 bit signals on amd64. It turns out that I was sign extending the register values coming back from sigreturn(2). Normally this wouldn't matter because the 32 bit environment would truncate the upper 32 bits and re-save the truncated values at the next trap. However, if we got a fast second signal and it was pending while we were returning from sigreturn(2) in the signal trampoline, we'd never have had a chance to truncate the bogus values in 32 bit mode, and the new sendsig would get an EFAULT when trying to write to the bogus user stack address.	2005-04-05 22:41:49 +00:00
John Baldwin	98df9218da	- Change the vm_mmap() function to accept an objtype_t parameter specifying the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from. Reviewed by: alc, arch@	2005-04-01 20:00:11 +00:00
Bill Paul	92b9707e2d	Fix another KeInitializeDpc()/amd64 calling convention issue: ndis_intrhand() has to be wrapped for the same reason as ndis_timercall().	2005-04-01 16:40:22 +00:00
John Baldwin	48052f99e7	- Use a custom version of copyinuio() to implement readv/writev using kern_readv/writev. - Use kern_settimeofday() and kern_adjtime() rather than stackgapping it.	2005-03-31 22:58:13 +00:00
Bill Paul	2c87b2b73f	Apparently I'm cursed. ndis_findwrap() should be searching ndis_functbl, not ntoskrnl_functbl.	2005-03-31 21:20:19 +00:00
Bill Paul	621b33fc5b	Fix an amd64 issue I overlooked. When setting up a callout to ndis_timercall() in NdisMInitializeTimer(), we can't use the raw function pointer. This is because ntoskrnl_run_dpc() expects to invoke a function with Microsoft calling conventions. On i386, this works because ndis_timercall() is declared with the __stdcall attribute, but this is a no-op on amd64. To do it correctly, we have to generate a wrapper for ndis_timercall() and us the wrapper instead of of the raw function pointer. Fix this by adding ndis_timercall() to the funcptr table in subr_ndis.c, and create ndis_findwrap() to extract the wrapped function from the table in NdisMInitializeTimer() instead of just passing ndis_timercall() to KeInitializeDpc() directly.	2005-03-31 16:38:48 +00:00
Bill Paul	c3c51190cc	Fix a possible mutex leak in KeSetTimerEx(): if timer is NULL, we bail out without releasing the dispatcher lock. Move the lock acquisition after the pointer test to avoid this.	2005-03-30 16:22:48 +00:00
Bill Paul	76e96613b2	Remove a couple of #ifdef 0'ed code blocks left over from Atheros debugging. Remember to reset ndis_pendingreq to NULL when bailing out of ndis_set_info() or ndis_get_info() due to miniportadapterctx not being set.	2005-03-30 02:50:06 +00:00
Jeff Roberson	9f3d9acd26	- Initial cn_lkflags to LK_EXCLUSIVE. Sponsored by: Isilon Systems, Inc.	2005-03-29 10:16:12 +00:00
Bill Paul	18be2d04d8	The filehandle allocated in NdisOpenFile() is allocated using ExAllocatePoolWithTag(), not malloc(), so it should be released with ExFreePool(), not free(). Fix a couple if instances of free(fh, ...) that got overlooked.	2005-03-28 22:03:47 +00:00
Bill Paul	c6cb2045e4	Another Coverity fix from Sam: add NULL pointer test in NdisMFreeSharedMemory() (if the list is already empty, just bail).	2005-03-28 21:09:00 +00:00
Bill Paul	f3d5302e1a	More additions for amd64: - On amd64, InterlockedPushEntrySList() and InterlockedPopEntrySList() are mapped to ExpInterlockedPushEntrySList and ExpInterlockedPopEntrySList() via macros (which do the same thing). Add IMPORT_FUNC_MAP()s for these. - Implement ExQueryDepthSList().	2005-03-28 20:46:08 +00:00
Bill Paul	59abc1c4f3	Fix resource leak found by Coverity (via Sam Leffler).	2005-03-28 20:16:26 +00:00
Bill Paul	c0c6e20248	Fix for amd64.	2005-03-28 20:13:14 +00:00
Bill Paul	269dfbe780	Fix another amd64 issue with lookaside lists: we initialize the alloc and free routine pointers in the lookaside list with pointers to ExAllocatePoolWithTag() and ExFreePool() (in the case where the driver does not provide its own alloc and free routines). For amd64, this is wrong: we have to use pointers to the wrapped versions of these functions, not the originals.	2005-03-28 19:27:58 +00:00
Bill Paul	9a1c9424cf	Tweak to hopefully make lookaside lists work on amd64: in Windows, the nll_obsoletelock field in the lookaside list structure is only defined for the i386 arch. For amd64, the field is gone, and different list update routines are used which do their locking internally. Apparently the Inprocomm amd64 driver uses lookaside lists. I'm not positive this will make it work yet since I don't have an Inprocomm NIC to test, but this needs to be fixed anyway.	2005-03-28 17:36:06 +00:00
Bill Paul	97b4ef94b5	Spell '0' as 'FALSE' when initializing npp_validcounts. (Doesn't change the code, but emphasises that this field is used as a boolean.)	2005-03-28 17:06:47 +00:00
Bill Paul	da1accf806	Unbreak the build: correct the resource list traversal code for __FreeBSD_version >= 600022.	2005-03-28 16:49:27 +00:00
Bill Paul	e0c8c9460c	Argh. PCI resource list became an STAILQ instead of an SLIST. Try to deal with this while maintaining backards source compatibility with stable.	2005-03-27 10:35:07 +00:00
Bill Paul	91f9f476ee	Check in ntoskrnl_var.h, which should have been included in the previous commit.	2005-03-27 10:16:45 +00:00
Bill Paul	7c1968ad82	Finally bring an end to the great "make the Atheros NDIS driver work on SMP" saga. After several weeks and much gnashing of teeth, I have finally tracked down all the problems, despite their best efforts to confound and annoy me. Problem nunmber one: the Atheros windows driver is _NOT_ a de-serialized miniport! It used to be that NDIS drivers relied on the NDIS library itself for all their locking and serialization needs. Transmit packet queues were all handled internally by NDIS, and all calls to MiniportXXX() routines were guaranteed to be appropriately serialized. This proved to be a performance problem however, and Microsoft introduced de-serialized miniports with the NDIS 5.x spec. Microsoft still supports serialized miniports, but recommends that all new drivers written for Windows XP and later be deserialized. Apparently Atheros wasn't listening when they said this. This means (among other things) that we have to serialize calls to MiniportSendPackets(). We also have to serialize calls to MiniportTimer() that are triggered via the NdisMInitializeTimer() routine. It finally dawned on me why NdisMInitializeTimer() takes a special NDIS_MINIPORT_TIMER structure and a pointer to the miniport block: the timer callback must be serialized, and it's only by saving the miniport block handle that we can get access to the serialization lock during the timer callback. Problem number two: haunted hardware. The thing that was _really_ driving me absolutely bonkers for the longest time is that, for some reason I couldn't understand, my test machine would occasionally freeze or more frustratingly, reset completely. That's reset and in pow! back to the BIOS startup. No panic, no crashdump, just a reset. This appeared to happen most often when MiniportReset() was called. (As to why MiniportReset() was being called, see problem three below.) I thought maybe I had created some sort of horrible deadlock condition in the process of adding the serialization, but after three weeks, at least 6 different locking implementations and heroic efforts to debug the spinlock code, the machine still kept resetting. Finally, I started single stepping through the MiniportReset() routine in the driver using the kernel debugger, and this ultimately led me to the source of the problem. One of the last things the Atheros MiniportReset() routine does is call NdisReadPciSlotInformation() several times to inspect a portion of the device's PCI config space. It reads the same chunk of config space repeatedly, in rapid succession. Presumeably, it's polling the hardware for some sort of event. The reset occurs partway through this process. I discovered that when I single-stepped through this portion of the routine, the reset didn't occur. So I inserted a 1 microsecond delay into the read loop in NdisReadPciSlotInformation(). Suddenly, the reset was gone!! I'm still very puzzled by the whole thing. What I suspect is happening is that reading the PCI config space so quickly is causing a severe PCI bus error. My test system is a Sun w2100z dual Opteron system, and the NIC is a miniPCI card mounted in a miniPCI-to-PCI carrier card, plugged into a 100Mhz PCI slot. It's possible that this combination of hardware causes a bus protocol violation in this scenario which leads to a fatal machine check. This is pure speculation though. Really all I know for sure is that inserting the delay makes the problem go away. (To quote Homer Simpson: "I don't know how it works, but fire makes it good!") Problem number three: NdisAllocatePacket() needs to make sure to initialize the npp_validcounts field in the 'private' section of the NDIS_PACKET structure. The reason if_ndis was calling the MiniportReset() routine in the first place is that packet transmits were sometimes hanging. When sending a packet, an NDIS driver will call NdisQueryPacket() to learn how many physical buffers the packet resides in. NdisQueryPacket() is actually a macro, which traverses the NDIS_BUFFER list attached to the NDIS_PACKET and stashes some of the results in the 'private' section of the NDIS_PACKET. It also sets the npp_validcounts field to TRUE To indicate that the results are now valid. The problem is, now that if_ndis creates a pool of transmit packets via NdisAllocatePacketPool(), it's important that each time a new packet is allocated via NdisAllocatePacket() that validcounts be initialized to FALSE. If it isn't, and a previously transmitted NDIS_PACKET is pulled out of the pool, it may contain stale data from a previous transmission which won't get updated by NdisQueryPacket(). This would cause the driver to miscompute the number of fragments for a given packet, and botch the transmission. Fixing these three problems seems to make the Atheros driver happy on SMP, which hopefully means other serialized miniports will be happy too. And there was much rejoicing. Other stuff fixed along the way: - Modified ndis_thsuspend() to take a mutex as an argument. This allows KeWaitForSingleObject() and KeWaitForMultipleObjects() to avoid any possible race conditions with other routines that use the dispatcher lock. - Fixed KeCancelTimer() so that it returns the correct value for 'pending' according to the Microsoft documentation - Modfied NdisGetSystemUpTime() to use ticks and hz rather than calling nanouptime(). Also added comment that this routine wraps after 49.7 days. - Added macros for KeAcquireSpinLock()/KeReleaseSpinLock() to hide all the MSCALL() goop. - For x86, KeAcquireSpinLockRaiseToDpc() needs to be a separate function. This is because it's supposed to be _stdcall on the x86 arch, whereas KeAcquireSpinLock() is supposed to be _fastcall. On amd64, all routines use the same calling convention so we can just map KeAcquireSpinLockRaiseToDpc() directly to KfAcquireSpinLock() and it will work. (The _fastcall attribute is a no-op on amd64.) - Implement and use IoInitializeDpcRequest() and IoRequestDpc() (they're just macros) and use them for interrupt handling. This allows us to move the ndis_intrtask() routine from if_ndis.c to kern_ndis.c. - Fix the MmInitializeMdl() macro so that is uses sizeof(vm_offset_t) when computing mdl_size instead of uint32_t, so that it matches the MmSizeOfMdl() routine. - Change a could of M_WAITOKs to M_NOWAITs in the unicode routines in subr_ndis.c. - Use the dispatcher lock a little more consistently in subr_ntoskrnl.c. - Get rid of the "wait for link event" hack in ndis_init(). Now that I fixed NdisReadPciSlotInformation(), it seems I don't need it anymore. This should fix the witness panic a couple of people have reported. - Use MSCALL1() when calling the MiniportHangCheck() function in ndis_ticktask(). I accidentally missed this one when adding the wrapping for amd64.	2005-03-27 10:14:36 +00:00
Brooks Davis	044ba81b85	Use the CTASSERT() macro instead of rolling my own, non-portable one using #error. Suggested by: jhb	2005-03-24 19:26:50 +00:00
Brooks Davis	fe753c29f7	Compile errors are way more useful then panics later. Replace a KASSERT of LINUX_IFNAMSIZ == IFNAMSIZ with a preprocessor check and #error message. This will prevent nasty suprises if users change IFNAMSIZ without updating the linux code appropriatly.	2005-03-24 17:51:15 +00:00
David Schultz	a3e1ec194d	Bounds check the user-supplied length used in a copyout() in svr4_do_getmsg(). In principle this bug could disclose data from kernel memory, but in practice, the SVR4 emulation layer is probably not functional enough to cause the relevant code path to be executed. In any case, the emulator has been disconnected from the build since 5.0-RELEASE. Found by: Coverity Prevent analysis tool	2005-03-23 08:28:06 +00:00
David Schultz	aa675b572f	Reject packets larger than IP_MAXPACKET in linux_sendto() for sockets with the IP_HDRINCL option set. Without this change, a Linux process with access to a raw socket could cause a kernel panic. Raw sockets must be created by root, and are generally not consigned to untrusted applications; hence, the security implications of this bug are minimal. I believe this only affects 6-CURRENT on or after 2005-01-30. Found by: Coverity Prevent analysis tool Security: Local DOS	2005-03-23 08:28:00 +00:00
Poul-Henning Kamp	be1bf4d2b8	s/SLIST/STAILQ/ /imp/a\ pointy hat .	2005-03-18 11:57:44 +00:00
Poul-Henning Kamp	bbbc2d967e	Neuter the duplicated disk-device magic code for now. Somebody with serious linux-clue is necessary to fix this properly.	2005-03-15 11:58:40 +00:00
Maxim Sobolev	8d6e40c3f1	Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress SIGPIPE signal for the duration of the sento-family syscalls. Use it to replace previously added hack in Linux layer based on temporarily setting SO_NOSIGPIPE flag. Suggested by: alfred	2005-03-08 16:11:41 +00:00
Maxim Sobolev	2302f0fea8	Handle MSG_NOSIGNAL flag in linux_send() by setting SO_NOSIGPIPE on socket for the duration of the send() call. Such approach may be less than ideal in threading environment, when several threads share the same socket and it might happen that several of them are calling linux_send() at the same time with and without SO_NOSIGPIPE set. However, such race condition is very unlikely in practice, therefore this change provides practical improvement compared to the previous behaviour. PR: kern/76426 Submitted by: Steven Hartland <killing@multiplay.co.uk> MFC after: 3 days	2005-03-07 07:26:42 +00:00
Bill Paul	58a6edd121	When you call MiniportInitialize() for an 802.11 driver, it will at some point result in a status event being triggered (it should be a link down event: the Microsoft driver design guide says you should generate one when the NIC is initialized). Some drivers generate the event during MiniportInitialize(), such that by the time MiniportInitialize() completes, the NIC is ready to go. But some drivers, in particular the ones for Atheros wireless NICs, don't generate the event until after a device interrupt occurs at some point after MiniportInitialize() has completed. The gotcha is that you have to wait until the link status event occurs one way or the other before you try to fiddle with any settings (ssid, channel, etc...). For the drivers that set the event sycnhronously this isn't a problem, but for the others we have to pause after calling ndis_init_nic() and wait for the event to arrive before continuing. Failing to wait can cause big trouble: on my SMP system, calling ndis_setstate_80211() after ndis_init_nic() completes, but _before_ the link event arrives, will lock up or reset the system. What we do now is check to see if a link event arrived while ndis_init_nic() was running, and if it didn't we msleep() until it does. Along the way, I discovered a few other problems: - Defered procedure calls run at PASSIVE_LEVEL, not DISPATCH_LEVEL. ntoskrnl_run_dpc() has been fixed accordingly. (I read the documentation wrong.) - Similarly, the NDIS interrupt handler, which is essentially a DPC, also doesn't need to run at DISPATCH_LEVEL. ndis_intrtask() has been fixed accordingly. - MiniportQueryInformation() and MiniportSetInformation() run at DISPATCH_LEVEL, and each request must complete before another can be submitted. ndis_get_info() and ndis_set_info() have been fixed accordingly. - Turned the sleep lock that guards the NDIS thread job list into a spin lock. We never do anything with this lock held except manage the job list (no other locks are held), so it's safe to do this, and it's possible that ndis_sched() and ndis_unsched() can be called from DISPATCH_LEVEL, so using a sleep lock here is semantically incorrect. Also updated subr_witness.c to add the lock to the order list.	2005-03-07 03:05:31 +00:00
Maxim Sobolev	e3478fe000	Handle unimplemented syscall by instantly returning ENOSYS instead of sending signal first and only then returning ENOSYS to match what real linux does. PR: kern/74302 Submitted by: Travis Poppe <tlp@LiquidX.org>	2005-03-07 00:18:06 +00:00
Maxim Sobolev	996358f55c	Always produce cpuX entries, even in the case when there is only one CPU in the system. This is consistent with what real linuxes do. PR: kern/75848 Submitted by: Andriy Gapon <avg@icyb.net.ua> MFC after: 3 days	2005-03-06 22:28:14 +00:00
Bill Paul	7d962e5cc5	MAXPATHLEN is 1024, which means NdisOpenFile() and ndis_find_sym() were both consuming 1K of stack space. This is unfriendly. Allocate the buffers off the heap instead. It's a little slower, but these aren't performance critical routines. Also, add a spinlock to NdisAllocatePacketPool(), NdisAllocatePacket(), NdisFreePacketPool() and NdisFreePacket(). The pool is maintained as a linked list. I don't know for a fact that it can be corrupted, but why take chances.	2005-03-03 03:51:02 +00:00
John Baldwin	501ce30561	Remove linux_emul_find() and the CHECKALT*() macros as they are no longer used.	2005-03-01 17:57:45 +00:00
Paul Saab	b8a4edc17e	Use kern_kevent instead of the stackgap for 32bit syscall wrapping. Submitted by: jhb Tested on: amd64	2005-03-01 17:45:55 +00:00
Bill Paul	2628b0b7ab	In windrv_load(), I was allocating the driver object using malloc(sizeof(device_object), ...) by mistake. Correct this, and rename "dobj" to "drv" to make it a bit clearer what this variable is supposed to be. Spotted by: Mikore Li at Sun dot comnospamplzkthx	2005-03-01 17:21:25 +00:00
Paul Saab	5d83706b23	Ooops. I will compile test before committing. The stackgap version of kevent32 will be going away shortly, so this is temporary until I commit the non-stackgap version.	2005-03-01 13:50:57 +00:00
Paul Saab	a95e8cd364	Correct the freebsd32_kevent prototype.	2005-03-01 06:32:53 +00:00
Bill Paul	303ff38659	Don't need to do MmInitializeMdl() in ndis_mtop() anymore: IoInitializeMdl() does it internally (and doing it again here messes things up).	2005-02-26 07:11:17 +00:00
Bill Paul	a944e196da	MDLs are supposed to be variable size (they include an array of pages that describe a buffer of variable size). The problem is, allocating MDLs off the heap is slow, and it can happen that drivers will allocate lots and lots of lots of MDLs as they run. As a compromise, we now do the following: we pre-allocate a zone for MDLs big enough to describe any buffer with 16 or less pages. If IoAllocateMdl() needs a MDL for a buffer with 16 or less pages, we'll allocate it from the zone. Otherwise, we allocate it from the heap. MDLs allocate from the zone have a flag set in their mdl_flags field. When the MDL is released, IoMdlFree() will uma_zfree() the MDL if it has the MDL_ZONE_ALLOCED flag set, otherwise it will release it to the heap. The assumption is that 16 pages is a "big number" and we will rarely need MDLs larger than that. - Moved the ndis_buffer zone to subr_ntoskrnl.c from kern_ndis.c and named it mdl_zone. - Modified IoAllocateMdl() and IoFreeMdl() to use uma_zalloc() and uma_zfree() if necessary. - Made ndis_mtop() use IoAllocateMdl() instead of calling uma_zalloc() directly. Inspired by: discussion with Giridhar Pemmasani	2005-02-26 00:22:16 +00:00
Sam Leffler	960f641e6d	fixup signal mapping: o change the mapping arrays to have a zero offset rather than base 1; this eliminates lots of signo adjustments and brings the code back inline with the original netbsd code o purge use of SVR4_SIGTBLZ; SVR4_NSIG is the only definition for how big a mapping array is o change the mapping loops to explicitly ignore signal 0 o purge some bogus code from bsd_to_svr4_sigset o adjust svr4_sysentvec to deal with the mapping table change Enticed into fixing by: Coverity Prevent analysis tool Glanced at by: marcel, jhb	2005-02-25 19:34:10 +00:00
Bill Paul	849bcccac9	Add macros to construct Windows IOCTL codes, and to extract function codes from an IOCTL. (The USB module will need them later.)	2005-02-25 18:25:48 +00:00
Bill Paul	ed7003a9f3	Fix a couple of callback instances that should have been wrapped with MSCALLx(). Add definition for STATUS_PENDING error code.	2005-02-25 08:34:32 +00:00
Bill Paul	17bd4e32e1	Compute the right length to use with bzero() when initializing an IRP in IoInitializeIrp() (must use IoSizeOfIrp() to account for the stack locations).	2005-02-25 06:31:45 +00:00
Bill Paul	63ba67b69c	- Correct one aspect of the driver_object/device_object/IRP framework: when we create a PDO, the driver_object associated with it is that of the parent driver, not the driver we're trying to attach. For example, if we attach a PCI device, the PDO we pass to the NdisAddDevice() function should contain a pointer to fake_pci_driver, not to the NDIS driver itself. For PCI or PCMCIA devices this doesn't matter because the child never needs to talk to the parent bus driver, but for USB, the child needs to be able to send IRPs to the parent USB bus driver, and for that to work the parent USB bus driver has to be hung off the PDO. This involves modifying windrv_lookup() so that we can search for bus drivers by name, if necessary. Our fake bus drivers attach themselves as "PCI Bus," "PCCARD Bus" and "USB Bus," so we can search for them using those names. The individual attachment stubs now create and attach PDOs to the parent bus drivers instead of hanging them off the NDIS driver's object, and in if_ndis.c, we now search for the correct driver object depending on the bus type, and use that to find the correct PDO. With this fix, I can get my sample USB ethernet driver to deliver an IRP to my fake parent USB bus driver's dispatch routines. - Add stub modules for USB support: subr_usbd.c, usbd_var.h and if_ndis_usb.c. The subr_usbd.c module is hooked up the build but currently doesn't do very much. It provides the stub USB parent driver object and a dispatch routine for IRM_MJ_INTERNAL_DEVICE_CONTROL. The only exported function at the moment is USBD_GetUSBDIVersion(). The if_ndis_usb.c stub compiles, but is not hooked up to the build yet. I'm putting these here so I can keep them under source code control as I flesh them out.	2005-02-24 21:49:14 +00:00
John Baldwin	ead6bc8265	Regen.	2005-02-24 18:24:29 +00:00
John Baldwin	ddcc2a3ff3	Use msync() to implement msync() for freebsd32 emulation. This isn't quite right for certain MAP_FIXED mappings on ia64 but it will work fine for all other mappings and works fine on amd64. Requested by: ps, Christian Zander MFC after: 1 week	2005-02-24 18:24:16 +00:00
Bill Paul	70211f5ef5	Couple of lessons learned during USB driver testing: - In kern_ndis.c:ndis_unload_driver(), test that ndis_block->nmb_rlist is not NULL before trying to free() it. - In subr_pe.c:pe_get_import_descriptor(), do a case-insensitive match on the import module name. Most drivers I have encountered link against "ntoskrnl.exe" but the ASIX USB ethernet driver I'm testing with wants "NTOSKRNL.EXE." - In subr_ntoskrnl.c:IoAllocateIrp(), return a pointer to the IRP instead of NULL. (Stub code leftover.) - Also in subr_ntoskrnl.c, add ExAllocatePoolWithTag() and ExFreePool() to the function table list so they'll get exported to drivers properly.	2005-02-24 17:58:27 +00:00
Bill Paul	d3e4cd0609	Implement IoCancelIrp(), IoAcquireCancelSpinLock(), IoReleaseCancelSpinLock() and a machine-independent though inefficient InterlockedExchange(). In Windows, InterlockedExchange() appears to be implemented in header files via inline assembly. I would prefer using an atomic.h macro for this, but there doesn't seem to be one that just does a plain old atomic exchange (as opposed to compare and exchange). Also implement IoSetCancelRoutine(), which is just a macro that uses InterlockedExchange(). Fill in IoBuildSynchronousFsdRequest(), IoBuildAsynchronousFsdRequest() and IoBuildDeviceIoControlRequest() so that they do something useful, and add a bunch of #defines to ntoskrnl_var.h to help make these work. These may require some tweaks later.	2005-02-23 16:44:33 +00:00
Poul-Henning Kamp	1e247cc2ce	Neuter linux_ustat() until somebody finds time to try to fix it. The fundamental problem is that we get only the lower 8 bits of the minor device number so there is no guarantee that we can actually find the disk device in question at all. This was probably a bigger issue pre-GEOM where the upper bits signaled which slice were in use. The secondary problem is how we get from (partial) dev_t to vnode. The correct implementation will involve traversing the mount list looking for a perfect match or a possible match (for truncated minor).	2005-02-22 13:39:46 +00:00
Sam Leffler	1ca1ea77be	remove dead code Submitted by: Coverity Prevent analysis tool	2005-02-22 01:26:48 +00:00
John Baldwin	38765a3178	- Add a custom version of exec_copyin_args() to deal with the 32-bit pointers in argv and envv in userland and use that together with kern_execve() and exec_free_args() to implement freebsd32_execve() without using the stackgap. - Fix freebsd32_adjtime() to call adjtime() rather than utimes(). Still uses stackgap for now. - Use kern_setitimer(), kern_getitimer(), kern_select(), kern_utimes(), kern_statfs(), kern_fstatfs(), kern_fhstatfs(), kern_stat(), kern_fstat(), and kern_lstat(). Tested by: cokane (amd64) Silence on: amd64, ia64	2005-02-18 18:56:04 +00:00
Bill Paul	e6f328fb03	Fix a couple of u_int_foos that should have been uint_foos.	2005-02-18 04:33:34 +00:00
Bill Paul	6e121c5427	Make the Win64 -> ELF64 template a little smaller by using a string copy op to shift arguments on the stack instead of transfering each argument one by one through a register. Probably doesn't affect overall operation, but makes the code a little less grotty and easier to update later if I choose to make the wrapper handle more args. Also add comments.	2005-02-18 03:22:37 +00:00
Bill Paul	2b0dcd6b18	Remove redundant label.	2005-02-16 21:24:04 +00:00
Bill Paul	513c5292f8	Fix freeing of custom driver extensions. (ExFreePool() was being called with the wrong pointer.)	2005-02-16 19:21:07 +00:00
Bill Paul	2adbfd5436	KeAcquireSpinLockRaiseToDpc() and KeReleaseSpinLock() are (at least for now) exactly the same as KfAcquireSpinLock() and KfReleaseSpinLock(). I implemented the former as small routines in subr_ntoskrnl.c that just turned around and invoked the latter. But I don't really need the wrapper routines: I can just create an entries in the ntoskrnl func table that map KeAcquireSpinLockRaiseToDpc() and KeReleaseSpinLock() to KfAcquireSpinLock() and KfReleaseSpinLock() directly. This means the stubs can go away.	2005-02-16 18:18:30 +00:00
Bill Paul	d8f2dda739	Add support for Windows/x86-64 binaries to Project Evil. Ville-Pertti Keinonen (will at exomi dot comohmygodnospampleasekthx) deserves a big thanks for submitting initial patches to make it work. I have mangled his contributions appropriately. The main gotcha with Windows/x86-64 is that Microsoft uses a different calling convention than everyone else. The standard ABI requires using 6 registers for argument passing, with other arguments on the stack. Microsoft uses only 4 registers, and requires the caller to leave room on the stack for the register arguments incase the callee needs to spill them. Unlike x86, where Microsoft uses a mix of _cdecl, _stdcall and _fastcall, all routines on Windows/x86-64 uses the same convention. This unfortunately means that all the functions we export to the driver require an intermediate translation wrapper. Similarly, we have to wrap all calls back into the driver binary itself. The original patches provided macros to wrap every single routine at compile time, providing a secondary jump table with a customized wrapper for each exported routine. I decided to use a different approach: the call wrapper for each function is created from a template at runtime, and the routine to jump to is patched into the wrapper as it is created. The subr_pe module has been modified to patch in the wrapped function instead of the original. (On x86, the wrapping routine is a no-op.) There are some minor API differences that had to be accounted for: - KeAcquireSpinLock() is a real function on amd64, not a macro wrapper around KfAcquireSpinLock() - NdisFreeBuffer() is actually IoFreeMdl(). I had to change the whole NDIS_BUFFER API a bit to accomodate this. Bugs fixed along the way: - IoAllocateMdl() always returned NULL - kern_windrv.c:windrv_unload() wasn't releasing private driver object extensions correctly (found thanks to memguard) This has only been tested with the driver for the Broadcom 802.11g chipset, which was the only Windows/x86-64 driver I could find.	2005-02-16 05:41:18 +00:00
Nate Lawson	1e8d246eee	Unbreak the kernel build. Pointy hat to: sobomax.	2005-02-13 19:50:57 +00:00
Maxim Sobolev	1a88a252fd	Backout previous change (disabling of security checks for signals delivered in emulation layers), since it appears to be too broad. Requested by: rwatson	2005-02-13 17:37:20 +00:00
Maxim Sobolev	d8ff44b79f	Split out kill(2) syscall service routine into user-level and kernel part, the former is callable from user space and the latter from the kernel one. Make kernel version take additional argument which tells if the respective call should check for additional restrictions for sending signals to suid/sugid applications or not. Make all emulation layers using non-checked version, since signal numbers in emulation layers can have different meaning that in native mode and such protection can cause misbehaviour. As a result remove LIBTHR from the signals allowed to be delivered to a suid/sugid application. Requested (sorta) by: rwatson MFC after: 2 weeks	2005-02-13 16:42:08 +00:00
Maxim Sobolev	282fae35d6	Semctl with IPC_STAT command should return zero in case of success. PR: 73778 Submitted by: Andriy Gapon <avg@icyb.net.ua> MFC after: 2 weeks	2005-02-11 13:46:55 +00:00
Bill Paul	b545a3b822	Next step on the road to IRPs: create and use an imitation of the Windows DRIVER_OBJECT and DEVICE_OBJECT mechanism so that we can simulate driver stacking. In Windows, each loaded driver image is attached to a DRIVER_OBJECT structure. Windows uses the registry to match up a given vendor/device ID combination with a corresponding DRIVER_OBJECT. When a driver image is first loaded, its DriverEntry() routine is invoked, which sets up the AddDevice() function pointer in the DRIVER_OBJECT and creates a dispatch table (based on IRP major codes). When a Windows bus driver detects a new device, it creates a Physical Device Object (PDO) for it. This is a DEVICE_OBJECT structure, with semantics analagous to that of a device_t in FreeBSD. The Windows PNP manager will invoke the driver's AddDevice() function and pass it pointers to the DRIVER_OBJECT and the PDO. The AddDevice() function then creates a new DRIVER_OBJECT structure of its own. This is known as the Functional Device Object (FDO) and corresponds roughly to a private softc instance. The driver uses IoAttachDeviceToDeviceStack() to add this device object to the driver stack for this PDO. Subsequent drivers (called filter drivers in Windows-speak) can be loaded which add themselves to the stack. When someone issues an IRP to a device, it travel along the stack passing through several possible filter drivers until it reaches the functional driver (which actually knows how to talk to the hardware) at which point it will be completed. This is how Windows achieves driver layering. Project Evil now simulates most of this. if_ndis now has a modevent handler which will use MOD_LOAD and MOD_UNLOAD events to drive the creation and destruction of DRIVER_OBJECTs. (The load event also does the relocation/dynalinking of the image.) We don't have a registry, so the DRIVER_OBJECTS are stored in a linked list for now. Eventually, the list entry will contain the vendor/device ID list extracted from the .INF file. When ndis_probe() is called and detectes a supported device, it will create a PDO for the device instance and attach it to the DRIVER_OBJECT just as in Windows. ndis_attach() will then call our NdisAddDevice() handler to create the FDO. The NDIS miniport block is now a device extension hung off the FDO, just as it is in Windows. The miniport characteristics table is now an extension hung off the DRIVER_OBJECT as well (the characteristics are the same for all devices handled by a given driver, so they don't need to be per-instance.) We also do an IoAttachDeviceToDeviceStack() to put the FDO on the stack for the PDO. There are a couple of fake bus drivers created for the PCI and pccard buses. Eventually, there will be one for USB, which will actually accept USB IRP.s Things should still work just as before, only now we do things in the proper order and maintain the correct framework to support passing IRPs between drivers. Various changes: - corrected the comments about IRQL handling in subr_hal.c to more accurately reflect reality - update ndiscvt to make the drv_data symbol in ndis_driver_data.h a global so that if_ndis_pci.o and/or if_ndis_pccard.o can see it. - Obtain the softc pointer from the miniport block by referencing the PDO rather than a private pointer of our own (nmb_ifp is no longer used) - implement IoAttachDeviceToDeviceStack(), IoDetachDevice(), IoGetAttachedDevice(), IoAllocateDriverObjectExtension(), IoGetDriverObjectExtension(), IoCreateDevice(), IoDeleteDevice(), IoAllocateIrp(), IoReuseIrp(), IoMakeAssociatedIrp(), IoFreeIrp(), IoInitializeIrp() - fix a few mistakes in the driver_object and device_object definitions - add a new module, kern_windrv.c, to handle the driver registration and relocation/dynalinkign duties (which don't really belong in kern_ndis.c). - made ndis_block and ndis_chars in the ndis_softc stucture pointers and modified all references to it - fixed NdisMRegisterMiniport() and NdisInitializeWrapper() so they work correctly with the new driver_object mechanism - changed ndis_attach() to call NdisAddDevice() instead of ndis_load_driver() (which is now deprecated) - used ExAllocatePoolWithTag()/ExFreePool() in lookaside list routines instead of kludged up alloc/free routines - added kern_windrv.c to sys/modules/ndis/Makefile and files.i386.	2005-02-08 17:23:25 +00:00
John Baldwin	c87b5f76aa	- Implement svr4_emul_find() using kern_alternate_path(). This changes the semantics in that the returned filename to use is now a kernel pointer rather than a user space pointer. This required changing the arguments to the CHECKALT*() macros some and changing the various system calls that used pathnames to use the kern_foo() functions that can accept kernel space filename pointers instead of calling the system call directly. - Use kern_open(), kern_access(), kern_msgctl(), kern_execve(), kern_mkfifo(), kern_mknod(), kern_statfs(), kern_fstatfs(), kern_setitimer(), kern_stat(), kern_lstat(), kern_fstat(), kern_utimes(), kern_pathconf(), and kern_unlink().	2005-02-07 21:53:42 +00:00
John Baldwin	f7a2587298	- Use kern_{l,f,}stat() and kern_{f,}statfs() functions rather than duplicating the contents of the same functions inline. - Consolidate common code to convert a BSD statfs struct to a Linux struct into a static worker function.	2005-02-07 18:47:28 +00:00
John Baldwin	25771ec2a4	Make linux_emul_convpath() a simple wrapper for kern_alternate_path().	2005-02-07 18:46:05 +00:00
John Baldwin	76951d21d1	- Tweak kern_msgctl() to return a copy of the requested message queue id structure in the struct pointed to by the 3rd argument for IPC_STAT and get rid of the 4th argument. The old way returned a pointer into the kernel array that the calling function would then access afterwards without holding the appropriate locks and doing non-lock-safe things like copyout() with the data anyways. This change removes that unsafeness and resulting race conditions as well as simplifying the interface. - Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(), fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of code duplication for freebsd4 and netbsd compatability system calls. - Add a new lookup function kern_alternate_path() that looks up a filename under an alternate prefix and determines which filename should be used. This is basically a more general version of linux_emul_convpath() that can be shared by all the ABIs thus allowing for further reduction of code duplication.	2005-02-07 18:44:55 +00:00
John Baldwin	12dd959a7d	Use kern_setitimer() to implement linux_alarm() instead of fondling the real interval timer directly.	2005-02-07 18:36:21 +00:00
Maxim Sobolev	4379219537	Boot away another stackgap (one of the lest ones in linuxlator/i386) by providing special version of CDIOCREADSUBCHANNEL ioctl(), which assumes that result has to be placed into kernel space not user space. In the long run more generic solution has to be designed WRT emulating various ioctl()s that operate on userspace buffers, but right now there is only one such ioctl() is emulated, so that it makes little sense. MFC after: 2 weeks	2005-01-30 08:12:37 +00:00
Maxim Sobolev	a6886ef173	Extend kern_sendit() to take another enum uio_seg argument, which specifies where the buffer to send lies and use it to eliminate yet another stackgap in linuxlator. MFC after: 2 weeks	2005-01-30 07:20:36 +00:00
Maxim Sobolev	610ecfe035	o Split out kernel part of execve(2) syscall into two parts: one that copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks	2005-01-29 23:12:00 +00:00
Maxim Sobolev	f4b6eb045f	Split out kernel side of msgctl(2) into two parts: the first that pops data from the userland and pushes results back and the second which does actual processing. Use the latter to eliminate stackgap in the linux wrapper of that syscall. MFC after: 2 weeks	2005-01-26 00:46:36 +00:00
Maxim Sobolev	cfa0efe7ab	Split out kernel side of {get,set}itimer(2) into two parts: the first that pops data from the userland and pushes results back and the second which does actual processing. Use the latter to eliminate stackgap in the linux wrappers of those syscalls. MFC after: 2 weeks	2005-01-25 21:28:28 +00:00
Bill Paul	26805b1855	Apparently, the Intel icc compiler doesn't like it when you use attributes in casts (i.e. foo = (__stdcall sometype)bar). This only happens in two places where we need to set up function pointers, so work around the problem with some void pointer magic.	2005-01-25 17:00:54 +00:00
Bill Paul	df7b7cf4c3	Begin the first phase of trying to add IRP support (and ultimately USB device support): - Convert all of my locally chosen function names to their actual Windows equivalents, where applicable. This is a big no-op change since it doesn't affect functionality, but it helps avoid a bit of confusion (it's now a lot easier to see which functions are emulated Windows API routines and which are just locally defined). - Turn ndis_buffer into an mdl, like it should have been. The structure is the same, but now it belongs to the subr_ntoskrnl module. - Implement a bunch of MDL handling macros from Windows and use them where applicable. - Correct the implementation of IoFreeMdl(). - Properly implement IoAllocateMdl() and MmBuildMdlForNonPagedPool(). - Add the definitions for struct irp and struct driver_object. - Add IMPORT_FUNC() and IMPORT_FUNC_MAP() macros to make formatting the module function tables a little cleaner. (Should also help with AMD64 support later on.) - Fix if_ndis.c to use KeRaiseIrql() and KeLowerIrql() instead of the previous calls to hal_raise_irql() and hal_lower_irql() which have been renamed. The function renaming generated a lot of churn here, but there should be very little operational effect.	2005-01-24 18:18:12 +00:00
Paul Saab	0e214fad37	Add a 32bit syscall wrapper for modstat Obtained from: Yahoo!	2005-01-19 17:53:06 +00:00
Paul Saab	7fdf2c856f	- rename nanosleep1 to kern_nanosleep - Add a 32bit syscall entry for nanosleep Reviewed by: peter Obtained from: Yahoo!	2005-01-19 17:44:59 +00:00
Bill Paul	52378c7ead	Fix a problem reported by Pierre Beyssac. Sometinmes when ndis_get_info() calls MiniportQueryInformation(), it will return NDIS_STATUS_PENDING. When this happens, ndis_get_info() will sleep waiting for a completion event. If two threads call ndis_get_info() and both end up having to sleep, they will both end up waiting on the same wait channel, which can cause a panic in sleepq_add() if INVARIANTS are turned on. Fix this by having ndis_get_info() use a common mutex rather than using the process mutex with PROC_LOCK(). Also do the same for ndis_set_info(). Note that Pierre's original patch also made ndis_thsuspend() use the new mutex, but ndis_thsuspend() shouldn't need this since it will make each thread that calls it sleep on a unique wait channel. Also, it occured to me that we probably don't want to enter MiniportQueryInformation() or MiniportSetInformation() from more than one thread at any given time, so now we acquire a Windows spinlock before calling either of them. The Microsoft documentation says that MiniportQueryInformation() and MiniportSetInformation() are called at DISPATCH_LEVEL, and previously we would call KeRaiseIrql() to set the IRQL to DISPATCH_LEVEL before entering either routine, but this only guarantees mutual exclusion on uniprocessor machines. To make it SMP safe, we need to use a real spinlock. For now, I'm abusing the spinlock embedded in the NDIS_MINIPORT_BLOCK structure for this purpose. (This may need to be applied to some of the other routines in kern_ndis.c at a later date.) Export ntoskrnl_init_lock() (KeInitializeSpinlock()) from subr_ntoskrnl.c since we need to use in in kern_ndis.c, and since it's technically part of the Windows kernel DDK API along with the other spinlock routines. Use it in subr_ndis.c too rather than frobbing the spinlock directly.	2005-01-14 22:39:44 +00:00
David E. O'Brien	1997c537be	Match the LINUX32's style with existing style Submitted by: Jung-uk Kim <jkim@niksun.com> Use positive, not negative logic.	2005-01-14 04:44:56 +00:00
David E. O'Brien	9c0552ce3e	Fix Linux compat 'uname -m' on AMD64. Submitted by: Jung-uk Kim <jkim@niksun.com> (patch reworked by me)	2005-01-14 03:45:26 +00:00
Poul-Henning Kamp	fc5571cc25	Remove duplicate code.	2005-01-13 19:27:28 +00:00
Warner Losh	898b0535b7	Start each of the license/copyright comments with /*-	2005-01-05 22:34:37 +00:00
John Baldwin	c88379381b	- Move the function prototypes for kern_setrlimit() and kern_wait() to sys/syscallsubr.h where all the other kern_foo() prototypes live. - Resort kern_execve() while I'm there.	2005-01-05 22:19:44 +00:00
John Baldwin	d7d1139749	Regenerate.	2005-01-04 18:54:40 +00:00
John Baldwin	20ae37df8c	Partial sync up to the master syscalls.master file: - Mark mount, unmount and nmount MPSAFE. - Add a stub for _umtx_op(). - Mark open(), link(), unlink(), and freebsd32_sigaction() MPSAFE. Pointy hats to: several	2005-01-04 18:53:32 +00:00
John Baldwin	63710c4d35	Stop explicitly touching td_base_pri outside of the scheduler and simply set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.	2004-12-30 20:29:58 +00:00
Poul-Henning Kamp	c9b621fb98	Do not blindly pass linux filesystem specific mount data across.	2004-12-03 18:14:22 +00:00
Colin Percival	691b3b0df9	Fix unvalidated pointer dereference. This is FreeBSD-SA-04:17.procfs.	2004-12-01 21:33:02 +00:00
David Schultz	6004362e66	Don't include sys/user.h merely for its side-effect of recursively including other headers.	2004-11-27 06:51:39 +00:00
David Schultz	d3adf76902	Axe the semblance of support for PECOFF and Linux a.out core dumps.	2004-11-27 06:46:45 +00:00
Poul-Henning Kamp	f8524838b9	Ignore MNT_NODEV option, it is implicit in choice of filesystem.	2004-11-26 07:39:20 +00:00
David Schultz	0ef5c36ff1	Maintain the broken state of backwards compatibilty for a.out (and PECOFF!) core dumps. None of the old versions of gdb I tried were able to read a.out core dumps before or after this change. Reviewed by: arch@	2004-11-20 02:32:04 +00:00
Mark Santcroos	463b173e50	Rebuild from compat/freebsd32/syscalls.master:1.43 Reviewed by: imp, phk, njl, peter Approved by: njl	2004-11-18 23:56:09 +00:00
Mark Santcroos	f16ab45fbc	32-bit FreeBSD ABI compatibility stubs from syscalls.master:1.179 Reviewed by: imp, phk, njl, peter Approved by: njl	2004-11-18 23:54:26 +00:00
Poul-Henning Kamp	124e4c3be8	Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST. Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.	2004-11-13 11:53:02 +00:00
Poul-Henning Kamp	7689860fd5	Pick up the inode number using VOP_GETATTR() rather than caching it in all vnodes on the off chance that linprocfs needs it. If we can afford to call vn_fullpath() we can afford the much cheaper VOP_GETATTR().	2004-11-10 07:25:37 +00:00
Poul-Henning Kamp	0ac3a7f694	More sensible FILEDESC_ locking.	2004-11-07 15:59:27 +00:00
Robert Watson	a4bde6f695	Rebuild from FreeBSD32 syscalls.master:1.42.	2004-10-23 20:05:42 +00:00
Robert Watson	8e36528346	32-bit FreeBSD ABI compatibility stubs from syscalls.master:1.178.	2004-10-23 20:04:56 +00:00
Peter Wemm	a7bc3102c4	Put on my peril sensitive sunglasses and add a flags field to the internal sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.	2004-10-11 22:04:16 +00:00
David Malone	08de85f54a	Rename thread args to be called "td" rather than "p" to be consistent with other bits of this file. There should be no functional change. Submitted by: Andrea Campi (many moons ago) MFC after: 2 month	2004-10-10 18:34:30 +00:00
Mike Makonnen	401901ac43	Close a race between a thread exiting and the freeing of it's stack. After some discussion the best option seems to be to signal the thread's death from within the kernel. This requires that thr_exit() take an argument. Discussed with: davidxu, deischen, marcel MFC after: 3 days	2004-10-06 14:23:00 +00:00
John Baldwin	78c85e8dfc	Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month	2004-10-05 18:51:11 +00:00
John Baldwin	4afec35169	Add a proc *p pointer for td->td_proc to make this code easier to read.	2004-09-24 20:26:15 +00:00
Poul-Henning Kamp	f69f5fbd42	Hold thread reference while frobbing cdevsw.	2004-09-24 06:37:00 +00:00
John Baldwin	7eaec467d8	Various small style fixes.	2004-09-22 15:24:33 +00:00
Bruce M Simpson	6120a003b4	Fix compiler warnings, when __stdcall is #defined, by adding explicit casts. These normally only manifest if the ndis compat module is statically compiled into a kernel image by way of 'options NDISAPI'. Submitted by: Dmitri Nikulin Approved by: wpaul PR: kern/71449 MFC after: 1 week	2004-09-17 19:54:26 +00:00
John Baldwin	8a7aa72dec	Regenerate after fcntl() wrappers were marked MP safe.	2004-08-24 20:24:34 +00:00
John Baldwin	2ca25ab53e	Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl() directly. This removes a few more users of the stackgap and also marks the syscalls using these wrappers MP safe where appropriate. Tested on: i386 with linux acroread5 Compiled on: i386, alpha LINT	2004-08-24 20:21:21 +00:00
Dag-Erling Smørgrav	72261b9f61	Don't try to translate the control message unless we're certain it's valid; otherwise a caller could trick us into changing any 32-bit word in kernel memory to LINUX_SOL_SOCKET (0x00000001) if its previous value is SOL_SOCKET (0x0000ffff). MFC after: 3 days	2004-08-23 12:41:29 +00:00
Bill Paul	ae58ccaa60	I'm a dumbass: remember to initialize fh->nf_map to NULL in ndis_open_file() in the module loading case.	2004-08-16 19:25:27 +00:00
Bill Paul	161a639981	The Texas Instruments ACX111 driver wants srand(), so provide it.	2004-08-16 18:52:37 +00:00
Bill Paul	f454f98c31	Make the Texas Instruments 802.11g chipset work with the NDISulator. This was tested with a Netgear WG311v2 802.11b/g PCI card. Things that were fixed: - This chip has two memory mapped regions, one at PCIR_BAR(0) and the other at PCIR_BAR(1). This is a little different from the other chips I've seen with two PCI shared memory regions, since they tend to have the second BAR ad PCIR_BAR(2). if_ndis_pci.c tests explicitly for PCIR_BAR(2). This has been changed to simply fill in ndis_res_mem first and ndis_res_altmem second, if a second shared memory range exists. Given that NDIS drivers seem to scan for BARs in ascending order, I think this should be ok. - Fixed the code that tries to process firmware images that have been loaded as .ko files. To save a step, I was setting up the address mapping in ndis_open_file(), but ndis_map_file() flags pre-existing mappings as an error (to avoid duplicate mappings). Changed this so that the mapping is now donw in ndis_map_file() as expected. - Made the typedef for 'driver_entry' explicitly include __stdcall to silence gcc warning in ndis_load_driver(). NOTE: the Texas Instruments ACX111 driver needs firmware. With my card, there were 3 .bin files shipped with the driver. You must either put these files in /compat/ndis or convert them with ndiscvt -f and kldload them so the driver can use them. Without the firmware image, the NIC won't work.	2004-08-16 18:50:20 +00:00
David E. O'Brien	b61c60d401	Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.	2004-08-16 12:15:07 +00:00
David E. O'Brien	4a16b489ca	Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.	2004-08-16 11:12:57 +00:00
David E. O'Brien	3a2e3a4aa7	Fix the 'DEBUG' argument code to unbreak the LINT build.	2004-08-16 10:36:12 +00:00
Tim J. Robbins	84880f87d0	Add support for 32-bit Linux binary emulation on amd64: - include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h> if building with the COMPAT_LINUX32 option. - make minimal changes to the i386 linprocfs_docpuinfo() function to support amd64. We return a fake CPU family of 6 for now.	2004-08-16 08:19:18 +00:00
Tim J. Robbins	4af2762336	Changes to MI Linux emulation code necessary to run 32-bit Linux binaries on AMD64, and the general case where the emulated platform has different size pointers than we use natively: - declare certain structure members as l_uintptr_t and use the new PTRIN and PTROUT macros to convert to and from native pointers. - declare some structures __packed on amd64 when the layout would differ from that used on i386. - include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h> if compiling with COMPAT_LINUX32. This will need to be revisited before 32-bit and 64-bit Linux emulation support can coexist in the same kernel. - other small scattered changes. This should be a no-op on i386 and Alpha.	2004-08-16 07:28:16 +00:00
Tim J. Robbins	ae8e14a6ac	Replace linux_getitimer() and linux_setitimer() with implementations based on those in freebsd32_misc.c, removing the assumption that Linux uses the same layout for struct itimerval as we use natively.	2004-08-15 12:34:15 +00:00
Tim J. Robbins	d1d6dbf120	Avoid assuming that l_timeval is the same as the native struct timeval in linux_select().	2004-08-15 12:24:05 +00:00
Tim J. Robbins	6fa534bad8	Use sv_psstrings from the current process's sysentvec structure instead of PS_STRINGS. This is a no-op at present, but it will be needed when running 32-bit Linux binaries on amd64 to ensure PS_STRINGS is in addressable memory.	2004-08-15 11:52:45 +00:00
Poul-Henning Kamp	41befa53a4	Add XXX comment about findcdev() misuse.	2004-08-14 08:38:17 +00:00
Marcel Moolenaar	4da47b2fec	Add __elfN(dump_thread). This function is called from __elfN(coredump) to allow dumping per-thread machine specific notes. On ia64 we use this function to flush the dirty registers onto the backingstore before we write out the PRSTATUS notes. Tested on: alpha, amd64, i386, ia64 & sparc64 Not tested on: arm, powerpc	2004-08-11 02:35:06 +00:00
Bill Paul	6f4481422e	More minor cleanups and one small bug fix: - In ntoskrnl_var.h, I had defined compat macros for ntoskrnl_acquire_spinlock() and ntoskrnl_release_spinlock() but never used them. This is fortunate since they were stale. Fix them to work properly. (In Windows/x86 KeAcquireSpinLock() is a macro that calls KefAcquireSpinLock(), which lives in HAL.dll. To imitate this, ntoskrnl_acquire_spinlock() is just a macro that calls hal_lock(), which lives in subr_hal.o.) - Add macros for ntoskrnl_raise_irql() and ntoskrnl_lower_irql() that call hal_raise_irql() and hal_lower_irql(). - Use these macros in kern_ndis.c, subr_ndis.c and subr_ntoskrnl.c. - Along the way, I realised subr_ndis.c:ndis_lock() was not calling hal_lock() correctly (it was using the FASTCALL2() wrapper when in reality this routine is FASTCALL1()). Using the ntoskrnl_acquire_spinlock() fixes this. Not sure if this actually caused any bugs since hal_lock() would have just ignored what was in %edx, but it was still bogus. This hides many of the uses of the FASTCALLx() macros which makes the code a little cleaner. Should not have any effect on generated object code, other than the one fix in ndis_lock().	2004-08-04 18:22:50 +00:00
Bill Paul	20b03f4992	In ndis_alloc_bufpool() and ndis_alloc_packetpool(), the test to see if allocating pool memory succeeded was checking the wrong pointer (should have been looking at *pool, not pool). Corrected this.	2004-08-01 21:15:29 +00:00
Bill Paul	f13b900a9e	Big mess 'o changes: - Give ndiscvt(8) the ability to process a .SYS file directly into a .o file so that we don't have to emit big messy char arrays into the ndis_driver_data.h file. This behavior is currently optional, but may become the default some day. - Give ndiscvt(8) the ability to turn arbitrary files into .ko files so that they can be pre-loaded or kldloaded. (Both this and the previous change involve using objcopy(1)). - Give NdisOpenFile() the ability to 'read' files out of kernel memory that have been kldloaded or pre-loaded, and disallow the use of the normal vn_open() file opening method during bootstrap (when no filesystems have been mounted yet). Some people have reported that kldloading if_ndis.ko works fine when the system is running multiuser but causes a panic when the modile is pre-loaded by /boot/loader. This happens with drivers that need to use NdisOpenFile() to access external files (i.e. firmware images). NdisOpenFile() won't work during kernel bootstrapping because no filesystems have been mounted. To get around this, you can now do the following: o Say you have a firmware file called firmware.img o Do: ndiscvt -f firmware.img -- this creates firmware.img.ko o Put the firmware.img.ko in /boot/kernel o add firmware.img_load="YES" in /boot/loader.conf o add if_ndis_load="YES" and ndis_load="YES" as well Now the loader will suck the additional file into memory as a .ko. The phony .ko has two symbols in it: filename_start and filename_end, which are generated by objcopy(1). ndis_open_file() will traverse each module in the module list looking for these symbols and, if it finds them, it'll use them to generate the file mapping address and length values that the caller of NdisOpenFile() wants. As a bonus, this will even work if the file has been statically linked into the kernel itself, since the "kernel" module is searched too. (ndiscvt(8) will generate both filename.o and filename.ko for you). - Modify the mechanism used to provide make-pretend FASTCALL support. Rather than using inline assembly to yank the first two arguments out of %ecx and %edx, we now use the __regparm__(3) attribute (and the __stdcall__ attribute) and use some macro magic to re-order the arguments and provide dummy arguments as needed so that the arguments passed in registers end up in the right place. Change taken from DragonflyBSD version of the NDISulator.	2004-08-01 20:04:31 +00:00
Poul-Henning Kamp	ebb48ffd65	Use kernel_vmount() instead of vfs_nmount().	2004-07-27 21:38:42 +00:00
Colin Percival	56f21b9d74	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
Bill Paul	020732be39	sigh Fix source code compatibility with 5.2.1-RELEASE _again_. (Make kdb stuff conditional.)	2004-07-20 20:28:57 +00:00
David Malone	fb75797e40	I missed two pieces of the commit to this file. Robert has already added one, this adds the other.	2004-07-18 09:26:34 +00:00
Robert Watson	38da2381cd	Remove 'sg' argument to linux_sendto_hdrincl, which is what I think was intended. This fixes the build, but might require revision.	2004-07-18 04:09:40 +00:00

... 2 3 4 5 6 ...

1225 Commits