flag so that it can see if the message string is unicode or not and
do the conversion itself rather than doing it in subr_pe.c. This
prevents subr_pe.c from being dependent on subr_ndis.c.
the RT_MESSAGETABLE resources that some driver binaries have.
This allows us to print error messages in ndis_syslog().
- Correct the implementation of InterlockedIncrement() and
InterlockedDecrement() -- they return uint32_t, not void.
- Correct the declarations of the 64-bit arithmetic shift
routines in subr_ntoskrnl.c (_allshr, allshl, etc...). These
do not follow the _stdcall convention: instead, they appear
to be __attribute__((regparm(3)).
- Change the implementation of KeInitializeSpinLock(). There is
no complementary KeFreeSpinLock() function, so creating a new
mutex on each call to KeInitializeSpinLock() leaks resources
when a driver is unloaded. For now, KeInitializeSpinLock()
returns a handle to the ntoskrnl interlock mutex.
- Use a driver's MiniportDisableInterrupt() and MiniportEnableInterrupt()
routines if they exist. I'm not sure if I'm doing this right
yet, but at the very least this shouldn't break any currently
working drivers, and it makes the Intel PRO/1000 driver work.
- In ndis_register_intr(), save some state that might be needed
later, and save a pointer to the driver's interrupt structure
in the ndis_miniport_block.
- Save a pointer to the driver image for use by ndis_syslog()
when it calls pe_get_message().
and MiniportHandleInterrupt() is fired off later via a task queue in
ndis_intrtask(). This more accurately follows the NDIS interrupt handling
model, where the ISR does a minimal amount of work in interrupt context
and the handler is defered and run at a lower priority.
Create a separate ndis_intrmtx mutex just for the guarding the ISR.
Modify NdisSynchronizeWithInterrupt() to aquire the ndis_intrmtx
mutex before invoking the synchronized procedure. (The purpose of
this function is to provide mutual exclusion for code that shares
variables with the ISR.)
Modify NdisMRegisterInterrupt() to save a pointer to the miniport
block in the ndis_miniport_interrupt structure so that
NdisSynchronizeWithInterrupt() can grab it later and derive
ndis_intrmtx from it.
driver was compiled with.
Remove debug printf from ndis_assicn_pcirsc(). It doesn't serve
much purpose.
Implement NdisMIndicateStatus() and NdisMIndicateStatusComplete()
as functions in subr_ndis.c. In NDIS 4.0, they were functions. In
NDIS 5.0 and later, they're just macros.
Allocate a few extra packets/buffers beyond what the driver asks
for since sometimes it seems they can lie about how many they really
need, and some extra stupid ones don't check to see if NdisAllocatePacket()
and/or NdisAllocateBuffer() actually succeed.
calling the haltfunc. If an interrupt is triggered by the init
or halt func, the IFF_UP flag must be set in order for us to be able
to service it.
In kern_ndis.c: implement a handler for NdisMSendResourcesAvailable()
(currently does nothing since we don't really need it).
In subr_ndis.c:
- Correct ndis_init_string() and ndis_unicode_to_ansi(),
which were both horribly broken.
- Implement NdisImmediateReadPciSlotInformation() and
NdisImmediateWritePciSlotInformation().
- Implement NdisBufferLength().
- Work around my first confirmed NDIS driver bug.
The SMC 9462 gigE driver (natsemi 83820-based copper)
incorrectly creates a spinlock in its DriverEntry()
routine and then destroys it in its MiniportHalt()
handler. This is wrong: spinlocks should be created
in MiniportInit(). In a Windows environment, this is
often not a problem because DriverEntry()/MiniportInit()
are called once when the system boots and MiniportHalt()
or the shutdown handler is called when the system halts.
With this stuff in place, this driver now seems to work:
ndis0: <SMC EZ Card 1000> port 0xe000-0xe0ff mem 0xda000000-0xda000fff irq 10 at device 9.0 on pci0
ndis0: assign PCI resources...
ndis_open_file("FLASH9.hex", 18446744073709551615)
ndis0: Ethernet address: 00:04:e2:0e:d3:f0
subr_ndis.c: implement NdisDprAllocatePacket() and NdisDprFreePacket()
(which are aliased to NdisAllocatePacket() and NdisFreePacket()), and
bump the value we return in ndis_mapreg_cnt() to something ridiculously
large, since some drivers apparently expect to be able to allocate
way more than just 64.
These changes allow the Level 1 1000baseSX driver to work for
the following card:
ndis0: <SMC TigerCard 1000 Adapter> port 0xe000-0xe0ff mem 0xda004000-0xda0043ff irq 10 at device 9.0 on pci0
ndis0: Ethernet address: 00:e0:29:6f:cc:04
This is already supported by the lge(4) driver, but I decided
to take a try at making the Windows driver that came with it work too,
since I still had the floppy diskette for it lying around.
the NTx86 section decoration).
subr_ndis.c: correct the behavior of ndis_query_resources(): if the
caller doesn't provide enough space to return the resources, tell it
how much it needs to provide and return an error.
subr_hal.c & subr_ntoskrnl.c: implement/stub a bunch of new routines;
ntoskrnl:
KefAcquireSpinLockAtDpcLevel
KefReleaseSpinLockFromDpcLevel
MmMapLockedPages
InterlockedDecrement
InterlockedIncrement
IoFreeMdl
KeInitializeSpinLock
HAL:
KfReleaseSpinLock
KeGetCurrentIrql
KfAcquireSpinLock
Lastly, correct spelling of "_aullshr" in the ntoskrnl functable.
copyrights to the inf parser files.
Add a -n flag to ndiscvt to allow the user to override the default
device name of NDIS devices. Instead of "ndis0, ndis1, etc..."
you can have "foo0, foo1, etc..." This allows you to have more than
one kind of NDIS device in the kernel at the same time.
Convert from printf() to device_printf() in if_ndis.c, kern_ndis.c
and subr_ndis.c.
Create UMA zones for ndis_packet and ndis_buffer structs allocated
on transmit. The zones are created and destroyed in the modevent
handler in kern_ndis.c.
printf() and UMA changes submitted by green@freebsd.org
peter and jhb: use __volatile__ to prevent gcc from possibly reordering
code, use a null inline instruction instead of a no-op movl (I would
have done this myself if I knew it was allowed) and combine two register
assignments into a single asm statement.
- if_ndis.c: set the NDIS_STATUS_PENDING flag on all outgoing packets
in ndis_start(), make the resource allocation code a little smarter
about how it selects the altmem range, correct a lock order reversal
in ndis_tick().
ndis_var.h
- In kern_ndis.c:ndis_send_packets(), avoid dereferencing NULL pointers
created when the driver's send routine immediately calls the txeof
handler (which releases the packets for us anyway).
- In if_ndis.c:ndis_80211_setstate(), implement WEP support.
method with something a little more intelligent: use BUS_GET_RESOURCE_LIST()
to run through all resources allocated to us and map them as needed. This
way we know exactly what resources need to be mapped and what their RIDs
are without having to guess. This simplifies both ndis_attach() and
ndis_convert_res(), and eliminates the unfriendly "ndisX: couldn't map
<foo>" messages that are sometimes emitted during driver load.
nb_size field in an ndis_buffer is meant to represent, but it does not
represent the original allocation size, so the sanity check doesn't
make any sense now that we're using the Windows-mandated initialization
method.
Among other things, this makes the following card work with the
NDISulator:
ndis0: <NETGEAR PA301 Phoneline10X PCI Adapter> mem 0xda004000-0xda004fff irq 10 at device 9.0 on pci0
This is that notoriously undocumented 10Mbps HomePNA Broadcom chipset
that people wanted support for many moons ago. Sadly, the only other
HomePNA NIC I have handy is a 1Mbps device, so I can't actually do
any 10Mbps performance tests, but it talks to my 1Mbps ADMtek card
just fine.
For received packets, an status of NDIS_STATUS_RESOURCES means we need
to copy the packet data and return the ndis_packet to the driver immediatel.
NDIS_STATUS_SUCCESS means we get to hold onto the packet, but we have
to set the status to NDIS_STATUS_PENDING so the driver knows we're
going to hang onto it for a while.
For transmit packets, NDIS_STATUS_PENDING means the driver will
asynchronously return the packet to us via the ndis_txeof() routine,
and NDIS_STATUS_SUCCESS means the driver sent the frame, and NDIS
(i.e. the OS) retains ownership of the packet and can free it
right away.
evaluate them. Whatever they're meant to do, they're doing it wrong.
Also:
- Clean up last bits of NULL fallout in subr_pe
- Don't let ndis_ifmedia_sts() do anything if the IFF_UP flag isn't set
- Implement NdisSystemProcessorCount() and NdisQueryMapRegisterCount().
packet being freed has NDIS_STATUS_PENDING in the status field of
the OOB data. Finish implementing the "alternative" packet-releasing
function so it doesn't crash.
For those that are curious about ndis0: <ORiNOCO 802.11abg ComboCard Gold>:
1123 packets transmitted, 1120 packets received, 0% packet loss
round-trip min/avg/max/stddev = 3.837/6.146/13.919/1.925 ms
Not bad!
The log message for rev.1.160 of kern/uipc_syscalls.c and associated
changes only claimed to add restrict qualifiers (which have no effect in
the kernel so they probably shouldn't be added), but the following
interface changes were also made:
- caddr_t to `void *' and `struct sockaddr_t *'
- `int *' to `socklen_t *'.
These interface changes are not quite null, and this fix is quick (like
the changes in uipc_syscalls 1.160) because it uses bogus casts instead
of complete bounds-checked conversions.
Things should be fixed better when the conversions can be done without
using the stack gap. linux_check_hdrincl() already uses the stack gap
and is fixed completely though the type mismatches in it were not fatal
(there were only fatal type mismatches from unopaquing pointers to
[o]sockaddr't's -- the difference between accept()'s args and oaccept()'s
args is now non-opaque, but this is not reflected in their args structs).
mbuf<->packet housekeeping. Instead, add a couple of extra fields
to the end of ndis_packet. These should be invisible to the Windows
driver module.
This also lets me get rid of a little bit of evil from ndis_ptom()
(frobbing of the ext_buf field instead of relying on the MEXTADD()
macro).
- Fix ndis_time().
- Implement NdisGetSystemUpTime().
- Implement RtlCopyUnicodeString() and RtlUnicodeStringToAnsiString().
- In ndis_getstate_80211(), use sc->ndis_link to determine connect
status.
Submitted by: Brian Feldman <green@freebsd.org>
- Add explicit cardbus attachment in if_ndis.c
- Clean up after moving bus_setup_intr() in ndis_attach().
- When setting an ssid, program an empty ssid as a 1-byte string
with a single 0 byte. The Microsoft documentation says this is
how you're supposed to tell the NIC to attach to 'any' ssid.
- Keep trace of callout handles for timers externally from the
ndis_miniport_timer structs, and run through and clobber them
all after invoking the haltfunc just in case the driver left one
running. (We need to make sure all timers are cancelled on driver
unload.)
- Handle the 'cancelled' argument in ndis_cancel_timer() correctly.
NDIS_80211_NET_INFRA_BSS: I accidentally reversed them during
transcription from the Microsoft headers. Note that the
driver will default to BSS mode, and you need to specify
'mediaopt adhoc' to get it into IBSS mode.
supposed to be opaque to the driver, however it is exposed through
several macros which expect certain behavior. In my original
implementation, I used the mappedsystemva member of the structure
to hold a pointer to the buffer and bytecount to hold the length.
It turns out you must use the startva pointer to point to the
page containing the start of the buffer and set byteoffset to
the offset within the page where the buffer starts. So, for a buffer
with address 'baseva,' startva is baseva & ~(PAGE_SIZE -1) and
byteoffset is baseva & (PAGE_SIZE -1). We have to maintain this
convention everywhere that ndis_buffers are used.
Fortunately, Microsoft defines some macros for initializing and
manipulating NDIS_BUFFER structures in ntddk.h. I adapted some
of them for use here and used them where appropriate.
This fixes the discrepancy I observed between how RX'ed packet sizes
were being reported in the Broadcom wireless driver and the sample
ethernet drivers that I've tested. This should also help the
Intel Centrino wireless driver work.
Also try to properly initialize the 802.11 BSS and IBSS channels.
(Sadly, the channel value is meaningless since there's no way
in the existing NDIS API to get/set the channel, but this should
take care of any 'invalid channel (NULL)' messages printed on
the console.
In NdisQueryBuffer() and NdisQueryBufferSafe(), the vaddr argument is
optional, so test it before trying to dereference it.
Also correct NdisGetFirstBufferFromPacket()/NdisGetFirstBufferFromPacketSafe():
we need to use nb_mappedsystemva from the buffer, not nb_systemva.
routines: NdisUnchainBufferAtBack(), NdisGetFirstBufferFromPacketSafe()
and NdisGetFirstBufferFromPacket(). This should bring us a little
closer to getting the Intel centrino wireless NIC to work.
Note: I have not actually tested these additions since I don't
have a driver that calls them, however they're pretty simple, and
one of them is taken pretty much directly from the Windows ndis.h
header file, so I'm fairly confident they work, but disclaimers
apply.
- Make ndis_get_info()/ndis_set_info() sleep on the setdone/getdone
routines if they get back NDIS_STATUS_PENDING.
- Add a bunch of net80211 support so that 802.11 cards can be twiddled
with ifconfig. This still needs more work and is not guaranteed to
work for everyone. It works on my 802.11b/g card anyway.
The problem here is Microsoft doesn't provide a good way to a) learn
all the rates that a card supports (if it has more than 8, you're
kinda hosed) and b) doesn't provide a good way to distinguish between
802.11b, 802.11b/g an 802.11a/b/g cards, so you sort of have to guess.
Setting the SSID and switching between infrastructure/adhoc modes
should work. WEP still needs to be implemented. I can't find any API
for getting/setting the channel other than the registry/sysctl keys.
definitions for more than one device (usually differentiated by
the PCI subvendor/subdevice ID). Each device also has its own tree
of registry keys. In some cases, each device has the same keys, but
sometimes each device has a unique tree but with overlap. Originally,
I just had ndiscvt(8) dump out all the keys it could find, and we
would try to apply them to every device we could find. Now, each key
has an index number that matches it to a device in the device ID list.
This lets us create just the keys that apply to a particular device.
I also added an extra field to the device list to hold the subvendor
and subdevice ID.
Some devices are generic, i.e. there is no subsystem definition. If
we have a device that doesn't match a specific subsystem value and
we have a generic entry, we use the generic entry.
make it more robust. This should fix problems with crashes under
heavy traffic loads that have been reported. Also add a 'query done'
callback handler to satisfy the e100bex.sys sample Intel driver.
NdisAnsiStringToUnicodeString(), NdisWriteConfiguration().
Also add stubs for NdisMGetDeviceProperty(), NdisTerminateWrapper(),
NdisOpenConfigurationKeyByName(), NdisOpenConfigurationKeyByIndex()
and NdisMGetDeviceProperty().
- fix ndis_time() so that it returns a time based on the proper
epoch (wacky though it may be)
- implement NdisInitializeString() and NdisFreeString(), and add
stub for NdisMRemoveMiniport()
ntoskrnl_var.h:
- add missing member to the general_lookaside struct (gl_listentry)
subr_ntoskrnl.c:
- Fix arguments to the interlocked push/pop routines: 'head' is an
slist_header *, not an slist_entry *
- Kludge up _fastcall support for the push/pop routines. The _fastcall
convention is similar to _stdcall, except the first two available
DWORD-sized arguments are passed in %ecx and %edx, respectively.
One kludge for this __attribute__ ((regparm(3))), however this
isn't entirely right, as it assumes %eax, %ecx and %edx will be
used (regparm(2) assumes %eax and %edx). Another kludge is to
declare the two fastcall-ed args as local register variables and
explicitly assign them to %ecx and %edx, but experimentation showed
that gcc would not guard %ecx and %edx against being clobbered.
Thus, I came up with a 3rd kludge, which is to use some inline
assembly of the form:
void *arg1;
void *arg2;
__asm__("movl %%ecx, %%ecx" : "=c" (arg1));
__asm__("movl %%edx, %%edx" : "=d" (arg2));
This lets gcc know that we're going to reference %ecx and %edx and
that it should make an effort not to let it get trampled. This wastes
an instruction (movl %reg, %reg is a no-op) but insures proper
behavior. It's possible there's a better way to do this though:
this is the first time I've used inline assembler in this fashion.
The above fixes to ntoskrnl_var.h an subr_ntoskrnl.c make lookaside
lists work for the two drivers I have that use them, one of which
is an NDIS 5.0 miniport and another which is 5.1.
subr_ndis.c: NdisGetCurrentSystemTime() which, according to the
Microsoft documentation returns "the number of 100 nanosecond
intervals since January 1, 1601." I have no idea what's so special
about that epoch or why they chose 100 nanosecond ticks. I don't
know the proper offset to convert nanotime() from the UNIX epoch
to January 1, 1601, so for now I'm just doing the unit convertion
to 100s of nanoseconds.
subr_ntoskrnl.c: memcpy(), memset(), ExInterlockedPopEntrySList(),
ExInterlockedPushEntrySList().
The latter two are different from InterlockedPopEntrySList()
and InterlockedPushEntrySList() in that they accept a spinlock to
hold while executing, whereas the non-Ex routines use a lock
internal to ntoskrnl. I also modified ExInitializePagedLookasideList()
and ExInitializeNPagedLookasideList() to initialize mutex locks
within the lookaside structures. It seems that in NDIS 5.0,
the lookaside allocate/free routines ExInterlockedPopEntrySList()
and ExInterlockedPushEntrySList(), which require the use of the
per-lookaside spinlock, whereas in NDIS 5.1, the per-lookaside
spinlock is deprecated. We need to support both cases.
Note that I appear to be doing something wrong with
ExInterlockedPopEntrySList() and ExInterlockedPushEntrySList():
they don't appear to obtain proper pointers to their arguments,
so I'm probably doing something wrong in terms of their calling
convention (they're declared to be FASTCALL in Widnows, and I'm
not sure what that means for gcc). It happens that in my stub
lookaside implementation, they don't need to do any work anyway,
so for now I've hacked them to always return NULL, which avoids
corrupting the stack. I need to do this right though.
it's an error to set the buffer bytecount to anything larger than
the buffer's original allocation size, but anything less than that
is ok.
Also, in ndis_ptom(), use the same logic: if the bytecount is
larger than the allocation size, consider the bytecount invalid
and the allocation size as the packet fragment length (m_len)
instead of the bytecount.
This corrects a consistency problem between the Broadcom wireless
driver and some of the ethernet drivers I've tested: the ethernet
drivers all report the packet frag sizes in buf->nb_bytecount, but
the Broadcom wireless driver reports them in buf->nb_size. This
seems like a bug to me, but it clearly must work in Windows, so
we have to deal with it here too.
is provided to NDIS via the the miniport characteristics structure
supplied in the call to NdisMRegisterMiniport(). But in NDIS 5.0
and earlier, you had to call NdisMRegisterAdapterShutdownHandler()
and supply both a function pointer and context pointer.
We try to handle both cases in ndis_shutdown_nic(). If the
driver registered a shutdown routine and a context,then used
that context, otherwise pass it the adapter context from
NdisMSetAttributesEx().
This fixes a panic on shutdown with the sample Intel 82559 e100bex.sys
driver from the Windows DDK.
function pointer
Yes, it's what you think it is. Yes, you should run away now.
This is a special compatibility module for allowing Windows NDIS
miniport network drivers to be used with FreeBSD/x86. This provides
_binary_ NDIS compatibility (not source): you can run NDIS driver
code, but you can't build it. There are three main parts:
sys/compat/ndis: the NDIS compat API, which provides binary
compatibility functions for many routines in NDIS.SYS, HAL.dll
and ntoskrnl.exe in Windows (these are the three modules that
most NDIS miniport drivers use). The compat module also contains
a small PE relocator/dynalinker which relocates the Windows .SYS
image and then patches in our native routines.
sys/dev/if_ndis: the if_ndis driver wrapper. This module makes
use of the ndis compat API and can be compiled with a specially
prepared binary image file (ndis_driver_data.h) containing the
Windows .SYS image and registry key information parsed out of the
accompanying .INF file. Once if_ndis.ko is built, it can be loaded
and unloaded just like a native FreeBSD kenrel module.
usr.sbin/ndiscvt: a special utility that converts foo.sys and foo.inf
into an ndis_driver_data.h file that can be compiled into if_ndis.o.
Contains an .inf file parser graciously provided by Matt Dodd (and
mercilessly hacked upon by me) that strips out device ID info and
registry key info from a .INF file and packages it up with a binary
image array. The ndiscvt(8) utility also does some manipulation of
the segments within the .sys file to make life easier for the kernel
loader. (Doing the manipulation here saves the kernel code from having
to move things around later, which would waste memory.)
ndiscvt is only built for the i386 arch. Only files.i386 has been
updated, and none of this is turned on in GENERIC. It should probably
work on pc98. I have no idea about amd64 or ia64 at this point.
This is still a work in progress. I estimate it's about %85 done, but
I want it under CVS control so I can track subsequent changes. It has
been tested with exactly three drivers: the LinkSys LNE100TX v4 driver
(Lne100v4.sys), the sample Intel 82559 driver from the Windows DDK
(e100bex.sys) and the Broadcom BCM43xx wireless driver (bcmwl5.sys). It
still needs to have a net80211 stuff added to it. To use it, you would
do something like this:
# cd /sys/modules/ndis
# make; make load
# cd /sys/modules/if_ndis
# ndiscvt -i /path/to/foo.inf -s /path/to/foo.sys -o ndis_driver_data.h
# make; make load
# sysctl -a | grep ndis
All registry keys are mapped to sysctl nodes. Sometimes drivers refer
to registry keys that aren't mentioned in foo.inf. If this happens,
the NDIS API module creates sysctl nodes for these keys on the fly so
you can tweak them.
An example usage of the Broadcom wireless driver would be:
# sysctl hw.ndis0.EnableAutoConnect=1
# sysctl hw.ndis0.SSID="MY_SSID"
# sysctl hw.ndis0.NetworkType=0 (0 for bss, 1 for adhoc)
# ifconfig ndis0 <my ipaddr> netmask 0xffffff00 up
Things to be done:
- get rid of debug messages
- add in ndis80211 support
- defer transmissions until after a status update with
NDIS_STATUS_CONNECTED occurs
- Create smarter lookaside list support
- Split off if_ndis_pci.c and if_ndis_pccard.c attachments
- Make sure PCMCIA support works
- Fix ndiscvt to properly parse PCMCIA device IDs from INF files
- write ndisapi.9 man page
with the sendsig code in the MD area. It is not safe to assume that all
the register conventions will be the same. Also, the way of producing
32 bit code (.code32 directives) in this file is amd64 specific.
The split-up code is derived from the ia64 code originally.
Note that I have only compile-tested this, not actually run-tested it.
The ia64 side of the force is missing some significant chunks of signal
delivery code.
purpose and the resulting vattr structure was ignored. In addition,
the VOP_GETATTR call was made with no vnode lock held, resulting in
vnode locking violation panic with debug kernels.
Reported by: truckman
Approved by: re@ (rwatson)
- improve sysinfo(2) syscall;
- add dummy fadvise64(2) syscall;
- add dummy *xattr(2) family of syscalls;
- add protos for the syscalls 222-225, 238-249 and 253-267;
- add exit_group(2) syscall, which is currently just wired to exit(2).
Obtained from: OpenBSD
MFC after: 2 weeks
is highly MD in an emulation environment since it operates on the host
environment. Although the setregs functions are really for exec support
rather than signals, they deal with the same sorts of context and include
files. So I put it there rather than create yet another file.
1.36 +73 -60 src/sys/compat/linux/linux_ipc.c
1.83 +102 -48 src/sys/kern/sysv_shm.c
1.8 +4 -0 src/sys/sys/syscallsubr.h
That change was intended to support vmware3, but
wantrem parameter is useless because vmware3 uses SYSV shared memory
to talk with X server and X server is native application.
The patch worked because check for wantrem was not valid
(wantrem and SHMSEG_REMOVED was never checked for SHMSEG_ALLOCATED segments).
Add kern.ipc.shm_allow_removed (integer, rw) sysctl (default 0) which when set
to 1 allows to return removed segments in
shm_find_segment_by_shmid() and shm_find_segment_by_shmidx().
MFC after: 1 week
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
structures come out the right size.
Fix the ones that broke. stat32 had some missing fields from the end
and statfs32 was broken due to the strange definition of MNAMELEN
(which is dependent on sizeof(long))
I'm not sure if this fixes any actual problems or not.
- Return NULL instead of returning memory outside of the stackgap
in stackgap_alloc() (FreeBSD-SA-00:42.linux)
- Check for stackgap_alloc() returning NULL in svr4_emul_find(),
and clean_pipe().
- Avoid integer overflow on large nfds argument in svr4_sys_poll()
- Reject negative nbytes argument in svr4_sys_getdents()
- Don't copy out past the end of the struct componentname
pathname buffer in svr4_sys_resolvepath()
- Reject out-of-range signal numbers in svr4_sys_sigaction(),
svr4_sys_signal(), and svr4_sys_kill().
- Don't malloc() user-specified lengths in show_ioc() and
show_strbuf(), place arbitrary limits instead.
- Range-check lengths in si_listen(), ti_getinfo(), ti_bind(),
svr4_do_putmsg(), svr4_do_getmsg(), svr4_stream_ti_ioctl().
Some fixes obtain from OpenBSD.
- Allocate storage for uap->msg always because it is copyin()'ed in
native sendmsg().
- Convert sockopt level from Linux to FreeBSD after native recvmsg() calling.
- Some cleanups.
Tested with: Oracle 9i shared server connection mode.
MFC after: 1 week
systems where the data/stack/etc limits are too big for a 32 bit process.
Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.
Supply an ia32_fixlimits function. Export the clip/default values to
sysctl under the compat.ia32 heirarchy.
Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable. This allows mmap to
place mappings at sensible locations when limits have been reduced.
Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.
Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'. And that causes exec etc to fail since it can no
longer find space to mmap things.
with 64-bit longs again. This was fixed in rev.1.42 but the fix
rotted non-fatally in rev.1.105 and fatally in rev.1.137.
Many more non-egregrious casts are strictly required for conversions
from semi-opaque types to pointers, but we avoid most of them by using
types that are almost certain to be compatible with uintptr_t for
representing pointers (e.g., vm_offset_t). Here we don't really want
the u_longs, but we have them because a.out.h and its support code
doesn't use typedefs (it uses unsigned in V7 and unsigned long in
FreeBSD) and is too obsolete to fix now.
from the ia32 specific stuff. Some of this still needs to move to the MI
freebsd32 area, and some needs to move to the MD area. This is still
work-in-progress.
like we have on other platforms. Move savectx() to <machine/pcb.h>.
A lot of files got these MD prototypes through the indirect inclusion
of <machine/cpu.h> and now need to include <machine/md_var.h>. The
number of which is unexpectedly large...
osf1_misc.c especially is tricky because szsigcode is redefined in
one of the osf1 header files. Reordering of the include files was
needed.
linprocfs.c now needs an explicit extern declaration.
Tested with: LINT
- cut the version string at the newline, suppressing information about
who built the kernel and in what directory. Most of this information
was already lost to truncation.
- on i386, return the precise CPU class (if known) rather than just
"i386". Linux software which uses this information to select
which binary to run often does not know what to make of "i386".
contain the filedescriptor number on opens from userland.
The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*
For now pass -1 all over the place.
paging space and how much of it is in use (in pages).
Use this interface from the Linuxolator instead of groping around in the
internals of the swap_pager.
the VMIN and VTIME members of the c_cc array. These members are not
special control characters. By not excluding these members we
changed the noncanonical mode input processing when both members
were 0 on entry (=LINUX_POSIX_VDISABLE) as we would remap them to 255
(=_POSIX_VDISABLE). See termios(4) case A for how that screws up
your terminal I/O.
PR: 23173
Originator: Bjarne Blichfeldt <bbl@dk.damgaard.com>
Patch by: Boris Nikolaus <bn@dali.tellique.de> (original submission)
Philipp Mergenthaler <philipp.mergenthaler@stud.uni-karlsruhe.de>
Reminders by: Joseph Holland King <gte743n@cad.gatech.edu>
MFC after: 5 days
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.
By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.
At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
having their stack at the 512GB mark. Give 4GB of user VM space for 32
bit apps. Note that this is significantly more than on i386 which gives
only about 2.9GB of user VM to a process (1GB for kernel, plus page
table pages which eat user VM space).
Approved by: re (blanket)
load_gs() calls into a single place that is less likely to go wrong.
Eliminate the per-process context switching of MSR_GSBASE, because it
should be constant for a single cpu. Instead, save/restore it during
the loading of the new %gs selector for the new process.
Approved by: re (amd64/* blanket)
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls. It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4. The ia64 code has not implemented signal delivery, so I had
to do that.
Before you say it, yes, this does need to go in a common place. But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.
On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure. The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting. The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes. This still needs
to be optimized.
Approved by: re (amd64/* blanket)
- Move struct sigacts out of the u-area and malloc() it using the
M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
and thread_stopped() are now MP safe.
Reviewed by: arch@
Approved by: re (rwatson)
argument to the functions shm{at,ctl}1 and shm_find_segment_by_shmid{x}.
The BSD semantics didn't allow the usage of shared segment after
being marked for removal through IPC_RMID.
The patch involves the following functions:
- shmat
- shmctl
- shm_find_segment_by_shmid
- shm_find_segment_by_shmidx
- linux_shmat
- linux_shmctl
Submitted by: Orlando Bassotto <orlando.bassotto@ieo-research.it>
Reviewed by: marcel
do for newstat_copyout().
Lie about disk drives which are character devices
in FreeBSD but block devices under Linux.
PR: 37227
Submitted by: Vladimir B. Grebenschikov <vova@sw.ru>
Reviewed by: phk
MFC after: 2 weeks
FreeBSD flags instead of just adding one to the Linux flags. This should
be identical to the previous version except that I have at least one report
of this patch fixing problems people were having with Linux apps after my
last commit to this file. It is safer to use the switch then to make
assumptions about the flag values anyways, esp. since we currently use
MD defines for the values of the flags and this is MI code.
Tested by: Michael Class <michael_class@gmx.net>
kern_sigprocmask() in the various binary compatibility emulators.
- Replace calls to sigsuspend(), sigaltstack(), sigaction(), and
sigprocmask() that used the stackgap with calls to the corresponding
kern_sig*() functions instead without using the stackgap.
by allprison_mtx), a unique prison/jail identifier field, two path
fields (pr_path for reporting and pr_root vnode instance) to store
the chroot() point of each jail.
o Add jail_attach(2) to allow a process to bind to an existing jail.
o Add change_root() to perform the chroot operation on a specified
vnode.
o Generalize change_dir() to accept a vnode, and move namei() calls
to callers of change_dir().
o Add a new sysctl (security.jail.list) which is a group of
struct xprison instances that represent a snapshot of active jails.
Reviewed by: rwatson, tjr
a follow on commit to kern_sig.c
- signotify() now operates on a thread since unmasked pending signals are
stored in the thread.
- PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
functions are now all basically identical except that alpha linux uses
Elf64 arguments and svr4 and i386 linux use Elf32. The fixups include
changing the first argument to be a register_t ** to match the prototype
for fixup functions, asserting that the process in the image_params struct
is always curproc and removing unnecessary locking to read credentials as a
result, and a few style fixes.
but I decided that it was important for this patch to not bit-rot, and
since it is mainly moving code around, the total amount of entropy is
epsilon /phk)
This is a patch to move the common parts of linux_getcwd() back into
kern/vfs_cache.c so that the standard FreeBSD libc getcwd() can use it's
extended functionality. The linux syscall linux_getcwd() in
compat/linux/linux_getcwd.c has been rewritten to use it too. It should
be possible to simplify libc's getcwd() after this. No doubt this code
needs some cleaning up, since I've left in the sysctl variables I used
for debugging.
PR: 48169
Submitted by: James Whitwell <abacau@yahoo.com.au>
take a thread instead of a proc for their first argument.
- Add a mutex to protect the system-wide Linux osname, osrelease, and
oss_version variables.
- Change linux_get_prison() to take a thread instead of a proc for its
first argument and to use td_ucred rather than p_ucred. This is ok
because a thread's prison does not change even though it's ucred might.
- Also, change linux_get_prison() to return a struct prison * instead of
a struct linux_prison * since it returns with the struct prison locked
and this makes it easier to safely unlock the prison when we are done
messing with it.
sched_lock around accesses to p_stats->p_timer[] to avoid a potential
race with hardclock. getitimer(), setitimer() and the realitexpire()
callout are now Giant-free.
so be more careful about calling stackgap_init.
Tested by: Fred Souza <fred@storming.org>
2) Linux_sendmsg was forgetting to fill out the bsd_args struct.
Reviewed by: ume
3) The args to linux_connect have differently named types on alpha and
i386, so add a cast to stop gcc complaining.
Spotted by: peter
pointer types, and remove a huge number of casts from code using it.
Change struct xfile xf_data to xun_data (ABI is still compatible).
If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary. There are no operational changes in this
commit.
code, make the emulator use it.
Rename unsupported_msg() to unimplemented_syscall(). Rename some arguments
for clarity
Fixup grammar.
Requested by: bde
the same as fcntl() except that it supports the new 64-bit file
locking commands (LINUX_F_GETLK64 etc) that use the `flock64'
structure. We had been interpreting all flock structures passed to
fcntl64() as `struct flock64' instead of only the ones from F_*64
commands.
The glibc in linux_base-7 uses fcntl64() by default, but the bug
was often non-fatal since the misinterpretation typically only
causes junk to appear in the `l_len' field and most junk values are
accepted as valid range lengths. The result is occasional EINVAL
errors from F_SETLK and a few bytes after the supplied `struct
flock' getting clobbered during F_GETLK.
PR: kern/37656
Reviewed by: marcel
Approved by: re
MFC after: 1 week
(1) Permit userland applications to request a change of label atomic
with an execve() via mac_execve(). This is required for the
SEBSD port of SELinux/FLASK. Attempts to invoke this without
MAC compiled in result in ENOSYS, as with all other MAC system
calls. Complexity, if desired, is present in policy modules,
rather than the framework.
(2) Permit policies to have access to both the label of the vnode
being executed as well as the interpreter if it's a shell
script or related UNIX nonsense. Because we can't hold both
vnode locks at the same time, cache the interpreter label.
SEBSD relies on this because it supports secure transitioning
via shell script executables. Other policies might want to
take both labels into account during an integrity or
confidentiality decision at execve()-time.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
describes an image activation instance. Instead, make use of the
existing fname structure entry, and introduce two new entries,
userspace_argv, and userspace_envv. With the addition of
mac_execve(), this divorces the image structure from the specifics
of the execve() system call, removes a redundant pointer, etc.
No semantic change from current behavior, but it means that the
structure doesn't depend on syscalls.master-generated includes.
There seems to be some redundant initialization of imgact entries,
which I have maintained, but which could probably use some cleaning
up at some point.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
It is never used. I left it there from pre-KSE days as I didn't know
if I'd need it or not but now I know I don't.. It's functionality
is in TDI_IWAIT in the thread.
This is for the not-quite-ready signal/fpu abi stuff. It may not see
the light of day, but I'm certainly not going to be able to validate it
when getting shot in the foot due to syscall number conflicts.
execve_secure() system call, which permits a process to pass in a label
for a label change during exec. This permits SELinux to change the
label for the resulting exec without a race following a manual label
change on the process. Because this interface uses our general purpose
MAC label abstraction, we call it execve_mac(), and wrap our port of
SELinux's execve_secure() around it with appropriate sid mappings.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
- add wrappers for mmap2(2) and ftruncate64(2) system calls;
- don't spam console with printf's when VFAT_READDIR_BOTH ioctl(2) is invoked;
- add support for SOUND_MIXER_READ_STEREODEVS ioctl(2);
- make msgctl(IPC_STAT) and IPC_SET actually working by converting from
BSD msqid_ds to Linux and vice versa;
- properly return EINVAL if semget(2) is called with nsems being negative.
Reviewed by: marcel
Approved by: marcel
Tested with: LSB runtime test