Commit Graph

2456 Commits

Author SHA1 Message Date
Hans Petter Selasky
3a8bec33ef Fix handling of IOCTLs in the LinuxKPI.
Linux requires that all IOCTL data resides in userspace. FreeBSD
always moves the main IOCTL structure into a kernel buffer before
invoking the IOCTL handler and then copies it back into userspace,
before returning. Hide this difference in the "linux_copyin()" and
"linux_copyout()" functions by remapping userspace addresses in the
range from 0x10000 to 0x20000, to the kernel IOCTL data buffer.

It is assumed that the userspace code, data and stack segments starts
no lower than memory address 0x400000, which is also stated by "man 1
ld", which means any valid userspace pointer can be passed to regular
LinuxKPI handled IOCTLs.

Bump the FreeBSD version to force recompilation of all kernel modules.

Discussed with:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-12 11:38:28 +00:00
Hans Petter Selasky
15c98ff2f1 Remove redundant "task_struct_set()".
This is done by the "linux_kthread_fn()".

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-12 09:11:18 +00:00
Hans Petter Selasky
464d20bcc8 Create a dummy "task_struct" on the stack which is returned by
"current" inside all LinuxKPI file operation callbacks. The "current"
is frequently used for various debug prints, printing the thread name
and thread ID for example.

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-12 09:06:54 +00:00
Hans Petter Selasky
dacb734ea8 Match Linux behaviour and iterate the IDR tree unlocked. The caller is
responsible the IDR tree stays unmodified while iterating.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-11 17:20:20 +00:00
Hans Petter Selasky
b3c89b5ad9 Return a proper error code instead of panicing when an I/O vector
having the wrong number of entries is detected.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-11 10:50:59 +00:00
Hans Petter Selasky
fd42d62378 Add more IDR and IDA related functions to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-11 10:40:04 +00:00
Hans Petter Selasky
f8221002a5 Factor out common code into "idr_find_layer_locked()" and fix inverted
bitmap test for free entry in "idr_replace()".

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-11 10:35:15 +00:00
Hans Petter Selasky
5d35d77707 Add missing destruction of mutex.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-11 10:06:58 +00:00
Hans Petter Selasky
8457719578 Add more atomic LinuxKPI functions.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-11 07:58:43 +00:00
Hans Petter Selasky
f2dbb750f4 Implement ioremap_wt() and use that in the MEMREMAP_WT case for i386
and amd64.

Suggested by:	cem @
Discussed with:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-10 17:51:17 +00:00
Hans Petter Selasky
684a5fef01 Add more LinuxKPI I/O functions.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-10 12:04:57 +00:00
Hans Petter Selasky
7652bc32f7 Use function macros when possible to avoid stray substitutions.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-10 11:39:36 +00:00
Hans Petter Selasky
f2f5b1337e Add missing semicolon and properly wrap macro argument.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-10 11:34:22 +00:00
Hans Petter Selasky
c7d81c66df Allow the argument for the cpu_to_xxxp() and xxx_to_cpup() macros to
point to a constant.

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-10 11:31:00 +00:00
Hans Petter Selasky
0754e66c54 Fix file polling bug.
Ensure the actual poll result is returned by the "linux_file_poll()"
function instead of zero which means no data is available.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2016-05-09 11:52:57 +00:00
Pedro F. Giffuni
1ce4275dd2 sys/compat/linux*: spelling fixes.
Mostly on comments but there are some user-visible messages as well.

MFC after: 2 weeks
2016-04-30 00:53:10 +00:00
Pedro F. Giffuni
72ffecf147 ndis: spelling fixes in comments.
No functional change.
2016-04-30 00:35:46 +00:00
Pedro F. Giffuni
2bede1b82a x86bios: spelling fix in a comment.
No functional change.
2016-04-30 00:34:04 +00:00
Pedro F. Giffuni
5cb9d60e30 x86bios_alloc(): Unsign a counter.
The value can't even be signed so we can avoid the signed vs. unsigned
comparison.

Reviewed by:	jkim
2016-04-29 20:22:10 +00:00
Pedro F. Giffuni
5fe2c518bd ndis(4): it's rather unrealistic to expect a size_t here.
int was actually OK, and u_int is more than enough.
2016-04-28 03:19:53 +00:00
Pedro F. Giffuni
9119df34df ndis(4): unsign some indexes to prevent overflows.
The "len" parameter is uint32_t, indexing it with an int may
end up in a signed integer overflow.

strlen(3) returns an integer of size_t so the corresponding index should
have that size.

MFC after:	1 week
2016-04-28 01:58:56 +00:00
Conrad Meyer
aa90aec270 osd(9): Change array pointer to array pointer type from void*
This is a minor follow-up to r297422, prompted by a Coverity warning.  (It's
not a real defect, just a code smell.)  OSD slot array reservations are an
array of pointers (void **) but were cast to void* and back unnecessarily.
Keep the correct type from reservation to use.

osd.9 is updated to match, along with a few trivial igor fixes.

Reported by:	Coverity
CID:		1353811
Sponsored by:	EMC / Isilon Storage Division
2016-04-26 19:57:35 +00:00
Pedro F. Giffuni
55e0987aea sys: extend use of the howmany() macro when available.
We have a howmany() macro in the <sys/param.h> header that is
convenient to re-use as it makes things easier to read.
2016-04-26 15:38:17 +00:00
Jamie Gritton
d56cf22d22 linux_map_osrel doesn't need to be checked in linux_prison_set,
since it already was in linux_prison_check.
2016-04-25 06:08:45 +00:00
Dmitry Chagin
6c960bf07f Allow to build svr4 module with SYSV support separatelly from the kernel build.
PR:		208464
Reported by:	Kristoffer Eriksson
MFC after:	2 week
2016-04-23 20:31:18 +00:00
Dmitry Chagin
9f8621b1d5 Fix streams and svr4 module dependency. Both modules are complaining about
undefined symbol svr4_delete_socket which was moved from streams to the svr4 module
in r160558 that created a two-way dependency between them.

PR:		208464
Submitted by:	Kristoffer Eriksson
Reported by:	Kristoffer Eriksson
MFC after:	2 week
2016-04-23 20:29:55 +00:00
Pedro F. Giffuni
b66bb393f2 Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
Conrad Meyer
8d340432aa linprocfs_doproclimits: Initialize error return before use
Reported by:	Coverity
CID:		1354623
Sponsored by:	EMC / Isilon Storage Division
2016-04-20 01:03:06 +00:00
Conrad Meyer
e78adba3fe linprocfs: Don't print uninitialized values
Reported by:	Coverity
CID:		1354624
Sponsored by:	EMC / Isilon Storage Division
2016-04-20 01:00:13 +00:00
Pedro F. Giffuni
02abd40029 kernel: use our nitems() macro when it is available through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:48:27 +00:00
Pedro F. Giffuni
500ed14d6e compat/linux: for pointers replace 0 with NULL.
plvc is a pointer, no functional change.

Found with devel/coccinelle.
2016-04-15 16:21:13 +00:00
Pedro F. Giffuni
74b8d63dcc Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
Dmitry Chagin
5743aa47f5 More complete implementation of /proc/self/limits.
Fix the way the code accesses process limits struct - pointed out by mjg@.

PR:		207386
Reviewed by:	no objection form des@
MFC after:	3 weeks
2016-04-10 07:11:29 +00:00
Ed Schouten
ab83575070 Make CloudABI's way of doing TLS more friendly to userspace emulators.
We're currently seeing how hard it would be to run CloudABI binaries on
operating systems cannot be modified easily (Windows, Mac OS X). The
idea is that we want to just run them without any sandboxing. Now
that CloudABI executables are PIE, this is already a bit easier, but TLS
is still problematic:

- CloudABI executables want to write to the %fs, which typically
  requires extra system calls by the emulator every time it needs to
  switch between CloudABI's and its own TLS.

- If CloudABI executables overwrite the %fs base unconditionally, it
  also becomes harder for the emulator to store a backup of the old
  value of %fs. To solve this, let's no longer overwrite %fs, but just
  %fs:0.

As CloudABI's C library does not use a TCB, this space can now be used
by an emulator to keep track of its internal state. The executable can
now safely overwrite %fs:0, as long as it makes sure that the TCB is
copied over to the new TLS area.

Ensure that there is an initial TLS area set up when the process starts,
only containing a bogus TCB. We don't really care about its contents on
FreeBSD.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D5836
2016-04-06 11:11:31 +00:00
Pedro F. Giffuni
ae26eab161 Fix indentation oops. 2016-04-03 14:40:54 +00:00
Dmitry Chagin
8bc21bafba Move Linux specific times tests up to guarantee the values are defined.
CID:		1305178
Submitted by:	pfg@
MFC after:	1 week
2016-04-03 06:33:16 +00:00
Sepherosa Ziehau
1ea448225c tcp/lro: Change SLIST to LIST, so that removing an entry is O(1)
This is kinda critical to the performance when the CPU is slow and
network bandwidth is high, e.g. in the hypervisor.

Reviewed by:	rrs, gallatin, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5765
2016-04-01 06:43:05 +00:00
Ed Schouten
4a8b3b18cc Make Position Independent Executables work for CloudABI.
- Set BI_CAN_EXEC_DYN, so we can execute ET_DYN ELF files in addition to
  regular ET_EXECs.
- Provide an AT_BASE entry in the auxiliary vector, so the executable
  knows at which address it got loaded and can apply relocations.
2016-03-31 18:52:00 +00:00
Ed Schouten
2054309b6a Regenerate system call table after r297468. 2016-03-31 18:50:52 +00:00
Ed Schouten
38526a2cf1 Sync in the latest CloudABI system call definitions.
Some time ago I made a change to merge together the memory scope
definitions used by mmap (MAP_{PRIVATE,SHARED}) and lock objects
(PTHREAD_PROCESS_{PRIVATE,SHARED}). Though that sounded pretty smart
back then, it's backfiring. In the case of mmap it's used with other
flags in a bitmask, but for locking it's an enumeration. As our plan is
to automatically generate bindings for other languages, that looks a bit
sloppy.

Change all of the locking functions to use separate flags instead.

Obtained from:	https://github.com/NuxiNL/cloudabi
2016-03-31 18:50:06 +00:00
Navdeep Parhar
b03384114d Add wait_event_interruptible_timeout to linuxkpi.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Reviewed by:	hselasky@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D5776
2016-03-31 17:11:58 +00:00
Hans Petter Selasky
9ad5ce9d01 Fix bugs in currently unused bit searching loop.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2016-03-31 06:19:15 +00:00
Jamie Gritton
7ab25e3d18 Use osd_reserve / osd_jail_set_reserved, which is known to succeed.
Also don't work around nonexistent osd_register failure.
2016-03-30 17:05:04 +00:00
Gleb Smirnoff
9c64cfe56c The sendfile(2) allows to send extra data from userspace before the file
data (headers).  Historically the size of the headers was not checked
against the socket buffer space.  Application could easily overcommit the
socket buffer space.

With the new sendfile (r293439) the problem remained, but a KASSERT was
inserted that checked that amount of data written to the socket matches
its space.  In case when size of headers is bigger that socket space,
KASSERT fires.  Without INVARIANTS the new sendfile won't panic, but
would report incorrect amount of bytes sent.

o With this change, the headers copyin is moved down into the cycle, after
  the sbspace() check.  The uio size is trimmed by socket space there,
  which fixes the overcommit problem and its consequences.
o The compatibility handling for FreeBSD 4 sendfile headers API is pushed
  up the stack to syscall wrappers.  This required a copy and paste of the
  code, but in turn this allowed to remove extra stack carried parameter
  from fo_sendfile_t, and embrace entire compat code into #ifdef.  If in
  future we got more fo_sendfile_t function, the copy and paste level would
  even reduce.

Reviewed by:	emax, gallatin, Maxim Dounin <mdounin mdounin.ru>
Tested by:	Vitalij Satanivskij <satan ukr.net>
Sponsored by:	Netflix
2016-03-29 19:57:11 +00:00
Dmitry Chagin
7c5982000d Revert r297310 as the SOL_XXX are equal to the IPPROTO_XX except SOL_SOCKET.
Pointed out by:	ae@
2016-03-27 10:09:10 +00:00
Dmitry Chagin
c826fcfe22 iConvert Linux SOL_IPV6 level.
MFC after:	1 week
2016-03-27 08:12:01 +00:00
Dmitry Chagin
e667ee63f6 Whitespaces and style(9) fix. No functional changes.
MFC after:	1 week
2016-03-27 08:10:20 +00:00
Dmitry Chagin
09806d8e3e When write(2) on eventfd object fails with the error EAGAIN do not return
the number of bytes written.

MFC after:	1 week
2016-03-26 19:16:53 +00:00
Dmitry Chagin
2bb14e7541 Implement O_NONBLOCK flag via fcntl(F_SETFL) for eventfd object.
MFC after:	1 week
2016-03-26 19:15:23 +00:00
Ed Schouten
3cf50041ef Regenerate system call table after r297247. 2016-03-24 21:49:39 +00:00
Ed Schouten
1f3bbfd875 Replace the CloudABI system call table by a machine generated version.
The type definitions and constants that were used by COMPAT_CLOUDABI64
are a literal copy of some headers stored inside of CloudABI's C
library, cloudlibc. What is annoying is that we can't make use of
cloudlibc's system call list, as the format is completely different and
doesn't provide enough information. It had to be synced in manually.

We recently decided to solve this (and some other problems) by moving
the ABI definitions into a separate file:

	https://github.com/NuxiNL/cloudabi/blob/master/cloudabi.txt

This file is processed by a pile of Python scripts to generate the
header files like before, documentation (markdown), but in our case more
importantly: a FreeBSD system call table.

This change discards the old files in sys/contrib/cloudabi and replaces
them by the latest copies, which requires some minor changes here and
there. Because cloudabi.txt also enforces consistent names of the system
call arguments, we have to patch up a small number of system call
implementations to use the new argument names.

The new header files can also be included directly in FreeBSD kernel
space without needing any includes/defines, so we can now remove
cloudabi_syscalldefs.h and cloudabi64_syscalldefs.h. Patch up the
sources to include the definitions directly from sys/contrib/cloudabi
instead.
2016-03-24 21:47:15 +00:00
Dmitry Chagin
2ad0231309 Check bsd_to_linux_statfs() return value. Forgotten in r297070.
MFC after:	1 week
2016-03-20 19:06:21 +00:00
Dmitry Chagin
525c9796c3 Return EOVERFLOW in case when actual statfs values are large enough and
not fit into 32 bit fileds of a Linux struct statfs.

PR:		181012
MFC after:	1 week
2016-03-20 18:31:30 +00:00
Dmitry Chagin
7958a34cb5 Whitespaces, style(9) fixes. No functional changes.
MFC after:	1 week
2016-03-20 14:06:27 +00:00
Dmitry Chagin
99546279d6 Implement fstatfs64 system call.
PR:		181012
Submitted by:	John Wehle
MFC after:	1 week
2016-03-20 13:21:20 +00:00
Dmitry Chagin
4525bb829f Rework r296543:
1. Limit secs to INT32_MAX / 2 to avoid errors from kern_setitimer().
   Assert that kern_setitimer() returns 0.
   Remove bogus cast of secs.
   Fix style(9) issues.

2. Increment the return value if the remaining tv_usec value more than 500000 as a Linux does.

Pointed out by: [1] Bruce Evans

MFC after:	1 week
2016-03-20 11:40:52 +00:00
Justin Hibbits
da1b038af9 Use uintmax_t (typedef'd to rman_res_t type) for rman ranges.
On some architectures, u_long isn't large enough for resource definitions.
Particularly, powerpc and arm allow 36-bit (or larger) physical addresses, but
type `long' is only 32-bit.  This extends rman's resources to uintmax_t.  With
this change, any resource can feasibly be placed anywhere in physical memory
(within the constraints of the driver).

Why uintmax_t and not something machine dependent, or uint64_t?  Though it's
possible for uintmax_t to grow, it's highly unlikely it will become 128-bit on
32-bit architectures.  64-bit architectures should have plenty of RAM to absorb
the increase on resource sizes if and when this occurs, and the number of
resources on memory-constrained systems should be sufficiently small as to not
pose a drastic overhead.  That being said, uintmax_t was chosen for source
clarity.  If it's specified as uint64_t, all printf()-like calls would either
need casts to uintmax_t, or be littered with PRI*64 macros.  Casts to uintmax_t
aren't horrible, but it would also bake into the API for
resource_list_print_type() either a hidden assumption that entries get cast to
uintmax_t for printing, or these calls would need the PRI*64 macros.  Since
source code is meant to be read more often than written, I chose the clearest
path of simply using uintmax_t.

Tested on a PowerPC p5020-based board, which places all device resources in
0xfxxxxxxxx, and has 8GB RAM.
Regression tested on qemu-system-i386
Regression tested on qemu-system-mips (malta profile)

Tested PAE and devinfo on virtualbox (live CD)

Special thanks to bz for his testing on ARM.

Reviewed By: bz, jhb (previous)
Relnotes:	Yes
Sponsored by:	Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D4544
2016-03-18 01:28:41 +00:00
John Baldwin
823590d40e Regen. 2016-03-12 22:55:07 +00:00
John Baldwin
8d91aced32 Regen. 2016-03-09 19:06:46 +00:00
John Baldwin
399e8c1773 Simplify AIO initialization now that it is standard.
- Mark AIO system calls as STD and remove the helpers to dynamically
  register them.
- Use COMPAT6 for the old system calls with the older sigevent instead of
  an 'o' prefix.
- Simplify the POSIX configuration to note that AIO is always available.
- Handle AIO in the default VOP_PATHCONF instead of special casing it in
  the pathconf() system call.  fpathconf() is still hackish.
- Remove freebsd32_aio_cancel() as it just called the native one directly.

Reviewed by:	kib
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D5589
2016-03-09 19:05:11 +00:00
Andrey V. Elsukov
86a9058b01 Add support for IPPROTO_IPV6 socket layer for getsockopt/setsockopt calls.
Also add mapping for several options from RFC 3493 and 3542.

Reviewed by:	dchagin
Tested by:	Joe Love <joe at getsomwhere dot net>
MFC after:	2 weeks
2016-03-09 09:12:40 +00:00
Dmitry Chagin
a87488d1e4 Better english.
Submitted by:	Kevin P. Neal
MFC after:	1 week
2016-03-08 19:40:01 +00:00
Dmitry Chagin
649ca5e9dc Put a commit message from r296502 about Linux alarm() system call
behaviour to the source.

Suggested by:	emaste@

MFC after:	1 week
2016-03-08 19:20:57 +00:00
Dmitry Chagin
91f514e413 Does not leak fp. While here remove bogus cast of fp->f_data.
MFC after:	1 week
2016-03-08 15:55:43 +00:00
Dmitry Chagin
fc4b98fb88 Linux accept() system call return EOPNOTSUPP errno instead of EINVAL
for UDP sockets.

MFC after:	1 week
2016-03-08 15:15:34 +00:00
Dmitry Chagin
15c3b371e2 According to POSIX and Linux implementation the alarm() system call
is always successfull.
So, ignore any errors and return 0 as a Linux do.

XXX. Unlike POSIX, Linux in case when the invalid seconds value specified
always return 0, so in that case Linux does not return proper remining time.

MFC after:	1 week
2016-03-08 15:12:49 +00:00
Dmitry Chagin
9f4e66afb9 Link the newly created process to the corresponding parent as
if CLONE_PARENT is set, then the parent of the new process will
be the same as that of the calling process.

MFC after:	1 week
2016-03-08 15:08:22 +00:00
Hans Petter Selasky
cb19abd277 Run the LinuxKPI PCI shutdown handler free of the Giant mutex.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-03-07 14:35:31 +00:00
Hans Petter Selasky
510ebed7be Add more functions to the LinuxKPI.
Define strnicmp as a function macro instead of a regular macro while
at it.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-03-03 09:56:04 +00:00
Mark Johnston
0acf5d0bfd Improve error handling for posix_fallocate(2) and posix_fadvise(2).
- Set td_errno so that ktrace and dtrace can obtain the syscall error
  number in the usual way.
- Pass negative error numbers directly to the syscall layer, as they're
  not intended to be returned to userland.

Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D5425
2016-02-25 19:58:23 +00:00
Ed Schouten
c0af8d16d8 Call cap_rights_init() properly.
Even though or'ing the individual rights works in this specific case, it
may not work in general. Pass them in as varargs.
2016-02-24 10:54:26 +00:00
Ed Schouten
70907712be Make handling of mmap()'s prot argument more strict.
- Make the system call fail if prot contains bits other than read, write
  and exec.
- Similar to OpenBSD's W^X, don't allow write and exec to be set at the
  same time. I'd like to see for now what happens if we enforce this
  policy unconditionally. If it turns out that this is far too strict,
  we'll loosen this requirement.
2016-02-23 09:22:00 +00:00
Svatopluk Kraus
35a0bc1260 As <machine/vmparam.h> is included from <vm/vm_param.h>, there is no
need to include it explicitly when <vm/vm_param.h> is already included.

Suggested by:	alc
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D5379
2016-02-22 09:08:04 +00:00
Svatopluk Kraus
a1e1814d76 As <machine/pmap.h> is included from <vm/pmap.h>, there is no need to
include it explicitly when <vm/pmap.h> is already included.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D5373
2016-02-22 09:02:20 +00:00
Dag-Erling Smørgrav
f4d6a773f8 Implement /proc/$$/limits.
PR:		207386
Submitted by:	Szymon Śliwa <knight.erraunt@gmail.com>
MFC after:	3 weeks
2016-02-21 14:56:05 +00:00
Jung-uk Kim
c2a9e596ed Silence VPS-Studio errors (V512). These buffer underflows are intentional. 2016-02-18 19:37:39 +00:00
Konstantin Belousov
db57c70a5b Rename P_KTHREAD struct proc p_flag to P_KPROC.
I left as is an apparent bug in ntoskrnl_var.h:AT_PASSIVE_LEVEL()
definition.

Suggested by:	jhb
Sponsored by:	The FreeBSD Foundation
2016-02-09 16:30:16 +00:00
Mateusz Guzik
813361c140 fork: plug a use after free of the returned process
fork1 required its callers to pass a pointer to struct proc * which would
be set to the new process (if any). procdesc and racct manipulation also
used said pointer.

However, the process could have exited prior to do_fork return and be
automatically reaped, thus making this a use-after-free.

Fix the problem by letting callers indicate whether they want the pid or
the struct proc, return the process in stopped state for the latter case.

Reviewed by:	kib
2016-02-04 04:25:30 +00:00
Mateusz Guzik
33fd9b9a2b fork: pass arguments to fork1 in a dedicated structure
Suggested by:	kib
2016-02-04 04:22:18 +00:00
Hans Petter Selasky
fe68f570d4 Update and add various macros to the LinuxKPI and resolve a macro
redefinition issue in the cxgb driver.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Reviewed by:	np @
2016-01-26 15:26:35 +00:00
Hans Petter Selasky
c7c96d1093 LinuxKPI list updates:
- Add some new hlist macros.
- Update existing hlist macros removing the need for a temporary
  iteration variable.
- Properly define the RCU hlist macros to be SMP safe with regard
  to RCU.
- Safe list macro arguments by adding a pair of parentheses.
- Prefix the _list_add() and _list_splice() functions with "linux"
  to reflect they are LinuxKPI internal functions.

Obtained from:	Linux
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 15:12:31 +00:00
Hans Petter Selasky
e6ef991e5e Implement ether_addr_equal(), ether_addr_equal_64bits() and
random_ether_addr() for the LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:36:16 +00:00
Hans Petter Selasky
f15ffb5e63 Implement is_vlan_dev() and vlan_dev_vlan_id() for the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:33:20 +00:00
Hans Petter Selasky
e28297940b Implement bitmap_weight() and bitmap_equal() for the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:31:20 +00:00
Hans Petter Selasky
d211177315 Add more network related macros and functions to the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:29:50 +00:00
Hans Petter Selasky
4a19cc98b3 Add definition for the NETDEV_CHANGE event and tidy up the LinuxKPI
notifier header file a bit while at it.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:27:00 +00:00
Hans Petter Selasky
9f34efb9f4 Define __get_user() and __put_user() for the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:21:30 +00:00
Hans Petter Selasky
a65ef21558 Add more LinuxKPI PCI related functions and defines.
Removed comments deriving from Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-26 14:20:25 +00:00
Hans Petter Selasky
f919b7a664 Implement 64-bit atomic operations for the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-21 17:56:23 +00:00
Hans Petter Selasky
29cbb3bef2 LinuxKPI atomic fixes:
- Fix implementation of atomic_add_unless(). The atomic_cmpset_int()
  function returns a boolean and not the previous value of the atomic
  variable.
- The atomic counters should be signed according to Linux.
- Some minor cosmetics and styling while at it.

Reviewed by:	alfred @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-21 17:52:55 +00:00
Hans Petter Selasky
1f6112d50a Use function macro instead of non-function macro to reduce chance of
incorrect expansion.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-21 17:36:06 +00:00
Hans Petter Selasky
2d1bee654a Implement idr_preload(), idr_preload_end(), idr_alloc() and
idr_alloc_cyclic() in the LinuxKPI. Bump the FreeBSD version to
force recompilation of all KLDs due to IDR structure size change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2016-01-21 14:57:45 +00:00
John Baldwin
e23cd1b923 Initialize vm_page_prot to VM_MEMATTR_DEFAULT instead of 0.
If a driver's Linux mmap callback passed vm_page_prot through unchanged,
then linux_dev_mmap_single() would try to apply whatever VM_MEMATTR_xxx
value 0 is to the mapping.  On x86, VM_MEMATTR_DEFAULT is the PAT value
for write-back (WB) which is 6, while 0 maps to the PAT value for
uncacheable (UC).  Thus, any mmap request that did not explicitly set
page_prot was tried to map memory as UC triggering the warning in
sg_pager_getpages().

Tested by:	np
Reported by:	Krishnamraju Eraparaju @ Chelsio
MFC after:	3 days
Sponsored by:	Chelsio Communications
2016-01-20 00:14:34 +00:00
Dmitry Chagin
67968b35a0 Prevent double free of control in common sendmsg path as sosend
already freeing it.
2016-01-17 19:28:13 +00:00
Hans Petter Selasky
7d1333938f Implement support for PCI suspend, resume and shutdown events in the
LinuxKPI. Fix a few spaces to tabs. Bump the FreeBSD version to force
recompilation of existing KMODs.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-15 11:18:58 +00:00
Gleb Smirnoff
c8358c6e0d Call crextend() before copying old credentials to the new credentials
and replace crcopysafe by crcopy as crcopysafe is is not intended to be
safe in a threaded environment, it drops PROC_LOCK() in while() that
can lead to unexpected results, such as overwrite kernel memory.

In my POV crcopysafe() needs special attention. For now I do not see
any problems with this function, but who knows.

Submitted by:	dchagin
Found by:	trinity
Security:	SA-16:04.linux
2016-01-14 10:16:25 +00:00
Gleb Smirnoff
037f750877 Change linux get_robust_list system call to match actual linux one.
The set_robust_list system call request the kernel to record the head
of the list of robust futexes owned by the calling thread. The head
argument is the list head to record.
The get_robust_list system call should return the head of the robust
list of the thread whose thread id is specified in pid argument.
The list head should be stored in the location pointed to by head
argument.

In contrast, our implemenattion of get_robust_list system call copies
the known portion of memory pointed by recorded in set_robust_list
system call pointer to the head of the robust list to the location
pointed by head argument.

So, it is possible for a local attacker to read portions of kernel
memory, which may result in a privilege escalation.

Submitted by:	mjg
Security:	SA-16:03.linux
2016-01-14 10:13:58 +00:00
Dmitry Chagin
6437b8e7d9 Unlock process lock when return error from getrobustlist call and add
an forgotten dtrace probe when return the same error.

MFC after:	3 days
XMFC with:	r292743
2016-01-10 07:36:43 +00:00
Dmitry Chagin
038c720553 Implement vsyscall hack. Prior to 2.13 glibc uses vsyscall
instead of vdso. An upcoming linux_base-c6 needs it.

Differential Revision:  https://reviews.freebsd.org/D1090

Reviewed by:	kib, trasz
MFC after:	1 week
2016-01-09 20:18:53 +00:00
Hans Petter Selasky
0c510167fb LinuxKPI style changes:
- Properly prefix internal functions with "linux_" instead of only a
  single underscore to avoid future namespace collisions.
- Make some functions global instead of inline to ease debugging and
  to avoid unnecessary code duplication.
- Remove no longer existing kthread_create() function's prototype.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-01-08 10:04:19 +00:00