This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git.
Pull mirroring updated .
- Apr 02, 2009
-
-
Ilpo Järvinen authored
It seems that trivial reset of pcount to one was not sufficient in tcp_retransmit_skb. Multiple counters experience a positive miscount when skb's pcount gets lowered without the necessary adjustments (depending on skb's sacked bits which exactly), at worst a packets_out miscount can crash at RTO if the write queue is empty! Triggering this requires mss change, so bidir tcp or mtu probe or like. Signed-off-by:
Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Reported-by:
Markus Trippelsdorf <markus@trippelsdorf.de> Tested-by:
Uwe Bugla <uwe.bugla@gmx.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Ilpo Järvinen authored
We need full-scale adjustment to fix a TCP miscount in the next patch, so just move it into a helper and call for that from the other places. Signed-off-by:
Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
GRO assumes that there is a one-to-one relationship between NAPI structure and network device. Some devices like sky2 share multiple devices on a single interrupt so only have one NAPI handler. Rather than split GRO from NAPI, just have GRO assume if device changes that it is a different flow. Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Commit 78454473 (netfilter: iptables: lock free counters) forgot to disable BH in arpt_do_table(), ipt_do_table() and ip6t_do_table() Use rcu_read_lock_bh() instead of rcu_read_lock() cures the problem. Reported-and-bisected-by:
Roman Mindalev <r000n@r000n.net> Signed-off-by:
Eric Dumazet <dada1@cosmosbay.com> Acked-by:
Patrick McHardy <kaber@trash.net> Acked-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Andy Grover authored
We have a 64bit value that needs to be set atomically. This is easy and quick on all 64bit archs, and can also be done on x86/32 with set_64bit() (uses cmpxchg8b). However other 32b archs don't have this. I actually changed this to the current state in preparation for mainline because the old way (using a spinlock on 32b) resulted in unsightly #ifdefs in the code. But obviously, being correct takes precedence. Signed-off-by:
Andy Grover <andy.grover@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Andy Grover authored
This fixes a bug where a connection was unexpectedly not on *any* list while being destroyed. It also cleans up some code duplication and regularizes some function names. * Grab appropriate lock in conn_free() and explain in comment * Ensure via locking that a conn is never not on either a dev's list or the nodev list * Add rds_xx_remove_conn() to match rds_xx_add_conn() * Make rds_xx_add_conn() return void * Rename remove_{,nodev_}conns() to destroy_{,nodev_}conns() and unify their implementation in a helper function * Document lock ordering as nodev conn_lock before dev_conn_lock Reported-by:
Yosef Etigin <yosefe@voltaire.com> Signed-off-by:
Andy Grover <andy.grover@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Andy Grover authored
rs_send_drop_to() is called during socket close. If it takes m_rs_lock without disabling interrupts, then rds_send_remove_from_sock() can run from the rx completion handler and thus deadlock. Signed-off-by:
Andy Grover <andy.grover@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 01, 2009
-
-
Trond Myklebust authored
Also ensure that we use the protocol family instead of the address family when calling sock_create_kern(). Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Davide Libenzi authored
Add support for event-aware wakeups to the sockets code. Events are delivered to the wakeup target, so that epoll can avoid spurious wakeups for non-interesting events. Signed-off-by:
Davide Libenzi <davidel@xmailserver.org> Acked-by:
Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: William Lee Irwin III <wli@movementarian.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Dobriyan authored
struct tty_operations::proc_fops took it's place and there is one less create_proc_read_entry() user now! Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Dobriyan authored
Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Al Viro authored
current->fs->umask is what most of fs_struct users are doing. Put that into a helper function. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Mar 31, 2009
-
-
Wei Yongjun authored
Remove pointless conditional before kfree(). Signed-off-by:
Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Rami Rosen authored
Signed-off-by:
Rami Rosen <ramirose@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 30, 2009
-
-
Alexey Dobriyan authored
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy as correctly noted at bug #12454. Someone can lookup entry with NULL ->owner, thus not pinning enything, and release it later resulting in module refcount underflow. We can keep ->owner and supply it at registration time like ->proc_fops and ->data. But this leaves ->owner as easy-manipulative field (just one C assignment) and somebody will forget to unpin previous/pin current module when switching ->owner. ->proc_fops is declared as "const" which should give some thoughts. ->read_proc/->write_proc were just fixed to not require ->owner for protection. rmmod'ed directories will be empty and return "." and ".." -- no harm. And directories with tricky enough readdir and lookup shouldn't be modular. We definitely don't want such modular code. Removing ->owner will also make PDE smaller. So, let's nuke it. Kudos to Jeff Layton for reminding about this, let's say, oversight. http://bugzilla.kernel.org/show_bug.cgi?id=12454 Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com>
-
Matt LaPlante authored
Signed-off-by:
Matt LaPlante <kernel1@cyberdogtech.com> Acked-by:
Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
Rusty Russell authored
Impact: cleanup Time to clean up remaining laggards using the old cpu_ functions. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Trond.Myklebust@netapp.com
-
- Mar 29, 2009
-
-
Pablo Neira Ayuso authored
This patch fixes a dependency with IPv6: ERROR: "__ipv6_addr_type" [net/netfilter/xt_cluster.ko] undefined! This patch adds a function that checks if the higher bits of the address is 0xFF to identify a multicast address, instead of adding a dependency due to __ipv6_addr_type(). I came up with this idea after Patrick McHardy pointed possible problems with runtime module dependencies. Reported-by:
Steven Noonan <steven@uplinklabs.net> Reported-by:
Randy Dunlap <randy.dunlap@oracle.com> Reported-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
Allows for the removal of byteswapping in some places and the removal of HIPQUAD (replaced by %pI4). Signed-off-by:
Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Harvey Harrison authored
dcc_ip is treated as a host-endian value in the first printk, but the second printk uses %pI4 which expects a be32. This will cause a mismatch between the debug statement and the warning statement. Treat as a be32 throughout and avoid some byteswapping during some comparisions, and allow another user of HIPQUAD to bite the dust. Signed-off-by:
Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
When GRO/frag_list support was added to GSO, I made an error which broke the support for segmenting linear GSO packets (GSO packets are normally non-linear in the payload). These days most of these packets are constructed by the tun driver, which prefers to allocate linear memory if possible. This is fixed in the latest kernel, but for 2.6.29 and earlier it is still the norm. Therefore this bug causes failures with GSO when used with tun in 2.6.29. Reported-by:
James Huang <jamesclhuang@gmail.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 28, 2009
-
-
Chuck Lever authored
We just augmented the kernel's RPC service registration code so that it automatically adjusts to what is supported in user space. Thus we no longer need the kernel configuration option to enable registering RPC services with v4 -- it's all done automatically. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Move error reporting for RPC registration to rpcb_register's caller. This way the caller can choose to recover silently from certain errors, but report errors it does not recognize. Error reporting for kernel RPC service registration is now handled in one place. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The kernel registers RPC services with the local portmapper with an rpcbind SET upcall to the local portmapper. Traditionally, this used rpcbind v2 (PMAP), but registering RPC services that support IPv6 requires rpcbind v3 or v4. Since we now want separate PF_INET and PF_INET6 listeners for each kernel RPC service, svc_register() will do only one of those registrations at a time. For PF_INET, it tries an rpcb v4 SET upcall first; if that fails, it does a legacy portmap SET. This makes it entirely backwards compatible with legacy user space, but allows a proper v4 SET to be used if rpcbind is available. For PF_INET6, it does an rpcb v4 SET upcall. If that fails, it fails the registration, and thus the transport creation. This let's the kernel detect if user space is able to support IPv6 RPC services, and thus whether it should maintain a PF_INET6 listener for each service at all. This provides complete backwards compatibilty with legacy user space that only supports rpcbind v2. The only down-side is that registering a new kernel RPC service may take an extra exchange with the local portmapper on legacy systems, but this is an infrequent operation and is done over UDP (no lingering sockets in TIMEWAIT), so it shouldn't be consequential. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Our initial implementation of svc_unregister() assumed that PMAP_UNSET cleared all rpcbind registrations for a [program, version] tuple. However, we now have evidence that PMAP_UNSET clears only "inet" entries, and not "inet6" entries, in the rpcbind database. For backwards compatibility with the legacy portmapper, the svc_unregister() function also must work if user space doesn't support rpcbind version 4 at all. Thus we'll send an rpcbind v4 UNSET, and if that fails, we'll send a PMAP_UNSET. This simplifies the code in svc_unregister() and provides better backwards compatibility with legacy user space that does not support rpcbind version 4. We can get rid of the conditional compilation in here as well. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The user space TI-RPC library uses an empty string for the universal address when unregistering all target addresses for [program, version]. The kernel's rpcb client should behave the same way. Here, we are switching between several registration methods based on the protocol family of the incoming address. Rename the other rpcbind v4 registration functions to make it clear that they, as well, are switched on protocol family. In /etc/netconfig, this is either "inet" or "inet6". NB: The loopback protocol families are not supported in the kernel. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
RFC 1833 has little to say about the contents of r_owner; it only specifies that it is a string, and states that it is used to control who can UNSET an entry. Our port of rpcbind (from Sun) assumes this string contains a numeric UID value, not alphabetical or symbolic characters, but checks this value only for AF_LOCAL RPCB_SET or RPCB_UNSET requests. In all other cases, rpcbind ignores the contents of the r_owner string. The reference user space implementation of rpcb_set(3) uses a numeric UID for all SET/UNSET requests (even via the network) and an empty string for all other requests. We emulate that behavior here to maintain bug-for-bug compatibility. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Clean up: Simplify rpcb_v4_register() and its helpers by moving the details of sockaddr type casting to rpcb_v4_register()'s helper functions. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The RPC client returns -EPROTONOSUPPORT if there is a protocol version mismatch (ie the remote RPC server doesn't support the RPC protocol version sent by the client). Helpers for the svc_register() function return -EPROTONOSUPPORT if they don't recognize the passed-in IPPROTO_ value. These are two entirely different failure modes. Have the helpers return -ENOPROTOOPT instead of -EPROTONOSUPPORT. This will allow callers to determine more precisely what the underlying problem is, and decide to report or recover appropriately. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The kernel uses an IPv6 loopback address when registering its AF_INET6 RPC services so that it can tell whether the local portmapper is actually IPv6-enabled. Since the legacy portmapper doesn't listen on IPv6, however, this causes a long timeout on older systems if the kernel happens to try creating and registering an AF_INET6 RPC service. Originally I wanted to use a connected transport (either TCP or connected UDP) so that the upcall would fail immediately if the portmapper wasn't listening on IPv6, but we never agreed on what transport to use. In the end, it's of little consequence to the kernel whether the local portmapper is listening on IPv6. It's only important whether the portmapper supports rpcbind v4. And the kernel can't tell that at all if it is sending requests via IPv6 -- the portmapper will just ignore them. So, send both rpcbind v2 and v4 SET/UNSET requests via IPv4 loopback to maintain better backwards compatibility between new kernels and legacy user space, and prevent multi-second hangs in some cases when the kernel attempts to register RPC services. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
We are about to convert to using separate RPC listener sockets for PF_INET and PF_INET6. This echoes the way IPv6 is handled in user space by TI-RPC, and eliminates the need for ULPs to worry about mapped IPv4 AF_INET6 addresses when doing address comparisons. Start by setting the IPV6ONLY flag on PF_INET6 RPC listener sockets. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Since an RPC service listener's protocol family is specified now via svc_create_xprt(), it no longer needs to be passed to svc_create() or svc_create_pooled(). Remove that argument from the synopsis of those functions, and remove the sv_family field from the svc_serv struct. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The sv_family field is going away. Pass a protocol family argument to svc_create_xprt() instead of extracting the family from the passed-in svc_serv struct. Again, as this is a listener socket and not an address, we make this new argument an "int" protocol family, instead of an "sa_family_t." Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Since the sv_family field is going away, modify svc_setup_socket() to extract the protocol family from the passed-in socket instead of from the passed-in svc_serv struct. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
The sv_family field is going away. Instead of using sv_family, have the svc_register() function take a protocol family argument. Since this argument represents a protocol family, and not an address family, this argument takes an int, as this is what is passed to sock_create_kern(). Also make sure svc_register's helpers are checking for PF_FOO instead of AF_FOO. The value of [AP]F_FOO are equivalent; this is simply a symbolic change to reflect the semantics of the value stored in that variable. sock_create_kern() should return EPFNOSUPPORT if the passed-in protocol family isn't supported, but it uses EAFNOSUPPORT for this case. We will stick with that tradition here, as svc_register() is called by the RPC server in the same path as sock_create_kern(). Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
Clean up: add documentating comment and use appropriate data types for svc_find_xprt()'s arguments. This also eliminates a mixed sign comparison: @port was an int, while the return value of svc_xprt_local_port() is an unsigned short. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Chuck Lever authored
In 2007, commit e65fe397 added additional sanity checking to rpcb_decode_getaddr() to make sure we were getting a reply that was long enough to be an actual universal address. If the uaddr string isn't long enough, the XDR decoder returns EIO. However, an empty string is a valid RPCB_GETADDR response if the requested service isn't registered. Moreover, "::.n.m" is also a valid RPCB_GETADDR response for IPv6 addresses that is shorter than rpcb_decode_getaddr()'s lower limit of 11. So this sanity check introduced a regression for rpcbind requests against IPv6 remotes. So revert the lower bound check added by commit e65fe397, and add an explicit check for an empty uaddr string, similar to libtirpc's rpcb_getaddr(3). Pointed-out-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Paul Moore authored
This patch cleans up a lot of the Smack network access control code. The largest changes are to fix the labeling of incoming TCP connections in a manner similar to the recent SELinux changes which use the security_inet_conn_request() hook to label the request_sock and let the label move to the child socket via the normal network stack mechanisms. In addition to the incoming TCP connection fixes this patch also removes the smk_labled field from the socket_smack struct as the minor optimization advantage was outweighed by the difficulty in maintaining it's proper state. Signed-off-by:
Paul Moore <paul.moore@hp.com> Acked-by:
Casey Schaufler <casey@schaufler-ca.com> Signed-off-by:
James Morris <jmorris@namei.org>
-
Paul Moore authored
The socket_post_accept() hook is not currently used by any in-tree modules and its existence continues to cause problems by confusing people about what can be safely accomplished using this hook. If a legitimate need for this hook arises in the future it can always be reintroduced. Signed-off-by:
Paul Moore <paul.moore@hp.com> Signed-off-by:
James Morris <jmorris@namei.org>
-
Paul Moore authored
The current NetLabel/SELinux behavior for incoming TCP connections works but only through a series of happy coincidences that rely on the limited nature of standard CIPSO (only able to convey MLS attributes) and the write equality imposed by the SELinux MLS constraints. The problem is that network sockets created as the result of an incoming TCP connection were not on-the-wire labeled based on the security attributes of the parent socket but rather based on the wire label of the remote peer. The issue had to do with how IP options were managed as part of the network stack and where the LSM hooks were in relation to the code which set the IP options on these newly created child sockets. While NetLabel/SELinux did correctly set the socket's on-the-wire label it was promptly cleared by the network stack and reset based on the IP options of the remote peer. This patch, in conjunction with a prior patch that adjusted the LSM hook locations, works to set the correct on-the-wire label format for new incoming connections through the security_inet_conn_request() hook. Besides the correct behavior there are many advantages to this change, the most significant is that all of the NetLabel socket labeling code in SELinux now lives in hooks which can return error codes to the core stack which allows us to finally get ride of the selinux_netlbl_inode_permission() logic which greatly simplfies the NetLabel/SELinux glue code. In the process of developing this patch I also ran into a small handful of AF_INET6 cleanliness issues that have been fixed which should make the code safer and easier to extend in the future. Signed-off-by:
Paul Moore <paul.moore@hp.com> Acked-by:
Casey Schaufler <casey@schaufler-ca.com> Signed-off-by:
James Morris <jmorris@namei.org>
-