Skip to content
Snippets Groups Projects
This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git. Pull mirroring updated .
  1. Oct 10, 2007
    • Pavel Emelyanov's avatar
      [FS] seq_file: Introduce the seq_open_private() · 39699037
      Pavel Emelyanov authored
      
      This function allocates the zeroed chunk of memory and
      call seq_open(). The __seq_open_private() helper returns
      the allocated memory to make it possible for the caller
      to initialize it.
      
      Signed-off-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39699037
    • Pavel Emelyanov's avatar
      [NETNS]: Move some code into __init section when CONFIG_NET_NS=n · 4665079c
      Pavel Emelyanov authored
      
      With the net namespaces many code leaved the __init section,
      thus making the kernel occupy more memory than it did before.
      Since we have a config option that prohibits the namespace
      creation, the functions that initialize/finalize some netns
      stuff are simply not needed and can be freed after the boot.
      
      Currently, this is almost not noticeable, since few calls
      are no longer in __init, but when the namespaces will be
      merged it will be possible to free more code. I propose to
      use the __net_init, __net_exit and __net_initdata "attributes"
      for functions/variables that are not used if the CONFIG_NET_NS
      is not set to save more space in memory.
      
      The exiting functions cannot just reside in the __exit section,
      as noticed by David, since the init section will have
      references on it and the compilation will fail due to modpost
      checks. These references can exist, since the init namespace
      never dies and the exit callbacks are never called. So I
      introduce the __exit_refok attribute just like it is already
      done with the __init_refok.
      
      Signed-off-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4665079c
    • Eric W. Biederman's avatar
      [NET]: Fix race when opening a proc file while a network namespace is exiting. · 077130c0
      Eric W. Biederman authored
      
      The problem:  proc_net files remember which network namespace the are
      against but do not remember hold a reference count (as that would pin
      the network namespace).   So we currently have a small window where
      the reference count on a network namespace may be incremented when opening
      a /proc file when it has already gone to zero.
      
      To fix this introduce maybe_get_net and get_proc_net.
      
      maybe_get_net increments the network namespace reference count only if it is
      greater then zero, ensuring we don't increment a reference count after it
      has gone to zero.
      
      get_proc_net handles all of the magic to go from a proc inode to the network
      namespace instance and call maybe_get_net on it.
      
      PROC_NET the old accessor is removed so that we don't get confused and use
      the wrong helper function.
      
      Then I fix up the callers to use get_proc_net and handle the case case
      where get_proc_net returns NULL.  In that case I return -ENXIO because
      effectively the network namespace has already gone away so the files
      we are trying to access don't exist anymore.
      
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      077130c0
    • Daniel Lezcano's avatar
      [NETNS]: Fix export symbols. · 36ac3135
      Daniel Lezcano authored
      
      Add the appropriate EXPORT_SYMBOLS for proc_net_create,
      proc_net_fops_create and proc_net_remove to fix errors when
      compiling allmodconfig
      
      Signed-off-by: default avatarMark Nelson <markn@au1.ibm.com>
      Acked-by: default avatarBenjamin Thery <benjamin.thery@bull.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36ac3135
    • David S. Miller's avatar
      3c12afe7
    • Eric W. Biederman's avatar
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman authored
      
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      881d966b
    • Eric W. Biederman's avatar
      [NET]: Support multiple network namespaces with netlink · b4b51029
      Eric W. Biederman authored
      
      Each netlink socket will live in exactly one network namespace,
      this includes the controlling kernel sockets.
      
      This patch updates all of the existing netlink protocols
      to only support the initial network namespace.  Request
      by clients in other namespaces will get -ECONREFUSED.
      As they would if the kernel did not have the support for
      that netlink protocol compiled in.
      
      As each netlink protocol is updated to be multiple network
      namespace safe it can register multiple kernel sockets
      to acquire a presence in the rest of the network namespaces.
      
      The implementation in af_netlink is a simple filter implementation
      at hash table insertion and hash table look up time.
      
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4b51029
    • Eric W. Biederman's avatar
      [NET]: Make /proc/net per network namespace · 457c4cbc
      Eric W. Biederman authored
      
      This patch makes /proc/net per network namespace.  It modifies the global
      variables proc_net and proc_net_stat to be per network namespace.
      The proc_net file helpers are modified to take a network namespace argument,
      and all of their callers are fixed to pass &init_net for that argument.
      This ensures that all of the /proc/net files are only visible and
      usable in the initial network namespace until the code behind them
      has been updated to be handle multiple network namespaces.
      
      Making /proc/net per namespace is necessary as at least some files
      in /proc/net depend upon the set of network devices which is per
      network namespace, and even more files in /proc/net have contents
      that are relevant to a single network namespace.
      
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      457c4cbc
    • Eric W. Biederman's avatar
      [NET]: Don't implement dev_ifname32 inline · 32da477a
      Eric W. Biederman authored
      
      The current implementation of dev_ifname makes maintenance difficult
      because updates to the implementation of the ioctl have to made in two
      places.  So this patch updates dev_ifname32 to do a classic 32/64
      structure conversion and call sys_ioctl like the rest of the
      compat calls do.
      
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32da477a
  2. Oct 09, 2007
  3. Oct 08, 2007
  4. Oct 03, 2007
  5. Oct 01, 2007
  6. Sep 28, 2007
  7. Sep 26, 2007
  8. Sep 25, 2007
  9. Oct 03, 2007
  10. Sep 21, 2007
  11. Sep 20, 2007
    • Sunil Mushran's avatar
      ocfs2: Pack vote message and response structures · 813d974c
      Sunil Mushran authored
      
      The ocfs2_vote_msg and ocfs2_response_msg structs needed to be
      packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this,
      we had inadvertantly broken 32/64 bit cross mounts.
      
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      813d974c
    • Mark Fasheh's avatar
      ocfs2: Don't double set write parameters · 5c26a7b7
      Mark Fasheh authored
      
      The target page offsets were being incorrectly set a second time in
      ocfs2_prepare_page_for_write(), which was causing problems on a 16k page
      size kernel. Additionally, ocfs2_write_failure() was incorrectly using those
      parameters instead of the parameters for the individual page being cleaned
      up.
      
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      5c26a7b7
    • Mark Fasheh's avatar
      ocfs2: Fix pos/len passed to ocfs2_write_cluster · db56246c
      Mark Fasheh authored
      
      This was broken for file systems whose cluster size is greater than page
      size. Pos needs to be incremented as we loop through the descriptors, and
      len needs to be capped to the size of a single cluster.
      
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      db56246c
    • Mark Fasheh's avatar
      ocfs2: Allow smaller allocations during large writes · 415cb800
      Mark Fasheh authored
      
      The ocfs2 write code loops through a page much like the block code, except
      that ocfs2 allocation units can be any size, including larger than page
      size. Typically it's equal to or larger than page size - most kernels run 4k
      pages, the minimum ocfs2 allocation (cluster) size.
      
      Some changes introduced during 2.6.23 changed the way writes to pages are
      handled, and inadvertantly broke support for > 4k page size. Instead of just
      writing one cluster at a time, we now handle the whole page in one pass.
      
      This means that multiple (small) seperate allocations might happen in the
      same pass. The allocation code howver typically optimizes by getting the
      maximum which was reserved. This triggered a BUG_ON in the extend code where
      it'd ask for a single bit (for one part of a > 4k page) and get back more
      than it asked for.
      
      Fix this by providing a variant of the high level allocation function which
      allows the caller to specify a maximum. The traditional function remains and
      just calls the new one with a maximum determined from the initial
      reservation.
      
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      415cb800
    • Davide Libenzi's avatar
      signalfd simplification · b8fceee1
      Davide Libenzi authored
      
      This simplifies signalfd code, by avoiding it to remain attached to the
      sighand during its lifetime.
      
      In this way, the signalfd remain attached to the sighand only during
      poll(2) (and select and epoll) and read(2).  This also allows to remove
      all the custom "tsk == current" checks in kernel/signal.c, since
      dequeue_signal() will only be called by "current".
      
      I think this is also what Ben was suggesting time ago.
      
      The external effect of this, is that a thread can extract only its own
      private signals and the group ones.  I think this is an acceptable
      behaviour, in that those are the signals the thread would be able to
      fetch w/out signalfd.
      
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b8fceee1
    • Christoph Hellwig's avatar
      [XFS] fix valid but harmless sparse warning · 1bc5858d
      Christoph Hellwig authored
      
      The new xlog_recover_do_reg_buffer checks call be16_to_cpu on di_gen which
      is a 32bit value so sparse rightly complains. Fortunately the warning is
      harmless because we don't care for the value, but only whether it's
      non-NULL. Due to that fact we can simply kill the endian swaps on this and
      the previous di_mode check entirely.
      
      SGI-PV: 969656
      SGI-Modid: xfs-linux-melb:xfs-kern:29709a
      
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: default avatarTim Shimmin <tes@sgi.com>
      1bc5858d
    • Eric Sandeen's avatar
      [XFS] fix filestreams on 32-bit boxes · bcc7b445
      Eric Sandeen authored
      
      xfs_filestream_mount() sets up an mru cache with:
        err = xfs_mru_cache_create(&mp->m_filestream, lifetime, grp_count,
        (xfs_mru_cache_free_func_t)xfs_fstrm_free_func);
      but that cast is causing problems...
        typedef void (*xfs_mru_cache_free_func_t)(unsigned long, void*);
      but:
        void xfs_fstrm_free_func( xfs_ino_t ino, fstrm_item_t *item)
      so on a 32-bit box, it's casting (32, 32) args into (64, 32) and I assume
      it's getting garbage for *item, which subsequently causes an explosion.
      With this change the filestreams xfsqa tests don't oops on my 32-bit box.
      
      SGI-PV: 967795
      SGI-Modid: xfs-linux-melb:xfs-kern:29510a
      
      Signed-off-by: default avatarEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: default avatarDavid Chinner <dgc@sgi.com>
      Signed-off-by: default avatarTim Shimmin <tes@sgi.com>
      bcc7b445
  12. Sep 19, 2007
    • Eric Sandeen's avatar
      ext34: ensure do_split leaves enough free space in both blocks · ef2b02d3
      Eric Sandeen authored
      
      The do_split() function for htree dir blocks is intended to split a leaf
      block to make room for a new entry.  It sorts the entries in the original
      block by hash value, then moves the last half of the entries to the new
      block - without accounting for how much space this actually moves.  (IOW,
      it moves half of the entry *count* not half of the entry *space*).  If by
      chance we have both large & small entries, and we move only the smallest
      entries, and we have a large new entry to insert, we may not have created
      enough space for it.
      
      The patch below stores each record size when calculating the dx_map, and
      then walks the hash-sorted dx_map, calculating how many entries must be
      moved to more evenly split the existing entries between the old block and
      the new block, guaranteeing enough space for the new entry.
      
      The dx_map "offs" member is reduced to u16 so that the overall map size
      does not change - it is temporarily stored at the end of the new block, and
      if it grows too large it may be overwritten.  By making offs and size both
      u16, we won't grow the map size.
      
      Also add a few comments to the functions involved.
      
      This fixes the testcase reported by hooanon05@yahoo.co.jp on the
      linux-ext4 list, "ext3 dir_index causes an error"
      
      Thanks to Andreas Dilger for discussing the problem & solution with me.
      
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarAndreas Dilger <adilger@clusterfs.com>
      Tested-by: default avatarJunjiro Okajima <hooanon05@yahoo.co.jp>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ef2b02d3
    • Alexey Dobriyan's avatar
      nfs: fix oops re sysctls and V4 support · 49af7ee1
      Alexey Dobriyan authored
      
      NFS unregisters sysctls only if V4 support is compiled in.  However, sysctl
      table is not V4 specific, so unregister it always.
      
      Steps to reproduce:
      
      	[build nfs.ko with CONFIG_NFS_V4=n]
      	modrobe nfs
      	rmmod nfs
      	ls /proc/sys
      
      Unable to handle kernel paging request at ffffffff880661c0 RIP:
       [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
      PGD 203067 PUD 207063 PMD 7e216067 PTE 0
      Oops: 0000 [1] SMP
      CPU 1
      Modules linked in: lockd nfs_acl sunrpc
      Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2
      RIP: 0010:[<ffffffff802af8e3>]  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
      RSP: 0018:ffff81007fd93e78  EFLAGS: 00010286
      RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0
      RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40
      RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0
      R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280
      FS:  00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000)
      Stack:  ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40
       ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a
       2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640
      Call Trace:
       [<ffffffff80283f30>] filldir+0x0/0xf0
       [<ffffffff80283f30>] filldir+0x0/0xf0
       [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0
       [<ffffffff80284376>] sys_getdents+0x96/0xe0
       [<ffffffff8020bb3e>] system_call+0x7e/0x83
      
      Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b
      RIP  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
       RSP <ffff81007fd93e78>
      CR2: ffffffff880661c0
      Kernel panic - not syncing: Fatal exception
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarTrond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49af7ee1
    • Eric Sandeen's avatar
      dir_index: error out instead of BUG on corrupt dx dirs · 3d82abae
      Eric Sandeen authored
      
      Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
      errors with helpful warnings.  With help catching other asserts from Duane
      Griffin <duaneg@dghda.com>
      
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Acked-by: default avatarDuane Griffin <duaneg@dghda.com>
      Acked-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d82abae
  13. Sep 18, 2007
  14. Sep 17, 2007
  15. Sep 14, 2007
  16. Sep 12, 2007
  17. Sep 11, 2007
Loading