This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git. Pull mirroring updated .
  1. 20 Mar, 2012 1 commit
  2. 05 Mar, 2012 3 commits
    • Curt Wohlgemuth's avatar
      ext4: add comments to definition of ext4_io_end_t · 4188188b
      Curt Wohlgemuth authored
      
      
      This should make it more clear what this structure is used
      for, and how some of the (mutually exclusive) fields are
      used to keep page cache references.
      
      Signed-off-by: default avatarCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      4188188b
    • Jeff Moyer's avatar
      ext4: fix race between sync and completed io work · 491caa43
      Jeff Moyer authored
      The following command line will leave the aio-stress process unkillable
      on an ext4 file system (in my case, mounted on /mnt/test):
      
      aio-stress -t 20 -s 10 -O -S -o 2 -I 1000 /mnt/test/aiostress.3561.4 /mnt/test/aiostress.3561.4.20 /mnt/test/aiostress.3561.4.19 /mnt/test/aiostress.3561.4.18 /mnt/test/aiostress.3561.4.17 /mnt/test/aiostress.3561.4.16 /mnt/test/aiostress.3561.4.15 /mnt/test/aiostress.3561.4.14 /mnt/test/aiostress.3561.4.13 /mnt/test/aiostress.3561.4.12 /mnt/test/aiostress.3561.4.11 /mnt/test/aiostress.3561.4.10 /mnt/test/aiostress.3561.4.9 /mnt/test/aiostress.3561.4.8 /mnt/test/aiostress.3561.4.7 /mnt/test/aiostress.3561.4.6 /mnt/test/aiostress.3561.4.5 /mnt/test/aiostress.3561.4.4 /mnt/test/aiostress.3561.4.3 /mnt/test/aiostress.3561.4.2
      
      This is using the aio-stress program from the xfstests test suite.
      That particular command line tells aio-stress to do random writes to
      20 files from 20 threads (one thread per file).  The files are NOT
      preallocated, so you will get writes to random offsets within the
      file, thus creating holes and extending i_size.  It also opens the
      file with O_DIRECT and O_SYNC.
      
      On to the problem.  When an I/O requires unwritten extent conversion,
      it is queued onto the completed_io_list for the ext4 inode.  Two code
      paths will pull work items from this list.  The first is the
      ext4_end_io_work routine, and the second is ext4_flush_completed_IO,
      which is called via the fsync path (and O_SYNC handling, as well).
      There are two issues I've found in these code paths.  First, if the
      fsync path beats the work routine to a particular I/O, the work
      routine will free the io_end structure!  It does not take into account
      the fact that the io_end may still be in use by the fsync path.  I've
      fixed this issue by adding yet another IO_END flag, indicating that
      the io_end is being processed by the fsync path.
      
      The second problem is that the work routine will make an assignment to
      io->flag outside of the lock.  I have witnessed this result in a hang
      at umount.  Moving the flag setting inside the lock resolved that
      problem.
      
      The problem was introduced by commit b82e384c
      
       ("ext4: optimize
      locking for end_io extent conversion"), which first appeared in 3.2.
      As such, the fix should be backported to that release (probably along
      with the unwritten extent conversion race fix).
      
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      CC: stable@kernel.org
      491caa43
    • Theodore Ts'o's avatar
      ext4: make ext4_show_options() be table-driven · 5a916be1
      Theodore Ts'o authored
      
      
      Consistently show mount options which are the non-default, so that
      /proc/mounts accurately shows the mount options that would be
      necessary to mount the file system in its current mode of operation.
      
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      5a916be1
  3. 03 Mar, 2012 1 commit
  4. 02 Mar, 2012 1 commit
  5. 20 Feb, 2012 3 commits
    • Jeff Moyer's avatar
      ext4: fix race between unwritten extent conversion and truncate · 266991b1
      Jeff Moyer authored
      
      
      The following comment in ext4_end_io_dio caught my attention:
      
      	/* XXX: probably should move into the real I/O completion handler */
              inode_dio_done(inode);
      
      The truncate code takes i_mutex, then calls inode_dio_wait.  Because the
      ext4 code path above will end up dropping the mutex before it is
      reacquired by the worker thread that does the extent conversion, it
      seems to me that the truncate can happen out of order.  Jan Kara
      mentioned that this might result in error messages in the system logs,
      but that should be the extent of the "damage."
      
      The fix is pretty straight-forward: don't call inode_dio_done until the
      extent conversion is complete.
      
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      266991b1
    • Theodore Ts'o's avatar
      ext4: fix INCOMPAT feature codepoint reservation for INLINEDATA · 856cbcf9
      Theodore Ts'o authored
      In commit 9b90e5e0
      
       I incorrectly reserved the wrong bit for
      EXT4_FEATURE_INCOMPAT_INLINEDATA per the discussion on the linux-ext4
      list on December 7, 2011.  The codepoint 0x2000 should be used for
      EXT4_FEATURE_INCOMPAT_USE_META_CSUM, so INLINEDATA will be assigned
      the value 0x8000.
      
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      856cbcf9
    • Theodore Ts'o's avatar
      ext4: fix race when setting bitmap_uptodate flag · 813e5727
      Theodore Ts'o authored
      
      
      In ext4_read_{inode,block}_bitmap() we were setting bitmap_uptodate()
      before submitting the buffer for read.  The is bad, since we check
      bitmap_uptodate() without locking the buffer, and so if another
      process is racing with us, it's possible that they will think the
      bitmap is uptodate even though the read has not completed yet,
      resulting in inodes and blocks potentially getting allocated more than
      once if we get really unlucky.
      
      Addresses-Google-Bug: 2828254
      
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      813e5727
  6. 05 Jan, 2012 2 commits
  7. 04 Jan, 2012 3 commits
  8. 29 Dec, 2011 2 commits
  9. 01 Nov, 2011 1 commit
  10. 31 Oct, 2011 1 commit
  11. 29 Oct, 2011 1 commit
  12. 25 Oct, 2011 1 commit
    • Dmitry Monakhov's avatar
      ext4: update EOFBLOCKS flag on fallocate properly · a4e5d88b
      Dmitry Monakhov authored
      
      
      EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
      Currently it happens only if new extent was allocated.
      
      TESTCASE:
      fallocate test_file -n -l4096
      fallocate test_file -l4096
      Last fallocate cmd has updated size, but keept EOFBLOCK_FL set. And
      fsck will complain about that.
      
      Also remove ping pong in ext4_fallocate() in case of new extents,
      where ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
      ext4_falloc_update_inode() restore it again.
      
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      a4e5d88b
  13. 08 Oct, 2011 2 commits
  14. 09 Sep, 2011 15 commits
  15. 03 Sep, 2011 2 commits
    • Theodore Ts'o's avatar
      ext4: improve handling of conflicting mount options · 56889787
      Theodore Ts'o authored
      
      
      If the user explicitly specifies conflicting mount options for
      delalloc or dioread_nolock and data=journal, fail the mount, instead
      of printing a warning and continuing (since many user's won't look at
      dmesg and notice the warning).
      
      Also, print a single warning that data=journal implies that delayed
      allocation is not on by default (since it's not supported), and
      furthermore that O_DIRECT is not supported.  Improve the text in
      Documentation/filesystems/ext4.txt so this is clear there as well.
      
      Similarly, if the dioread_nolock mount option is specified when the
      file system block size != PAGE_SIZE, fail the mount instead of
      printing a warning message and ignoring the mount option.
      
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      56889787
    • Allison Henderson's avatar
      ext4: Add new ext4_discard_partial_page_buffers routines · 4e96b2db
      Allison Henderson authored
      
      
      This patch adds two new routines: ext4_discard_partial_page_buffers
      and ext4_discard_partial_page_buffers_no_lock.
      
      The ext4_discard_partial_page_buffers routine is a wrapper
      function to ext4_discard_partial_page_buffers_no_lock.
      The wrapper function locks the page and passes it to
      ext4_discard_partial_page_buffers_no_lock.
      Calling functions that already have the page locked can call
      ext4_discard_partial_page_buffers_no_lock directly.
      
      The ext4_discard_partial_page_buffers_no_lock function
      zeros a specified range in a page, and unmaps the
      corresponding buffer heads.  Only block aligned regions of the
      page will have their buffer heads unmapped.  Unblock aligned regions
      will be mapped if needed so that they can be updated with the
      partial zero out.  This function is meant to
      be used to update a page and its buffer heads to be zeroed
      and unmapped when the corresponding blocks have been released
      or will be released.
      
      This routine is used in the following scenarios:
      * A hole is punched and the non page aligned regions
        of the head and tail of the hole need to be discarded
      
      * The file is truncated and the partial page beyond EOF needs
        to be discarded
      
      * The end of a hole is in the same page as EOF.  After the
        page is flushed, the partial page beyond EOF needs to be
        discarded.
      
      * A write operation begins or ends inside a hole and the partial
        page appearing before or after the write needs to be discarded
      
      * A write operation extends EOF and the partial page beyond EOF
        needs to be discarded
      
      This function takes a flag EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED
      which is used when a write operation begins or ends in a hole.
      When the EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED flag is used, only
      buffer heads that are already unmapped will have the corresponding
      regions of the page zeroed.
      
      Signed-off-by: default avatarAllison Henderson <achender@linux.vnet.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      4e96b2db
  16. 31 Aug, 2011 1 commit