Skip to content
Snippets Groups Projects
This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git. Pull mirroring updated .
  1. Jun 16, 2009
  2. Jun 12, 2009
  3. Jun 11, 2009
    • Kiyoshi Ueda's avatar
      block: add request clone interface (v2) · b0fd271d
      Kiyoshi Ueda authored
      
      This patch adds the following 2 interfaces for request-stacking drivers:
      
        - blk_rq_prep_clone(struct request *clone, struct request *orig,
      		      struct bio_set *bs, gfp_t gfp_mask,
      		      int (*bio_ctr)(struct bio *, struct bio*, void *),
      		      void *data)
            * Clones bios in the original request to the clone request
              (bio_ctr is called for each cloned bios.)
            * Copies attributes of the original request to the clone request.
              The actual data parts (e.g. ->cmd, ->buffer, ->sense) are not
              copied.
      
        - blk_rq_unprep_clone(struct request *clone)
            * Frees cloned bios from the clone request.
      
      Request stacking drivers (e.g. request-based dm) need to make a clone
      request for a submitted request and dispatch it to other devices.
      
      To allocate request for the clone, request stacking drivers may not
      be able to use blk_get_request() because the allocation may be done
      in an irq-disabled context.
      So blk_rq_prep_clone() takes a request allocated by the caller
      as an argument.
      
      For each clone bio in the clone request, request stacking drivers
      should be able to set up their own completion handler.
      So blk_rq_prep_clone() takes a callback function which is called
      for each clone bio, and a pointer for private data which is passed
      to the callback.
      
      NOTE:
      blk_rq_prep_clone() doesn't copy any actual data of the original
      request.  Pages are shared between original bios and cloned bios.
      So caller must not complete the original request before the clone
      request.
      
      Signed-off-by: default avatarKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      b0fd271d
  4. Jun 10, 2009
  5. Jun 09, 2009
    • Li Zefan's avatar
      tracing/events: convert block trace points to TRACE_EVENT() · 55782138
      Li Zefan authored
      
      TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
      these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
        ...
      
      Cons:
      
        - no dev_t info for the output of plug, unplug_timer and unplug_io events.
          no dev_t info for getrq and sleeprq events if bio == NULL.
          no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
      
          This is mainly because we can't get the deivce from a request queue.
          But this may change in the future.
      
        - A packet command is converted to a string in TP_assign, not TP_print.
          While blktrace do the convertion just before output.
      
          Since pc requests should be rather rare, this is not a big issue.
      
        - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
          has a unique format, which means we have some unused data in a trace entry.
      
          The overhead is minimized by using __dynamic_array() instead of __array().
      
      I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
      
            dd                   dd + ioctl blktrace       dd + TRACE_EVENT (splice)
      1     7.36s, 42.7 MB/s     7.50s, 42.0 MB/s          7.41s, 42.5 MB/s
      2     7.43s, 42.3 MB/s     7.48s, 42.1 MB/s          7.43s, 42.4 MB/s
      3     7.38s, 42.6 MB/s     7.45s, 42.2 MB/s          7.41s, 42.5 MB/s
      
      So the overhead of tracing is very small, and no regression when using
      those trace events vs blktrace.
      
      And the binary output of TRACE_EVENT is much smaller than blktrace:
      
       # ls -l -h
       -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
       -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
       -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
      
      Following are some comparisons between TRACE_EVENT and blktrace:
      
      plug:
        kjournald-480   [000]   303.084981: block_plug: [kjournald]
        kjournald-480   [000]   303.084981:   8,0    P   N [kjournald]
      
      unplug_io:
        kblockd/0-118   [000]   300.052973: block_unplug_io: [kblockd/0] 1
        kblockd/0-118   [000]   300.052974:   8,0    U   N [kblockd/0] 1
      
      remap:
        kjournald-480   [000]   303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
        kjournald-480   [000]   303.085043:   8,0    A   W 102736992 + 8 <- (8,8) 33384
      
      bio_backmerge:
        kjournald-480   [000]   303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
        kjournald-480   [000]   303.085086:   8,0    M   W 102737032 + 8 [kjournald]
      
      getrq:
        kjournald-480   [000]   303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084975:   8,0    G   W 102736984 + 8 [kjournald]
      
        bash-2066  [001]  1072.953770:   8,0    G   N [bash]
        bash-2066  [001]  1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
      
      rq_complete:
        konsole-2065  [001]   300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
        konsole-2065  [001]   300.053191:   8,0    C   W 103669040 + 16 [0]
      
        ksoftirqd/1-7   [001]  1072.953811:   8,0    C   N (5a 00 08 00 00 00 00 00 24 00) [0]
        ksoftirqd/1-7   [001]  1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
      
      rq_insert:
        kjournald-480   [000]   303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084986:   8,0    I   W 102736984 + 8 [kjournald]
      
      Changelog from v2 -> v3:
      
      - use the newly introduced __dynamic_array().
      
      Changelog from v1 -> v2:
      
      - use __string() instead of __array() to minimize the memory required
        to store hex dump of rq->cmd().
      
      - support large pc requests.
      
      - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
      
      - some cleanups.
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      55782138
    • FUJITA Tomonori's avatar
      bsg: setting rq->bio to NULL · c1d4c41f
      FUJITA Tomonori authored
      
      Due to commit 1cd96c24 ("block: WARN
      in __blk_put_request() for potential bio leak"), BSG SMP requests get
      the false warnings:
      
      WARNING: at block/blk-core.c:1068 __blk_put_request+0x52/0xc0()
      
      This sets rq->bio to NULL to avoid that false warnings.
      
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c1d4c41f
    • Martin K. Petersen's avatar
      block: Add missing bounce_pfn stacking and fix comments · 77634f33
      Martin K. Petersen authored
      
      DM no longer needs to set limits explicitly when calling blk_stack_limits.
      Let the latter automatically deal with bounce_pfn scaling.
      
      Fix kerneldoc variable names.
      
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      77634f33
    • Jens Axboe's avatar
      Revert "block: Fix bounce limit setting in DM" · 9df1bb9b
      Jens Axboe authored
      
      This reverts commit a05c0205.
      
      DM doesn't need to access the bounce_pfn directly.
      
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      9df1bb9b
    • FUJITA Tomonori's avatar
      block: needs to set the residual length of a bidi request · dbb66c4b
      FUJITA Tomonori authored
      
      Tejun's "block: set rq->resid_len to blk_rq_bytes() on issue" patch
      seems to be incomplete; It doesn't set rq->resid_len to blk_rq_bytes()
      for a bidi request (req->next_rq). As a result, all bidi users are
      broken.
      
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      dbb66c4b
  6. Jun 03, 2009
  7. Jun 02, 2009
  8. May 30, 2009
  9. May 28, 2009
  10. May 27, 2009
    • Kiyoshi Ueda's avatar
      block: fix no diskstat problem · 3c4198e8
      Kiyoshi Ueda authored
      
      The commit below in 2.6-block/for-2.6.31 causes no diskstat problem
      because the blk_discard_rq() check was added with '&&'.
      It should be 'blk_fs_request() || blk_discard_rq()'.
      This patch does it and fixes the no diskstat problem.
      Please review and apply.
      
      ------ /proc/diskstat without this patch -------------------------------------
         8       0 sda 0 0 0 0 0 0 0 0 0 0 0
      ------------------------------------------------------------------------------
      
      ----- /proc/diskstat with this patch applied ---------------------------------
         8       0 sda 4186 303 373621 61600 9578 3859 107468 169479 2 89755 231059
      ------------------------------------------------------------------------------
      
      --------------------------------------------------------------------------
      commit c69d4854
      Author: Jens Axboe <jens.axboe@oracle.com>
      Date:   Fri Apr 24 08:12:19 2009 +0200
      
          block: include discard requests in IO accounting
      
          We currently don't do merging on discard requests, but we potentially
          could. If we do, then we need to include discard requests in the IO
          accounting, or merging would end up decrementing in_flight IO counters
          for an IO which never incremented them.
      
          So enable accounting for discard requests.
      
      <snip>
      
       static inline int blk_do_io_stat(struct request *rq)
       {
      -       return rq->rq_disk && blk_rq_io_stat(rq) && blk_fs_request(rq);
      +       return rq->rq_disk && blk_rq_io_stat(rq) && blk_fs_request(rq) &&
      +               blk_discard_rq(rq);
       }
      --------------------------------------------------------------------------
      
      Signed-off-by: default avatarKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      3c4198e8
    • James Bottomley's avatar
      block: fix oops with block tag queueing · ba396a6c
      James Bottomley authored
      
      commit e8939a50466fd963eb1ba9118c34b9ffb7ff6aa6
      Author: Tejun Heo <tj@kernel.org>
      Date:   Fri May 8 11:54:16 2009 +0900
      
          block: implement and enforce request peek/start/fetch
      
      Added a BUG_ON(blk_queued_rq(req)) to the top of blk_finish_req().
      Unfortunately, this checks whether req->queuelist is empty.  This list
      is doing double duty both as the queue list and the tag list, so tagged
      requests come in here with this not empty and boom (the tag list is
      emptied by blk_queue_end_tag() lower down).
      
      Fix this by moving the BUG_ON to below the end tag we also seem
      vulnerable to this in blk_requeue_request() as well.  I think all uses
      of blk_queued_rq() need auditing because the check is clearly wrong in
      the tagged case.
      
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      ba396a6c
  11. May 22, 2009
  12. May 20, 2009
  13. May 19, 2009
  14. May 12, 2009
    • Kazuhisa Ichikawa's avatar
      block: fix the bio_vec array index out-of-bounds test · af498d7f
      Kazuhisa Ichikawa authored
      
      Current bio_vec array index out-of-bounds test within
      __end_that_request_first() does not seem correct.
      It checks bio->bi_idx against bio->bi_vcnt, but the subsequent code
      uses idx (which is, bio->bi_idx + next_idx) as the array index into
      bio_vec array. This means that the test really make sense only at
      the first iteration of !(nr_bytes >=bio->bi_size) case (when next_idx
      == zero). Fix this by replacing bio->bi_idx with idx.
      (This patch applies to 2.6.30-rc4.)
      
      Signed-off-by: default avatarKazuhisa Ichikawa <ki@epsilou.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      af498d7f
  15. May 11, 2009
    • FUJITA Tomonori's avatar
      block: move completion related functions back to blk-core.c · b1f74493
      FUJITA Tomonori authored
      
      Let's put the completion related functions back to block/blk-core.c
      where they have lived. We can also unexport blk_end_bidi_request() and
      __blk_end_bidi_request(), which nobody uses.
      
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      b1f74493
    • Tejun Heo's avatar
      block: implement and enforce request peek/start/fetch · 9934c8c0
      Tejun Heo authored
      
      Till now block layer allowed two separate modes of request execution.
      A request is always acquired from the request queue via
      elv_next_request().  After that, drivers are free to either dequeue it
      or process it without dequeueing.  Dequeue allows elv_next_request()
      to return the next request so that multiple requests can be in flight.
      
      Executing requests without dequeueing has its merits mostly in
      allowing drivers for simpler devices which can't do sg to deal with
      segments only without considering request boundary.  However, the
      benefit this brings is dubious and declining while the cost of the API
      ambiguity is increasing.  Segment based drivers are usually for very
      old or limited devices and as converting to dequeueing model isn't
      difficult, it doesn't justify the API overhead it puts on block layer
      and its more modern users.
      
      Previous patches converted all block low level drivers to dequeueing
      model.  This patch completes the API transition by...
      
      * renaming elv_next_request() to blk_peek_request()
      
      * renaming blkdev_dequeue_request() to blk_start_request()
      
      * adding blk_fetch_request() which is combination of peek and start
      
      * disallowing completion of queued (not started) requests
      
      * applying new API to all LLDs
      
      Renamings are for consistency and to break out of tree code so that
      it's apparent that out of tree drivers need updating.
      
      [ Impact: block request issue API cleanup, no functional change ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: unsik Kim <donari75@gmail.com>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Cc: Tim Waugh <tim@cyberelk.net>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Laurent Vivier <Laurent@lvivier.info>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      9934c8c0
    • Tejun Heo's avatar
      block: hide request sector and data_len · a2dec7b3
      Tejun Heo authored
      
      Block low level drivers for some reason have been pretty good at
      abusing block layer API.  Especially struct request's fields tend to
      get violated in all possible ways.  Make it clear that low level
      drivers MUST NOT access or manipulate rq->sector and rq->data_len
      directly by prefixing them with double underscores.
      
      This change is also necessary to break build of out-of-tree codes
      which assume the previous block API where internal fields can be
      manipulated and rq->data_len carries residual count on completion.
      
      [ Impact: hide internal fields, block API change ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a2dec7b3
    • Tejun Heo's avatar
      block: drop request->hard_* and *nr_sectors · 2e46e8b2
      Tejun Heo authored
      
      struct request has had a few different ways to represent some
      properties of a request.  ->hard_* represent block layer's view of the
      request progress (completion cursor) and the ones without the prefix
      are supposed to represent the issue cursor and allowed to be updated
      as necessary by the low level drivers.  The thing is that as block
      layer supports partial completion, the two cursors really aren't
      necessary and only cause confusion.  In addition, manual management of
      request detail from low level drivers is cumbersome and error-prone at
      the very least.
      
      Another interesting duplicate fields are rq->[hard_]nr_sectors and
      rq->{hard_cur|current}_nr_sectors against rq->data_len and
      rq->bio->bi_size.  This is more convoluted than the hard_ case.
      
      rq->[hard_]nr_sectors are initialized for requests with bio but
      blk_rq_bytes() uses it only for !pc requests.  rq->data_len is
      initialized for all request but blk_rq_bytes() uses it only for pc
      requests.  This causes good amount of confusion throughout block layer
      and its drivers and determining the request length has been a bit of
      black magic which may or may not work depending on circumstances and
      what the specific LLD is actually doing.
      
      rq->{hard_cur|current}_nr_sectors represent the number of sectors in
      the contiguous data area at the front.  This is mainly used by drivers
      which transfers data by walking request segment-by-segment.  This
      value always equals rq->bio->bi_size >> 9.  However, data length for
      pc requests may not be multiple of 512 bytes and using this field
      becomes a bit confusing.
      
      In general, having multiple fields to represent the same property
      leads only to confusion and subtle bugs.  With recent block low level
      driver cleanups, no driver is accessing or manipulating these
      duplicate fields directly.  Drop all the duplicates.  Now rq->sector
      means the current sector, rq->data_len the current total length and
      rq->bio->bi_size the current segment length.  Everything else is
      defined in terms of these three and available only through accessors.
      
      * blk_recalc_rq_sectors() is collapsed into blk_update_request() and
        now handles pc and fs requests equally other than rq->sector update.
        This means that now pc requests can use partial completion too (no
        in-kernel user yet tho).
      
      * bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
        now uses byte count as the primary data length.
      
      * blk_rq_pos() is now guranteed to be always correct.  In-block users
        converted.
      
      * blk_rq_bytes() is now guaranteed to be always valid as is
        blk_rq_sectors().  In-block users converted.
      
      * blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
        More convenient one is used.
      
      * blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
        pointer to request.
      
      [ Impact: API cleanup, single way to represent one property of a request ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      2e46e8b2
    • Tejun Heo's avatar
      block: convert to pos and nr_sectors accessors · 83096ebf
      Tejun Heo authored
      
      With recent cleanups, there is no place where low level driver
      directly manipulates request fields.  This means that the 'hard'
      request fields always equal the !hard fields.  Convert all
      rq->sectors, nr_sectors and current_nr_sectors references to
      accessors.
      
      While at it, drop superflous blk_rq_pos() < 0 test in swim.c.
      
      [ Impact: use pos and nr_sectors accessors ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Tested-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Acked-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Tested-by: default avatarAdrian McMenamin <adrian@mcmen.demon.co.uk>
      Acked-by: default avatarAdrian McMenamin <adrian@mcmen.demon.co.uk>
      Acked-by: default avatarMike Miller <mike.miller@hp.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Cc: Tim Waugh <tim@cyberelk.net>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Dario Ballabio <ballabio_dario@emc.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: unsik Kim <donari75@gmail.com>
      Cc: Laurent Vivier <Laurent@lvivier.info>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      83096ebf
    • Tejun Heo's avatar
      block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones · 5b93629b
      Tejun Heo authored
      
      Implement accessors - blk_rq_pos(), blk_rq_sectors() and
      blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
      and rq->hard_cur_sectors respectively and convert direct references of
      the said fields to the accessors.
      
      This is in preparation of request data length handling cleanup.
      
      Geert	: suggested adding const to struct request * parameter to accessors
      Sergei	: spotted error in patch description
      
      [ Impact: cleanup ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Acked-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Tested-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Acked-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Ackec-by: default avatarSergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      5b93629b
    • Tejun Heo's avatar
      block: add rq->resid_len · c3a4d78c
      Tejun Heo authored
      
      rq->data_len served two purposes - the length of data buffer on issue
      and the residual count on completion.  This duality creates some
      headaches.
      
      First of all, block layer and low level drivers can't really determine
      what rq->data_len contains while a request is executing.  It could be
      the total request length or it coulde be anything else one of the
      lower layers is using to keep track of residual count.  This
      complicates things because blk_rq_bytes() and thus
      [__]blk_end_request_all() relies on rq->data_len for PC commands.
      Drivers which want to report residual count should first cache the
      total request length, update rq->data_len and then complete the
      request with the cached data length.
      
      Secondly, it makes requests default to reporting full residual count,
      ie. reporting that no data transfer occurred.  The residual count is
      an exception not the norm; however, the driver should clear
      rq->data_len to zero to signify the normal cases while leaving it
      alone means no data transfer occurred at all.  This reverse default
      behavior complicates code unnecessarily and renders block PC on some
      drivers (ide-tape/floppy) unuseable.
      
      This patch adds rq->resid_len which is used only for residual count.
      
      While at it, remove now unnecessasry blk_rq_bytes() caching in
      ide_pc_intr() as rq->data_len is not changed anymore.
      
      Boaz	: spotted missing conversion in osd
      Sergei	: spotted too early conversion to blk_rq_bytes() in ide-tape
      
      [ Impact: cleanup residual count handling, report 0 resid by default ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Doug Gilbert <dgilbert@interlog.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Darrick J. Wong <djwong@us.ibm.com>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c3a4d78c
Loading