Skip to content
  • shli@kernel.org's avatar
    RAID5: batch adjacent full stripe write · 59fc630b
    shli@kernel.org authored
    
    
    stripe cache is 4k size. Even adjacent full stripe writes are handled in 4k
    unit. Idealy we should use big size for adjacent full stripe writes. Bigger
    stripe cache size means less stripes runing in the state machine so can reduce
    cpu overhead. And also bigger size can cause bigger IO size dispatched to under
    layer disks.
    
    With below patch, we will automatically batch adjacent full stripe write
    together. Such stripes will be added to the batch list. Only the first stripe
    of the list will be put to handle_list and so run handle_stripe(). Some steps
    of handle_stripe() are extended to cover all stripes of the list, including
    ops_run_io, ops_run_biodrain and so on. With this patch, we have less stripes
    running in handle_stripe() and we send IO of whole stripe list together to
    increase IO size.
    
    Stripes added to a batch list have some limitations. A batch list can only
    include full stripe write and can't cross chunk boundary to make sure stripes
    have the same parity disks. Stripes in a batch list must be in the same state
    (no written, toread and so on). If a stripe is in a batch list, all new
    read/write to add_stripe_bio will be blocked to overlap conflict till the batch
    list is handled. The limitations will make sure stripes in a batch list be in
    exactly the same state in the life circly.
    
    I did test running 160k randwrite in a RAID5 array with 32k chunk size and 6
    PCIe SSD. This patch improves around 30% performance and IO size to under layer
    disk is exactly 32k. I also run a 4k randwrite test in the same array to make
    sure the performance isn't changed with the patch.
    
    Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    59fc630b