Skip to content
  • J. Bruce Fields's avatar
    dcache: return -ESTALE not -EBUSY on distributed fs race · 3d330dc1
    J. Bruce Fields authored
    On a distributed filesystem it's possible for lookup to discover that a
    directory it just found is already cached elsewhere in the directory
    heirarchy.  The dcache won't let us keep the directory in both places,
    so we have to move the dentry to the new location from the place we
    previously had it cached.
    
    If the parent has changed, then this requires all the same locks as we'd
    need to do a cross-directory rename.  But we're already in lookup
    holding one parent's i_mutex, so it's too late to acquire those locks in
    the right order.
    
    The (unreliable) solution in __d_unalias is to trylock() the required
    locks and return -EBUSY if it fails.
    
    I see no particular reason for returning -EBUSY, and -ESTALE is already
    the result of some other lookup races on NFS.  I think -ESTALE is the
    more helpful error return.  It also allows us to take advantage of the
    logic Jeff Layton added in c6a94284
    
     "vfs: fix renameat to retry on
    ESTALE errors" and ancestors, which hopefully resolves some of these
    errors before they're returned to userspace.
    
    I can reproduce these cases using NFS with:
    
    	ssh root@$client '
    		mount -olookupcache=pos '$server':'$export' /mnt/
    		mkdir /mnt/TO
    		mkdir /mnt/DIR
    		touch /mnt/DIR/test.txt
    		while true; do
    			strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY
    		done
    	'
    	ssh root@$server '
    		while true; do
    			mv $export/DIR $export/TO/DIR
    			mv $export/TO/DIR $export/DIR
    		done
    	'
    
    It also helps to add some other concurrent use of the directory on the
    client (e.g., "ls /mnt/TO").  And you can replace the server-side mv's
    by client-side mv's that are repeatedly killed.  (If the client is
    interrupted while waiting for the RENAME response then it's left with a
    dentry that has to go under one parent or the other, but it doesn't yet
    know which.)
    
    Acked-by: default avatarJeff Layton <jlayton@primarydata.com>
    Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
    Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    3d330dc1