From: Mike Christie <mchristi@redhat.com> Date: Fri, 18 Jun 2010 19:09:52 -0500 Subject: [iscsi] fix slow failover times Message-id: <1276906192-2764-1-git-send-email-mchristi@redhat.com> O-Subject: [PATCH RHEL 5.3.z]: Fix iscsi failover/shutdown (ver 2) Bugzilla: 583898 RH-Acked-by: Tomas Henzl <thenzl@redhat.com> From: Mike Christie <mchristi@redhat.com> This is for BZ 583898. This patches fixes a couple bugs in the shutdown/relogin code: 1. If we are trying to log out of a iscsi connection and close the socket at the same time there was a problem with the connection (someone pulled a cable, switch died, etc), and while lots of IO was being sent by the iscsi layer, then the network code could be waiting in sk_stream_wait_memory. This patch adds a wake_up on the sock so we do not have to wait the full sk_sndtimeo secs. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7d05548a62c87ee55b0c81933669177f885aa8d http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b64e77f70b8c11766e967e3485331a9e6ef01390 2. There is a race where the xmit or scsi eh thread can reset the session->state while the recovery code thread is trying to clean up the session resources. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4ae0a6c15efcc37e94e3f30e3533bdec03c53126 I can replicate the problem here and verified the patch fixes the problem. I also ran the iscsi test scripts to check for regressions. Ver 2: - Rebuild and retest patch against current 5.3.z stream kernel. --- drivers/scsi/iscsi_tcp.c | 9 +++++++++ drivers/scsi/libiscsi.c | 5 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index e7a7978..98fca22 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -1870,6 +1870,15 @@ iscsi_tcp_conn_stop(struct iscsi_cls_conn *cls_conn, int flag) { struct iscsi_conn *conn = cls_conn->dd_data; struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct socket *sock = tcp_conn->sock; + + if (!sock) + return; + + if (sock->sk->sk_sleep) { + sock->sk->sk_err = EIO; + wake_up_interruptible(sock->sk->sk_sleep); + } iscsi_conn_stop(cls_conn, flag); iscsi_tcp_release_conn(conn); diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 526218a..f0d3694 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -2059,14 +2059,15 @@ static void iscsi_start_session_recovery(struct iscsi_session *session, session->state = ISCSI_STATE_TERMINATE; else if (conn->stop_stage != STOP_CONN_RECOVER) session->state = ISCSI_STATE_IN_RECOVERY; + + old_stop_stage = conn->stop_stage; + conn->stop_stage = flag; spin_unlock_bh(&session->lock); del_timer_sync(&conn->transport_timer); iscsi_suspend_tx(conn); spin_lock_bh(&session->lock); - old_stop_stage = conn->stop_stage; - conn->stop_stage = flag; conn->c_stage = ISCSI_CONN_STOPPED; spin_unlock_bh(&session->lock); -- 1.6.6.1