Python: File Download Using Ftplib Hangs Forever After File Is Successfully Downloaded

January 13, 2024 Post a Comment

I have been trying to troubleshoot an issue where in when we are downloading a file from ftp/ftps. File gets downloaded successfully but no operation is performed post file downloa

Solution 1:

Before doing anything, note that there is something very wrong with your connection, and diagnosing that and getting it fixed is far better than working around it. But sometimes, you just have to deal with a broken server, and even sending keepalives doesn't help. So, what can you do?

The trick is to download a chunk at a time, then abort the download—or, if the server can't handle aborting, close and reopen the connection.

Note that I'm testing everything below with ftp://speedtest.tele2.net/5MB.zip, which hopefully this doesn't cause a million people to start hammering their servers. Of course you'll want to test it with your actual server.

Testing for `REST`

The entire solution of course relies on the server being able to resume transfers, which not all servers can do—especially when you're dealing with something badly broken. So we'll need to test for that. Note that this test will be very slow, and very heavy on the server, so do not testing with your 3GB file; find something much smaller. Also, if you can put something readable there, it will help for debugging, because you may be stuck comparing files in a hex editor.

def downit():
    with open('5MB.zip', 'wb') as f:
        while True:
            ftp = FTP(host='speedtest.tele2.net', user='anonymous', passwd='test@example.com')
            pos = f.tell()
            print(pos)
            ftp.sendcmd('TYPE I')
            sock = ftp.transfercmd('RETR 5MB.zip', rest=pos)
            buf = sock.recv(1024 * 1024)
            if not buf:
                return
            f.write(buf)

You will probably not get 1MB at a time, but instead something under 8KB. Let's assume you're seeing 1448, then 2896, 4344, etc.

If you get an exception from the REST, the server does not handle resuming—give up, you're hosed.
If the file goes on past the actual file size, hit ^C, and check it in a hex editor.
- If you see the same 1448 bytes or whatever (the amount you saw it printing out) over and over again, again, you're hosed.
- If you have the right data, but with extra bytes between each chunk of 1448 bytes, that's actually fixable. If you run into this and can't figure out how to fix it by using f.seek, I can explain—but you probably won't run into it.

Testing for `ABRT`

One thing we can do is try to abort the download and not reconnect.

def downit():
    with open('5MB.zip', 'wb') as f:
        ftp = FTP(host='speedtest.tele2.net', user='anonymous', passwd='test@example.com')
        while True:
            pos = f.tell()
            print(pos)
            ftp.sendcmd('TYPE I')
            sock = ftp.transfercmd('RETR 5MB.zip', rest=pos)
            buf = sock.recv(1024 * 1024)
            if not buf:
                return
            f.write(buf)
            sock.close()
            ftp.abort()

You're going to want to try multiple variations:

No sock.close.
No ftp.abort.
With sock.close after ftp.abort.
With ftp.abort after sock.close.
All four of the above repeated with TYPE I moved to before the loop instead of each time.

Some will raise exceptions. Others will just appear to hang forever. If that's true for all 8 of them, we need to give up on aborting. But if any of them works, great!

Downloading a full chunk

The other way to speed things up is to download 1MB (or more) at a time before aborting or reconnecting. Just replace this code:

buf=sock.recv(1024*1024)if buf:f.write(buf)

with this:

chunklen = 1024 * 1024while chunklen:
    print('   ', f.tell())
    buf = sock.recv(chunklen)
    ifnot buf:
        break
    f.write(buf)
    chunklen -= len(buf)

Now, instead of reading 1442 or 8192 bytes for each transfer, you're reading up to 1MB for each transfer. Try pushing it farther.

Combining with keepalives

If, say, your downloads were failing at 10MB, and the keepalive code in your question got things up to 512MB, but it just wasn't enough for 3GB—you can combine the two. Use keepalives to read 512MB at a time, then abort or reconnect and read the next 512MB, until you're done.

Python Channel

Python: File Download Using Ftplib Hangs Forever After File Is Successfully Downloaded

Solution 1:

Testing for `REST`

Testing for `ABRT`

Downloading a full chunk

Combining with keepalives

Post a Comment for "Python: File Download Using Ftplib Hangs Forever After File Is Successfully Downloaded"

Python: File Download Using Ftplib Hangs Forever After File Is Successfully Downloaded

Solution 1:

Testing for REST

Testing for ABRT

Downloading a full chunk

Combining with keepalives

Post a Comment for "Python: File Download Using Ftplib Hangs Forever After File Is Successfully Downloaded"

Testing for `REST`

Testing for `ABRT`