Python script to download files from a server

Python script to download files from a server

Table of contents

No heading

No headings in the article.

FTP (File Transfer Protocol) is a standard network protocol used for transferring files between a client and a server on a computer network. It enables users to upload and download files to and from a remote server.

Here's an overview of FTP for file download:

  1. Server Setup: To use FTP for file download, you need an FTP server set up and running. The server hosts the files that can be accessed and downloaded by clients.

  2. Client Connection: The client is the device or software that connects to the FTP server to initiate file transfers. The client can be a dedicated FTP client program or a web browser with built-in FTP capabilities.

  3. Connecting to the FTP Server: The client establishes a connection to the FTP server using the server's hostname or IP address, along with a username and password for authentication.

  4. Navigating the Remote File System: Once connected, the client can navigate the directory structure of the remote server using FTP commands such as LIST, CWD, and PWD. These commands allow the client to view the available files and directories on the server.

  5. Downloading Files: To download a file from the FTP server, the client typically uses the RETR (Retrieve) command followed by the filename or path of the file. This initiates the transfer of the specified file from the server to the client. The file is then saved to the client's local system.

  6. Binary or ASCII Mode: FTP supports two transfer modes: binary and ASCII. Binary mode is used for non-text files like images, videos, and executables, while ASCII mode is used for plain text files. It's important to choose the correct mode to ensure the file is transferred correctly.

  7. Handling Large Files: FTP supports the ability to resume interrupted file downloads. If a download is interrupted, the client can reconnect to the server and use the REST (Restart) command to specify the starting point of the file transfer.

  8. Closing the Connection: Once the file transfer is complete, the client can issue the QUIT command to gracefully terminate the FTP session and close the connection to the server.

Here's an example of how to go about it:

import os
import ftplib

username = "dlpuser"
pwd = "rNrKYTX9g7z3RgJRmxWuGHbeu"
remote_dir = 'images'
local_dir = 'C:/Users/USER/OneDrive/Images/PINTEREST/'
history_file = 'history.txt'

ftp = ftplib.FTP("ftp.dlptest.com")
ftp.login(username, pwd)

ftp.cwd(remote_dir)

# Create an empty history file if it doesn't exist
if not os.path.exists(history_file):
    with open(history_file, 'w') as file:
        # Write an initial message or leave it blank
        file.write('This is the history file for downloaded files.\n')       


# Read the previously downloaded files from history file
downloaded_files = []
if os.path.exists(history_file):
    with open(history_file, 'r') as file:
        downloaded_files = file.read().splitlines()

# Get the list of files in the current directory on the FTP server 
files = ftp.nlst()

# Compare and download only the new files
for file in files:
    if file.endswith('.tiff') and file not in downloaded_files:
        local_filepath = os.path.join(local_dir, file)

        if os.path.exists(local_filepath):
            local_file_size = os.path.getsize(local_filepath)
            ftp.sendcmd('REST %s' % local_file_size)
            with open(local_filepath, 'ab') as local_file:
                ftp.retrbinary('RETR %s' % file, local_file.write)
        else:
            with open(local_filepath, 'wb') as local_file:
                ftp.retrbinary('RETR %s' % file, local_file.write)

        print('Downloaded: %s' % file)
    else:
        print("Failed to download: %s" % file)

# Update the history file with the new downloads
with open(history_file, 'a') as file:
    file.write('\n'.join(files))

ftp.quit()

Certainly! Here's an explanation of what is happening in the code:

  1. The code begins by importing the necessary modules, os and ftplib, which provide functions for interacting with the operating system and FTP functionality, respectively.

  2. It then sets the FTP server credentials (username and pwd), the remote directory on the server where the files are located (remote_dir), the local directory where the files will be downloaded to (local_dir), and the name of the history file to track downloaded files (history_file).

  3. The code establishes a connection to the FTP server using the provided credentials by creating an FTP object with the server address.

  4. It changes the current working directory on the server to the specified remote directory using the cwd method of the FTP object.

  5. If the history file doesn't exist, it creates an empty history file and writes an initial message indicating its purpose.

  6. It checks if the history file exists and, if so, reads the contents to retrieve the list of previously downloaded files.

  7. The code retrieves the list of files in the current directory on the FTP server using the nlst method of the FTP object.

  8. It loops through each file in the directory and checks if it has a ".tiff" extension and if it has not been previously downloaded.

  9. For new files, it constructs the local file path by joining the local directory and the file name.

  10. If the file already exists locally, it retrieves the size of the local file using the os.path.getsize() function and uses the REST command to set the FTP server's position for retrieval at the size of the local file.

  11. It opens the local file in append mode ('ab') and uses the retrbinary method of the FTP object to resume downloading the file from the server and append the new data to the existing file.

  12. If the file doesn't exist locally, it opens the local file in write mode ('wb') and downloads the file from the server using the retrbinary method.

  13. After each successful download, it prints a message indicating the file has been downloaded. If the download fails, it prints a message indicating the failure.

  14. Once all the files have been processed, the code updates the history file by appending the names of the downloaded files.

  15. Finally, it closes the FTP connection using the quit method of the FTP object.

Overall, the code connects to the FTP server, checks for previously downloaded files using the history file, compares the files on the server with the downloaded files, and selectively downloads the new files while allowing for resumption of interrupted downloads. It also keeps track of the downloaded files in the history file for future reference.

It's worth noting that FTP has been widely used in the past, but newer protocols like SFTP and HTTP(S) have gained popularity due to their improved security, ease of use, and compatibility with firewalls. However, FTP is still supported and utilized in various scenarios where legacy systems or specific requirements necessitate its usage.

Happy Coding!