This project implements a simple backup system using a client-server architecture. The client regularly backs up files from a specified directory to a remote server. The server listens for incoming file transfers and saves them in a backup directory. Both the client and server use MD5 hashing to ensure that only modified or new files are transferred.
- Features
- Prerequisites
- Installation
- Usage
- Configuration
- File Backup Flow
- Class Diagram
- Sequence Diagram
-
Client-Side:
- Periodically backs up files from a local directory to the server.
- Allows configuring the types of files to back up via extensions.
- Supports incremental backups by comparing file hashes.
- Allows for periodic backups with a configurable interval.
- Provides a graphical user interface (GUI) for easier configuration and backup management.
-
Server-Side:
- Listens for incoming file transfers from clients.
- Saves files in a backup directory, avoiding unnecessary overwrites if the file hasn’t changed.
- Tracks the state of backups to identify modified or new files.
The documentation in doc
folder has been generated by Sphynx.
- Python 3.x
- Required Python packages:
schedule
(for periodic backups)argparse
tk
pycryptodome
- openssl (it's important to have the same version on Client and Server)
You can install the required Python packages by running:
pip install -r requirements.txt
⚠️ You may need to create a virtual environment if your Python environment is flagged as externally managed. It's important to have a venv on Client and Server, it could create network errors.
python3 -m venv venv
source venv/bin/activate
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
The server and client code are in separate scripts. Ensure you have both the server and client scripts available for use.
This project uses SSL certificates to ensure secure communication between the client and the server. SSL certificates help encrypt the data transmitted over the network, providing an additional layer of security.
To generate SSL certificates, you can use the provided generate_server_certificate_and_key.sh
script. This script will create a self-signed SSL certificate and a private key.
-
Make the Script Executable:
chmod +x generate_server_certificate_and_key.sh
-
Run the Script:
./generate_server_certificate_and_key.sh <password> <server_name>
<password>
: The password for the private key.<server_name>
: The name or address of the server.
Example:
./generate_server_certificate_and_key.sh mypassword myserver.com
The script will create a
certificate
directory containingserver_key.pem
andserver_cert.pem
.
The client and server use the SSL certificate to verify each other's identity and establish a secure connection. The certificate files should be placed in the certificate
directory.
-
Client: The client uses the SSL certificate to verify the server's identity. The certificate file should be named
server_cert.pem
. -
Server: The server uses the SSL certificate and private key to establish a secure connection with the client. The certificate file should be named
server_cert.pem
and the private key file should be namedserver_key.pem
.
To start the server, simply run:
$python3 -m Server.Server
The SSL package will ask you to supply your PEM password that you define in ./generate_server_certificate_and_key.sh
:
$python3 Server.py --host 192.168.191.118
Enter PEM pass phrase:
By default, the server listens on 127.0.0.1:12345
, and the backup directory is backup
. You can modify these defaults by adjusting the parameters in the Server
class constructor or via environment variables.
The server will:
- Listen for incoming connections on the specified port.
- Accept file transfers from clients and save them in the
backup
directory. - Track the state of files using a
backup_state.json
file to determine if a file is modified.
The server must be running for the client class to work.
To start the client, run:
python3 Client/Client.py --server <server-ip> --source <source-directory> --extensions <config-file> --port <port>
--server
: IP address of the server to which files will be backed up.--source
: Directory from which files will be backed up.--extensions
: Path to a configuration file containing the allowed file extensions for backup (one extension per line).--port
: (Optional) Port number for the server. Default is12345
.--remove
: (Optional) Path to backup directory to delete.--restore
: (Optional) Backup ID to restore.--path
: (Optional) Custom path to restore the backup.
The client will:
- Calculate the hash of each file and compare it with the previously stored hash (if any) to detect changes.
- Send modified or new files to the server.
- Save the state (file hashes) to the state file.
To back up files from the directory ~/Documents
to the server at 192.168.1.100
, using a configuration file allowed_extensions.txt
, run:
python3 Client.py --server 192.168.1.100 --source ~/Documents --extensions allowed_extensions.txt
You can schedule periodic backups by providing the --interval
argument (in hours):
python3 Client.py --server 192.168.1.100 --source ~/Documents --extensions allowed_extensions.txt --interval 1
This will schedule backups every hour.
Another way to run the client program is by using the GUI client which provides a user-friendly interface to configure and manage backups. To start the GUI client, simply run:
python3 Client/Client.py
The GUI will allow you to:
- Enter the server IP and port.
- Browse and select the source directory for backup.
- Browse and select the configuration file for allowed extensions.
- Start the backup process.
- Delete existing backups.
- Restore backups.
The configuration file should contain a list of file extensions (one per line) that you want to back up. For example:
txt
jpg
pdf
Only files with these extensions will be considered for backup.
-
Client:
- The client calculates the hash (MD5) of each file in the specified source directory.
- The client compares the file hash with the saved state in the
local_hash_log.json
file. If the hash is different (or if the file is new), the client prepares to send it to the server. - Each root folder that is backed up is assigned a unique code for identification purposes. This code is used to track the backup state and link it to the original file path.
-
Server:
- The server listens for incoming file transfer requests from the client.
- Upon receiving file information (name and size), the server checks if the file has been modified since the last backup. If modified, it accepts the file and saves it.
- The server uses the unique code to identify the backup and manage the backup state.
-
State Management:
-
Client-Side: The client stores the backup state in a JSON file (
local_hash_log.json
), which tracks the hashes of files that have been backed up. This ensures that only modified or new files are transferred during subsequent backups. Thelocal_hash_log.json
file contains information about each backup, including the unique code, the original file path, and the date of the backup. The format of the file is as follows:{ "source_dir": { "folder_id": "unique_id", "files_hash": { "relative_path": { "source_path": "original_file_path", "is_folder": true, "folder_hash": "hash_value", "files_hash": { "file_name": { "source_path": "original_file_path", "is_folder": false, "hash": "hash_value" } } } } } }
-
source_dir
: The source directory being backed up. -
folder_id
: A unique identifier for the backup. -
files_hash
: A nested structure containing the hashes of files and folders. -
relative_path
: The relative path of the folder or file. -
source_path
: The original file path on the client side. -
is_folder
: A boolean indicating whether the item is a folder. -
folder_hash
: The hash value of the folder. -
hash
: The hash value of the file. -
Server-Side: The server stores the backup state in a JSON file (
backup_state.json
), which tracks the hashes of files that have been backed up. This ensures that only modified or new files are transferred during subsequent backups. Thebackup_state.json
file contains information about each backup, including the unique code, the original file path, and the date of the backup. The format of the file is as follows:{ "idDossier": { "absoluteFilePath": "filePathRootFolderOnServerSide", "date": "date", "sourcePath": "originalFilePathOnClientSide" } }
-
idDossier
: A unique identifier for the backup. -
absoluteFilePath
: The absolute file path of the backup on the server side. -
date
: The date and time of the backup. -
sourcePath
: The original file path on the client side.
This file helps in tracking the state of backups and ensures that only modified or new files are transferred during subsequent backups.
-
The following class diagram illustrates the structure and interactions of the main components in the backup system. It includes the Client
, Server
, Handlers
, and BackupClientGUI
classes, along with their attributes and methods. The diagram also shows the relationships and interactions between these classes.
- Client: Represents the client-side component of the backup system. It handles file hashing, backup initiation, and communication with the server.
- Server: Represents the server-side component of the backup system. It listens for incoming file transfers, manages backup states, and handles client requests.
- Handlers: Represents the different handlers used by the server to process commands from the client.
- BackupClientGUI: Represents the graphical user interface for the backup client. It allows users to configure and manage backups through a user-friendly interface.
The following sequence diagram illustrates the detailed interactions between the BackupClientGUI
, Client
, Server
, and Handlers
classes in the backup system. It shows the flow of commands and data between these components during the backup and deletion processes.
-
User Interaction:
- The user starts the
BackupClientGUI
. - The user enters the server IP, source directory, config file, and port.
- The user starts the
-
Initialization:
- The
BackupClientGUI
initializes theClient
with the provided parameters.
- The
-
Start Backup:
- The
BackupClientGUI
callsstart_backup()
on theClient
. - The
Client
sends aCHECK_SOURCE
command to theServer
to check if the source directory exists. - The
Server
handles the command using the appropriateHandler
and returns a response. - The
Client
retrieves the list of files to back up. - The
Client
sends aFILE_TRANSFER
command to theServer
with file metadata. - The
Server
handles the command using the appropriateHandler
and returns a response. - The
Client
sendsDATA_TRANSFER
commands to theServer
with file chunks. - The
Server
handles each command using the appropriateHandler
and returns a response. - The
Client
sends aTRANSFER_COMPLETE
command to theServer
to indicate that the file transfer is complete. - The
Server
handles the command using the appropriateHandler
and returns a response. - The
Client
sends aBACKUP_COMPLETE
command to theServer
to indicate that the backup process is complete. - The
Server
handles the command using the appropriateHandler
and returns a response. - The
BackupClientGUI
displays a success message to the user.
- The
-
Delete Backup:
- The user requests to delete a backup.
- The
BackupClientGUI
callsdelete_backup(backup_id)
on theClient
. - The
Client
sends aDELETE_BACKUP
command to theServer
. - The
Server
handles the command using the appropriateHandler
and returns a response. - The
BackupClientGUI
displays a success or error message to the user.