Documentation

Welcome to the technical reference for the Face-Swap Engine. [cite_start]This project performs real-time, high-quality face swapping between photos using Deep Learning methods[cite: 1]. [cite_start]It is optimized to run on NVIDIA RTX series graphics cards (CUDA)[cite: 2].

Note: The system automatically manages hardware acceleration, prioritizing NVIDIA CUDA cores when available, falling back to CPU execution providers seamlessly.

Installation Guide

Please follow these steps sequentially to set up the project environment correctly.

System Requirements:

    [cite_start]
  • OS: Windows 10 or 11 [cite: 3]
  • [cite_start]
  • Hardware: NVIDIA RTX Graphics Card (Recommended: RTX 3060+) [cite: 3]
  • [cite_start]
  • Software: Python 3.11.x [cite: 3]

1. Basic Setup

If Python is not installed, download version 3.11.9. [cite_start]Important: You must check the box "Add python.exe to PATH" during installation[cite: 4].

Additionally, "Visual Studio C++ Build Tools" are required for the InsightFace library. [cite_start]If not present, install the "Desktop development with C++" workload via the Visual Studio Installer[cite: 5, 6].

2. Project Environment (Venv)

[cite_start]

Navigate to the project folder, open your terminal (CMD/Terminal), and execute the following commands to create and activate the virtual environment[cite: 8, 9]:

# Create the virtual environment
>> python -m venv venv

# Activate the environment (You should see (venv) prefix)
>> venv\Scripts\activate

3. Installing Libraries

[cite_start]

With (venv) active in your terminal, install the required packages using the commands below[cite: 10]:

Category Command
Basic Tools pip install numpy==1.26.4 opencv-python==4.10.0.84 PyQt5 scipy
AI Engine pip install insightface==0.7.3 onnxruntime-gpu
NVIDIA CUDA 12 pip install nvidia-cudnn-cu12 nvidia-cublas-cu12 nvidia-cufft-cu12 nvidia-cuda-runtime-cu12 nvidia-cuda-nvrtc-cu12

4. NVIDIA Configuration

DLL files must be moved so Python can detect the NVIDIA libraries. [cite_start]An automated script is provided for this task[cite: 11]:

>> python setup_nvidia.py
[cite_start]

If successful, you will see a "SUCCESS" message[cite: 12]. [cite_start]If it fails, you must manually copy DLL files from the site-packages\nvidia\...\bin folders into your venv/scripts folder[cite: 11].

5. Execution

[cite_start]

Ensure the model file inswapper_128.onnx is located in the same directory as interface.py[cite: 13]. To start the application:

>> python interface.py
[cite_start]

Note: When clicking "Swap Faces" for the first time, there may be a 5-10 second delay as the AI model loads onto the GPU[cite: 16]. Subsequent operations will be instantaneous.

System Architecture

The application is structured into two distinct layers to ensure separation of concerns and thread safety.

  • Backend (`main.py`): Handles all inference, image processing matrices, and file I/O. It is stateless regarding the UI.
  • Frontend (`interface.py`): Manages the PyQt5 event loop, signals, and slots. It never calls blocking backend functions directly on the main thread.

Deep Learning Engine

Class DeepLearningEngine

This is the primary engine for photorealistic swaps using the ONNX Runtime.

Initialization

class DeepLearningEngine:
    def __init__(self, device='gpu'):
        # Initializes InsightFace with buffalo_l model
        self.face_analyser = FaceAnalysis(name='buffalo_l', ...)
        
        # Loads the swapper model (inswapper_128.onnx)
        self.swapper = insightface.model_zoo.get_model(MODEL_PATH)

Key Methods

Method Parameters Description
process() img_a, img_b, idx_a, idx_b Performs the swap (A to B) and applies the sharpening enhancement filter.
get_faces() img_path Detects faces and caches results to avoid re-computation on same files.
_enhance() swapped_img, target_face Internal method. Applies a 3x3 convolution kernel to sharpen facial features.

Geometric Engine

Class GeometricEngine

A fallback engine that uses pure geometry rather than neural networks. It is computationally lighter but requires precise alignment.

Workflow

  1. Mesh Generation: Uses MediaPipe to find 468 landmarks.
  2. Triangulation: scipy.spatial.Delaunay creates a mesh from landmarks.
  3. Warping: Affine transforms map texture from Source triangles to Target triangles.
def process(self, img_a, img_b, ...):
    # 1. Get Landmarks
    lm_a = self.get_landmarks(img_a)
    
    # 2. Seamless Clone (Poisson Blending)
    result = cv2.seamlessClone(warped_face, img_b, mask, center, ...)

Threading Model (Frontend)

The GUI relies on Qt's signal/slot mechanism to remain responsive. No heavy computation occurs in `FaceSwapApp`.

ComparisonWorker

QThread

This worker handles the end-to-end processing pipeline.

class ComparisonWorker(QThread):
    progress = pyqtSignal(int, str)
    finished = pyqtSignal(np.ndarray, np.ndarray, float)

    def run(self):
        # Calls the backend blocking function here
        backend.process_comparison(..., progress_callback=self.progress.emit)

Dev Note: The progress_callback parameter allows the backend to update the frontend UI bar without the backend knowing about Qt libraries.