Immich OCR Using Extra Threads: A Bug Report & Discussion

by Admin 58 views
Immich OCR Using Extra Threads: A Bug Report & Discussion

Hey guys! Today, we're diving into a peculiar issue reported by an Immich user concerning the OCR (Optical Character Recognition) functionality. It seems that since version 2.2.0, Immich's OCR is exhibiting some unexpected behavior by utilizing two threads by default. Let's break down the problem, explore the details, and see what might be causing this.

The Issue: OCR Thread Overload

Our user, who's running version 2.2.1, noticed that the OCR process automatically spins up two threads. Now, here's where it gets interesting: when they configure Immich to run two OCR jobs in parallel, instead of sticking to the expected two threads, it adds another one, resulting in a total of three threads chugging away. This behavior is observed when using the server model. The original poster provides the following:

I am running 2.2.1 and since 2.2.0 OCR uses 2 threads by default. If I configure it to run 2 jobs in parallell it will add another one, so it runs on three threads instead of the expected two. I am using the server model.

Why is this a problem? Well, excessive thread usage can lead to increased CPU load, potentially impacting the overall performance of your Immich server. It's like having too many cooks in the kitchen – things can get a bit chaotic and inefficient. We need to understand why this is happening and how to manage it.

Diving Deeper: Environment and Configuration

To get a clearer picture, let's look at the user's setup. They're running Immich Server on Debian 13, with the server version being 2.2.21 and the mobile app at 2.2.1. This issue is specifically affecting the server component. Now, let's peek at the docker-compose.yml and .env files to understand the configuration.

Docker Compose File

The docker-compose.yml file outlines the services that make up the Immich ecosystem. Here's a snippet:

services:
 immich-server:
 container_name: immich_server
 image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
 volumes:
 - ${UPLOAD_LOCATION}:/data
 - /etc/localtime:/etc/localtime:ro
 env_file:
 - .env
 ports:
 - '127.0.0.1:2283:2283'
 depends_on:
 - redis
 - database
 restart: always
 healthcheck:
 disable: false
 logging:
 driver: "journald"
 options:
 tag: "immich-server"

 immich-machine-learning:
 container_name: immich_machine_learning
 image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
 volumes:
 - model-cache:/cache
 env_file:
 - .env
 restart: always
 healthcheck:
 disable: false

This file defines the Immich server, machine learning component, Redis, and PostgreSQL database. Notably, the immich-machine-learning service is responsible for the OCR tasks. The compose file doesn't reveal any explicit thread settings, so the default behavior is likely the culprit.

Environment Variables

The .env file contains environment-specific configurations. Here's the relevant part:

UPLOAD_LOCATION=/data/immich-library
DB_DATA_LOCATION=/root/immich-data/postgres-data
TZ=Europe/Berlin
IMMICH_VERSION=v2
DB_PASSWORD=stripped
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

MACHINE_LEARNING_PRELOAD__CLIP__TEXTUAL=ViT-SO400M-16-SigLIP2-384__webli

Again, there are no specific configurations related to the number of threads used by the OCR process. This suggests that the thread management is either hardcoded or determined dynamically by the Immich application itself.

Reproduction Steps

The user provided a clear set of steps to reproduce the issue:

  1. Use server OCR model
  2. Set OCR parallel jobs to 1
  3. Start OCR queue
  4. Check CPU load
  5. See 2 active ML threads
  6. Stop OCR queue
  7. See no active ML threads

By following these steps, you can confirm whether you're experiencing the same problem.

Log Output Analysis

Reviewing the provided log output, we can see the initialization of the machine learning component:

[10/31/25 23:07:40] INFO Starting gunicorn 23.0.0
[10/31/25 23:07:40] INFO Listening at: http://[::]:3003 (8)
[10/31/25 23:07:40] INFO Using worker: immich_ml.config.CustomUvicornWorker
[10/31/25 23:07:40] INFO Booting worker with pid: 9
[10/31/25 23:07:47] INFO Started server process [9]
[10/31/25 23:07:47] INFO Waiting for application startup.
[10/31/25 23:07:47] INFO Created in-memory cache with unloading after 300s of inactivity.
[10/31/25 23:07:47] INFO Initialized request thread pool with 4 threads.
[10/31/25 23:07:47] INFO Preloading models:
 clip:textual='ViT-SO400M-16-SigLIP2-384__webli'
 visual=None facial_recognition:recognition=None
 detection=None
[10/31/25 23:07:47] INFO Loading textual model 'ViT-SO400M-16-SigLIP2-384__webli' to memory
[10/31/25 23:07:47] INFO Setting execution providers to
 ['CPUExecutionProvider'], in descending order of
 preference
[10/31/25 23:07:52] INFO Application startup complete.
[10/31/25 23:07:52] INFO Loading detection model 'PP-OCRv5_server' to memory
[10/31/25 23:07:52] INFO Setting execution providers to
 ['CPUExecutionProvider'], in descending order of
 preference
[10/31/25 23:07:53] INFO Using engine_name: onnxruntime
[10/31/25 23:08:17] INFO Loading recognition model 'PP-OCRv5_server' to
 memory
[10/31/25 23:08:17] INFO Setting execution providers to
 ['CPUExecutionProvider'], in descending order of
 preference
[10/31/25 23:08:18] INFO Using engine_name: onnxruntime

This log indicates that the machine learning component initializes a request thread pool with four threads. While this doesn't directly explain why two threads are active when only one OCR job is running, it does suggest that the machine learning service is configured to use multiple threads.

Possible Causes and Solutions

So, what could be causing this behavior, and how can we fix it?

  1. Default Thread Configuration: It's possible that the OCR engine (likely Tesseract or another OCR library) is configured to use two threads by default. This could be a setting within the Immich codebase or the OCR library itself.

    • Solution: Investigate the Immich codebase to see if there's a way to configure the number of threads used by the OCR engine. This might involve modifying the immich-machine-learning service or passing environment variables to control the OCR library's behavior.
  2. Parallelism Within the OCR Engine: Some OCR engines can internally parallelize their operations, using multiple threads to process a single image. This could explain why two threads are active even when processing a single job.

    • Solution: Check the documentation for the OCR engine being used to see if there are options to control its internal parallelism. You might be able to limit it to a single thread.
  3. Bug in Immich Code: There might be a bug in Immich's code that's causing it to launch an extra thread when OCR is enabled.

    • Solution: This would require a code fix from the Immich developers. Reporting the issue on the Immich GitHub repository is the best way to get it addressed.

Next Steps and Community Input

If you're experiencing this issue, here are some steps you can take:

  • Experiment with OCR Settings: If there are any OCR-related settings in the Immich admin panel or configuration files, try adjusting them to see if they affect the thread usage.
  • Monitor CPU Usage: Keep an eye on your CPU usage to see how much of an impact this issue is having on your server's performance.
  • Engage with the Immich Community: Share your findings and experiences on the Immich GitHub repository or community forum. This can help the developers identify and fix the issue more quickly.

Let's work together to get to the bottom of this and ensure that Immich's OCR functionality is as efficient and reliable as possible! Happy Immich-ing, everyone!