Why You Should Stop Using Imagick for PDF Previews (And What to Do Instead)

 

If you manage a web application where users upload documents, you’ve undoubtedly run into that classic Friday afternoon nightmare: a user uploads a heavy, complex PDF, and the server suddenly crashes due to memory exhaustion.

Nine times out of ten, the culprit is Imagick. But how did we end up relying so heavily on this module, and why has it become such a massive bottleneck today? Let’s take a step back to understand the root of the problem.

A Little History: Why Did We Always Use Imagick?

In the PHP ecosystem, image manipulation has historically been handled by the GD library. It is lightweight, fast, and pre-installed on virtually every hosting environment. So why did everyone choose Imagick for handling PDFs?

The answer lies in how files are structured:

  • The GD library only understands pixels: It was built to manipulate raster formats (PNG, JPEG, GIF). It doesn’t have the slightest clue how a PDF file is put together.
  • PDF is a vector format (and more): A PDF contains geometric coordinates, embedded fonts, text layers, and vector paths. Turning a PDF page into a JPEG requires a highly complex rasterization engine.

This is where ImageMagick (via the PHP Imagick extension) stepped in. It historically delegated this heavy lifting to Ghostscript, a powerful PostScript interpreter installed at the OS level. For years, this was the only viable path: PHP passed the PDF to Imagick, Ghostscript chewed through it on the server, and spat out the first page as a JPEG.

The Hidden Cost on Your Server

While this server-side approach works, it introduces three major pain points in Production Environments and SHared Hosting:

  1. Devastating Resource Consumption: Rasterization via Ghostscript is incredibly RAM and CPU intensive. If multiple users upload files simultaneously, your server’s resources evaporate instantly.
  2. System Instability: Configuring and maintaining Ghostscript on Linux servers can be a security nightmare, especially given historical vulnerabilities that often force hosting providers to block PDF reading entirely.
  3. Synchronous User Experience: The user has to sit and wait for the heavy server-side processing to finish before they can even see a thumbnail, resulting in frustratingly long loading times.

The Modern Solution: “Client-Driven” Architecture

Modern browsers possess incredible computing power. Why not offload the heavy lifting to the client?

By leveraging PDF.js (Mozilla’s brilliant open-source library that powers Firefox’s native PDF viewer), we can make the user’s browser download the PDF, render the first page inside an HTML5 <canvas>, and generate the image locally.

Once generated, the preview image is sent back to the server via a simple asynchronous AJAX request. The PHP backend performs zero rendering calculations; it just receives a ready-to-go JPEG and saves it. Minimal effort, maximum efficiency.

1. The Frontend (JavaScript)

The strategy is to intercept the rendering of the first page using PDF.js, extract the Blob from the canvas, and POST it to our backend.


// Rendering the first page on a Canvas using PDF.js
pdfDoc.getPage(1).then(page => {
    const viewport = page.getViewport({ scale: 1.0 });
    const canvas = document.createElement('canvas');
    const context = canvas.getContext('2d');
    
    canvas.width = viewport.width;
    canvas.height = viewport.height;

    const renderContext = { canvasContext: context, viewport: viewport };
    
    // Execute visual rendering on the canvas
    page.render(renderContext).promise.then(() => {
        
        // THE TRICK: Convert the canvas into a compressed JPEG on the client side
        canvas.toBlob(function(blob) {
            if (!blob) return;

            // Prepare the payload for the async request
            const formData = new FormData();
            formData.append('file_id', '12345'); // Indicative file ID
            formData.append('my_lazy_preview', blob, 'lazy_preview.jpg');

            // Send the preview to the server (Lazy Preview)
            fetch('ajax-save-lazy-preview.php', {
                method: 'POST',
                body: formData
            })
            .then(r => r.json())
            .then(data => console.log("Preview saved successfully!", data))
            .catch(err => console.error("Network error", err));
            
        }, 'image/jpeg', 0.85); // 0.85 strikes the perfect balance between quality and file size
    });
});

2. The Backend (PHP)

Because the heavy computational work happened in the user’s browser, our PHP script becomes incredibly lightweight and secure. No Imagick required, no Ghostscript needed. Just native, standard PHP functions:


<?php
// ajax-save-lazy-preview.php

header('Content-Type: application/json');

if ($_SERVER['REQUEST_METHOD'] === 'POST' && isset($_FILES['my_lazy_preview'])) {
    $fileId = intval($_POST['file_id'] ?? 0);
    $file = $_FILES['my_lazy_preview'];

    // Basic security validation (extension and size)
    $allowedMime = ['image/jpeg', 'image/jpg'];
    if (!in_array($file['type'], $allowedMime) || $file['size'] > 2 * 1024 * 1024) {
        echo json_encode(['success' => false, 'error' => 'Invalid file type or file too large']);
        exit;
    }

    // Destination path (indicative)
    $targetDir = __DIR__ . "/previews/";
    $targetFile = $targetDir . "preview_" . $fileId . ".jpg";

    // Store the image directly without reprocessing it
    if (move_uploaded_file($file['tmp_name'], $targetFile)) {
        // Optional: Update your database here to flag that this file now has a preview
        echo json_encode(['success' => true, 'message' => 'Preview saved!']);
    } else {
        echo json_encode(['success' => false, 'error' => 'Failed to save file']);
    }
    exit;
}

echo json_encode(['success' => false, 'error' => 'Invalid request']);

Done! 🥳


Handling Edge Cases Safely: Do you know Dokky Script?

Dokky Suite - Self Hosted Document management and much more
Dokky Suite – Self Hosted Document management and much more

Implementing this workflow from scratch works perfectly for smaller, straightforward projects. However, if you are building an Ecosystem centered around Documentation or complex file management, tedious edge cases will eventually pop up: What if the user scrolls through pages too quickly? How do we prevent browser memory leaks when handling 500-page PDFs? How do we handle secure token-based streaming?

If you are developing a document-centric system and need a production-ready, highly optimized solution, it’s worth looking into Dokky script.

Dokky is a Self Hosted PHP script specifically engineered for Managing Documentation. It natively integrates this exact Lazy Preview mechanism, using an IntersectionObserver to monitor visible pages, dynamically unloading distant ones to free up client RAM, and seamlessly automating async preview generation directly to your backend.

Instead of manually wrestling with canvas lifecycles and server-side configurations, you can explore the Dokky Live Demo to see it in action, or check out the Project Description Page to see how it can slot cleanly into your current development stack.

How to Display PDFs from IPFS on the Web (Without Losing Your Mind Over Gateways)

 

If you have ever tried to integrate IPFS (InterPlanetary File System) into a traditional web application, you’ve likely stumbled upon the worst-kept secret of decentralization: public gateway latency.

The core concept is brilliant: you upload a PDF, get a unique and immutable CID (Content Identifier), and you’re good to go. The file is secure, distributed, and tamper-proof. But the moment you need to pass that file, for exampleto a browser-based PDF viewer, a major hurdle arises: Which public gateway will respond first? And what if that specific gateway is down today?

How to Render PDFs from IPFS on the Web
How to Render PDFs from IPFS on the Web

Today, we’ll explore how I solved this issue on the backend, optimizing performance using a parallel gateway racing system and smart caching.

The Problem: The IPFS Gateway Lottery

To display an IPFS file inside an iframe or a JavaScript viewer, we need a public HTTP gateway (such as Cloudflare, Pinata, or Web3.Storage) to translate the IPFS protocol into a standard https:// URL.

The catch is that public gateways are highly unpredictable:

  • A gateway might be lightning-fast right now and painfully slow five minutes later.
  • Some will time out if the file hasn’t fully propagated across the network yet.
  • Hardcoding a single gateway introduces a single point of failure (SPOF).

The solution? Pit the gateways against each other.
First one to respond wins.

Architecture of the Solution (In a Nutshell…)

To guarantee maximum loading speeds for the Document Viewer, we have implemented a three-step strategy:

Session Memory: It remembers the last gateway that responded successfully during the user’s active session.
Parallel Gateway Racing: If the preferred gateway fails or lags, it queries a list of alternative gateways simultaneously using asynchronous HTTP requests.
Smart Caching & Fallback: It caches the winning URL (e.g., via SQLite or JSON) to avoid re-running the race on every page refresh. If the entire IPFS network is unreachable, it seamlessly falls back to a locally stored file.

Implementation: The curl_multi_init Trick

Here is a conceptual Snippet showing how to implement parallel checks in PHP. Instead of testing gateways sequentially (which would take ages), we fire off HEAD requests all at once.
PHP


// A list of public gateways to put to the test
$default_gateways = [
    'https://gateway.pinata.cloud/ipfs',
    'https://cloudflare-ipfs.com/ipfs',
    'https://w3s.link/ipfs'
];

function checkIPFSGatewayParallel($cid, $gateways) {
    $mh = curl_multi_init();
    $chs = [];

    // Prepare asynchronous requests (HEAD only, to keep it lightning fast)
    foreach ($gateways as $gateway) {
        $url = rtrim($gateway, '/') . '/' . $cid;
        $ch = curl_init($url);
        curl_setopt($ch, CURLOPT_NOBODY, true); // Don't download the PDF, just look for a 200 OK
        curl_setopt($ch, CURLOPT_TIMEOUT, 2);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        
        curl_multi_add_handle($mh, $ch);
        $chs[$gateway] = $ch;
    }

    // Execute handles in parallel
    $active = null;
    do {
        $mrc = curl_multi_exec($mh, $active);
    } while ($mrc == CURLM_CALL_MULTI_PERFORM);

    $found_url = false;
    // ... descriptor read loop ...
    // The first gateway to return HTTP Code 200 breaks the loop and wins the race!
    
    return $found_url;
}

Why is this approach so efficient?

By setting CURLOPT_NOBODY = true, we issue a HEAD request. We aren’t downloading the actual PDF (which could be several megabytes); we are merely asking the gateway: “Hey, do you have this CID ready to serve?“. The first one to reply with a 200 OK wins the right to serve the file to our viewer.

Here the Global Logical Flow; within the application lifecycle, verifying the CID follows an optimized pipeline:

1 – Cache Check: Has this CID been linked to a working gateway in the last hour? If yes, use it.
2 – Fast Track: Try the gateway stored in the user’s session ($_SESSION[‘preferred_gateway’]). Estimated latency: just a few milliseconds.
3 – The Race: If the Fast Track fails, the parallel race described above triggers. The winner is saved to both the session and the cache.
4 – Hard Fallback: If the IPFS network is experiencing temporary congestion, the system switches to the local file system (uploads/file.pdf), ensuring the user always sees their document.

Once the definitive URL is fetched (whether from IPFS or local storage), you simply pass it to your client-side viewer (like an iframe or to a webpage instance), and you are good to go.

Want to Skip the Headache?

Dealing with cURL timeouts, configuring caching databases, managing local fallbacks, and tweaking viewers to avoid CORS issues with IPFS gateways can quickly turn into a micro-management nightmare.

If you want to integrate an immutable, decentralized, and blazing-fast Document Management System into your projects without writing, testing, and maintaining all this backend boilerplate… well, there’s Dokky Suite. It handles exactly this (and a whole lot more) natively, with just one click. 😉

Dokky Suite v2.3.0: Lighter, More Self-Contained, Ready for Growth

 

With version 2.3.0, Dokky Suite Script takes a significant step toward a clear objective: offering a Document Management and Collaboration Platform that is simpler to install, easier to manage, and more robust for daily use.

This release introduces significant improvements, both under the hood and in the user experience, with a precise focus on autonomy, portability, and overall quality. The result is a Suite that is more modern, more flexible, and, above all, better suited to real-world environments where reliability, speed, and deployment freedom are paramount.

Dokky Suite - where Documents meet Possibilities
Dokky Suite – where Documents meet Possibilities

One of the most significant changes concerns PDF preview generation.
We have eliminated the server-side dependency on ImageMagick, thereby simplifying the architecture and tangibly reducing installation and compatibility hurdles. This makes Dokky easier to deploy even in resource-constrained environments, without sacrificing essential functionality.

Another major step forward is our new approach to OCR (Optical Character Recognition).
The System has been redesigned as a native, self-hosted process with no mandatory external dependencies. In practical terms, Dokky can now handle text extraction autonomously while retaining the ability to integrate with third-party services whenever necessary. The advantage is clear: greater control for Administrators and greater flexibility for end Users.

On the User experience front, we have enhanced the Document Editing page and various user-specific options, with the aim of making daily workflows clearer and faster. Every detail has been carefully crafted to help those working with Documents Achieve more with fewer steps.

The search engine has also received a major update.
Both global search and tag-based search functions have been improved to deliver more precise and consistent results by more effectively cross-referencing available information. This means finding what truly matters, and finding it faster, even as your Document Archive grows larger and more complex. We have also updated the installer, making it clearer and more effective at checking system requirements.
This translates into a simpler onboarding process and a more reliable initial Setup (two fundamental aspects for anyone looking to get started without wasting time).

On the administrative side, the global Dashboard has been updated, along with the MySQL database, translations, and FAQs.
Rounding out the release is a series of minor bug fixes and performance enhancements that contribute to a smoother and more stable overall user experience.

Dokky Suite v2.3.0 is a release designed to combine power with simplicity.
Fewer dependencies, greater autonomy, improved search capabilities, and enhanced document control: everything converges to create a more mature and competitive Platform.

If you are looking for a self-hosted solution to manage, publish, and distribute documents using a modern, flexible approach, this is the perfect release to discover its value.

Discover Dokky Suite, try the Live Demo, and see how it can fit into your workflow.
If you want a Document Management Platform that combines autonomy, portability, and control, Dokky is ready for the next step!