[GH-ISSUE #2501] Bug: PDF and Page Archives served as plain text (Missing Content-Type header) in v0.30.0 #1499

Open
opened 2026-03-02 11:57:41 +03:00 by kerem · 6 comments
Owner

Originally created by @MCQSJ on GitHub (Feb 22, 2026).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2501

Describe the Bug

Description

When trying to preview or download PDF assets or full-page archives, the browser renders them as raw plain text instead of displaying the PDF or HTML content correctly.

When clicking the download button, the browser navigates to a URL that displays the entire file as text. If I use "Save link as" in Edge, the default file extension is incorrectly suggested as .txt. I have to manually rename the extension to .pdf or .html to view the content properly.

Steps to Reproduce

  1. Use Karakeep (Hoarder) version 0.30.0.
  2. Navigate to a bookmark with a PDF attachment or a "Full Page Archive".
  3. Click on the "Preview" or "Download" action for that asset.
  4. Observation: The browser displays raw text/code. The Content-Type header is missing from the response.

Steps to Reproduce

  1. Log into Karakeep (Hoarder) version 0.30.0.
  2. Navigate to a bookmark that contains a PDF attachment or has a "Full Page Archive" generated.
  3. Click the "Preview" icon or the "Download" button for the specific PDF/Archive asset.
  4. Observe that the browser opens a new tab/window showing raw text content instead of the rendered file.
  5. Check the Network tab in DevTools; notice the absence of a Content-Type header in the response from /api/assets/....

Expected Behaviour

  1. When clicking "Preview" for a PDF, the browser's built-in PDF viewer should open and render the file correctly.
  2. When clicking "Preview" for a Page Archive, the archived HTML should be rendered as a webpage.
  3. When clicking "Download", the browser should trigger a file download with the correct file extension (.pdf or .html) instead of opening it as a text page.
  4. The server should include the correct Content-Type header (e.g., application/pdf or text/html) in the API response.

Screenshots or Additional Context

This issue affects both the built-in previewer and the direct download links via the API. The server should explicitly set the Content-Type based on the stored asset type.

Image Image Image

Device Details

OS: Windows 11 Browser: Microsoft Edge 145.0.3800.65

Exact Karakeep Version

Karakeep v0.30.0

Environment Details

Docker Windows OpenResty

Debug Logs

Technical Logs (Edge Console)

The request to the asset API returns a 200 OK but lacks the necessary MIME type header.

Request URL: https://book.home.com/api/assets/e4e4acd4-0a20-4904-9ed7-756443ae8a7f
Key Response Headers:

HTTP/1.1 200 OK
access-control-allow-origin: *
cache-control: private, max-age=31536000, immutable
content-length: 1883575
content-security-policy: sandbox; default-src 'none'; base-uri 'none'; form-action 'none'; img-src https: data: blob:; style-src 'unsafe-inline' https:; connect-src 'none'; media-src https: data: blob:; object-src 'none'; frame-src 'none'
x-content-type-options: nosniff
# Missing Content-Type (e.g., application/pdf or text/html)

### Have you checked the troubleshooting guide?

- [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @MCQSJ on GitHub (Feb 22, 2026). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2501 ### Describe the Bug ### Description When trying to preview or download PDF assets or full-page archives, the browser renders them as raw plain text instead of displaying the PDF or HTML content correctly. When clicking the download button, the browser navigates to a URL that displays the entire file as text. If I use "Save link as" in Edge, the default file extension is incorrectly suggested as `.txt`. I have to manually rename the extension to `.pdf` or `.html` to view the content properly. ### Steps to Reproduce 1. Use Karakeep (Hoarder) version 0.30.0. 2. Navigate to a bookmark with a PDF attachment or a "Full Page Archive". 3. Click on the "Preview" or "Download" action for that asset. 4. **Observation:** The browser displays raw text/code. The `Content-Type` header is missing from the response. ### Steps to Reproduce 1. Log into Karakeep (Hoarder) version 0.30.0. 2. Navigate to a bookmark that contains a PDF attachment or has a "Full Page Archive" generated. 3. Click the "Preview" icon or the "Download" button for the specific PDF/Archive asset. 4. Observe that the browser opens a new tab/window showing raw text content instead of the rendered file. 5. Check the Network tab in DevTools; notice the absence of a `Content-Type` header in the response from `/api/assets/...`. ### Expected Behaviour 1. When clicking "Preview" for a PDF, the browser's built-in PDF viewer should open and render the file correctly. 2. When clicking "Preview" for a Page Archive, the archived HTML should be rendered as a webpage. 3. When clicking "Download", the browser should trigger a file download with the correct file extension (`.pdf` or `.html`) instead of opening it as a text page. 4. The server should include the correct `Content-Type` header (e.g., `application/pdf` or `text/html`) in the API response. ### Screenshots or Additional Context This issue affects both the built-in previewer and the direct download links via the API. The server should explicitly set the Content-Type based on the stored asset type. <img width="2438" height="1895" alt="Image" src="https://github.com/user-attachments/assets/61515307-3078-4dbc-8223-a9bd587c9b10" /> <img width="2483" height="1778" alt="Image" src="https://github.com/user-attachments/assets/99bd365b-682f-4c77-810f-1e74115590e2" /> <img width="2483" height="1778" alt="Image" src="https://github.com/user-attachments/assets/df4c192c-664f-491d-86b0-8e0da37fd016" /> ### Device Details OS: Windows 11 Browser: Microsoft Edge 145.0.3800.65 ### Exact Karakeep Version Karakeep v0.30.0 ### Environment Details Docker Windows OpenResty ### Debug Logs ### Technical Logs (Edge Console) The request to the asset API returns a 200 OK but lacks the necessary MIME type header. **Request URL:** `https://book.home.com/api/assets/e4e4acd4-0a20-4904-9ed7-756443ae8a7f` **Key Response Headers:** ```http HTTP/1.1 200 OK access-control-allow-origin: * cache-control: private, max-age=31536000, immutable content-length: 1883575 content-security-policy: sandbox; default-src 'none'; base-uri 'none'; form-action 'none'; img-src https: data: blob:; style-src 'unsafe-inline' https:; connect-src 'none'; media-src https: data: blob:; object-src 'none'; frame-src 'none' x-content-type-options: nosniff # Missing Content-Type (e.g., application/pdf or text/html) ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
Author
Owner

@ElectricTea commented on GitHub (Feb 23, 2026):

I do not have the ability to test this on Karakeep 0.30.0 and Windows Edge. I was unable to replicate this bug on Karakeep 0.31.0 using Linux Firefox/Chrome, where the content-type application/pdf header is present.

<!-- gh-comment-id:3943429609 --> @ElectricTea commented on GitHub (Feb 23, 2026): I do not have the ability to test this on Karakeep 0.30.0 and Windows Edge. I was unable to replicate this bug on Karakeep 0.31.0 using Linux Firefox/Chrome, where the content-type application/pdf header is present.
Author
Owner

@MCQSJ commented on GitHub (Feb 28, 2026):

I do not have the ability to test this on Karakeep 0.30.0 and Windows Edge. I was unable to replicate this bug on Karakeep 0.31.0 using Linux Firefox/Chrome, where the content-type application/pdf header is present.

I am still able to reproduce this issue in Karakeep 0.31.0, even when using Edge browser in private mode to eliminate plugin interference. The application only uses OpenResty as a reverse proxy, yet the problem persists. Both PDF and full-page archive previews result in text display upon download.

<!-- gh-comment-id:3977374714 --> @MCQSJ commented on GitHub (Feb 28, 2026): > I do not have the ability to test this on Karakeep 0.30.0 and Windows Edge. I was unable to replicate this bug on Karakeep 0.31.0 using Linux Firefox/Chrome, where the content-type application/pdf header is present. I am still able to reproduce this issue in Karakeep 0.31.0, even when using Edge browser in private mode to eliminate plugin interference. The application only uses OpenResty as a reverse proxy, yet the problem persists. Both PDF and full-page archive previews result in text display upon download.
Author
Owner

@MohamedBassem commented on GitHub (Feb 28, 2026):

does the problem reproduce if you hit karakeep directly without the reverse proxy?

<!-- gh-comment-id:3977376467 --> @MohamedBassem commented on GitHub (Feb 28, 2026): does the problem reproduce if you hit karakeep directly without the reverse proxy?
Author
Owner

@MCQSJ commented on GitHub (Feb 28, 2026):

does the problem reproduce if you hit karakeep directly without the reverse proxy?

I tested it, and the issue persists even without enabling the reverse proxy.

Image Image
<!-- gh-comment-id:3977468212 --> @MCQSJ commented on GitHub (Feb 28, 2026): > does the problem reproduce if you hit karakeep directly without the reverse proxy? I tested it, and the issue persists even without enabling the reverse proxy. <img width="2179" height="1986" alt="Image" src="https://github.com/user-attachments/assets/7c17f0cc-1616-40c0-be3b-fece5dcfc3da" /> <img width="1352" height="1848" alt="Image" src="https://github.com/user-attachments/assets/c5c7142f-1d7a-4b69-8b39-e002067be423" />
Author
Owner

@MohamedBassem commented on GitHub (Feb 28, 2026):

is this happening for all bookmarks or only some of them?

<!-- gh-comment-id:3977473423 --> @MohamedBassem commented on GitHub (Feb 28, 2026): is this happening for all bookmarks or only some of them?
Author
Owner

@MCQSJ commented on GitHub (Mar 1, 2026):

is this happening for all bookmarks or only some of them?

This is happening for all bookmarks.

<!-- gh-comment-id:3978968518 --> @MCQSJ commented on GitHub (Mar 1, 2026): > is this happening for all bookmarks or only some of them? This is happening for all bookmarks.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1499
No description provided.