[PR #216] [MERGED] feature request: pdf support #28 #1559

Closed
opened 2026-03-02 11:58:11 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/karakeep-app/karakeep/pull/216
Author: @kamtschatka
Created: 6/10/2024
Status: Merged
Merged: 6/22/2024
Merged by: @MohamedBassem

Base: mainHead: pdf-support-asset


📝 Commits (4)

  • ddb5766 feature request: pdf support #28
  • d0dd32f remove pdf parsing from the crawler
  • 17f5324 extract the http logic into its own function to avoid duplicating the post-processing actions (openai/index)
  • bc52a0e Add 5s timeout to the content type fetch

📊 Changes

10 files changed (+1263 additions, -93 deletions)

View changed files

📝 .gitignore (+4 -0)
📝 apps/web/components/dashboard/preview/BookmarkPreview.tsx (+14 -2)
📝 apps/workers/crawlerWorker.ts (+163 -57)
packages/db/drizzle/0023_late_night_nurse.sql (+1 -0)
packages/db/drizzle/meta/0023_snapshot.json (+1022 -0)
📝 packages/db/drizzle/meta/_journal.json (+7 -0)
📝 packages/db/schema.ts (+35 -28)
📝 packages/shared/assetdb.ts (+14 -6)
📝 packages/shared/types/bookmarks.ts (+1 -0)
📝 packages/trpc/routers/bookmarks.ts (+2 -0)

📄 Description

Added a new sourceUrl column to the asset bookmarks
Added transforming a link bookmark pointing at a pdf to an asset bookmark
made sure the "View Original" link is also shown for asset bookmarks that have a sourceURL
updated gitignore for IDEA


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/karakeep-app/karakeep/pull/216 **Author:** [@kamtschatka](https://github.com/kamtschatka) **Created:** 6/10/2024 **Status:** ✅ Merged **Merged:** 6/22/2024 **Merged by:** [@MohamedBassem](https://github.com/MohamedBassem) **Base:** `main` ← **Head:** `pdf-support-asset` --- ### 📝 Commits (4) - [`ddb5766`](https://github.com/karakeep-app/karakeep/commit/ddb57663dbb6eeae7640200e9bf2be57ccae2085) feature request: pdf support #28 - [`d0dd32f`](https://github.com/karakeep-app/karakeep/commit/d0dd32fb05ad02f2025a8397340db1bde98af6ac) remove pdf parsing from the crawler - [`17f5324`](https://github.com/karakeep-app/karakeep/commit/17f532435e349f5a97ff560823516adbad46a055) extract the http logic into its own function to avoid duplicating the post-processing actions (openai/index) - [`bc52a0e`](https://github.com/karakeep-app/karakeep/commit/bc52a0ee026c2b5a9549c35b4f2a75eb4bc232eb) Add 5s timeout to the content type fetch ### 📊 Changes **10 files changed** (+1263 additions, -93 deletions) <details> <summary>View changed files</summary> 📝 `.gitignore` (+4 -0) 📝 `apps/web/components/dashboard/preview/BookmarkPreview.tsx` (+14 -2) 📝 `apps/workers/crawlerWorker.ts` (+163 -57) ➕ `packages/db/drizzle/0023_late_night_nurse.sql` (+1 -0) ➕ `packages/db/drizzle/meta/0023_snapshot.json` (+1022 -0) 📝 `packages/db/drizzle/meta/_journal.json` (+7 -0) 📝 `packages/db/schema.ts` (+35 -28) 📝 `packages/shared/assetdb.ts` (+14 -6) 📝 `packages/shared/types/bookmarks.ts` (+1 -0) 📝 `packages/trpc/routers/bookmarks.ts` (+2 -0) </details> ### 📄 Description Added a new sourceUrl column to the asset bookmarks Added transforming a link bookmark pointing at a pdf to an asset bookmark made sure the "View Original" link is also shown for asset bookmarks that have a sourceURL updated gitignore for IDEA --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-02 11:58:11 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1559
No description provided.