[PR #88] [MERGED] feature: Add PDF support #1524

Closed
opened 2026-03-02 11:58:01 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/karakeep-app/karakeep/pull/88
Author: @ahmadmucom
Created: 4/8/2024
Status: Merged
Merged: 4/11/2024
Merged by: @MohamedBassem

Base: mainHead: main


📝 Commits (10+)

  • 362a30f feature: Add PDF support
  • 636ffdf fix: PDF feature enhancements
  • 6b836dc fix: Freeze expo-share-intent version to prevent breaking changes
  • a6fb5b6 fix: set endOfLine to auto for cross-platform development
  • 4520eb6 fix: Upgrading eslint/parser and eslint-plugin to 7.6.0 to solve the linting issues
  • beeaf5a fix: enhancing PDF feature
  • 8fddd02 Merge remote-tracking branch 'upstream/main'
  • 546d15e fix: Allowing null in fiename for backward compatibility
  • 8dd8d3a fix: update pnpm file with pnpm 9.0.0-alpha-8
  • e60b241 fix:(web): PDF Preview for web

📊 Changes

24 files changed (+2387 additions, -107 deletions)

View changed files

📝 apps/mobile/lib/upload.ts (+4 -2)
📝 apps/mobile/package.json (+1 -1)
📝 apps/web/app/api/assets/route.ts (+5 -2)
📝 apps/web/components/dashboard/UploadDropzone.tsx (+4 -3)
📝 apps/web/components/dashboard/bookmarks/AssetCard.tsx (+7 -0)
📝 apps/web/components/dashboard/preview/AssetContentSection.tsx (+22 -17)
📝 apps/workers/openaiWorker.ts (+57 -12)
📝 apps/workers/package.json (+2 -0)
📝 apps/workers/searchWorker.ts (+7 -0)
📝 apps/workers/utils.ts (+32 -0)
packages/db/drizzle/0015_first_reavers.sql (+3 -0)
packages/db/drizzle/0016_shallow_rawhide_kid.sql (+2 -0)
packages/db/drizzle/meta/0015_snapshot.json (+959 -0)
packages/db/drizzle/meta/0016_snapshot.json (+959 -0)
📝 packages/db/drizzle/meta/_journal.json (+14 -0)
📝 packages/db/schema.ts (+4 -1)
📝 packages/shared/assetdb.ts (+1 -0)
📝 packages/shared/search.ts (+2 -0)
📝 packages/trpc/routers/bookmarks.ts (+3 -0)
📝 packages/trpc/types/bookmarks.ts (+2 -1)

...and 4 more files

📄 Description

  • Allowed PDF upload in both mobile and web
  • Dynamic type is being sent to the server based on the mimetype from both the mobile and web
  • Supported PDF view as iframe in both mobile and web
  • Added pdf-parse library to parse PDF files
  • Added content, info and metadata for the bookmarkAssets table to store the pdf parsing results so they can be indexed in the future. This would also be used in getting a description about the image and storing it in the content field.
  • Added inferTagsFromPDF that would parse the PDF, update the bookmarkAssets row and then inferTags from the text that includes the PDF content.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/karakeep-app/karakeep/pull/88 **Author:** [@ahmadmucom](https://github.com/ahmadmucom) **Created:** 4/8/2024 **Status:** ✅ Merged **Merged:** 4/11/2024 **Merged by:** [@MohamedBassem](https://github.com/MohamedBassem) **Base:** `main` ← **Head:** `main` --- ### 📝 Commits (10+) - [`362a30f`](https://github.com/karakeep-app/karakeep/commit/362a30f860cd08ef468b0a433b3c2f2077560833) feature: Add PDF support - [`636ffdf`](https://github.com/karakeep-app/karakeep/commit/636ffdf08c7a7326ffa5ca308f464749a66e8f43) fix: PDF feature enhancements - [`6b836dc`](https://github.com/karakeep-app/karakeep/commit/6b836dcc518a82dc94918b8970699a25b2ec0b51) fix: Freeze expo-share-intent version to prevent breaking changes - [`a6fb5b6`](https://github.com/karakeep-app/karakeep/commit/a6fb5b6f212f28cfb65a20de74c92358f03d416f) fix: set endOfLine to auto for cross-platform development - [`4520eb6`](https://github.com/karakeep-app/karakeep/commit/4520eb6811109d66cd408cb199900559572ad5c7) fix: Upgrading eslint/parser and eslint-plugin to 7.6.0 to solve the linting issues - [`beeaf5a`](https://github.com/karakeep-app/karakeep/commit/beeaf5a0f2f11915927afb8dd9f2268b43468b89) fix: enhancing PDF feature - [`8fddd02`](https://github.com/karakeep-app/karakeep/commit/8fddd0223e0b8dcc00097b15d27c302b9649bdfa) Merge remote-tracking branch 'upstream/main' - [`546d15e`](https://github.com/karakeep-app/karakeep/commit/546d15e2686fc21ba1c540f01738c9e9d5750337) fix: Allowing null in fiename for backward compatibility - [`8dd8d3a`](https://github.com/karakeep-app/karakeep/commit/8dd8d3ac987077b5ef41541fe8f92be4ff538471) fix: update pnpm file with pnpm 9.0.0-alpha-8 - [`e60b241`](https://github.com/karakeep-app/karakeep/commit/e60b241b695178bfdbf2b9a5acead4ec98fe7007) fix:(web): PDF Preview for web ### 📊 Changes **24 files changed** (+2387 additions, -107 deletions) <details> <summary>View changed files</summary> 📝 `apps/mobile/lib/upload.ts` (+4 -2) 📝 `apps/mobile/package.json` (+1 -1) 📝 `apps/web/app/api/assets/route.ts` (+5 -2) 📝 `apps/web/components/dashboard/UploadDropzone.tsx` (+4 -3) 📝 `apps/web/components/dashboard/bookmarks/AssetCard.tsx` (+7 -0) 📝 `apps/web/components/dashboard/preview/AssetContentSection.tsx` (+22 -17) 📝 `apps/workers/openaiWorker.ts` (+57 -12) 📝 `apps/workers/package.json` (+2 -0) 📝 `apps/workers/searchWorker.ts` (+7 -0) 📝 `apps/workers/utils.ts` (+32 -0) ➕ `packages/db/drizzle/0015_first_reavers.sql` (+3 -0) ➕ `packages/db/drizzle/0016_shallow_rawhide_kid.sql` (+2 -0) ➕ `packages/db/drizzle/meta/0015_snapshot.json` (+959 -0) ➕ `packages/db/drizzle/meta/0016_snapshot.json` (+959 -0) 📝 `packages/db/drizzle/meta/_journal.json` (+14 -0) 📝 `packages/db/schema.ts` (+4 -1) 📝 `packages/shared/assetdb.ts` (+1 -0) 📝 `packages/shared/search.ts` (+2 -0) 📝 `packages/trpc/routers/bookmarks.ts` (+3 -0) 📝 `packages/trpc/types/bookmarks.ts` (+2 -1) _...and 4 more files_ </details> ### 📄 Description - Allowed PDF upload in both mobile and web - Dynamic type is being sent to the server based on the mimetype from both the mobile and web - Supported PDF view as iframe in both mobile and web - Added pdf-parse library to parse PDF files - Added content, info and metadata for the bookmarkAssets table to store the pdf parsing results so they can be indexed in the future. This would also be used in getting a description about the image and storing it in the content field. - Added inferTagsFromPDF that would parse the PDF, update the bookmarkAssets row and then inferTags from the text that includes the PDF content. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-02 11:58:01 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1524
No description provided.