mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 09:06:02 +03:00
[GH-ISSUE #990] Feature Request: Add MERCURY_ARGS extractor option to enable saving article text as .md markdown #2128
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#2128
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cmuc24 on GitHub (Jun 16, 2022).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/990
Feature Request, a short description related to extractors, result items
As a admin, i need a markdown format, similar to the existing readability format,
to catch .md items for further, external downstream processes
the ideal specific solution
As a user, i can define MD as (additional) format for my (to be archived) items,
to get these items in our publishing worklow and as a result,
back to our commonly used MD based knowledge base.
Good for Applications like MkDocs, Trilium or similar other MD based Applications.
actual solution
At the moment a MD Webclipper as browser extension do the job, with some additional options also.
https://github.com/deathau/markdownload (MIT and Apache2 licenses only)
how important
as i can (not clear) see, a twin from the readability extractor, paired with great stuff, e.g. from mentioned .md Webclipper, could have potential for a low hanging fruit!? :-)
edit: if primary focus on converting, then an info.json could solve it. (like yt-download doing)
low code performer like me can work better with lightweight formats :)
personally i'll say its an improvement that can increase (not only) effiecenty in downstream processes
@ntevenhere commented on GitHub (Sep 12, 2022):
The mercury extractor has --format=markdown. In archivebox the option is set to --format=text. You just need a way to change the argument.
(
MERCURY_ARGSconfig option when? 😄 (I could work on that actually)