mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 09:06:02 +03:00
[GH-ISSUE #244] Behavior: Use relative paths in index.json metadata to avoid leaking full filesystem layout information #1680
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#1680
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Syd on GitHub (May 26, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/244
Archivebox has quite a bit of potential for researchers, activists, journalists etc however it currently leaks quite a bit of information about directory structures, usernames and more unnecessarily.
an example from index.json
"/home/{username}/ArchiveBox/output/sources/stdin-1558782247.txt"There is no real reason to give away directory structure and usernames in public archives and some users are likely to be targeted by those unhappy at what is being archived both by knowing webserver layout and the environment of the archiver itself.
@pirate commented on GitHub (May 27, 2019):
I considered this originally when designing it, but decided against because I had a use case where I was archiving bookmarks from multiple users and wanted the paths like this:
/home/someuser/Downloads/bookmarks.html/home/someotheruser/Downloads/bookmarks.htmlMaybe it could be a config option, or I can just do relative paths by default and let people deal with that issue themselves.
@cdvv7788 commented on GitHub (Oct 20, 2020):
With the change to the sqlite database is this still an issue? The index.json is not generated automatically, and the sql index does not have absolute paths anywhere. I can double check, but I don't think the web UI has the issue either.
@pirate commented on GitHub (Oct 22, 2020):
I believe absolute paths are still used in the
archive/<timestamp>/index.jsonfiles, but I could be wrong.@cdvv7788 commented on GitHub (Oct 22, 2020):
Good point. I will double check.