mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 09:06:02 +03:00
[GH-ISSUE #663] Question: How to get Chromium Browsing data to ArchvieBox when it is in a VM? #415
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#415
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @voarsh2 on GitHub (Mar 15, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/663
I'm running Chromium on my desktop, completely separate from my ArchvieBox VM.
How would I get my browser history in to it?
The documentation assumes they're both running together?
@pirate commented on GitHub (Mar 15, 2021):
They don't have to be running together, you can copy/mount your CHROME_DATA_DIR inside the VM and pass it to archivebox. Or export your history using https://github.com/ArchiveBox/ArchiveBox/blob/dev/bin/export_browser_history.sh on the desktop and pass the list of URLs to archivebox as text.
@voarsh2 commented on GitHub (Mar 15, 2021):
I can't run .sh on Windows, I need Linux for that.
The VM I refer to... is on another host system, I can't mount my chrome directory from my PC, over the network to another host....
I might be able to install Chrome on a network share... but that's getting messy here....?
@mAAdhaTTah commented on GitHub (Mar 15, 2021):
The new WSL2 should be able to run it. If you're not familiar you can install it here: https://docs.microsoft.com/en-us/windows/wsl/install-win10
@voarsh2 commented on GitHub (Mar 16, 2021):
Lucky I have Windows 10 Pro. The average home user doesn't have this, I believe.
Basically you're suggesting me to run ArchiveBox via WSL2 (Ubuntu), which I will not do. I have my own dedicated hardware. I cannot mount Chrome data to WSL2, and send my data to the remote VM (over LAN) (ArchiveBox)
If this is so tricky, then I think the project needs some sort of API, or chrome extension or native Windows app. If I want chrome browser history I must use WSL2 (on my desktop).
@pirate commented on GitHub (Mar 16, 2021):
ArchiveBox doesn't require WSL2, just the browser history export helper script. I don't use Windows, nor do I have a Windows machine to test on, you'll have to figure out a different way to get your history into ArchiveBox if you don't have a way to run the bash script or share your data dir with archivebox.
@voarsh2 commented on GitHub (Mar 16, 2021):
Basically telling me to stuff it, not your problem, because you have Linux and run this completely on your desktop computer, not as a web service on a remote machine, not that most people would be on Windows....
Unlikely, even when I install Chrome, it doesn't let me specify an install location.
Only thing I can think of is move the Appdata app location to a shared NFTS drive... not ideal.
The project should really think of running this like an actual web service. If I have lots of data to "archive" I'm hardly going to run this on a desktop computer, unless I have tons of hdd's/USB harddrives. Average user doesn't.
Instead I have a dedicated server with tons of drives, and isolated VM environments, and I can't even send an API call or run a little script to "post" my chrome data to it....
There's no extension or way to run a script in Windows to post data to a remote address, which is what I would expect from a project like this.
Any funding options?
@pirate commented on GitHub (Mar 16, 2021):
I run it as a web service in a docker container on Linux, and on my desktop Mac. I don't archive all my history (it would quickly fill many terabytes). Much easier to archive a subset of your history using bookmarks or a tool like pocket / pinboard, and send that to your vm.
@voarsh2 commented on GitHub (Mar 16, 2021):
Yes, but when you say VM, you mean like oracle box, or some wsl2 or something like that on a normal PC.
I have TB's of storage... not a problem.
I just can't get my browsing data to a remote machine for ArchiveBox.
The project in it's current state isn't fit for my purpose, which is sad for me. But I'm sure it works for many who only have a handful of sites to archive and don't have dedicated hardware.
@pirate commented on GitHub (Mar 16, 2021):
No it's a dedicated VM running in the cloud, not on a local PC. Why not rsync your chrome data dir to the server periodically and then run the export script on the VM?
@voarsh2 commented on GitHub (Mar 16, 2021):
What folder exactly needs to go to the remote?
I am using a Chromium browser (not official Chrome). Userdata folder?
@pirate commented on GitHub (Mar 16, 2021):
Yup, whatever folder contains the
Defaultfolder (your default chrome profile).C:\Users\<username>\AppData\Local\Google\Chrome\User Data\Default<- not thisC:\Users\<username>\AppData\Local\Google\Chrome\User Data<- this oneChromium is what archivebox uses (not chrome), so it should be fine, just make sure you're on or near the latest version.
@voarsh2 commented on GitHub (Mar 17, 2021):
Closing for now.
I will need to check that Rsync works with Windows, perhaps I can write a batch script that'll send it to my NFTS share.
Thanks.
@voarsh2 commented on GitHub (Mar 27, 2021):
I tried cmd scp to copy files over to my remote linux VM, didn't work. I can't even zip my Chomium data as I get permission/read errors with it open (I'm not going to close it everytime it tries to copy /User Data/
I hope this project can be more friendly towards Windows.