[GH-ISSUE #1033] Bug: CHROME_USER_DATA_DIR env not wokring #3667

Closed
opened 2026-03-14 23:57:51 +03:00 by kerem · 6 comments
Owner

Originally created by @green1052 on GitHub (Sep 28, 2022).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1033

Describe the bug

CHROME_USER_DATA_DIR env not wokring

Steps to reproduce

follow the step
https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#chrome_user_data_dir

and login

and move to /data/chrome-profile
CHROME_USER_DATA_DIR=/data/chrome-profile

connect webui and archive

Screenshots or log output

web:

[+] [2022-09-28 16:21:20] Adding 1 links to index (crawl depth=0)... [Errno 2] No such file or directory: 'https:/arca.live' > Saved verbatim input to sources/1664382080-import.txt > Parsed 1 URLs from input (Generic TXT) > Found 1 new URLs not already in index [*] [2022-09-28 16:21:20] Writing 1 links to main index... √ ./index.sqlite3 [▶] [2022-09-28 16:21:20] Starting archiving of 1 snapshots in index... [+] [2022-09-28 16:21:20] "arca.live" https://arca.live > ./archive/1664382080.370713 > favicon > singlefile > title > media 4 files (655.2 KB) in 0:00:18s [√] [2022-09-28 16:21:38] Update of 1 pages complete (18.14 sec) - 0 links skipped - 2 links updated - 1 links had errors Hint: To manage your archive in a Web UI, run: archivebox server 0.0.0.0:8000 [+] [2022-09-28 16:21:20] Adding 1 links to index (crawl depth=0)... [Errno 2] No such file or directory: 'https:/arca.live' > Saved verbatim input to sources/1664382080-import.txt > Parsed 1 URLs from input (Generic TXT) > Found 1 new URLs not already in index [*] [2022-09-28 16:21:20] Writing 1 links to main index... √ ./index.sqlite3 [▶] [2022-09-28 16:21:20] Starting archiving of 1 snapshots in index... [+] [2022-09-28 16:21:20] "arca.live" https://arca.live > ./archive/1664382080.370713 > favicon > singlefile > title > media 4 files (655.2 KB) in 0:00:18s [√] [2022-09-28 16:21:38] Update of 1 pages complete (18.14 sec) - 0 links skipped - 2 links updated - 1 links had errors Hint: To manage your archive in a Web UI, run: archivebox server 0.0.0.0:8000 

ArchiveBox version

ArchiveBox v0.6.3
Cpython Linux Linux-4.4.180+-x86_64-with-glibc2.31 x86_64
IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.3          valid     /usr/local/bin/archivebox                                                   
 √  PYTHON_BINARY         v3.10.4         valid     /usr/local/bin/python3.10                                                   
 √  DJANGO_BINARY         v3.1.14         valid     /usr/local/lib/python3.10/site-packages/django/bin/django-admin.py          
 √  CURL_BINARY           v7.74.0         valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21           valid     /usr/bin/wget                                                               
 √  NODE_BINARY           v17.9.0         valid     /usr/bin/node                                                               
 √  SINGLEFILE_BINARY     v0.3.16         valid     /node/node_modules/single-file/cli/single-file                              
 √  READABILITY_BINARY    v0.0.2          valid     /node/node_modules/readability-extractor/readability-extractor              
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js                         
 √  GIT_BINARY            v2.30.2         valid     /usr/bin/git                                                                
 √  YOUTUBEDL_BINARY      v2022.04.08     valid     /usr/local/bin/yt-dlp                                                       
 √  CHROME_BINARY         v101.0.4951.41  valid     /usr/bin/chromium                                                           
 √  RIPGREP_BINARY        v12.1.1         valid     /usr/bin/rg                                                                 

[i] Source-code locations:
 √  PACKAGE_DIR           24 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         4 files         valid     /app/archivebox/templates                                                   
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                              

[i] Secrets locations:
 √  CHROME_USER_DATA_DIR  34 files        valid     ./chrome-profile                                                            
 -  COOKIES_FILE          -               disabled                                                                              

[i] Data locations:
 √  OUTPUT_DIR            6 files         valid     /data                                                                       
 √  SOURCES_DIR           10 files        valid     ./sources                                                                   
 √  LOGS_DIR              1 files         valid     ./logs                                                                      
 √  ARCHIVE_DIR           42 files        valid     ./archive                                                                   
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             11.5 MB         valid     ./index.sqlite3             
Originally created by @green1052 on GitHub (Sep 28, 2022). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1033 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> CHROME_USER_DATA_DIR env not wokring #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> follow the step https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#chrome_user_data_dir and login and move to /data/chrome-profile CHROME_USER_DATA_DIR=/data/chrome-profile connect webui and archive #### Screenshots or log output web: ```logs [+] [2022-09-28 16:21:20] Adding 1 links to index (crawl depth=0)... [Errno 2] No such file or directory: 'https:/arca.live' > Saved verbatim input to sources/1664382080-import.txt > Parsed 1 URLs from input (Generic TXT) > Found 1 new URLs not already in index [*] [2022-09-28 16:21:20] Writing 1 links to main index... √ ./index.sqlite3 [▶] [2022-09-28 16:21:20] Starting archiving of 1 snapshots in index... [+] [2022-09-28 16:21:20] "arca.live" https://arca.live > ./archive/1664382080.370713 > favicon > singlefile > title > media 4 files (655.2 KB) in 0:00:18s [√] [2022-09-28 16:21:38] Update of 1 pages complete (18.14 sec) - 0 links skipped - 2 links updated - 1 links had errors Hint: To manage your archive in a Web UI, run: archivebox server 0.0.0.0:8000 [+] [2022-09-28 16:21:20] Adding 1 links to index (crawl depth=0)... [Errno 2] No such file or directory: 'https:/arca.live' > Saved verbatim input to sources/1664382080-import.txt > Parsed 1 URLs from input (Generic TXT) > Found 1 new URLs not already in index [*] [2022-09-28 16:21:20] Writing 1 links to main index... √ ./index.sqlite3 [▶] [2022-09-28 16:21:20] Starting archiving of 1 snapshots in index... [+] [2022-09-28 16:21:20] "arca.live" https://arca.live > ./archive/1664382080.370713 > favicon > singlefile > title > media 4 files (655.2 KB) in 0:00:18s [√] [2022-09-28 16:21:38] Update of 1 pages complete (18.14 sec) - 0 links skipped - 2 links updated - 1 links had errors Hint: To manage your archive in a Web UI, run: archivebox server 0.0.0.0:8000 ``` <!-- If applicable, post any relevant screenshots or copy/pasted terminal output from ArchiveBox. If you're reporting a parsing / importing error, **you must paste a copy of your redacted import file here**. --> #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs ArchiveBox v0.6.3 Cpython Linux Linux-4.4.180+-x86_64-with-glibc2.31 x86_64 IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.3 valid /usr/local/bin/archivebox √ PYTHON_BINARY v3.10.4 valid /usr/local/bin/python3.10 √ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.10/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.74.0 valid /usr/bin/curl √ WGET_BINARY v1.21 valid /usr/bin/wget √ NODE_BINARY v17.9.0 valid /usr/bin/node √ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.30.2 valid /usr/bin/git √ YOUTUBEDL_BINARY v2022.04.08 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v101.0.4951.41 valid /usr/bin/chromium √ RIPGREP_BINARY v12.1.1 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 24 files valid /app/archivebox √ TEMPLATES_DIR 4 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: √ CHROME_USER_DATA_DIR 34 files valid ./chrome-profile - COOKIES_FILE - disabled [i] Data locations: √ OUTPUT_DIR 6 files valid /data √ SOURCES_DIR 10 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 42 files valid ./archive √ CONFIG_FILE 81.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 11.5 MB valid ./index.sqlite3 ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
kerem closed this issue 2026-03-14 23:57:56 +03:00
Author
Owner
<!-- gh-comment-id:1320579754 --> @pirate commented on GitHub (Nov 18, 2022): For future reference / people finding this via google the docs for this are here: - https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install#setting-up-a-chromium-user-profile - https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#archiving-private-content - https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#chrome_user_data_dir
Author
Owner

@green1052 commented on GitHub (Nov 19, 2022):

did not work in window profile
only worked linux profile :(

<!-- gh-comment-id:1320937927 --> @green1052 commented on GitHub (Nov 19, 2022): did not work in window profile only worked linux profile :(
Author
Owner

@pirate commented on GitHub (Nov 19, 2022):

Yeah it won't work with a Windows profile because ArchiveBox doesn't support windows really. It only supports docker and the version of the browser that creates the profile needs to match the version archivebox runs.

<!-- gh-comment-id:1320956461 --> @pirate commented on GitHub (Nov 19, 2022): Yeah it won't work with a Windows profile because ArchiveBox doesn't support windows really. It only supports docker and the version of the browser that creates the profile needs to match the version archivebox runs.
Author
Owner

@Michael-Z-Freeman commented on GitHub (Mar 22, 2023):

Is there a local only way of setting this up or does it have to be done using docker ?

<!-- gh-comment-id:1480029002 --> @Michael-Z-Freeman commented on GitHub (Mar 22, 2023): Is there a local only way of setting this up or does it have to be done using docker ?
Author
Owner

@pirate commented on GitHub (Mar 26, 2023):

Not on windows, but it absolutely works without docker on Linux/BSD/mac. Just follow the README instructions to install and point it to your chrome profile using a local path.

<!-- gh-comment-id:1484189024 --> @pirate commented on GitHub (Mar 26, 2023): Not on windows, but it absolutely works without docker on Linux/BSD/mac. Just follow the README instructions to install and point it to your chrome profile using a local path.
Author
Owner

@Michael-Z-Freeman commented on GitHub (Mar 27, 2023):

Not on windows, but it absolutely works without docker on Linux/BSD/mac. Just follow the README instructions to install and point it to your chrome profile using a local path.

OK, great. Thanks for that confirmation. However it never manages to work with a Microsoft/Oauth2/ Learning Space login. I'll try and produce some more detailed output of how I have it setup and come back.

<!-- gh-comment-id:1484833248 --> @Michael-Z-Freeman commented on GitHub (Mar 27, 2023): > Not on windows, but it absolutely works without docker on Linux/BSD/mac. Just follow the README instructions to install and point it to your chrome profile using a local path. OK, great. Thanks for that confirmation. However it never manages to work with a Microsoft/Oauth2/ Learning Space login. I'll try and produce some more detailed output of how I have it setup and come back.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3667
No description provided.