[GH-ISSUE #426] pocket import corruptions #265

Closed
opened 2026-02-25 23:33:49 +03:00 by kerem · 1 comment
Owner

Originally created by @questor on GitHub (May 12, 2022).
Original GitHub issue: https://github.com/go-shiori/shiori/issues/426

Hi! I've tried the pocket import and notices that urls in the form of "www.youtube.com?v=abcd" do not work because the used script has a minor (but important) error. The linked script (https://gist.githubusercontent.com/dchakro/fa43b0e89f884826d3bd60f51e48b078/raw/pocket2shiori.sh) is:

#!/bin/sh
# Extracting URLs from the exported HTML from getpocket.com/export
# In the following line $1 is the exported HTML from pocket 
#  which will be passed to this script as a CLI parameter

grep -Eoi '<a [^>]+>' $1 | grep -Eo 'href="[^\"]+"' | cut -d'=' -f 2 | tr -d '"' | tac > pocket2shiori.txt

# Reading the URLs one by one and adding to shiori
while IFS= read -r line; do
    shiori add $line
done < pocket2shiori.txt

rm pocket2shiori.txt
exit 0

the right preprocessing should be:

grep -Eoi '<a [^>]+>' $1 | grep -Eo 'href="[^\"]+"' | cut -d'=' -f 2- | tr -d '"' | tac > pocket2shiori.txt

cut -d'=' -f 2- (please note the additional - to not disturb the following string)

I've added a comment to the gist, but your link in the wiki points to the not-fully-working version.

Originally created by @questor on GitHub (May 12, 2022). Original GitHub issue: https://github.com/go-shiori/shiori/issues/426 Hi! I've tried the pocket import and notices that urls in the form of "www.youtube.com?v=abcd" do not work because the used script has a minor (but important) error. The linked script (https://gist.githubusercontent.com/dchakro/fa43b0e89f884826d3bd60f51e48b078/raw/pocket2shiori.sh) is: ``` #!/bin/sh # Extracting URLs from the exported HTML from getpocket.com/export # In the following line $1 is the exported HTML from pocket # which will be passed to this script as a CLI parameter grep -Eoi '<a [^>]+>' $1 | grep -Eo 'href="[^\"]+"' | cut -d'=' -f 2 | tr -d '"' | tac > pocket2shiori.txt # Reading the URLs one by one and adding to shiori while IFS= read -r line; do shiori add $line done < pocket2shiori.txt rm pocket2shiori.txt exit 0 ``` the right preprocessing should be: ``` grep -Eoi '<a [^>]+>' $1 | grep -Eo 'href="[^\"]+"' | cut -d'=' -f 2- | tr -d '"' | tac > pocket2shiori.txt ``` cut -d'=' -f 2- (please note the additional - to not disturb the following string) I've added a comment to the gist, but your link in the wiki points to the not-fully-working version.
kerem closed this issue 2026-02-25 23:33:49 +03:00
Author
Owner

@fmartingr commented on GitHub (May 26, 2022):

Hey @questor, thanks for pointing this out!

I've forked and fixed the script with your suggestion. Also, the link to the gist points now to the gist itself rather than the raw file to avoid missing out comments information for the users that end up there.

<!-- gh-comment-id:1138945175 --> @fmartingr commented on GitHub (May 26, 2022): Hey @questor, thanks for pointing this out! I've forked and fixed the script with your suggestion. Also, the link to the gist points now to the gist itself rather than the raw file to avoid missing out comments information for the users that end up there.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shiori#265
No description provided.