[PR #30] [MERGED] Fix Pinboard JSON duplicate timestamps error #1038

Closed
opened 2026-03-01 14:48:10 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/30
Author: @bardisty
Created: 7/1/2017
Status: Merged
Merged: 7/1/2017
Merged by: @pirate

Base: masterHead: fix-pinboard-json-timestamps-error


📝 Commits (2)

  • 71159bd Fix Pinboard JSON duplicate timestamps error
  • 17cc60d Add pinboard.json example file

📊 Changes

2 files changed (+8 additions, -2 deletions)

View changed files

📝 archive.py (+2 -2)
examples/pinboard.json (+6 -0)

📄 Description

If the JSON exported by Pinboard contains duplicate timestamps, Python
returns a TypeError exception:

TypeError: argument of type 'float' is not iterable

This is because time.mktime() returns a floating point number.
Encasing time.mktime() in str() fixes the data type not being
iterable.

time.mktime() has also been encased in int() to remove the
unnecessary decimal value (.0) that gets returned for each time value,
and to keep the script consistent with the other export functions.

Full Error:

$ env CHROME_BINARY=google-chrome-stable ./archive.py ~/tmp/test
[+] [2017-07-01 02:57:35] Starting archive from /home/bah/tmp/test export file.
Traceback (most recent call last):
  File "./archive.py", line 468, in <module>
    create_archive(export_file, service=export_type, resume=resume_from)
  File "./archive.py", line 412, in create_archive
    links = uniquefied_links(links)         # fix duplicate timestamps, returns sorted list
  File "./archive.py", line 283, in uniquefied_links
    link['timestamp'] = next_uniq_timestamp(seen_timestamps, link['timestamp'])
  File "./archive.py", line 251, in next_uniq_timestamp
    if '.' in timestamp:
TypeError: argument of type 'float' is not iterable

JSON test data:

[{"href":"https:\/\/pushover.net\/","description":"Pushover: Simple Notifications for Android, iOS, and Desktop","extended":"","meta":"1e68511234d9390d10b7772c8ccc4b9e","hash":"bb93374ead8a937b18c7c46e13168a7d","time":"2014-06-14T15:51:42Z","shared":"no","toread":"no","tags":"app android"},
{"href":"http:\/\/www.reddit.com\/r\/Android","description":"r\/android","extended":"","meta":"18a973f09c9cc0608c116967b64e0419","hash":"910293f019c2f4bb1a749fb937ba58e3","time":"2014-06-14T15:51:42Z","shared":"no","toread":"no","tags":"reddit android"}]

Output after fix:

$ env CHROME_BINARY=google-chrome-stable ./archive.py ~/tmp/test           
[+] [2017-07-01 04:09:17] Starting archive from /home/bah/tmp/test export file.
[*] [2017-07-01 04:09:17] Created archive index with 2 links.
[*] Checking Dependencies:
/bin/google-chrome-stable
/bin/wget
/bin/curl
[+] [1402786302 (2014-06-14 15:51:42)] "Pushover: Simple Notifications for Android, iOS, and Desktop": pushover.net/
    - Downloading Full Site
    - Printing PDF
      Exception: Exception Failed to print PDF
    - Snapping Screenshot
    - Submitting to archive.org
    - Fetching Favicon
[+] [1402786302.1 (2014-06-14 15:51:42)] "r/android": www.reddit.com/r/Android
    - Downloading Full Site
    - Printing PDF
      Exception: Exception Failed to print PDF
    - Snapping Screenshot
    - Submitting to archive.org
    - Fetching Favicon
[√] [2017-07-01 04:09:39.035009] Archive update complete.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/30 **Author:** [@bardisty](https://github.com/bardisty) **Created:** 7/1/2017 **Status:** ✅ Merged **Merged:** 7/1/2017 **Merged by:** [@pirate](https://github.com/pirate) **Base:** `master` ← **Head:** `fix-pinboard-json-timestamps-error` --- ### 📝 Commits (2) - [`71159bd`](https://github.com/ArchiveBox/ArchiveBox/commit/71159bdcaa0e3e165fc0443b7d872d407044aa1e) Fix Pinboard JSON duplicate timestamps error - [`17cc60d`](https://github.com/ArchiveBox/ArchiveBox/commit/17cc60d38cb48a7085c44533b0b069ddc0c92eae) Add `pinboard.json` example file ### 📊 Changes **2 files changed** (+8 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `archive.py` (+2 -2) ➕ `examples/pinboard.json` (+6 -0) </details> ### 📄 Description If the JSON exported by Pinboard contains duplicate timestamps, Python returns a TypeError exception: `TypeError: argument of type 'float' is not iterable` This is because `time.mktime()` returns a floating point number. Encasing `time.mktime()` in `str()` fixes the data type not being iterable. `time.mktime()` has also been encased in `int()` to remove the unnecessary decimal value (`.0`) that gets returned for each time value, and to keep the script consistent with the other export functions. **Full Error:** ``` $ env CHROME_BINARY=google-chrome-stable ./archive.py ~/tmp/test [+] [2017-07-01 02:57:35] Starting archive from /home/bah/tmp/test export file. Traceback (most recent call last): File "./archive.py", line 468, in <module> create_archive(export_file, service=export_type, resume=resume_from) File "./archive.py", line 412, in create_archive links = uniquefied_links(links) # fix duplicate timestamps, returns sorted list File "./archive.py", line 283, in uniquefied_links link['timestamp'] = next_uniq_timestamp(seen_timestamps, link['timestamp']) File "./archive.py", line 251, in next_uniq_timestamp if '.' in timestamp: TypeError: argument of type 'float' is not iterable ``` **JSON test data:** ```json [{"href":"https:\/\/pushover.net\/","description":"Pushover: Simple Notifications for Android, iOS, and Desktop","extended":"","meta":"1e68511234d9390d10b7772c8ccc4b9e","hash":"bb93374ead8a937b18c7c46e13168a7d","time":"2014-06-14T15:51:42Z","shared":"no","toread":"no","tags":"app android"}, {"href":"http:\/\/www.reddit.com\/r\/Android","description":"r\/android","extended":"","meta":"18a973f09c9cc0608c116967b64e0419","hash":"910293f019c2f4bb1a749fb937ba58e3","time":"2014-06-14T15:51:42Z","shared":"no","toread":"no","tags":"reddit android"}] ``` **Output after fix:** ```text $ env CHROME_BINARY=google-chrome-stable ./archive.py ~/tmp/test [+] [2017-07-01 04:09:17] Starting archive from /home/bah/tmp/test export file. [*] [2017-07-01 04:09:17] Created archive index with 2 links. [*] Checking Dependencies: /bin/google-chrome-stable /bin/wget /bin/curl [+] [1402786302 (2014-06-14 15:51:42)] "Pushover: Simple Notifications for Android, iOS, and Desktop": pushover.net/ - Downloading Full Site - Printing PDF Exception: Exception Failed to print PDF - Snapping Screenshot - Submitting to archive.org - Fetching Favicon [+] [1402786302.1 (2014-06-14 15:51:42)] "r/android": www.reddit.com/r/Android - Downloading Full Site - Printing PDF Exception: Exception Failed to print PDF - Snapping Screenshot - Submitting to archive.org - Fetching Favicon [√] [2017-07-01 04:09:39.035009] Archive update complete. ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-01 14:48:10 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1038
No description provided.