[GH-ISSUE #806] Could ArchiveBox be used to automatically download new Twitter posts? #3528

Closed
opened 2026-03-14 23:23:06 +03:00 by kerem · 3 comments
Owner

Originally created by @holygamer on GitHub (Jul 24, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/806

I would like to be able to enter a Twitter account URL and have the program automatically archive all new Tweets by that person as soon as the Tweets are made. Is that possible as is or could I mod the program to do that? What would the limitations be if monitoring thousands of Twitter accounts like that at once? If I bought a single server would it be able to handle it?

Would Twitter block the IP used by the server with the ArchiveBox software because of making too many requests to view pages at the same time as each other? If so, could I queue requests so many Tweets aren't backed up all at the same time as as each other.

Originally created by @holygamer on GitHub (Jul 24, 2021). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/806 I would like to be able to enter a Twitter account URL and have the program automatically archive all new Tweets by that person as soon as the Tweets are made. Is that possible as is or could I mod the program to do that? What would the limitations be if monitoring thousands of Twitter accounts like that at once? If I bought a single server would it be able to handle it? Would Twitter block the IP used by the server with the ArchiveBox software because of making too many requests to view pages at the same time as each other? If so, could I queue requests so many Tweets aren't backed up all at the same time as as each other.
kerem closed this issue 2026-03-14 23:23:11 +03:00
Author
Owner

@pirate commented on GitHub (Jul 28, 2021):

archivebox schedule --every=hour https://twitter.com/...
archivebox schedule --every=day --depth=1 https://twitter.com/...
archivebox schedule --help

You'll have to try it and see, I don't know the answers for that kind of scale.

<!-- gh-comment-id:888402647 --> @pirate commented on GitHub (Jul 28, 2021): `archivebox schedule --every=hour https://twitter.com/...` `archivebox schedule --every=day --depth=1 https://twitter.com/...` `archivebox schedule --help` You'll have to try it and see, I don't know the answers for that kind of scale.
Author
Owner

@mhfowler commented on GitHub (Aug 19, 2021):

I would also be curious to see this experiment with instagram. Although an instagram-specific archiver may be necessary.

<!-- gh-comment-id:901958502 --> @mhfowler commented on GitHub (Aug 19, 2021): I would also be curious to see this experiment with instagram. Although an instagram-specific archiver may be necessary.
Author
Owner

@pirate commented on GitHub (Jan 19, 2024):

Closing as stale.

<!-- gh-comment-id:1899660387 --> @pirate commented on GitHub (Jan 19, 2024): Closing as stale.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3528
No description provided.