mirror of
https://github.com/RD17/ambar.git
synced 2026-04-25 07:25:55 +03:00
[GH-ISSUE #59] imap crawler: [error] error retrieving message b'20' failded to fetch #60
Labels
No labels
$$ Paid Support
bug
bug
enhancement
help wanted
invalid
pull-request
question
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ambar#60
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @buster39 on GitHub (Aug 6, 2017).
Original GitHub issue: https://github.com/RD17/ambar/issues/59
Hello,
i tried different imap-servers for the crawler. But only a few local installations worked as expected.
I still have problems with gmail - and outlook.com gave me the same error:
2017-08-06 11:31:28.752: [info] filecrawler initialized
2017-08-06 11:31:30.049: [info] crawling xxx@gmail.com at imap.gmail.com
2017-08-06 11:31:30.349: [error] error retrieving message b'20' failded to fetch
2017-08-06 11:31:30.650: [info] done
My config:
{
"id": "Gmail",
"uid": "Gmail_d033e22ae348aeb5660fc2140aec35850c4da997",
"description": "Test",
"type": "imap",
"locations": [
{
"host_name": "imap.gmail.com",
"ip_address": "",
"location": "xxx@gmail.com"
}
],
"file_regex": "(\.doc[a-z]
)|(\\.xls[a-z]*)|(\.txt$)|(\.csv$)|(\.htm[a-z])|(\\.ppt[a-z]*)|(\.pdf$)|(\.msg$)|(\.zip$)|(\.eml$)|(\.rtf$)|(\.md$)|(\.png$)|(\.bmp$)|(\.tif[f])|(\\.jp[e]*g)|(\.hwp$)","credentials": {
"auth_type": "basic",
"login": "xxx@gmail.com",
"password": "****",
"token": ""
},
"schedule": {
"is_active": false,
"cron_schedule": "/15 * * * *"
},
"max_file_size_bytes": 30000000,
"verbose": true
}
Thank you!
@akropp commented on GitHub (Aug 11, 2017):
I found the following change to imapcrawler.py:
from:
callResult, data = self.connection.fetch(messageId, '(RFC822)')
to:
callResult, data = self.connection.uid('fetch', messageId, '(BODY.PEEK[])')
Makes gmail work -- not sure why calling the fetch method directly instead of using the uid call makes it choke on the message ids. Also, changing RFC822 to BODY.PEEK[] keeps your mail unread.
@isido993 commented on GitHub (Aug 22, 2017):
Implemented, see 2fc84df85cd06895e0ec1b282348c64672d035ab
Thanks for your input!