[GH-ISSUE #8] Error when processing file. #8

Closed
opened 2026-02-27 15:54:26 +03:00 by kerem · 4 comments
Owner

Originally created by @rshibanov on GitHub (Apr 7, 2017).
Original GitHub issue: https://github.com/RD17/ambar/issues/8

I have loaded several pdf reports. Some of them, Ambar can't parse.
But he don't delete malformed files from queue.
After some time, Ambar try to parse them, again.

2017-04-07 13:42:17.512: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ the _wizzards_ of adware.pdf 'utf-8' codec can't decode byte 0xed in position 1605: invalid continuation byte 2017-04-07 13:42:19.413: [verbose] [p0] task received 7609b6aac3bff0e4bd067824dfc2925100797d2e72a1712bcf3edb1266660131 2017-04-07 13:42:19.450: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat round-up for the week of mar 6 - mar 10.pdf 2017-04-07 13:42:19.458: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat round-up for the week of mar 6 - mar 10.pdf 2017-04-07 13:42:19.778: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat round-up for the week of mar 6 - mar 10.pdf 'utf-8' codec can't decode byte 0xed in position 1075: invalid continuation byte 2017-04-07 13:42:33.929: [verbose] [p0] task received e0d13847f3079e244c9b4503635d59f0ce8840f37d410e9d6d8a0160895eb62a 2017-04-07 13:42:33.966: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat spotlight_ holiday greetings from pro pos – is your payment card data someone else’s christmas present_.pdf 2017-04-07 13:42:33.974: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat spotlight_ holiday greetings from pro pos – is your payment card data someone else’s christmas present_.pdf 2017-04-07 13:42:34.184: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat spotlight_ holiday greetings from pro pos – is your payment card data someone else’s christmas present_.pdf 'utf-8' codec can't decode byte 0xed in position 1307: invalid continuation byte 2017-04-07 13:42:34.200: [verbose] [p0] task received 7c8c073ba712de285d29877e3d01de448f6d7f0c0a4b061f700edfac0fa4d67b 2017-04-07 13:42:34.223: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat spotlight_ dyre_dyreza_ an analysis to discover the dga.pdf 2017-04-07 13:42:34.239: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat spotlight_ dyre_dyreza_ an analysis to discover the dga.pdf 2017-04-07 13:42:34.495: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat spotlight_ dyre_dyreza_ an analysis to discover the dga.pdf 'utf-8' codec can't decode byte 0xed in position 1697: invalid continuation byte 2017-04-07 13:42:35.406: [verbose] [p0] task received 073d9dde2cc486865db972a40f077660904439f0a00195bac24e33a3ebeecf72 2017-04-07 13:42:35.427: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ vulnerability deep dive - ichitaro office excel file code execution vulnerability.pdf 2017-04-07 13:42:35.436: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ vulnerability deep dive - ichitaro office excel file code execution vulnerability.pdf 2017-04-07 13:42:35.615: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ vulnerability deep dive - ichitaro office excel file code execution vulnerability.pdf 'utf-8' codec can't decode byte 0xed in position 222: invalid continuation byte 2017-04-07 13:42:38.804: [verbose] [p0] task received 7ecec8cba21bd5e0b0ee831045de5815a479f7dd20ccaaeb5ce48c30869f93ed 2017-04-07 13:42:38.840: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat spotlight_ teslacrypt - decrypt it yourself.pdf 2017-04-07 13:42:38.853: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat spotlight_ teslacrypt - decrypt it yourself.pdf 2017-04-07 13:42:39.089: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat spotlight_ teslacrypt - decrypt it yourself.pdf 'utf-8' codec can't decode byte 0xed in position 809: invalid continuation byte 2017-04-07 13:42:41.151: [verbose] [p0] task received 7fdb77feff9667babbe327a33f27db33c28b057c7032134a9b033b5ade09904b 2017-04-07 13:42:41.183: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ want tofsee my pictures_ a botnet gets aggressive.pdf 2017-04-07 13:42:41.194: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ want tofsee my pictures_ a botnet gets aggressive.pdf 2017-04-07 13:42:41.404: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ want tofsee my pictures_ a botnet gets aggressive.pdf 'utf-8' codec can't decode byte 0xed in position 544: invalid continuation byte 2017-04-07 13:42:43.720: [verbose] [p0] task received 8efde018b5242043d68dba573d67456712c74802c7b2ba64465793ec7daad1ff 2017-04-07 13:42:43.743: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ your files are encrypted with a _windows 10 upgrade_.pdf 2017-04-07 13:42:43.749: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ your files are encrypted with a _windows 10 upgrade_.pdf 2017-04-07 13:42:44.011: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ your files are encrypted with a _windows 10 upgrade_.pdf

Originally created by @rshibanov on GitHub (Apr 7, 2017). Original GitHub issue: https://github.com/RD17/ambar/issues/8 I have loaded several pdf reports. Some of them, Ambar can't parse. But he don't delete malformed files from queue. After some time, Ambar try to parse them, again. `2017-04-07 13:42:17.512: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ the _wizzards_ of adware.pdf 'utf-8' codec can't decode byte 0xed in position 1605: invalid continuation byte 2017-04-07 13:42:19.413: [verbose] [p0] task received 7609b6aac3bff0e4bd067824dfc2925100797d2e72a1712bcf3edb1266660131 2017-04-07 13:42:19.450: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat round-up for the week of mar 6 - mar 10.pdf 2017-04-07 13:42:19.458: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat round-up for the week of mar 6 - mar 10.pdf 2017-04-07 13:42:19.778: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat round-up for the week of mar 6 - mar 10.pdf 'utf-8' codec can't decode byte 0xed in position 1075: invalid continuation byte 2017-04-07 13:42:33.929: [verbose] [p0] task received e0d13847f3079e244c9b4503635d59f0ce8840f37d410e9d6d8a0160895eb62a 2017-04-07 13:42:33.966: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat spotlight_ holiday greetings from pro pos – is your payment card data someone else’s christmas present_.pdf 2017-04-07 13:42:33.974: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat spotlight_ holiday greetings from pro pos – is your payment card data someone else’s christmas present_.pdf 2017-04-07 13:42:34.184: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat spotlight_ holiday greetings from pro pos – is your payment card data someone else’s christmas present_.pdf 'utf-8' codec can't decode byte 0xed in position 1307: invalid continuation byte 2017-04-07 13:42:34.200: [verbose] [p0] task received 7c8c073ba712de285d29877e3d01de448f6d7f0c0a4b061f700edfac0fa4d67b 2017-04-07 13:42:34.223: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat spotlight_ dyre_dyreza_ an analysis to discover the dga.pdf 2017-04-07 13:42:34.239: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat spotlight_ dyre_dyreza_ an analysis to discover the dga.pdf 2017-04-07 13:42:34.495: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat spotlight_ dyre_dyreza_ an analysis to discover the dga.pdf 'utf-8' codec can't decode byte 0xed in position 1697: invalid continuation byte 2017-04-07 13:42:35.406: [verbose] [p0] task received 073d9dde2cc486865db972a40f077660904439f0a00195bac24e33a3ebeecf72 2017-04-07 13:42:35.427: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ vulnerability deep dive - ichitaro office excel file code execution vulnerability.pdf 2017-04-07 13:42:35.436: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ vulnerability deep dive - ichitaro office excel file code execution vulnerability.pdf 2017-04-07 13:42:35.615: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ vulnerability deep dive - ichitaro office excel file code execution vulnerability.pdf 'utf-8' codec can't decode byte 0xed in position 222: invalid continuation byte 2017-04-07 13:42:38.804: [verbose] [p0] task received 7ecec8cba21bd5e0b0ee831045de5815a479f7dd20ccaaeb5ce48c30869f93ed 2017-04-07 13:42:38.840: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ threat spotlight_ teslacrypt - decrypt it yourself.pdf 2017-04-07 13:42:38.853: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ threat spotlight_ teslacrypt - decrypt it yourself.pdf 2017-04-07 13:42:39.089: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ threat spotlight_ teslacrypt - decrypt it yourself.pdf 'utf-8' codec can't decode byte 0xed in position 809: invalid continuation byte 2017-04-07 13:42:41.151: [verbose] [p0] task received 7fdb77feff9667babbe327a33f27db33c28b057c7032134a9b033b5ade09904b 2017-04-07 13:42:41.183: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ want tofsee my pictures_ a botnet gets aggressive.pdf 2017-04-07 13:42:41.194: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ want tofsee my pictures_ a botnet gets aggressive.pdf 2017-04-07 13:42:41.404: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ want tofsee my pictures_ a botnet gets aggressive.pdf 'utf-8' codec can't decode byte 0xed in position 544: invalid continuation byte 2017-04-07 13:42:43.720: [verbose] [p0] task received 8efde018b5242043d68dba573d67456712c74802c7b2ba64465793ec7daad1ff 2017-04-07 13:42:43.743: [verbose] [p0] file content received //default/cisco's talos intelligence group blog_ your files are encrypted with a _windows 10 upgrade_.pdf 2017-04-07 13:42:43.749: [verbose] [p0] parsing //default/cisco's talos intelligence group blog_ your files are encrypted with a _windows 10 upgrade_.pdf 2017-04-07 13:42:44.011: [error] [p0] error parsing //default/cisco's talos intelligence group blog_ your files are encrypted with a _windows 10 upgrade_.pdf`
kerem 2026-02-27 15:54:26 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@sochix commented on GitHub (Apr 7, 2017):

Seems, like it's malformed PDF's or maybe encrypted. Ambar reject such files to the dead queue, it'll be live here. It'll not harm or somehow slow-down the Ambar perfomance. Just ignore it.

Can you please share some of these files with us?

<!-- gh-comment-id:292507737 --> @sochix commented on GitHub (Apr 7, 2017): Seems, like it's malformed PDF's or maybe encrypted. Ambar reject such files to the dead queue, it'll be live here. It'll not harm or somehow slow-down the Ambar perfomance. Just ignore it. Can you please share some of these files with us?
Author
Owner

@rshibanov commented on GitHub (Apr 10, 2017):

https://1drv.ms/f/s!AtSJqJWfRSn2uRcTsS24r3JirUww

<!-- gh-comment-id:292862029 --> @rshibanov commented on GitHub (Apr 10, 2017): [https://1drv.ms/f/s!AtSJqJWfRSn2uRcTsS24r3JirUww](url)
Author
Owner

@sochix commented on GitHub (Apr 10, 2017):

Fixed. Update please.

<!-- gh-comment-id:292951386 --> @sochix commented on GitHub (Apr 10, 2017): Fixed. Update please.
Author
Owner

@rshibanov commented on GitHub (Apr 12, 2017):

thanks, all fine

<!-- gh-comment-id:293490688 --> @rshibanov commented on GitHub (Apr 12, 2017): thanks, all fine
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ambar#8
No description provided.