[GH-ISSUE #284] RSS feed not supported #246

Closed
opened 2026-02-25 21:34:32 +03:00 by kerem · 35 comments
Owner

Originally created by @ehanuise on GitHub (Sep 25, 2018).
Original GitHub issue: https://github.com/cypht-org/cypht/issues/284

Originally assigned to: @jasonmunro on GitHub.

I wanted to add 2 Belgian newspapers feeds :
http://www.lesoir.be/rss/81853/cible_principale_gratuit
http://www.lalibre.be/rss.xml

The first works OK, the second is refused by Cypht : http://www.lalibre.be/rss.xml

Cypht should be able to process both.

I noticed a small difference in headers, which might be the cause :
this one is Le Soir :

 <?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xml:base="http://www.lesoir.be/rss/81853/cible_principale_gratuit?status=1" xmlns:media="http://search.yahoo.com/mrss/" >
    <channel>

This one is La Libre :

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
Originally created by @ehanuise on GitHub (Sep 25, 2018). Original GitHub issue: https://github.com/cypht-org/cypht/issues/284 Originally assigned to: @jasonmunro on GitHub. I wanted to add 2 Belgian newspapers feeds : http://www.lesoir.be/rss/81853/cible_principale_gratuit http://www.lalibre.be/rss.xml The first works OK, the second is refused by Cypht : http://www.lalibre.be/rss.xml Cypht should be able to process both. I noticed a small difference in headers, which might be the cause : this one is Le Soir : ``` <?xml version="1.0" encoding="utf-8" ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xml:base="http://www.lesoir.be/rss/81853/cible_principale_gratuit?status=1" xmlns:media="http://search.yahoo.com/mrss/" > <channel> ``` This one is La Libre : ``` <?xml version="1.0" encoding="utf-8"?> <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> ```
kerem 2026-02-25 21:34:32 +03:00
  • closed this issue
  • added the
    feeds
    label
Author
Owner

@jasonmunro commented on GitHub (Sep 25, 2018):

Thanks for the feedback. I will re-create the issue and figure out whats wrong!

<!-- gh-comment-id:424385534 --> @jasonmunro commented on GitHub (Sep 25, 2018): Thanks for the feedback. I will re-create the issue and figure out whats wrong!
Author
Owner

@jasonmunro commented on GitHub (Sep 25, 2018):

I just popped it in and it worked without issue. I'm running the git master branch however - if you are running the latest release could you try switching over to the latest code? the latest release is quite old and out of date - I'm trying to start a new release cycle this week actually.

<!-- gh-comment-id:424386644 --> @jasonmunro commented on GitHub (Sep 25, 2018): I just popped it in and it worked without issue. I'm running the git master branch however - if you are running the latest release could you try switching over to the latest code? the latest release is quite old and out of date - I'm trying to start a new release cycle this week actually.
Author
Owner

@ehanuise commented on GitHub (Sep 25, 2018):

Hi.
I installed with the script from your site. Doesn't it DL the latest version?
I'm not familiar with git - not a dev. If you have a script, I'm game :-)

Sent from my mobile.

-----Original Message-----
From: Jason Munro notifications@github.com
To: jasonmunro/cypht cypht@noreply.github.com
Cc: Eric Hanuise ehanuise@fantasybel.net, Author author@noreply.github.com
Sent: Tue, 25 Sep 2018 17:25
Subject: Re: [jasonmunro/cypht] RSS feed not supported (#284)

I just popped it in and it worked without issue. I'm running the git master branch however - if you are running the latest release could you try switching over to the latest code? the latest release is quite old and out of date - I'm trying to start a new release cycle this week actually.

--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://github.com/jasonmunro/cypht/issues/284#issuecomment-424386644

<!-- gh-comment-id:424486556 --> @ehanuise commented on GitHub (Sep 25, 2018): Hi. I installed with the script from your site. Doesn't it DL the latest version? I'm not familiar with git - not a dev. If you have a script, I'm game :-) Sent from my mobile. -----Original Message----- From: Jason Munro <notifications@github.com> To: jasonmunro/cypht <cypht@noreply.github.com> Cc: Eric Hanuise <ehanuise@fantasybel.net>, Author <author@noreply.github.com> Sent: Tue, 25 Sep 2018 17:25 Subject: Re: [jasonmunro/cypht] RSS feed not supported (#284) I just popped it in and it worked without issue. I'm running the git master branch however - if you are running the latest release could you try switching over to the latest code? the latest release is quite old and out of date - I'm trying to start a new release cycle this week actually. -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/jasonmunro/cypht/issues/284#issuecomment-424386644
Author
Owner

@jasonmunro commented on GitHub (Sep 25, 2018):

Actually, the script as defined on the install page at cypht.org does use the latest git master branch. I can't explain the issue you are seeing here - I was able to add that RSS source without a problem. Can you tell me more about your PHP version, and which PHP packages are installed?

Thanks!

<!-- gh-comment-id:424520719 --> @jasonmunro commented on GitHub (Sep 25, 2018): Actually, the script as defined on the install page at cypht.org does use the latest git master branch. I can't explain the issue you are seeing here - I was able to add that RSS source without a problem. Can you tell me more about your PHP version, and which PHP packages are installed? Thanks!
Author
Owner

@ehanuise commented on GitHub (Sep 25, 2018):

It's php 7.1.1 on debian latest version

I can try and capture some logs if you tell me where to look :)

On 26/09/18 00:18, Jason Munro wrote:

Actually, the script as defined on the install page at cypht.org does
use the latest git master branch. I can't explain the issue you are
seeing here - I was able to add that RSS source without a problem. Can
you tell me more about your PHP version, and which PHP packages are
installed?

Thanks!


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-424520719,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsamEcHybUlJ_AILhS7c3va9Td9WHks5uequ9gaJpZM4W415s.

<!-- gh-comment-id:424528883 --> @ehanuise commented on GitHub (Sep 25, 2018): It's php 7.1.1 on debian latest version I can try and capture some logs if you tell me where to look :) On 26/09/18 00:18, Jason Munro wrote: > > Actually, the script as defined on the install page at cypht.org does > use the latest git master branch. I can't explain the issue you are > seeing here - I was able to add that RSS source without a problem. Can > you tell me more about your PHP version, and which PHP packages are > installed? > > Thanks! > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-424520719>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsamEcHybUlJ_AILhS7c3va9Td9WHks5uequ9gaJpZM4W415s>. >
Author
Owner

@ehanuise commented on GitHub (Sep 26, 2018):

I get this error on the la libre feed : Cound not add feed: php_network_getaddresses: getaddrinfo failed: Name or service not known

<!-- gh-comment-id:424623695 --> @ehanuise commented on GitHub (Sep 26, 2018): I get this error on the la libre feed : Cound not add feed: php_network_getaddresses: getaddrinfo failed: Name or service not known
Author
Owner

@jasonmunro commented on GitHub (Sep 26, 2018):

Weird. So this is telling me your server cannot resolve the address for that feed. What happens when you try:

nslookup www.lalibre.be

From your server?

<!-- gh-comment-id:424731452 --> @jasonmunro commented on GitHub (Sep 26, 2018): Weird. So this is telling me your server cannot resolve the address for that feed. What happens when you try: `nslookup www.lalibre.be` From your server?
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

It rsolves just fine :

 nslookup www.lalibre.be
Server:		127.0.0.1
Address:	127.0.0.1#53

Non-authoritative answer:
Name:	www.lalibre.be
Address: 81.246.65.146
<!-- gh-comment-id:425026505 --> @ehanuise commented on GitHub (Sep 27, 2018): It rsolves just fine : ``` nslookup www.lalibre.be Server: 127.0.0.1 Address: 127.0.0.1#53 Non-authoritative answer: Name: www.lalibre.be Address: 81.246.65.146 ```
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

Thanks for the follow up. Looking a bit closer at the code, the first thing we do is split up the url into it's parts, and try to connect to the host portion - this is where it is failing for you. Is it possible you had a typo? Looks like even a leading space before the address could cause an issue here (which I will fix). Can you retry and if it still fails run this from your server:

php -r 'print_r(parse_url(" http://www.lalibre.be/rss.xml"));'

it should return the following:

Array
(
    [scheme] => http
    [host] => www.lalibre.be
    [path] => /rss.xml
)

<!-- gh-comment-id:425128896 --> @jasonmunro commented on GitHub (Sep 27, 2018): Thanks for the follow up. Looking a bit closer at the code, the first thing we do is split up the url into it's parts, and try to connect to the host portion - this is where it is failing for you. Is it possible you had a typo? Looks like even a leading space before the address could cause an issue here (which I will fix). Can you retry and if it still fails run this from your server: `php -r 'print_r(parse_url(" http://www.lalibre.be/rss.xml"));' ` it should return the following: ``` Array ( [scheme] => http [host] => www.lalibre.be [path] => /rss.xml ) ```
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

Thanks.

I get it differently :

php -r 'print_r(parse_url(" http://www.lalibre.be/rss.xml"));'

Array
(
    [path] =>  http://www.lalibre.be/rss.xml
)

On 27/09/18 17:08, Jason Munro wrote:

Thanks for the follow up. Looking a bit closer at the code, the first
thing we do is split up the url into it's parts, and try to connect to
the host portion - this is where it is failing for you. Is it possible
you had a typo? Looks like even a leading space before the address
could cause an issue here (which I will fix). Can you retry and if it
still fails run this from your server:

|php -r 'print_r(parse_url(" http://www.lalibre.be/rss.xml"));'|

it should return the following:

|Array ( [scheme] => http [host] => www.lalibre.be [path] => /rss.xml ) |


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425128896,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsXEXNtvEzzZdz-SFhJCf8RnNNH2Yks5ufOnYgaJpZM4W415s.

<!-- gh-comment-id:425155179 --> @ehanuise commented on GitHub (Sep 27, 2018): Thanks. I get it differently : php -r 'print_r(parse_url(" http://www.lalibre.be/rss.xml"));' Array (     [path] =>  http://www.lalibre.be/rss.xml ) On 27/09/18 17:08, Jason Munro wrote: > > Thanks for the follow up. Looking a bit closer at the code, the first > thing we do is split up the url into it's parts, and try to connect to > the host portion - this is where it is failing for you. Is it possible > you had a typo? Looks like even a leading space before the address > could cause an issue here (which I will fix). Can you retry and if it > still fails run this from your server: > > |php -r 'print_r(parse_url(" http://www.lalibre.be/rss.xml"));'| > > it should return the following: > > |Array ( [scheme] => http [host] => www.lalibre.be [path] => /rss.xml ) | > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425128896>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsXEXNtvEzzZdz-SFhJCf8RnNNH2Yks5ufOnYgaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

Oh shoot, I copied the command with the leading space in the host that does not work - should be this:

php -r 'print_r(parse_url("http://www.lalibre.be/rss.xml"));'

<!-- gh-comment-id:425156050 --> @jasonmunro commented on GitHub (Sep 27, 2018): Oh shoot, I copied the command with the leading space in the host that does not work - should be this: `php -r 'print_r(parse_url("http://www.lalibre.be/rss.xml"));' `
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

:-)

This one works :

php -r 'print_r(parse_url("http://www.lalibre.be/rss.xml"));'
Array
(
    [scheme] => http
    [host] => www.lalibre.be
    [path] => /rss.xml
)

On 27/09/18 18:23, Jason Munro wrote:

Oh shoot, I copied the command with the leading space in the host that
does not work - should be this:

|php -r 'print_r(parse_url("http://www.lalibre.be/rss.xml"));'|


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425156050,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsR59c2ZgSVurW2TgkvSTr8IDo9MGks5ufPuOgaJpZM4W415s.

<!-- gh-comment-id:425156326 --> @ehanuise commented on GitHub (Sep 27, 2018): :-) This one works : php -r 'print_r(parse_url("http://www.lalibre.be/rss.xml"));' Array (     [scheme] => http     [host] => www.lalibre.be     [path] => /rss.xml ) On 27/09/18 18:23, Jason Munro wrote: > > Oh shoot, I copied the command with the leading space in the host that > does not work - should be this: > > |php -r 'print_r(parse_url("http://www.lalibre.be/rss.xml"));'| > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425156050>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsR59c2ZgSVurW2TgkvSTr8IDo9MGks5ufPuOgaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

yep, looks good. Did you try adding it again in Cypht making sure there is no leading space?

<!-- gh-comment-id:425156537 --> @jasonmunro commented on GitHub (Sep 27, 2018): yep, looks good. Did you try adding it again in Cypht making sure there is no leading space?
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

yup, sure, typed it by hand, doublechecked, the issue is somewhere else
i'm afraid :(

On 27/09/18 18:25, Jason Munro wrote:

yep, looks good. Did you try adding it again in Cypht making sure
there is no leading space?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425156537,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsQ4dw0VgE4zTtu4mCVX_dTXZ6pyMks5ufPvvgaJpZM4W415s.

<!-- gh-comment-id:425157367 --> @ehanuise commented on GitHub (Sep 27, 2018): yup, sure, typed it by hand, doublechecked, the issue is somewhere else i'm afraid :( On 27/09/18 18:25, Jason Munro wrote: > > yep, looks good. Did you try adding it again in Cypht making sure > there is no leading space? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425156537>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsQ4dw0VgE4zTtu4mCVX_dTXZ6pyMks5ufPvvgaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

how about this command?
php -r 'print_r(fsockopen("www.lalibre.be", 80));'

<!-- gh-comment-id:425159037 --> @jasonmunro commented on GitHub (Sep 27, 2018): how about this command? `php -r 'print_r(fsockopen("www.lalibre.be", 80));'`
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

php -r 'print_r(fsockopen("www.lalibre.be", 80));'
Resource id #4USERNAME@HOSTNAME:/home/USERNAME   (caps are edited)

On 27/09/18 18:33, Jason Munro wrote:

|php -r 'print_r(fsockopen("www.lalibre.be", 80));'|

<!-- gh-comment-id:425159746 --> @ehanuise commented on GitHub (Sep 27, 2018): php -r 'print_r(fsockopen("www.lalibre.be", 80));' Resource id #4USERNAME@HOSTNAME:/home/USERNAME   (caps are edited) On 27/09/18 18:33, Jason Munro wrote: > |php -r 'print_r(fsockopen("www.lalibre.be", 80));'|
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

Ah sorry it was my prompt.

it just returns

Resource id #4

On 27/09/18 18:35, Eric Hanuise wrote:

php -r 'print_r(fsockopen("www.lalibre.be", 80));'
Resource id #4USERNAME@HOSTNAME:/home/USERNAME   (caps are edited)

On 27/09/18 18:33, Jason Munro wrote:

|php -r 'print_r(fsockopen("www.lalibre.be", 80));'|

<!-- gh-comment-id:425160231 --> @ehanuise commented on GitHub (Sep 27, 2018): Ah sorry it was my prompt. it just returns Resource id #4 On 27/09/18 18:35, Eric Hanuise wrote: > > php -r 'print_r(fsockopen("www.lalibre.be", 80));' > Resource id #4USERNAME@HOSTNAME:/home/USERNAME   (caps are edited) > > > On 27/09/18 18:33, Jason Munro wrote: >> |php -r 'print_r(fsockopen("www.lalibre.be", 80));'| >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

well this is a puzzle! I can't reproduce this here, so the only thing I can think to try is to give you a patch to insert some debugging info into the feed related code that will output some data to the PHP/webserver error log.

<!-- gh-comment-id:425160872 --> @jasonmunro commented on GitHub (Sep 27, 2018): well this is a puzzle! I can't reproduce this here, so the only thing I can think to try is to give you a patch to insert some debugging info into the feed related code that will output some data to the PHP/webserver error log.
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

I enabled debug mode , and copied the output on when I try to enter the
feed :

[Thu Sep 27 18:44:10.803701 2018] [php7:notice] [pid 20177] [client
XXXXXXXXXXXXX:60378] Array\n(\n    [0] => Using Hm_PHP_Session with
Hm_Auth_IMAP\n    [1] => Using file based user configuration\n    [2] =>
Using sapi: apache2handler\n [3] => Request type: HTTP\n    [4] =>
Request path: /webmail/\n    [5] => TLS request: 1\n    [6] => Mobile
request: 0\n    [7] => Page ID: servers\n    [8] => LOGGED IN\n    [9]
=> XML Parse error: Reserved XML Name\n    [10] => Setting cookie: name:
hm_msgs, lifetime: 0, path: /webmail/, domain: www.XXXXXXX.com, secure:
1, html_only 1\n    [11] => Redirecting to /webmail/?page=servers\n   
[12] => PHP version 7.1.20-1+020180910100430.3+jessie1.gbp17c613\n   
[13] => Zend version 3.1.0\n    [14] => Peak Memory: 2048\n    [15] =>
PID: 20177\n    [16] => Included files: 68\n)\n, referer:
https://www.XXXXXXXX.com/webmail/?page=servers
[Thu Sep 27 18:44:10.919458 2018] [php7:notice] [pid 20177] [client
XXXXXXXXXX:60378] Array\n(\n    [0] => Using Hm_PHP_Session with
Hm_Auth_IMAP\n    [1] => Using file based user configuration\n    [2] =>
Using sapi: apache2handler\n [3] => Request type: HTTP\n    [4] =>
Request path: /webmail/\n    [5] => TLS request: 1\n    [6] => Mobile
request: 0\n    [7] => Page ID: servers\n    [8] => LOGGED IN\n    [9]
=> Deleting cookie: name: hm_msgs, lifetime: 1538063050, path:
/webmail/, domain: www.XXXXXXX.com, secure: 1, html_only 1\n    [10] =>
TRANSLATION NOT FOUND :Could not find an RSS or ATOM feed at that
address:\n    [11] => TRANSLATION NOT FOUND :Office365:\n    [12] =>
TRANSLATION NOT FOUND :STARTTLS or unencrypted:\n    [13] => TRANSLATION
NOT FOUND :STARTTLS or unencrypted:\n    [14] => TRANSLATION NOT FOUND
:STARTTLS or unencrypted:\n    [15] => PHP version
7.1.20-1+020180910100430.3+jessie1.gbp17c613\n    [16] => Zend version
3.1.0\n    [17] => Peak Memory: 2048\n    [18] => PID: 20177\n    [19]
=> Included files: 69\n)\n, referer:
https://www.XXXXXXXX.com/webmail/?page=servers

On 27/09/18 18:39, Jason Munro wrote:

well this is a puzzle! I can't reproduce this here, so the only thing
I can think to try is to give you a patch to insert some debugging
info into the feed related code that will output some data to the
PHP/webserver error log.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425160872,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsVjQktWG1J4gEl03WhJVna3TqmOZks5ufP8sgaJpZM4W415s.

<!-- gh-comment-id:425164040 --> @ehanuise commented on GitHub (Sep 27, 2018): I enabled debug mode , and copied the output on when I try to enter the feed : [Thu Sep 27 18:44:10.803701 2018] [php7:notice] [pid 20177] [client XXXXXXXXXXXXX:60378] Array\n(\n    [0] => Using Hm_PHP_Session with Hm_Auth_IMAP\n    [1] => Using file based user configuration\n    [2] => Using sapi: apache2handler\n [3] => Request type: HTTP\n    [4] => Request path: /webmail/\n    [5] => TLS request: 1\n    [6] => Mobile request: 0\n    [7] => Page ID: servers\n    [8] => LOGGED IN\n    [9] => XML Parse error: Reserved XML Name\n    [10] => Setting cookie: name: hm_msgs, lifetime: 0, path: /webmail/, domain: www.XXXXXXX.com, secure: 1, html_only 1\n    [11] => Redirecting to /webmail/?page=servers\n    [12] => PHP version 7.1.20-1+0~20180910100430.3+jessie~1.gbp17c613\n    [13] => Zend version 3.1.0\n    [14] => Peak Memory: 2048\n    [15] => PID: 20177\n    [16] => Included files: 68\n)\n, referer: https://www.XXXXXXXX.com/webmail/?page=servers [Thu Sep 27 18:44:10.919458 2018] [php7:notice] [pid 20177] [client XXXXXXXXXX:60378] Array\n(\n    [0] => Using Hm_PHP_Session with Hm_Auth_IMAP\n    [1] => Using file based user configuration\n    [2] => Using sapi: apache2handler\n [3] => Request type: HTTP\n    [4] => Request path: /webmail/\n    [5] => TLS request: 1\n    [6] => Mobile request: 0\n    [7] => Page ID: servers\n    [8] => LOGGED IN\n    [9] => Deleting cookie: name: hm_msgs, lifetime: 1538063050, path: /webmail/, domain: www.XXXXXXX.com, secure: 1, html_only 1\n    [10] => TRANSLATION NOT FOUND :Could not find an RSS or ATOM feed at that address:\n    [11] => TRANSLATION NOT FOUND :Office365:\n    [12] => TRANSLATION NOT FOUND :STARTTLS or unencrypted:\n    [13] => TRANSLATION NOT FOUND :STARTTLS or unencrypted:\n    [14] => TRANSLATION NOT FOUND :STARTTLS or unencrypted:\n    [15] => PHP version 7.1.20-1+0~20180910100430.3+jessie~1.gbp17c613\n    [16] => Zend version 3.1.0\n    [17] => Peak Memory: 2048\n    [18] => PID: 20177\n    [19] => Included files: 69\n)\n, referer: https://www.XXXXXXXX.com/webmail/?page=servers On 27/09/18 18:39, Jason Munro wrote: > > well this is a puzzle! I can't reproduce this here, so the only thing > I can think to try is to give you a patch to insert some debugging > info into the feed related code that will output some data to the > PHP/webserver error log. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425160872>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsVjQktWG1J4gEl03WhJVna3TqmOZks5ufP8sgaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

XML Parse error: Reserved XML Name

This is interesting, looks we are getting the xml from the feed but can't parse it. Thanks, this helps! Still can't explain why it's working here (yet), but it's a clue :)

<!-- gh-comment-id:425165915 --> @jasonmunro commented on GitHub (Sep 27, 2018): > XML Parse error: Reserved XML Name This is interesting, looks we are getting the xml from the feed but can't parse it. Thanks, this helps! Still can't explain why it's working here (yet), but it's a clue :)
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

https://stackoverflow.com/questions/11107592/xml-error-parsing-soap-payload-reserved-xml-name/15604229

I notice the 'lalibre' feed has no whitespace before the <?xml statement
and the 'Le Soir' has.

I can't get them to change it of course, and it works with other feed
readers, so maybe this is the root cause ?

Hete's another feed that works OK elsewhere, doesn't work in cypht, and
has no whitespace before <?xml

http://www.bitcoin.fr/feed/rss2

On 27/09/18 18:55, Jason Munro wrote:

XML Parse error: Reserved XML Name

This is interesting, looks we are getting the xml from the feed but
can't parse it. Thanks, this helps! Still can't explain why it's
working here (yet), but it's a clue :)


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425165915,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsXvvMRQMpZ7IfMAafExRsd6a31Muks5ufQL1gaJpZM4W415s.

<!-- gh-comment-id:425168288 --> @ehanuise commented on GitHub (Sep 27, 2018): https://stackoverflow.com/questions/11107592/xml-error-parsing-soap-payload-reserved-xml-name/15604229 I notice the 'lalibre' feed has no whitespace before the <?xml statement and the 'Le Soir' has. I can't get them to change it of course, and it works with other feed readers, so maybe this is the root cause ? Hete's another feed that works OK elsewhere, doesn't work in cypht, and has no whitespace before <?xml http://www.bitcoin.fr/feed/rss2 On 27/09/18 18:55, Jason Munro wrote: > > XML Parse error: Reserved XML Name > > This is interesting, looks we are getting the xml from the feed but > can't parse it. Thanks, this helps! Still can't explain why it's > working here (yet), but it's a clue :) > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425165915>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsXvvMRQMpZ7IfMAafExRsd6a31Muks5ufQL1gaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

do you have php curl installed?

<!-- gh-comment-id:425169187 --> @jasonmunro commented on GitHub (Sep 27, 2018): do you have php curl installed?
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

looks like we try to use curl if it's installed, otherwise we fall back to file_get_contents(). If you don't have curl maybe this is the issue.

<!-- gh-comment-id:425169757 --> @jasonmunro commented on GitHub (Sep 27, 2018): looks like we try to use curl if it's installed, otherwise we fall back to file_get_contents(). If you don't have curl maybe this is the issue.
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

it's installed : php7.1-curl 7.1.20-1+020180910100430.3+jessie1.gbp17c613

On 27/09/18 19:07, Jason Munro wrote:

looks like we try to use curl if it's installed, otherwise we fall
back to file_get_contents(). If you don't have curl maybe this is the
issue.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425169757,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsfn2nJyZasqGOBflaA7gsCgH8fHjks5ufQXFgaJpZM4W415s.

<!-- gh-comment-id:425170389 --> @ehanuise commented on GitHub (Sep 27, 2018): it's installed : php7.1-curl 7.1.20-1+0~20180910100430.3+jessie~1.gbp17c613 On 27/09/18 19:07, Jason Munro wrote: > > looks like we try to use curl if it's installed, otherwise we fall > back to file_get_contents(). If you don't have curl maybe this is the > issue. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425169757>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsfn2nJyZasqGOBflaA7gsCgH8fHjks5ufQXFgaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

darn, thought we might have a hit on that one :) I saw the leading white-space issues from googling the error as well, but that still does not explain why it works fine here :/

<!-- gh-comment-id:425170968 --> @jasonmunro commented on GitHub (Sep 27, 2018): darn, thought we might have a hit on that one :) I saw the leading white-space issues from googling the error as well, but that still does not explain why it works fine here :/
Author
Owner

@ehanuise commented on GitHub (Sep 27, 2018):

Sorry can't help much more at this point :)

Maybe a change between different PHP versions ?

I'll email you privately a copy of phpinfo();

On 27/09/18 19:11, Jason Munro wrote:

darn, thought we might have a hit on that one :) I saw the leading
white-space issues from googling the error as well, but that still
does not explain why it works fine here :/


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/jasonmunro/cypht/issues/284#issuecomment-425170968,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGtnsasd2FSTrChPf64r0dtxJB4UpWk5ks5ufQa1gaJpZM4W415s.

<!-- gh-comment-id:425171536 --> @ehanuise commented on GitHub (Sep 27, 2018): Sorry can't help much more at this point :) Maybe a change between different PHP versions ? I'll email you privately a copy of phpinfo(); On 27/09/18 19:11, Jason Munro wrote: > > darn, thought we might have a hit on that one :) I saw the leading > white-space issues from googling the error as well, but that still > does not explain why it works fine here :/ > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/jasonmunro/cypht/issues/284#issuecomment-425170968>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGtnsasd2FSTrChPf64r0dtxJB4UpWk5ks5ufQa1gaJpZM4W415s>. >
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

It's looking like this is not a bug in Cypht, however we could do a few things better to figure out issues like this:

  • We should capture the HTTP status code and log it or return it as a user message
  • We should trim the feed response XML just in case it has leading white-space
  • On xml parsing failures, we should debug log some of the xml, maybe even just the first 1024 bytes.
<!-- gh-comment-id:425227495 --> @jasonmunro commented on GitHub (Sep 27, 2018): It's looking like this is not a bug in Cypht, however we could do a few things better to figure out issues like this: - We should capture the HTTP status code and log it or return it as a user message - We should trim the feed response XML just in case it has leading white-space - On xml parsing failures, we should debug log some of the xml, maybe even just the first 1024 bytes.
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

better debugging added in github.com/jasonmunro/cypht@354536bf13 @ehanuise I will leave this open for a while in case you run into further issues!

<!-- gh-comment-id:425235523 --> @jasonmunro commented on GitHub (Sep 27, 2018): better debugging added in https://github.com/jasonmunro/cypht/commit/354536bf13a1e5fcd8930b2278f5e84dcb4ff125 @ehanuise I will leave this open for a while in case you run into further issues!
Author
Owner

@dumblob commented on GitHub (Sep 27, 2018):

To cover these cases when the RSS feed (the XML) is invalid, we could switch from an XML parser to an HTML parser which is way more tolerant to any mistakes. Anything which uses libxml2 in its core shall be able to use the builtin HTMLparser API which parses HTML 4.0.

I'm not sure though how difficult this switch would be and how high priority it has (make the RSS parser more tolerant).

<!-- gh-comment-id:425248667 --> @dumblob commented on GitHub (Sep 27, 2018): To cover these cases when the RSS feed (the XML) is invalid, we could switch from an XML parser to an HTML parser which is way more tolerant to any mistakes. Anything which uses libxml2 in its core shall be able to use the builtin [`HTMLparser` API](http://xmlsoft.org/html/libxml-HTMLparser.html ) which parses HTML 4.0. I'm not sure though how difficult this switch would be and how high priority it has (make the RSS parser more tolerant).
Author
Owner

@jasonmunro commented on GitHub (Sep 27, 2018):

@dumblob not a bad idea. For the record this was not a badly formatted feed, but 403 permission denied response with a small HTML payload. I track about 12 feeds and don't recall seeing any badly formatted XML over the last few yars (though maybe I just have not noticed, and YMMV since that is not a very wide sample size).

For now I think the additional debugging will shine some light on potentially problematic feeds, and if we decide to use something more forgiving as a fallback for bad formatting we can look more closely into it.

<!-- gh-comment-id:425264300 --> @jasonmunro commented on GitHub (Sep 27, 2018): @dumblob not a bad idea. For the record this was not a badly formatted feed, but 403 permission denied response with a small HTML payload. I track about 12 feeds and don't recall seeing any badly formatted XML over the last few yars (though maybe I just have not noticed, and YMMV since that is not a very wide sample size). For now I think the additional debugging will shine some light on potentially problematic feeds, and if we decide to use something more forgiving as a fallback for bad formatting we can look more closely into it.
Author
Owner

@ehanuise commented on GitHub (Sep 28, 2018):

OK, I digged a bit further.
On http://www.lalibre.be/rss.xml or any other part of that site, I get a varnish 403 error. Lokks like a problem on their end with my server and fixed IP - I contacted them to investigate.

I also tried http://www.bitcoin.fr/feed This one is more interesting for our purposes here : when I try to open it in lynx or w3m from the server, it receives an html files and offers to download it. The file is in fact the correctly formed RSS XML feed.
In other rss readers this gets processed OK, but cypht misses that and can't open the feed.

<!-- gh-comment-id:425357340 --> @ehanuise commented on GitHub (Sep 28, 2018): OK, I digged a bit further. On http://www.lalibre.be/rss.xml or any other part of that site, I get a varnish 403 error. Lokks like a problem on their end with my server and fixed IP - I contacted them to investigate. I also tried http://www.bitcoin.fr/feed This one is more interesting for our purposes here : when I try to open it in lynx or w3m from the server, it receives an html files and offers to download it. The file is in fact the correctly formed RSS XML feed. In other rss readers this gets processed OK, but cypht misses that and can't open the feed.
Author
Owner
<!-- gh-comment-id:425417035 --> @ehanuise commented on GitHub (Sep 28, 2018): Might be a CURL referrer issue : https://unix.stackexchange.com/questions/139698/why-would-curl-and-wget-result-in-a-403-forbidden https://stackoverflow.com/questions/26173689/curl-not-able-to-download-image-file-from-server-running-varnish-cache
Author
Owner

@jasonmunro commented on GitHub (Sep 28, 2018):

Reproduced and fixed in github.com/jasonmunro/cypht@e7489ed1ca The issue was we were not following HTTP redirects, which we should :) @ehanuise you can get this fix and all the additional debugging I have added recently by downloading and copying in this file:

https://raw.githubusercontent.com/jasonmunro/cypht/master/modules/feeds/hm-feed.php

Thanks for the great feedback on feeds - already several great improvements thanks to your reports!

<!-- gh-comment-id:425511232 --> @jasonmunro commented on GitHub (Sep 28, 2018): Reproduced and fixed in https://github.com/jasonmunro/cypht/commit/e7489ed1ca9862bb2400386d74dbcfff8a8411af The issue was we were not following HTTP redirects, which we should :) @ehanuise you can get this fix and all the additional debugging I have added recently by downloading and copying in this file: https://raw.githubusercontent.com/jasonmunro/cypht/master/modules/feeds/hm-feed.php Thanks for the great feedback on feeds - already several great improvements thanks to your reports!
Author
Owner

@ehanuise commented on GitHub (Sep 29, 2018):

Thanks. The http://www.bitcoin.fr/feed feed now works, looks all good so far :)
Will add other feeds and report issues that aren't varnish 403-tied.
I added 50 feeds, so I see what's it like with a loaded set of feeds. I created a improvement ticket for feeds UI with some suggestions ;-)
Only the lalibre feed still eludes me - I'll use feedburner to bypass that 403 issue.

<!-- gh-comment-id:425646141 --> @ehanuise commented on GitHub (Sep 29, 2018): Thanks. The http://www.bitcoin.fr/feed feed now works, looks all good so far :) Will add other feeds and report issues that aren't varnish 403-tied. I added 50 feeds, so I see what's it like with a loaded set of feeds. I created a improvement ticket for feeds UI with some suggestions ;-) Only the lalibre feed still eludes me - I'll use feedburner to bypass that 403 issue.
Author
Owner

@jasonmunro commented on GitHub (Oct 17, 2018):

@ehanuise I'm going to close this since it think all issues in this thread are resolved. If not, please feel free to open a new issue around any specific problem remaining. Thanks for the feedback!

<!-- gh-comment-id:430715795 --> @jasonmunro commented on GitHub (Oct 17, 2018): @ehanuise I'm going to close this since it think all issues in this thread are resolved. If not, please feel free to open a new issue around any specific problem remaining. Thanks for the feedback!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/cypht#246
No description provided.