#30099 closed defect (bug) (invalid)
fetch_feed does not properly parse URLs with query strings
Reported by: | leereamsnyder | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.0 |
Component: | General | Keywords: | |
Focuses: | Cc: |
Description
Using fetch_feed with this url:
http://www.ibm.com/developerworks/views/global/rss/libraryview.jsp?&contentarea_by=global&topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
the function Simplepie::IRI::parse_iri decodes it to this (using a print_r($match) ):
Array ( [0] => http://www.ibm.com/developerworks/views/global/rss/libraryview.jsp?contentarea_by=global&topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by= [1] => http: [scheme] => http [2] => http [3] => //www.ibm.com [authority] => www.ibm.com [4] => www.ibm.com [path] => /developerworks/views/global/rss/libraryview.jsp [5] => /developerworks/views/global/rss/libraryview.jsp [6] => ?contentarea_by=global& [query] => contentarea_by=global& [7] => contentarea_by=global& [8] => #038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by= [fragment] => 038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by= [9] => 038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by= )
The $match[query] appears to be stopping at the first parameter, and the rest is thrown in to $match[fragment] incorrectly.
Weirder: the first ampersand (but only the first) appears to be html encoded to '#&038;', which has a '#', which is why the fragment starts with '038;'. I can't for the life of me explain where in parse_iri it's being escaped.
Change History (3)
Note: See
TracTickets for help on using
tickets.
I am unable to reproduce this with the given url. I took the parse_iri function and copied it out to a different file for testing. I then passed it the url give in the ticket. The return was normal.
Here is the example I did. The first url is prior to the trim(), the second url is after the trim, and then the array is print_r( $match ).
Example: http://jmortensen.bluehoststaff.com/IRI
It seems to me that the url encoding is happening somewhere else.