Make WordPress Core

Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#30099 closed defect (bug) (invalid)

fetch_feed does not properly parse URLs with query strings

Reported by: leereamsnyder's profile leereamsnyder Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.0
Component: General Keywords:
Focuses: Cc:

Description

Using fetch_feed with this url:

http://www.ibm.com/developerworks/views/global/rss/libraryview.jsp?&contentarea_by=global&topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=

the function Simplepie::IRI::parse_iri decodes it to this (using a print_r($match) ):

Array
(
    [0] => http://www.ibm.com/developerworks/views/global/rss/libraryview.jsp?contentarea_by=global&topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
    [1] => http:
    [scheme] => http
    [2] => http
    [3] => //www.ibm.com
    [authority] => www.ibm.com
    [4] => www.ibm.com
    [path] => /developerworks/views/global/rss/libraryview.jsp
    [5] => /developerworks/views/global/rss/libraryview.jsp
    [6] => ?contentarea_by=global&
    [query] => contentarea_by=global&
    [7] => contentarea_by=global&
    [8] => #038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
    [fragment] => 038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
    [9] => 038;topic_by=BlueMix&product_by=&type_by=All%20Types&search_by=&industry_by=&sort_by=Date&series_title_by=
)

The $match[query] appears to be stopping at the first parameter, and the rest is thrown in to $match[fragment] incorrectly.

Weirder: the first ampersand (but only the first) appears to be html encoded to '#&038;', which has a '#', which is why the fragment starts with '038;'. I can't for the life of me explain where in parse_iri it's being escaped.

Change History (3)

#1 @voldemortensen
10 years ago

I am unable to reproduce this with the given url. I took the parse_iri function and copied it out to a different file for testing. I then passed it the url give in the ticket. The return was normal.

Here is the example I did. The first url is prior to the trim(), the second url is after the trim, and then the array is print_r( $match ).

Example: http://jmortensen.bluehoststaff.com/IRI

It seems to me that the url encoding is happening somewhere else.

#2 @leereamsnyder
10 years ago

  • Resolution set to invalid
  • Status changed from new to closed

Per your comment I did some further testing by disabling other plugins and it appears another filter was HTML-encoding the url upstream. That was the source of the problem.

Thanks for looking into this.

#3 @dd32
10 years ago

  • Milestone Awaiting Review deleted
Note: See TracTickets for help on using tickets.