Home > Back-end >  Are question marks (?) allowed in URL passwords (netloc)?
Are question marks (?) allowed in URL passwords (netloc)?

Time:11-01

I've run into a bug in my code which boils down to unexpected behaviour in python's urllib.parse.urlparse. It happens when a a password in an URL contains ?. I'm very shy to call this a bug because I'm not 100% certain the URLs are syntactically correct. For example:

https://user:pass[email protected]/path

I'd expect this to parse as:

ParseResult(scheme='https', netloc='user:[email protected]', path='/path', params='', query='', fragment='')

But it actually parses as:

ParseResult(scheme='https', netloc='user:pass', path='', params='', query='[email protected]/path', fragment='')

These URL's are being automatically formatted elsewhere and the use of ? in the password needs to be supported. Interestingly enough, the URLS are actually sqlalchemy connection strings and sqlalchemy / psycopg2 are both interpreting them as expected.


Question

Are question marks ? allowed in an URL Password (netloc)? - Ideally answers would refer to the appropriate RFC wording.

Or is python's behaviour here correct?

CodePudding user response:

Section 3.2 in RFC 2396 specifies that ? is reserved in the authority component (which means you're not allowed to use it within that component without encoding, see 2.2). The authority component, IIUC, is the component encompassing the user-name/password of an HTTP scheme URI (see 3.2.2).

  • Related