I don't know why this is happening in my ruby, but do you see the same behaviour?
3.1.2 :001 > ["url", "label:from", "label:type", "label:batch", "note"].index('url')
=> nil
3.1.2 :002 > ["url", "label:from", "label:type", "label:batch", "note"].index('note')
=> 4
3.1.2 :003 > ["Url", "label:from", "label:type", "label:batch", "note"].index('Url')
=> 0
It can't find 'url' when downcased. Is this a reserved word?
Edit: it seems not to be able to find the first occurrence of "url" string:
["note", "url", "label:from", "label:type", "label:batch", "note", "url"].index 'url'
=> 6
CodePudding user response:
The first entry in your array is not what you think it is. Look at the raw bytes and you'll see:
["url", "label:from", "label:type", "label:batch", "note"].first.bytes.map { |x| x.to_s(16) }
# ["ef", "bb", "bf", "75", "72", "6c"]
The 0x75 0x72 0x6c
is the "url"
you see, the 0xef 0xbb 0xbf
is a Byte Order Mark (BOM). Byte order is meaningless in UTF-8 so BOMs should not be used, they're valid but unusual and not recommended. You can have Ruby strip the BOMs while reading files if that's where the string is coming from.