I would like to match all cells that beginns with 978 number. But following code matches 397854
or nan
too.
an_transaction_product["kniha"] = np.where(an_transaction_product["zbozi_ean"].str.contains('^978', regex=True) , 1, 0)
What do I do wrong please?
CodePudding user response:
This doesn't work because .str.contains
will check if the regex occurs anywhere in the string.
If you insist on using regex, .str.match
does what you want.
But for this simple case .str.startswith("978")
is clearer.
CodePudding user response:
Apart from regex, you can use .loc to find cells that start with '978'. The code below will assign 1 to such cells in column 'A', just as an example:
df.loc[df['A'].astype(str).str[:3]=='978', 'A'] = 1
note: astype(str) converts the number to string and then str[:3] gets the first 3 characters, and then compares it to '978'.