so i have a pandas dataframe, one of the columns is called fips
and it is a code for locating particular county on a map. when i merged the data from the dictionary, i found out some counties have a 4-digit format, while it should always be 5-digit. but luckily i just can add zero at the beginning of every 4-digit code and receive a nice correct 5-digit code.
now, i need to write a condition such that it add zero to a cell if its length is 4, and do nothing if it length is 5. i wrote a lambda function for that:
df['fips'] = df['fips'].apply(lambda x: '0' df.fips if df.fips.str.len()== 4 else df.fips)
but it does not work and throws me that error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
any ideas what am i doing wrong here? the fips
column has an Obj
dtype.
CodePudding user response:
How did you get the data frame? If it comes from a csv file, you can just do:
df = pd.read_csv('file.csv', dtype={'fips':'str'})
Now, back to your problem, you can just do:
df['fips'] = df['fips'].astype(str).zfill(5)
Note: the above solution would treat the NaN
values as the string 'NaN'
, you might want to double check on missing data.