Home > Enterprise >  adding a symbol based on condition pandas
adding a symbol based on condition pandas

Time:07-02

so i have a pandas dataframe, one of the columns is called fips and it is a code for locating particular county on a map. when i merged the data from the dictionary, i found out some counties have a 4-digit format, while it should always be 5-digit. but luckily i just can add zero at the beginning of every 4-digit code and receive a nice correct 5-digit code.

now, i need to write a condition such that it add zero to a cell if its length is 4, and do nothing if it length is 5. i wrote a lambda function for that:

df['fips'] = df['fips'].apply(lambda x: '0' df.fips if df.fips.str.len()== 4 else df.fips)

but it does not work and throws me that error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

any ideas what am i doing wrong here? the fips column has an Obj dtype.

CodePudding user response:

How did you get the data frame? If it comes from a csv file, you can just do:

df = pd.read_csv('file.csv', dtype={'fips':'str'})

Now, back to your problem, you can just do:

df['fips'] = df['fips'].astype(str).zfill(5)

Note: the above solution would treat the NaN values as the string 'NaN', you might want to double check on missing data.

  • Related