Home > Software design >  How to replace strings with int in sublist?
How to replace strings with int in sublist?

Time:12-03

I'm trying to split above special letters '/' or '_' whatever comes first in the string column.

Here is the sample of the dataset(df2):

    4. 발견장소 코딩사유    Unnamed : 1
1    67488                  교외/야산_등산로 계곡 앞
2    100825                 자택_자택 방안 텐트
3.   101199                 숙박업소_게스트하우스 21층 복도

I converted Unnamed: 1 column like this:

    4. 발견장소 코딩사유    Unnamed : 1
1    67488                  [[교외],[야산_등산로 계곡 앞]]
2    100825                 [[자택],[자택 방안 텐트]]
3.   101199                 [[숙박업소],[게스트하우스 21층 복도]]

I'm trying to replace the strings with numbers like this. Here is the desired output:

    4. 발견장소 코딩사유     Unnamed : 1
1    67488   [[7],[야산_등산로 계곡 앞]]
2    100825  [[1],[자택 방안 텐트]]
3    101199  [[6],[게스트하우스 21층 복도]]

What I tried:

for i in range(1,122):
  # KEY값 추출
  index_key = df2['4. 발견장소 코딩사유'][i]

  # FIND_PLACETP값 추출
  index_rawdata = df.loc[df['KEY'] == index_key,'FIND_PLACETP'].index[0]
  num = df['FIND_PLACETP'][index_rawdata]

  # text split
  findplace = df2['Unnamed: 1'].str.split('/|_',expand=False)

  # replace words with numbers
  findplace[i][0] = findplace[i][0].str.replace('자택','1')
  findplace[i][0]

  findplace[i][0].str.replace(dict(zip(['자택','친척집','지인집','학교|직장','공공장소','숙박업소','교외|야산','병원','기타'],
                                   [1,2,3,4,5,6,7,8,9])),regex=True)

but it caused the error like this

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-50-b8929ebb1248> in <module>
     11 
     12   # replace words with numbers
---> 13   findplace[i][0] = findplace[i][0].str.replace('자택','1')
     14   findplace[i][0]
     15   # findplace[i][0].str.replace(dict(zip(['자택','친척집','지인집','학교|직장','공공장소','숙박업소','교외|야산','병원','기타'],

AttributeError: 'str' object has no attribute 'str'

CodePudding user response:

In pandas, with your method, you are referring to a value instead of row or column. In findplace[i][0], i is your column name and 0 is your row name, and it is returning the value of column i where row is 0.

I don't really know, either you are trying to use replace on row (with index name as 0) or the whole column and row.

For the whole column:

findplace[i] = findplace[i].str.replace('자택','1') #[0] is removed

For the row with index name as 0:

findplace[i][0] = findplace[i][0].replace('자택','1')
#findplace[i][0] is a string so replace can be applied directly
  • Related