Home > front end >  pandas dataframe not creating new column
pandas dataframe not creating new column

Time:10-26

I have data like this. What I am trying to do is to create a rule, based on domain names for my project. I want to create a new column named new_url based on domains. If it contains .cdn. it will take the string before .cdn. , otherwise it will call url parser library and parse url in another way. The problem is that in the csv file I created (cleanurl.csv) , there is no new_url column created. When I print parsed urls in code, I can see them. If and else condition are working. Could you help me please ?

enter image description here

import pandas as pd 
import url_parser
from url_parser import parse_url,get_url,get_base_url
import numpy as np 

df = pd.read_csv("C:\\Users\\myuser\\Desktop\\raw_data.csv", sep=';')

i=-1
for x in df['domain']:

    i=i 1
    print("*",x,"*") 

    if '.cdn.' in x:
        parsed_url=x.split('.cdn')[0]
        print(parsed_url)
        df.iloc[i]['new_url']=parsed_url
       
    else:
        parsed_url=get_url(x).domain  '.'   get_url(x).top_domain
        print(parsed_url)
        df.iloc[i]['new_url']=parsed_url

df.to_csv("C:\\Users\\myuser\\Desktop\\cleanurl.csv", sep=';')

CodePudding user response:

Use .loc[row, 'column'] to create new column

for idx, x in df['domain'].items():
    if '.cdn.' in x:
        df.loc[idx, 'new_url'] = parsed_url
    else:
        df.loc[idx, 'new_url'] = parsed_url
  • Related