NameError: name 'data' is not defined (Python)-CodePudding

I am running this code in Python, I don't know what its always error

import string
import nltk
from sklearn.pipeline import Pipeline
import pandas as pd
import numpy as np
import re

data = pd.read_csv(r'C:\Users\Prihantoro Tri N\OneDrive\Documents\file toro\MSIB\Magang\Hukumonline\Project\youtube comments\dataset_komentar_instagram_cyberbullying.csv', sep=',', encoding='utf-8')

def casefolding(comment):
    comment = comment.lower()
    comment = comment.strip(" ")
    comment = re.sub(r'[?|$|.|!_:")(- ,]', '', comment)
    return comment
data['comment'] = data['comment'].apply(casefolding)
data.head(100)

And the results give the following error:

NameError                                 Traceback (most recent call last)
Input In [3], in <cell line: 8>()
      6     comment = re.sub(r'[?|$|.|!_:")(- ,]', '', comment)
      7     return comment
----> 8 data['comment'] = data['comment'].apply(casefolding)
      9 data.head(100)

NameError: name 'data' is not defined

or the results are like this >> KeyError: 'comment'

CodePudding user response：

I think your dataframe don't have "comment" column so please try check all the columns in your dataframe. try to run this data.columns

CodePudding user response：

#lets say your is from a csv file

import pandas as pd

import re

df = pd.read_csv('your_csv_file.csv')

data = pd.DataFrame(df)

def casefolding(comment):

comment = comment.lower()
comment = comment.strip(" ")
comment = re.sub(r'[?|$|.|!_:")(- ,]', '', comment)
return comment

data['comment'] = data['comment'].apply(casefolding)

data.head(100)