Home > database >  Detect language in pandas column in python
Detect language in pandas column in python

Time:01-17

I would like to detect language in pandas column in python. After detecting it I want to write the language code as a column in pandas dataframe. Below is my code and what I tried. But I got an error please help.

Thank you.

  data = {'text':  ["It is a good option","Better to have this way","es un portal informático 
  para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}
  # Create DataFrame
  df = pd.DataFrame(data)
  #get the language
 
  for i in  df['text']:
  # Language Detection
  df['lang'] = TextBlob(i)

enter image description here

CodePudding user response:

You can use langdetect library in Python for language detection.

pip install langdetect
import pandas as pd
from langdetect import detect

data = {'text':  ["It is a good option","Better to have this way","es un portal informático para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}

df = pd.DataFrame(data)

df['lang'] = df['text'].apply(lambda x: detect(x))

CodePudding user response:

i think this will be enough:

#get the language
df['lang'] = df.apply(lambda x: TextBlob(x['text']), axis = 1) 
  • Related