In an excel file I have 5 columns and 20 rows, out of which one row contains text data as shown below
df['Content']
row contains:
0 this is the final call
1 hello how are you doing
2 this is me please say hi
..
.. and so on
I want to create bigrams while it remains attached to its original table.
I tried applying the below function to iterate through rows
def find_bigrams(input_list):
bigram_list = []
for i in range(len(input_list)-1):
bigram_list.append(input_list[1:])
return bigram_list
And tried applying back the row into its table using the:
df['Content'] = df['Content'].apply(find_bigrams)
But I am getting the following error:
0 None
1 None
2 None
I am expecting the output as below
Company Code Content
0 xyz uh-11 (this,is),(is,the),(the,final),(final,call)
1 abc yh-21 (hello,how),(how,are),(are,you),(you,doing)
CodePudding user response:
Your input_list
is not actually a list, it's a string.
Try the function below:
def find_bigrams(input_text):
input_list = input_text.split(" ")
bigram_list = list(map(','.join, zip(input_list[:-1], input_list[1:])))
return bigram_list
CodePudding user response:
You can use itertools.permutations()
s.str.split().map(lambda x: list(itertools.permutations(x,2))[::len(x)])