Home > other >  Return all words in a dataframe column in lower case
Return all words in a dataframe column in lower case

Time:03-17

I want to convert all the words in the 'Split Tweets' column to lower case

This is my code for;

def word_splitter(df):
    
    df['Split Tweets'] = df['Tweets'].str.split()
    df['Split Tweets'] = df['Split Tweets'].str.lower()

    
    df = df[['Tweets', 'Date', 'Split Tweets']]
    
    return df

word_splitter(twitter_df.copy())

This is the output i get;

    Tweets                                              Date                Split Tweets
0   @BongaDlulane Please send an email to mediades...   2019-11-29 12:50:54 NaN
1   @saucy_mamiie Pls log a call on 0860037566          2019-11-29 12:46:53 NaN
2   @BongaDlulane Query escalated to media desk.        2019-11-29 12:46:10 NaN
3   Before leaving the office this afternoon, head...   2019-11-29 12:33:36 NaN
4   #ESKOMFREESTATE #MEDIASTATEMENT : ESKOM SUSPEN...   2019-11-29 12:17:43 NaN
... ... ... ...
195 Eskom's Visitors Centres’ facilities include i...   2019-11-20 10:29:07 NaN
196 #Eskom connected 400 houses and in the process...   2019-11-20 10:25:20 NaN
197 @ArthurGodbeer Is the power restored as yet?        2019-11-20 10:07:59 NaN
198 @MuthambiPaulina @SABCNewsOnline @IOL @eNCA @e...   2019-11-20 10:07:41 NaN
199 RT @GP_DHS: The @GautengProvince made a commit...   2019-11-20 10:00:09 NaN

This is the expected output;

word_splitter(twitter_df.copy()) 
    Tweets                                              Date                Split Tweets
0   @BongaDlulane Please send an email to mediades...   2019-11-29 12:50:54 [@bongadlulane, please, send, an, email, to, m...
1   @saucy_mamiie Pls log a call on 0860037566          2019-11-29 12:46:53 [@saucy_mamiie, pls, log, a, call, on, 0860037...
2   @BongaDlulane Query escalated to media desk.        2019-11-29 12:46:10 [@bongadlulane, query, escalated, to, media, d...
3   Before leaving the office this afternoon, head...   2019-11-29 12:33:36 [before, leaving, the, office, this, afternoon...
4   #ESKOMFREESTATE #MEDIASTATEMENT : ESKOM SUSPEN...   2019-11-29 12:17:43 [#eskomfreestate, #mediastatement, :, eskom, s...
... ... ... ...
195 Eskom's Visitors Centres’ facilities include i...   2019-11-20 10:29:07 [eskom's, visitors, centres’, facilities, incl...
196 #Eskom connected 400 houses and in the process...   2019-11-20 10:25:20 [#eskom, connected, 400, houses, and, in, the,...
197 @ArthurGodbeer Is the power restored as yet?        2019-11-20 10:07:59 [@arthurgodbeer, is, the, power, restored, as,...
198 @MuthambiPaulina @SABCNewsOnline @IOL @eNCA @e...   2019-11-20 10:07:41 [@muthambipaulina, @sabcnewsonline, @iol, @enc...
199 RT @GP_DHS: The @GautengProvince made a commit...   2019-11-20 10:00:09 [rt, @gp_dhs:, the, @gautengprovince, made, a,...

Please how do i do this?

CodePudding user response:

You need to convert the Tweets strings to lowercase before you split them. Use this instead:

df['Split Tweets'] = df['Tweets'].str.lower().str.split()

CodePudding user response:

After you do str.split(), your df['Split Tweets'] column contains a list and not just a string, so it cannot perform the str.lower() method.

Either you change the order, like other answers/comments here suggest, or you can apply the str.lower() method on the list via a lambda function, using the map method:

df['Split Tweets'] = df['Split Tweets'].map(lambda x: list(map(str.lower, x)))
  • Related