Home > Enterprise >  How to remove the first n character from all the cells in a column using python pandas?
How to remove the first n character from all the cells in a column using python pandas?

Time:05-08

df['opposition'].apply(lambda x: x[2:]). Please help me to understand the lambda function and how it works in this case. Thanks in advance.

CodePudding user response:

Few things at play here:

  • df[column].apply(f) takes a function f as argument and applies that function to every value in column, returning the new column with modified values.
  • lambda x: x[2:] defines a function that takes a value x and returns the slice x[2:]. I.e., when x is a string, it returns x without the first two characters.
  • Hence, df['opposition'].apply(lambda x: x[2:]) returns the 'opposition' column modified by removing the first 2 characters from all strings in it.

However, for this particular use case, there is a much better way to do this. You can use .str.slice() to perform the same operation:

df['opposition'].str.slice(start=2)

The methods in .str are specific for columns with string values. See here for more info.

CodePudding user response:

First you can read about lamba function separately and then how it works on a list.

Let's take an example where we have to add 1 to every value present in list. We can do that by iterating each value in a loop and calling the function explicitly.

# Normal iteration method (Approach 1:)
    def add_one_to_every_value(input):
        return input   1
    
    my_list = [1, 2, 3, 4]
    
    res_list = []
    
    for i in my_list:
        res_val = add_one_to_every_value(i)
        res_list.append(res_val)
    
    print(res_list)
    
   # Approach 2: Using map and the function
   res_list2 = list(map(add_one_to_every_value, my_list))
   print(res_list2)

    # Approach 3:
    # The same operation using lambda function. Every value will be iterated 
    # and the corresponding operation (x 1) will be done.
    res_list3 = list(map(lambda x: x 1, my_list))
    print(res_list3)

The same approach is used in pandas apply function.

# Removing first 2 characters in a string is done as follows

test_str = "abcdefg"
print(test_str[2:])

So using this first 2 character removal code inside lambda, we get

df['opposition'].apply(lambda x: x[2:])
  • Related