How to modify a Series(DataFrame) of Pandas in place during iterating?-CodePudding

I need to revice values in a Series(column) of Pandas according to another function.

During iterating, after I get the result, I don't want to lookup the series twice, becasue I guess that it wastes time and is not required.

For example:

import pandas as pd
s = pd.Series(['A', 'B', 'C'])
for index, value in s.items():
    s[index] = func_hard_to_vectorized(value)    # lookup again!!!

In words of C , "How to get a reference to that cell?"

What I want looks like:

import pandas as pd
s = pd.Series(['A', 'B', 'C'])
for index, value in s.items():
    value = func_hard_to_vectorized(value)    # change in place
    assert_equal(s[index], value)

A same problem about DataFrame exists also, perhaps more heavily influence the performance.

How to get a reference to a row of Pandas.DataFrame?

Thanks!

CodePudding user response：

You can try to insert your data only once, not at each step:

s[:] = [func_hard_to_vectorized(v) for v in s]

Or:

s[:] = s.apply(func_hard_to_vectorized)

Thus insertion will only occur once with all items at once.

If you don't care having a new Series (i.e. if there is not another name pointing to the Series):

s = s.apply(func_hard_to_vectorized)

can also be used.

example using both index/value:

s = pd.Series(['A', 'B', 'C'])

def f(idx, v):
    return f'{v}_{idx}'

s[:] = [f(idx, v) for idx, v in s.items()]

Modified s:

0    A_0
1    B_1
2    C_2
dtype: object