import pandas as pd
data = {'Account':['Paul','Jenn']}
df = pd.DataFrame(data=data)
The desired output would be 1 for paul and 2 for Jenn, the basis of the solution would form a for loop for a much bigger dataset to replace account number names with numeric values
CodePudding user response:
You could do something like this: First create a dictionary mapping of the unique names in Accounts to the number ordered by how they appear. Then use .replace()
to replace the values in the series with this number. This will ensure that Paul is always replaced by 1 if it appears more than once and Jenn is replaced by 2 if it appears more than once, etc.
import pandas
import json
data = {'Account':['Paul','Jenn']}
df = pandas.DataFrame(data=data)
name_mapping = json.loads(pandas.Series(
index=df.Account.unique(),
data=range(1, len(df.Account.unique()) 1)
).to_json())
df.Account = df.Account.replace(name_mapping)
Output:
>>> df
Account
0 1
1 2
CodePudding user response:
Not entirely sure what you're trying to do, please futher elaborate if i misunderstood the question :) If you want to replace the 'name' column with index incremented by a value:
df['name'] = df.index value
Increase the values of a column (or do any other arithmetical operation over a column) :
df['column name'] = value
# add do that over a column and add it to another column
df['result'] = df['other column'] value
Count the number of occurences for each value in column name
df = pd.DataFrame({"name": ["Paul", "Jenn", "Paul"]})
# count the number of occurences for each name
df["count"] = df.groupby("name")['name'].transform('count')
# in case you don't want duplicate rows
df.drop_duplicates(inplace=True)