Home > OS >  Adding numbers from one series to a series of letters within the same dataframes, possibly using reg
Adding numbers from one series to a series of letters within the same dataframes, possibly using reg

Time:11-12

I have a subset of a much larger dataframe:

import pandas as pd

data = {'Name': ['car- at', 'car- ar', 'truck- br', 'car- at'],
        'SysNum': [1, 2, 3, 4 ],
            }
     Name     SysNum
0   car- at     1   
1   car- ar     2   
2   truck- br   3   
3   car- at     4

I need to be able to enter the numbers into the name after the - . The dash is in each name so I tried to use a couple Regex lines but I am unable to get it so far. The dataframe should look like:

    Name      SysNum    Count
0   car- at     1      car-1 at
1   car- ar     2      car-2 ar
2   truck- br   3      truck-3 br
3   car- at     4      car-4 at

I have used something similar before:

df['BaseName'] = [re.sub('(?<=-)\d{1,6}', '', s) for s in df['Name'].values]
df['SysNum'] = [int(re.search('(?<=-)\d{1,6}', s)[0]) for s in df['Name'].values]
df['NewName'] = df['BaseName']   df['SysNum'].astype(str) 

This takes away the numbers after the - and then I add a new set of numbers back onto the end with the last line. I now have a set of letters at the end so I need to be able to put in a new set of numbers after the - and keep the letters on at the end. A new solution to this or a way to use regex that looks at the - and stop at the next set of letters should work. Thanks.

CodePudding user response:

You can use

df['Count'] = df.apply(lambda x: x['Name'].replace('- ', f"-{x['SysNum']} "), axis=1)
# => df['Count']
#    0      car-1 at
#    1      car-2 ar
#    2    truck-3 br
#    3      car-4 at
#    Name: Count, dtype: object

This is just replacing - space with - SysNum column value space.

If you want to make sure you insert the values after the first word that has a - after it, you may use a regex:

df['Count'] = df.apply(lambda x: re.sub(r'^\w -', fr"\g<0>{x['SysNum']}", x["Name"]), axis=1)
# => df
#         Name  SysNum       Count
# 0    car- at       1    car-1 at
# 1    car- ar       2    car-2 ar
# 2  truck- br       3  truck-3 br
# 3    car- at       4    car-4 at

Here, ^\w - matches start of string (^), \w matches one or more letters/digits/_, and then - is matched. The replacement is the whole match value (\g<0>) the value from the SysNum column.

CodePudding user response:

You can use zip and list comprehension!.

>>> df['NewCol'] = [name.replace("-", f"-{num}", 1) for name, num in zip(df.Name, df.SysNum)]
>>> df
        Name  SysNum      NewCol
0    car- at       1    car-1 at
1    car- ar       2    car-2 ar
2  truck- br       3  truck-3 br
3    car- at       4    car-4 at

Note: The third argument to the replace method ensure that we replace only the first occurrence of -

CodePudding user response:

One approach:

df["Count"] = [f"{pre}-{num} {suf}" for num, [pre, suf] in zip(df["SysNum"], df["Name"].str.split("- "))]
print(df)

Output

        Name  SysNum       Count
0    car- at       1    car-1 at
1    car- ar       2    car-2 ar
2  truck- br       3  truck-3 br
3    car- at       4    car-4 at

CodePudding user response:

parts = df['Name'].str.split('-', n=1)
df['NewName'] = parts.str[0]   '-'   df['SysNum'].astype(str)   parts.str[1]

Test:

>>> df
        Name  SysNum     NewName
0    car- at       1    car-1 at
1    car- ar       2    car-2 ar
2  truck- br       3  truck-3 br
3    car- at       4    car-4 at
  • Related