Home > Enterprise >  Change the value of a dataframe column to the value of a second column conditional on the value of a
Change the value of a dataframe column to the value of a second column conditional on the value of a

Time:03-06

I have data with current names of companies, old names, and the date of name changes. It looks like this:

name former_name1 name_change_date1
ACMAT CORP nan NaT
ACME ELECTRIC CORP nan NaT
ACME UNITED CORP nan NaT
COLUMBIA ACORN TRUST LIBERTY ACORN TRUST 2003-10-20
MULTIGRAPHICS INC AM INTERNATIONAL INC 1997-03-17
MILLER LLOYD I III nan NaT
AFFILIATED COMPUTER SERVICES INC nan NaT
ADAMS RESOURCES & ENERGY, INC. ADAMS RESOURCES & ENERGY INC 2005-04-01
BK Technologies Corp BK Technologies, Inc. 2019-03-28

I want to figure out what the name of each company was at a particular date. Let's say I want to figure out the name of a company as of January 1st 2002. Then I could create a new column called say, edited_name, which would contain the current name of the company unless the company has changed names since 1/1/2002, in which case it would contain the historical name (i.e. former_name1) of the company. So the output should look something like this:

name former_name1 name_change_date1 edited_name
ACMAT CORP nan NaT ACMAT CORP
ACME ELECTRIC CORP nan NaT ACME ELECTRIC CORP
ACME UNITED CORP nan NaT ACME UNITED CORP
COLUMBIA ACORN TRUST LIBERTY ACORN TRUST 2003-10-20 LIBERTY ACORN TRUST
MULTIGRAPHICS INC AM INTERNATIONAL INC 1997-03-17 MULTIGRAPHICS INC
MILLER LLOYD I III nan NaT MILLER LLOYD I III
AFFILIATED COMPUTER SERVICES INC nan NaT AFFILIATED COMPUTER SERVICES INC
ADAMS RESOURCES & ENERGY, INC. ADAMS RESOURCES & ENERGY INC 2005-04-01 ADAMS RESOURCES & ENERGY INC
BK Technologies Corp BK Technologies, Inc. 2019-03-28 BK Technologies, Inc.

In Stata (with which I am much more familiar) this could be easily accomplished with:

gen edited_name = name
replace edited_name = former_name1 if name_change_date_1 > date("2002-01-01", "YMD") & name_change_date_1 != .

Unfortunately I am at a loss of how to accomplish this in Python/Pandas.

Data:

{'name': ['ACMAT CORP', 'ACME ELECTRIC CORP', 'ACME UNITED CORP', 'COLUMBIA ACORN TRUST',
          'MULTIGRAPHICS INC', 'MILLER LLOYD I III', 'AFFILIATED COMPUTER SERVICES INC',
          'ADAMS RESOURCES & ENERGY, INC.', 'BK Technologies Corp'],
 'former_name1': [nan, nan, nan, 'LIBERTY ACORN TRUST', 'AM INTERNATIONAL INC', nan, nan,
                  'ADAMS RESOURCES & ENERGY INC', 'BK Technologies, Inc.'],
 'name_change_date1': [NaT, NaT, NaT, '2003-10-20', '1997-03-17', NaT, NaT,
                       '2005-04-01', '2019-03-28']}

CodePudding user response:

You could use enter image description here

  • Related