Home > OS >  Concatenation of Int array string array in python (similar to paste() in R)?
Concatenation of Int array string array in python (similar to paste() in R)?

Time:02-18

I'm new in the python world, I have been always taking advantage of the vectorized operations of R, so I have a basic question...

I have 2 arrays, 1 with int values and the other with string ones. I would like to have a pandas series with the concatenation of both like:

0      Enterobact
1        Pseudomo
2        Mycobact
3             Bac
4        Streptoc
5    Propionibact
6       Staphyloc
7           Morax
8        Synechoc
9            Gord
Name: fam, dtype: object

0    7275
1    3872
2    3869
3    1521
4    1408
5    1022
6     877
7     765
8     588
9     578
Name: frequency, dtype: int64

And I would like to have the following..:

Enterobact - 7275
Pseudomo - 3872
Mycobact - 3869
# And so on...

Which should be the proper way to solve this problem in python? Not the way adapted for R users. Thank you very much in advance...

CodePudding user response:

Not sure in what format you actually need the result but I will give you two methods. First of all, I assume that your data is stored in two variables:

print(fam_column)
print(freq_column)

Output of the two vars is exactly what you have:

0      Enterobact
1        Pseudomo
2        Mycobact
3             Bac
4        Streptoc
5    Propionibact
6       Staphyloc
7           Morax
8        Synechoc
9            Gord

Name: fam, dtype: object
0    7275
1    3872
2    3869
3    1521
4    1408
5    1022
6     877
7     765
8     588
9     578
Name: frequency, dtype: int64

So, the first method makes use of the fact that these lists are dataframe columns and we can use operations from pandas. The code simply concatenates the rows together as string and in the middle is -:

result = fam_column   ' - '   freq_column.astype(str)
print(result)

Output:

0      Enterobact - 7275
1        Pseudomo - 3872
2        Mycobact - 3869
3             Bac - 1521
4        Streptoc - 1408
5    Propionibact - 1022
6        Staphyloc - 877
7            Morax - 765
8         Synechoc - 588
9             Gord - 578
dtype: object

In your question, you mentioned that you want to combine two arrays (in python lists), therefore I created a second method. This one is not preferred as using the existing dataframes is much simpler. This method converts your columns into two lists and then combines them in a generator to the desired form.

list_fam = list(df1['fam'])
list_frequency = list(df2['frequency'])

result = [x   ' - '   str(y) for x, y in zip(list_fam,list_frequency)]
print(result)

The output is the following:

['Enterobact - 7275', 'Pseudomo - 3872', 'Mycobact - 3869', 'Bac - 1521', 'Streptoc - 1408', 'Propionibact - 1022', 'Staphyloc - 877', 'Morax - 765', 'Synechoc - 588', 'Gord - 578']
  • Related