I have the following dataframe total_authority
structure(list(country = c("Albania", "Algeria", "American Somoa",
"Angola", "Anguilla", "Antigua", "Argentina", "Armenia", "Aruba",
"Australia"), `1994` = c(0.0000000000000000312250225675825, 0.0000000000000000312250225675825,
0.0000000000000000312250225675825, 0.0000000000000000312250225675825,
0.0000000000000000312250225675825, 0.0000000000000000312250225675825,
0.00289122132708816, 0.0000000000000000312250225675825, 0.00000528966979389429,
0.00622391681538348), country.1 = c("Albania", "Algeria", "American Somoa",
"Angola", "Anguilla", "Antigua", "Argentina", "Armenia", "Aruba",
"Australia"), `1995` = c(0.00000320558770721281, 0.0000000000000000277555756156289,
0.0000000000000000277555756156289, 0.0000000000000000277555756156289,
0.0000000000000000277555756156289, 0.0000000000000000277555756156289,
0.0224538010858487, 0.0000000000000000277555756156289, 0.0000000000000000277555756156289,
0.407633483379219)), row.names = c(NA, 10L), class = "data.frame")
which I would like to rearrange in such a way the first column contains the countries, the second denotes the year and the third the value scored by the countries in that year.
Visually, the dataframe total_authority
is now
country 1994 country.1 1995
1 Albania 0.00000000000000003122502 Albania 0.00000320558770721280500
2 Algeria 0.00000000000000003122502 Algeria 0.00000000000000002775558
3 American Somoa 0.00000000000000003122502 American Somoa 0.00000000000000002775558
4 Angola 0.00000000000000003122502 Angola 0.00000000000000002775558
5 Anguilla 0.00000000000000003122502 Anguilla 0.00000000000000002775558
6 Antigua 0.00000000000000003122502 Antigua 0.00000000000000002775558
7 Argentina 0.00289122132708816148572 Argentina 0.02245380108584869860433
8 Armenia 0.00000000000000003122502 Armenia 0.00000000000000002775558
9 Aruba 0.00000528966979389429437 Aruba 0.00000000000000002775558
10 Australia 0.00622391681538347896208 Australia 0.40763348337921861963551
The desired result is instead:
country score year
Albania 0.00000000000000003122502 1994
Algeria 0.00000000000000003122502 1994
American Somoa 0.00000000000000003122502 1994
Angola 0.00000000000000003122502 1994
Anguilla 0.00000000000000003122502 1994
Antigua 0.00000000000000003122502 1994
Argentina 0.00289122132708816148572 1994
Armenia 0.00000000000000003122502 1994
Aruba 0.00000528966979389429437 1994
Australia 0.00622391681538347896208 1994
Albania 0.00000320558770721280500 1995
Algeria 0.00000000000000002775558 1995
American Somoa 0.00000000000000002775558 1995
Angola 0.00000000000000002775558 1995
Anguilla 0.00000000000000002775558 1995
Antigua 0.00000000000000002775558 1995
Argentina 0.02245380108584869860433 1995
Armenia 0.00000000000000002775558 1995
Aruba 0.00000000000000002775558 1995
Australia 0.40763348337921861963551 1995
This is my attempt (count
index of the for loop
ranges between 1 and 2 but it is just an example):
actors<-c("Albania", "Algeria", "American Somoa", "Angola", "Anguilla", "Antigua", "Argentina", "Armenia", "Aruba", "Australia")
final_output<-data.frame()
for (count in 1:2) {
df <- data.frame(country=actors)
df$year=rep(names(total_authority)[2*count],nrow(df))
df$authority<-total_authority[2*count]
final_output <- rbind(final_output, df)
}
Anyway, I obtained the following error:
Error in `.rowNamesDF<-`(x, value = value) :
'row.names' duplicate are not allowed.
In addition: Warning message:
non-unique values when setting 'row.names': ‘1’, ‘10’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’
CodePudding user response:
We don't need a for
loop here. Just index the data.frame to subset the columns, unlist
and construct data.frame
directly
out <- data.frame(country = unlist(total_authority[c(1,3)]),
score = unlist(total_authority[c(2,4)]),
year = rep(names(total_authority)[c(2,4)], each = nrow(total_authority)))
row.names(out) <- NULL
-output
> out
country score year
1 Albania 0.00000000000000003122502 1994
2 Algeria 0.00000000000000003122502 1994
3 American Somoa 0.00000000000000003122502 1994
4 Angola 0.00000000000000003122502 1994
5 Anguilla 0.00000000000000003122502 1994
6 Antigua 0.00000000000000003122502 1994
7 Argentina 0.00289122132708816018468 1994
8 Armenia 0.00000000000000003122502 1994
9 Aruba 0.00000528966979389429013 1994
10 Australia 0.00622391681538347982944 1994
11 Albania 0.00000320558770721281009 1995
12 Algeria 0.00000000000000002775558 1995
13 American Somoa 0.00000000000000002775558 1995
14 Angola 0.00000000000000002775558 1995
15 Anguilla 0.00000000000000002775558 1995
16 Antigua 0.00000000000000002775558 1995
17 Argentina 0.02245380108584869860433 1995
18 Armenia 0.00000000000000002775558 1995
19 Aruba 0.00000000000000002775558 1995
20 Australia 0.40763348337921900821357 1995
Regarding the error with duplicate row.names, it occurs because the authority
created is a data.frame
with a single column ([
), instead, we need a vector by extracting the column ([[
)
final_output<-data.frame()
for (count in 1:2) {
df <- data.frame(country=actors)
df$year=rep(names(total_authority)[2*count],nrow(df))
df$authority<-total_authority[[2*count]]
final_output <- rbind(final_output, df)
}
-output
> final_output
country year authority
1 Albania 1994 0.00000000000000003122502
2 Algeria 1994 0.00000000000000003122502
3 American Somoa 1994 0.00000000000000003122502
4 Angola 1994 0.00000000000000003122502
5 Anguilla 1994 0.00000000000000003122502
6 Antigua 1994 0.00000000000000003122502
7 Argentina 1994 0.00289122132708816018468
8 Armenia 1994 0.00000000000000003122502
9 Aruba 1994 0.00000528966979389429013
10 Australia 1994 0.00622391681538347982944
11 Albania 1995 0.00000320558770721281009
12 Algeria 1995 0.00000000000000002775558
13 American Somoa 1995 0.00000000000000002775558
14 Angola 1995 0.00000000000000002775558
15 Anguilla 1995 0.00000000000000002775558
16 Antigua 1995 0.00000000000000002775558
17 Argentina 1995 0.02245380108584869860433
18 Armenia 1995 0.00000000000000002775558
19 Aruba 1995 0.00000000000000002775558
20 Australia 1995 0.40763348337921900821357