I am attempting to import a plain-text file (.pgn - a summary of chess moves) into RStudio Cloud and extract only a handful of the rows for further analysis.
I import the file into a data frame with:
>pgn_df <- read.delim("test.pgn")
When I view the contents I see this:
>View(pgn_df)
X.Event.Live.Chess.
1 [Site Chess.com]
2 [Date 2022.06.14]
3 [Round -]
4 [White dervogel09]
5 [Black SuperCarp]
6 [Result 0-1]
7 [CurrentPosition r1b1n1k1/1p1n4/p2pN1p1/3Pp3/1P2P1rq/3B4/P2Q1P1K/RN3R2 w - -]
8 [Timezone UTC]
9 [ECO B06]
10 [ECOUrl https://www.chess.com/openings/Modern-Defense-with-1-e4-2.d4]
11 [UTCDate 2022.06.14]
12 [UTCTime 12:12:11]
13 [WhiteElo 1268]
14 [BlackElo 1234]
15 [TimeControl 900 10]
16 [Termination SuperCarp won by checkmate]
17 [StartTime 12:12:11]
18 [EndDate 2022.06.14]
19 [EndTime 12:34:16]
20 [Link https://www.chess.com/game/live/48947924239]
21 1. d4 g6 2. e4 Bg7 3. c4 d6 4. Nf3 Nf6 5. Bd3 e5 6. d5 c6 7. O-O cxd5 8. cxd5 a6
22 9. h3 Nbd7 10. Bg5 O-O 11. b4 h6 12. Be3 Ne8 13. Qd2 f5 14. Bxh6 f4 15. Bg5 Bf6
23 16. h4 Rf7 17. g3 Bxg5 18. Nxg5 Rf6 19. gxf4 Rxf4 20. Ne6 Rg4 21. Kh2 Qxh4# 0-1
However, after trying to extract some rows apparently only the first row contains data. I get the following results when I test:
>is_empty(pgn_df[1,1])
[1] FALSE
>is_empty(pgn_df[1,2])
TRUE
And the same TRUE for all other rows. I am trying to extract just a handful of rows (white player, black player, opening moves, etc) which I have done before with other plain-text files (not .pgn) I imported into data frames but for some reason I'm getting null values here.
When I try to extract a single row, for example the white player, I get:
>white_player <- row(pgn_df, 4)
>View(white_player)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
[11,] 11
[12,] 12
[13,] 13
[14,] 14
[15,] 15
[16,] 16
[17,] 17
[18,] 18
[19,] 19
[20,] 20
[21,] 21
[22,] 22
[23,] 23
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
CodePudding user response:
To extract a single row, try pgn_df[4,]
. To get multiple, pass a vector of indices: pgn_df[c(1, 4, 7),]
. row
doesn't do what it sounds like it should do! Your other example is_empty(pgn_df[1,2])
fails because there you're asking for the first row in the second column - when there's only one column in the data set. There are some good resources for learning to index data frames in R online that might be worth reviewing as well.