Home > other >  better way to create a data frame in R
better way to create a data frame in R

Time:11-18

I have created a sample data frame below using a relatively raw and somehow "dumb" way and I would like to know if there are shorter/neater ways doing so? Million thanks.

library(pedquant)

PECCPC<-md_stock(c("600028","601857","00386.HK","00857.HK"),type='real')

a<-as.data.frame(PECCPC$symbol[1:2])
b<-as.data.frame(PECCPC$close[1:2])
c<-as.data.frame(PECCPC$symbol[3:4])
d<-as.data.frame(PECCPC$close[3:4])

e<-cbind(a,b)
f<-cbind(c,d)

g<-cbind(e,f)
g$spread<-g[,2]-g[,4]
colnames(g)<-c("A-shares","Price","H-shares","Price","AH_spread")
g

   A-shares Price H-shares Price AH_spread
1 600028.SS  4.31 00386.HK  3.42      0.89
2 601857.SS  5.04 00857.HK  3.38      1.66

CodePudding user response:

This is a tibble, but it works:

library(pedquant)

PECCPC<-md_stock(c("600028","601857","00386.HK","00857.HK"),type='real')

g <- tibble(
    "A-shares" = PECCPC$symbol[1:2],
    "A_Price" = PECCPC$close[1:2],
    "H-shares" = PECCPC$symbol[3:4],
    "H_Price"= PECCPC$close[3:4],
    "AH_spread" = A_Price - H_Price
)

g

# A tibble: 2 x 5
  `A-shares` A_Price `H-shares` H_Price AH_spread
  <chr>        <dbl> <chr>        <dbl>     <dbl>
1 600028.SS     4.31 00386.HK      3.42      0.89
2 601857.SS     5.04 00857.HK      3.38      1.66

CodePudding user response:

Following up on my comment, here is very simple and standard way to do this with a data.frame and nothing but base R. I use the EuStockMarkets data set included in R, and skip the share count as you are after the spread calculation:

> library(datasets)
> data(EuStockMarkets)
> head(EuStockMarkets)
         DAX    SMI    CAC   FTSE
[1,] 1628.75 1678.1 1772.8 2443.6
[2,] 1613.63 1688.5 1750.5 2460.2
[3,] 1606.51 1678.6 1718.0 2448.2
[4,] 1621.04 1684.1 1708.1 2470.4
[5,] 1618.16 1686.6 1723.1 2484.7
[6,] 1610.61 1671.6 1714.3 2466.8
>

So given this data set, we create a data.frame by assigning two columns (which also gives then names). We then use these two columns in a subsequent assignment for which we use within to operate within the scope of the data.frame making the column names directly accessible:

> DF <- data.frame(DAX_level=EuStockMarkets[,"DAX"], SMI_level=EuStockMarkets[,"SMI"])
> DF <- within(DF, DAX_SMI_spread <- DAX_level - SMI_level)
> head(DF)
  DAX_level SMI_level DAX_SMI_spread
1   1628.75    1678.1         -49.35
2   1613.63    1688.5         -74.87
3   1606.51    1678.6         -72.09
4   1621.04    1684.1         -63.06
5   1618.16    1686.6         -68.44
6   1610.61    1671.6         -60.99
> 
  • Related