Home > other >  how to convert the .hdf file to dataset?
how to convert the .hdf file to dataset?

Time:08-12

I am using one of the files in here: http://orca.science.oregonstate.edu/1080.by.2160.monthly.hdf.vgpm.m.chl.m.sst.php:

untar(tarfile = "http://orca.science.oregonstate.edu/data/1x2/monthly/vgpm.r2018.m.chl.m.sst/hdf/vgpm.m.2010.tar", exdir = "./foo")

I get error: ar.exe: Error opening archive: Failed to open 'http://orca.science.oregonstate.edu/data/1x2/monthly/vgpm.r2018.m.chl.m.sst/hdf/vgpm.m.2010.tar'

so I manually had to download the file and untar it ( that is why cant provide a reproducible example here). Inside there are files of .hdf format:

I also was not able to read them:

library(ncdf4)
ncin <- nc_open(".\\vgpm.m.2010\\vgpm.2010001.hdf")
 ncin
 

lon <- ncvar_get(ncin,"fakeDim0")
head(lon)


lat <- ncvar_get(ncin,"fakeDim1")
head(lat)

fillvalue <- ncatt_get(ncin,"npp","_FillValue")

Can you please help to explain why i cant utar the file and why .hdf files have no fill value?

CodePudding user response:

  1. You should be able to untar the file once you have downloaded it. Download the file first to your working directory, then untar from your working directory: untar("vgpm.m.2002.tar", exdir = "mydir"). Your issue is likely with the connection. There can be many reasons for that which are specific to your computer's settings. You'll need to troubleshoot that separately.

  2. Once you untar the directory, the contents inside are not .hdf files. They are compressed .hdf files (thus why their file names end in .gz). You'll need to decompress:

library(R.utils)
gunzip("mydir/vgpm.2002335.hdf.gz", remove = FALSE)
  1. Once you actually have the .hdf file, you need to open it and then read it. You are correct to use ncdf4 because it accommodates multiple .hdf file formats. Some of the older formats would need different packages or software.

  2. To open and read it, you need two different functions, nc_open() and ncvar_get():

library(ncdf4)
dat <- nc_open("mydir/vgpm.2002335.hdf", write = TRUE)
ncvar_get(dat)

Note that these functions will NOT work if you have not completed the pre-requisite set-up explained in detail in the documentation. For example:

Both the netCDF library and the HDF5 library must already be installed on your machine for this R interface to the library to work.

CodePudding user response:

I also tried to rasterize it: it also works great:

library(raster)
x <- raster(.\\vgpm.2010001.hdf")
extent(x) <- extent(-180, 180, -90, 90)
crs(x) <- " proj=longlat  datum=WGS84"
NAvalue(x) <- -9999
#plot(x)

f1<-  as.data.frame(x, xy=TRUE) 
  • Related