I have a list of column headers that are read from a file into a List:
// Read column headers from text file
List<string> colHeads = new List<string>();
string[] lines = File.ReadAllLines("C:\\DATA\\MLData\\colHeads.txt");
string[] spl = new string[100];
foreach (string line in lines)
{
spl = line.Split('=');
colHeads.Add(spl[0].Trim());
}
I would like to then read a CSV file but only read the columns that are in the list:
// Read CSV file
string[] lines = File.ReadAllLines("C:\\DATA\\MLData\hist.csv");
string[] spl = new string[500];
foreach (string line in lines)
{
spl = line.Split(',');
var rec = new Record()
{
Name = spl[1],
ident = float.Parse(spl[4], CultureInfo.InvariantCulture.NumberFormat),
location = float.Parse(spl[5], CultureInfo.InvariantCulture.NumberFormat),
...
...
};
}
Example of colHeads.txt....
Name
ident
location
...
Example of hist.csv...
Name,yob,ident,level,location,score1
John B,1981,23,3,GB,54
There are more columns in hist.csv than colHeads.txt so I need a way to read the csv file by column name rather than column number, ignoring the columns that are not in the list. The variables I'm assigning to match the column names exactly.
CodePudding user response:
Assuming that colHeads
contains a list of header names, you can try something like this.
location = colHeads.Contains("location") ? float.Parse(spl[5], CultureInfo.InvariantCulture.NumberFormat) : 0
If the set of available columns are always the same, and you just want to change the columns you read, this will probably work fine for basic use.
CodePudding user response:
var colHeads = File.ReadLines("colHeads.txt").ToHashSet();
var lines = File.ReadLines("hist.csv");
var headers = lines.First().Split(',');
var indexes = new Dictionary<string, int>();
var records = new List<Record>();
for (int i = 0; i < headers.Length; i )
{
indexes[headers[i]] = colHeads.Contains(headers[i]) ? i : -1;
}
int nameIndex = indexes[nameof(Record.Name)];
int identIndex = indexes[nameof(Record.ident)];
int locationIndex = indexes[nameof(Record.location)];
//...
foreach (var line in lines.Skip(1))
{
var values = line.Split(',');
var record = new Record();
records.Add(record);
if (nameIndex >= 0) record.Name = values[nameIndex];
if (identIndex >= 0) record.ident = float.Parse(values[identIndex]);
if (locationIndex >= 0) record.location = values[locationIndex];
//...
}
First we read the header from the hist.csv file and assign indexes.
Next, using these indexes, we set the properties of the record.