Ok so I wasn't sure how to ask this but I would love answers I've been stumped for hours. Let's say I have a CSV file and I want to get all data at index position 1 (The Company Name in the sample image) and compare them too each other.
I am currently using this line of code to read in the CSV file line by line,
string[] csvData = System.IO.File.ReadAllLines(@"C:\Path");
Then I would split them by rows and try to run a code to grab the wanted data like this
var comNames = new List<string>();
for (int i = 0; i < csvData.Length; i ){
string[] rows = csvData[i].Split(',');
comNames.Add(rows[1]);
}
But as you all know that won't work for lines 4 and 5 even though it is still the same column. Is there a way for me to delete the CRLF's that are causing this issue so I can make this code work or is there another workaround?
Image in text format:
Serial Number,Company Name,Employee Markme,Description,Leave
9788189999599,TALES OF SHIVA,Mark,mark,0
9780099578079,1Q84
THE
COMPLETE
TRILOGY,HARUKI MURAKAMI,Mark,0
9780198082897,MY KUMAN,Mark,Mark,0
CodePudding user response:
The code below will work if the following assumptions hold true:
- There is always a serial #
- There is always a company name
- There is always a comma before and after the company name
- The serial # is always exactly 13 digits
#1-3 are required for this solution. You can tweak the RegEx pattern to deal with #4.
public List<string> GetListOfCompanies() {
string data = File.ReadAllText(@"C:\Users\adam\Documents\test.csv");
var companies = new List<string>();
var pattern = @"\d{13}";
//replace the line ending with something unique
data = data.Replace(System.Environment.NewLine, "#thisisreallyunique#");
//find each serial number, and grab the item after it
foreach (Match match in Regex.Matches(data, pattern)) {
var temp = data.Substring(match.Index); //cut off everything before this match
var temp2 = temp.Substring(temp.IndexOf(",") 1); //cut off the serial # and the comma following it
//at this point we have the company name, plus everything after it
var company = temp2.Substring(0, temp2.IndexOf(",")); //cut off everything after it
//oh, and put the spaces back into the company
company = company.Replace("#thisisreallyunique#", " ");
companies.Add(company);
}
return companies;
}