Home > database >  How to return the customer IDs of the best N customers from the transaction data
How to return the customer IDs of the best N customers from the transaction data

Time:12-13

I have transaction data in a CSV with columns Customer ID, Transaction Amount, Transaction Date. I have a function that accepts transactions_csv_file_path as string, N as an integer as params. I want to return the best N customers from the transaction data. NOTE:[best customer as the one with the longest period of consecutive daily payments]`. I can read the CSV as below:

public static string[,] ProcessCSV(string file_path, int n)
{
List<string> transData = new List<string>();
            using (StreamReader sr = new StreamReader(file_path))
            {
                
                string strResult = sr.ReadToEnd();
                var values = strResult.Split(',');
                transData.Add(values[0]);
                transData.Add(values[1]);

            }
return transData.ToArray();
}

when debugging, I only get the columns headers without data. I want to get the daily consecutive payments by date and return the customerIds, for example: if N=1, I expect the output to be ['K20008'], if N=3, output: ['K20987', 'K20008', 'K20233'] enter image description here

How do I get the array data from the CSV and get the best N customer IDs with the longest period of consecutive daily payments?

To consider:define consecutive daily payments as at least 1 transaction per calendar day. and If there are any ties, use ascending order to break ties. For example, K20003 comes before K20005

CodePudding user response:

I'd perhaps make a method that calculated the longest run length:

int MaxRun(IEnumerable<DateTime> ds){
  int max = 0;
  int current = 0;
  var prev = DateTime.MinValue;

  foreach(var d in ds.Distinct().OrderBy(x => x)){
    if((d - prev).Days == 1)
      current  ;
    else 
      current = 0;
    prev = d;
    if(current > max)
      max = current;
  }
  return max;
}

And then use a bit of LINQ to group the people, calc the maxrun, order the transdates by the maxrun, and output the people:

transactions
  .GroupBy(t => t.Customer, t => t.TransactionDate )
  .Select(g => new { g.Key, MR = MaxRun(g) })
  .OrderBy(at => at.MR)
  .ThenBy(at => at.Key)
  .Select(at => at.Key)
  .ToArray()
  • Related