Home > Back-end >  How to merge multiple list by id and get specific data?
How to merge multiple list by id and get specific data?

Time:12-15

i have 3 lists with common IDs. I need to group by object in one list, and extract data from other two. Will give example for more understanding

table for groupNames:

| Id | Name    | 
|--------------|
| 1  | Hello   |
| 2  | Hello   |
| 3  | Hey     |
| 4  | Dude    |
| 5  | Dude    |

table for countId:

| Id | whatever | 
|---------------|
| 1  | test0    |
| 1  | test1    |
| 2  | test2    |
| 3  | test3    |
| 3  | test4    |

table for lastTime:

| Id | timestamp  | 
|-----------------|
| 1  | 1636585230 |
| 1  | 1636585250 |
| 2  | 1636585240 |
| 3  | 1636585231 |
| 3  | 1636585230 |
| 5  | 1636585330 |

and I'm expecting result in list like this

| Name    | whateverCnt | lastTimestamp | 
|---------------------------------------|
| Hello   | 3           | 1636585250    |
| Hey     | 2           | 1636585231    |
| Dude    | 0           | 1636585330    |

for now i had something like this, but it doesnt work

            return groupNames
              .GroupBy(x => x.Name)
              .Select(x =>
              {
                  return new myElem
                  {
                      Name = x.Name,
                      lastTimestamp = new DateTimeOffset(lastTime.Where(a => groupNames.Where(d => d.Name == x.Key).Select(d => d.Id).Contains(a.Id)).Max(m => m.timestamp)).ToUnixTimeMilliseconds(),
                      whateverCnt = countId.Where(q => (groupNames.Where(d => d.Name == x.Key).Select(d => d.Id)).ToList().Contains(q.Id)).Count()
                    };
              })
             .ToList();

Many thanks for any advice.

CodePudding user response:

In your example, the safest would be a list of the last specified object and just LINQ query the other arrays of objects for the same id.

So something like

public IEnumerable<SomeObject> MergeListsById(
  IEnumerable<GroupNames> groupNames,
  IEnumerable<CountId> countIds,
  IEnumerable<LastTime> lastTimes)
{
  IEnumerable<SomeObject> mergedList = new List<SomeObject>();

  groupNames.ForEach(gn => {
    mergedList.Add(new SomeObject {
      Name = gn.Name,
      whateverCnt = countIds.FirstOrDefault(ci => ci.Id == gn.Id)?.whatever,
      lastTimeStamp = lastTimes.LastOrDefault(lt => lt.Id == gn.Id)?.timestamp
    });
  });

  return mergedList;
}

Try it in a Fiddle or throwaway project and tweak it to your needs. A solution in pure LINQ is probably not desired here, for readability and maintainability sake.

And yes, as the comments say do carefully consider whether LINQ is your best option here. While it works, it does not always do better in performance than a "simple" foreach. LINQ's main selling point is and always has been short, one-line querying statements which maintain readability.

CodePudding user response:

I think I'd mostly skip LINQ for this

class Thing{
  public string Name {get;set;}
  public int Count {get;set;}
  public long LastTimestamp {get;set;}
}

...

var ids = new Dictionary<int, string>();
var result = new Dictionary<string, Thing>();
foreach(var g in groupNames) {
  ids[g.Id] = g.Name;
  result[g.Name] = new Whatever { Name = n };
}

foreach(var c in counts)
  result[ids[c.Id]].Count  ;

foreach(var l in lastTime){
  var t = result[ids[l.Id]];
  if(t.LastTimeStamp < l.Timestamp) t.LastTimeStamp = l.TimeStamp;
}

We start off making two dictionaries (you could ToDictionary this).. If groupNames is already a dictionary that maps id:name then you can skip making the ids dictionary and just use groupNames directly. This gives us fast lookup from ID to Name, but we actually want to colelct results into a name:something mapping, so we make one of those too. doing result[name] = thing always succeeds, even if we've seen name before. We could skip on some object creation with a ContainsKey check here if you want

Then all we need to do is enumerate our other N collections, building the result. The result we want is accessed from result[ids[some_id_value_here]] and it always exists if groupnames id space is complete (we will never have an id in the counts that we do not have in groupNames)

For counts, we don't care for any of the other data; just the presence of the id is enough to increment the count

For dates, it's a simple max algorithm of "if known max is less than new max make known max = new max". If you know your dates list is sorted ascending you can skip that if too..

CodePudding user response:

Well, having

  List<(int id, string name)> groupNames = new List<(int id, string name)>() {
    ( 1, "Hello"),
    ( 2, "Hello"),
    ( 3, "Hey"),
    ( 4, "Dude"),
    ( 5, "Dude"),
  };

  List<(int id, string comments)> countId = new List<(int id, string comments)>() {
    ( 1  , "test0"),
    ( 1  , "test1"),
    ( 2  , "test2"),
    ( 3  , "test3"),
    ( 3  , "test4"),
  };

  List<(int id, int time)> lastTime = new List<(int id, int time)>() {
    ( 1  , 1636585230 ),
    ( 1  , 1636585250 ),
    ( 2  , 1636585240 ),
    ( 3  , 1636585231 ),
    ( 3  , 1636585230 ),
    ( 5  , 1636585330 ),
  };

you can use the Linq below:

      var result = groupNames
        .GroupBy(item => item.name, item => item.id)
        .Select(group => (Name          : group.Key,
                          whateverCnt   : group
                            .Sum(id => countId.Count(item => item.id == id)),
                          lastTimestamp : lastTime
                            .Where(item => group.Any(g => g == item.id))
                            .Max(item => item.time)));
  • Related