this is code we have for retrieving a header left joined to its lines (one query), instead of returning two result sets (one for head, one for lines) for getting related data from a database, we always get ONE head, and however many lines it has.
Could you please help me understand why if the Distinct() at the bottom is removed it would return HEAD duplicates corresponding to the number of lines that were retrieved.
Is this because for every row in the reader, we project a HEAD even if it's the same? so if the HEAD has 40 lines, we are projecting the same HEAD 40 times? then a DISTINCT will eliminate 39 and return just one?
Would just doing .FirstOrDefault() without Distinct() be equivalent in this scenario as Distinct() because at the end of the day, it is projecting the same HEAD object?
public static IEnumerable<T> Select<T>(this IDataReader dr, Func<T> selector)
{
if (dr == null)
throw new ArgumentException(nameof(dr));
while (dr.Read())
yield return selector();
}
public void Test()
{
DTOHead head = null;
Dictionary<string, DTOHead> entryDictionary = new Dictionary<string, DTOHead>();
using (DbDataReader reader = cmd.ExecuteReader())
{
var head = reader.Select(dr =>
{
DTOHead entry = null;
if (!entryDictionary.TryGetValue((string)dr["Key"], out entry))
{
DTOHead dtoHead = new DTOHead();
dtoHead.Key = (string)dr["Key"]
dtoHead.Description = (string)dr["DESCRIPTION"];
dtoHead.Lines = new List<DTOLine>();
entry = dtoHead;
entryDictionary.Add(entry.Key, entry);
}
if (dr["LINE_NO"] != DBNull.Value)//skip, there are no lines for this one
{
DTOLine dtoLine = new DTOLine();
dtoLine.LineNo = (string)dr["LINE_NO"];
dtoLine.Qty = (string)dr["QTY"];
entry.Lines.Add(dtoLine);
}
return entry;
}).Distinct();
}
}
CodePudding user response:
Is this because for every row in the reader, we project a HEAD even if it's the same? so if the HEAD has 40 lines, we are projecting the same HEAD 40 times? then a DISTINCT will eliminate 39 and return just one?
Yes, it is this exactly. Your implementation of Select
will project one DTOHead
for each row in reader
. Assuming that you only have one unique "Key"
in your result set, it will always be the same DTOHead
reference... the one that you create then add to entryDictionary
. The call to Distinct
then removes all the duplicates and leaves you with an IEnumerable<DTOHead>
with one item.
Would just doing .FirstOrDefault() without Distinct() be equivalent in this scenario as Distinct() because at the end of the day, it is projecting the same HEAD object?
Since you indicate that your result set will only contain ONE Head then yeah... you can drop the call to Distinct
and just use FirstOrDefault
, assuming that you don't want an IEnumerable<DTOHead>
and only want an instance of DTOHead
.
If that's the case though, you don't need entryDictionary
. You can just read the first row from reader
, then project the remaining rows into an IEnumerable<DTOLine>
with your Select
method.
public void Test()
{
DTOHead head = null;
using (DbDataReader reader = cmd.ExecuteReader())
{
if (reader.Read())
{
// Deal with the first row by creating the DTOHead.
head = new DTOHead();
head.Key = (string)reader["Key"];
head.Description = (string)reader["DESCRIPTION"];
head.Lines = new List<DTOLine>();
if (reader["LINE_NO"] != DBNull.Value)//skip, there are no lines for this one
{
// Deal with the first row by creating the first DTOLine.
DTOLine line = new DTOLine();
line.LineNo = (string)reader["LINE_NO"];
line.Qty = (string)reader["QTY"];
head.Lines.Add(dtoLine);
// Project the remaining rows into lines.
head.Lines.AddRange(reader.Select(dr =>
{
DTOLine dtoLine = new DTOLine();
dtoLine.LineNo = (string)dr["LINE_NO"];
dtoLine.Qty = (string)dr["QTY"];
return dtoLine;
});
}
}
}
}