I have 2 variables div1, div2 and want to get all value from them.
I can loop through one variable with foreach, but it's possible to get both divs InnerHtml?
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load(url);
var div1 = doc.DocumentNode.SelectNodes("//div[contains(@class,'class1')]");
var div2 = doc.DocumentNode.SelectNodes("//div[contains(@class,'class2')]");
foreach (HtmlNode div in div1)
{
String text = div.InnerHtml;
Debug.WriteLine(text);
}
CodePudding user response:
@mcjmzn, @Jonathan, and @Nenad answers are correct as far as printing all innerHtml
s.
I'm guessing you want to print the first div1 innerHtml
and then the first div2 innerHtml
, and then second div1 innerHtml
, and second div2 innerHtml
, and so on. You'll want a regular loop instead of a foreach, and add checks to make sure you don't exceed div1 or div2 array lengths:
var div1Max = div1.Count;
var div2Max = div2.Count;
var overallMax = Math.Max(div1Max, div2Max);
for(var i = 0; i < overallMax; i )
{
if (i < div1Max)
{
String text1 = div1[i].InnerHtml;
Debug.WriteLine(text1);
}
if (i < div2Max)
{
String text2 = div2[i].InnerHtml;
Debug.WriteLine(text2);
}
}
CodePudding user response:
You can use Concat
extension method of IEnumerable
to combine both collections of nodes.
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load(url);
var div1 = doc.DocumentNode.SelectNodes("//div[contains(@class,'class1')]");
var div2 = doc.DocumentNode.SelectNodes("//div[contains(@class,'class2')]");
var allNodes = div1.Concat(div2);
foreach (HtmlNode div in allNodes)
{
String text = div.InnerHtml;
Debug.WriteLine(text);
}
CodePudding user response:
Why don't you simply iterate one after another, instead of concatenating, etc?
foreach (HtmlNode div in div1)
{
String text = div.InnerHtml;
Debug.WriteLine(text);
}
foreach (HtmlNode div in div2)
{
String text = div.InnerHtml;
Debug.WriteLine(text);
}
CodePudding user response:
Create a container list at the beginning, add the results to it, and then loop through the container list:
var nodes = new HtmlNodeCollection();
nodes.Add(doc.DocumentNode.SelectNodes("//div[contains(@class,'class1')]"));
nodes.Add(doc.DocumentNode.SelectNodes("//div[contains(@class,'class2')]"));
foreach(HtmlNode node in nodes){
Debug.WriteLine(node.InnerHtml);
}
It is also possible to build up a different query that will get all the class1
s and class2
s at the same time:
doc.DocumentNode.SelectNodes("//div[contains(@class,'class1') or contains(@class,'class2')]");
Edit after comment @ 22:24:56Z:
If there is only one result for each selector, you could simplify your approach something like this:
var text1 = doc.DocumentNode.SelectSingleNode("//div[contains(@class,'class1')]")?.InnerHtml ?? String.Empty;
var text2 = doc.DocumentNode.SelectSingleNode("//div[contains(@class,'class2')]")?.InnerHtml ?? String.Empty;
Those question marks are null-coalescing operators. See:
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-coalescing-operator