I've been trying to parse this code for a very long time:
<html>
<body class="detailpage">
<div id="innerLayout">
<section id="body-container">
<div class="wrapper">
<div class="content" id="offer_active">
<div class="clr offerbody">
<div class="offercontent fleft rel ">
<div class="offercontentinner">
<script>
texto = {"name":"John"};
</script>
</div>
</div>
</div>
</div>
</div>
</section>
</div>
</body>
</html>
I prefer using AgilityPack, and I want to get "name" : "John"
as a result, but I have not been successful.
This is my attempt:
string stringThatKeepsYourHtml = @"<!DOCTYPE html> <head> <title>Title</title> </head> <body> <div id=""myId"" myClass""> <div myClass"">hello</div> </div> </body> </html>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(stringThatKeepsYourHtml);
string whatUrLookingFor = doc.DocumentNode.
SelectNodes("//div").
First().
SelectNodes("//div").
First().
InnerText;
Console.WriteLine(whatUrLookingFor);
Console.ReadKey(true);
How can I get this working?
CodePudding user response:
Not sure what the problem with parsing it is.. This worked fine:
var html = @"
<html>
<body class=""detailpage"">
<div id=""innerLayout"">
<section id=""body-container"">
<div class=""wrapper"">
<div class=""content"" id=""offer_active"">
<div class=""clr offerbody"">
<div class=""offercontent fleft rel "">
<div class=""offercontentinner"">
<script>
texto = {""name"":""John""};
</script>
</div>
</div>
</div>
</div>
</div>
</section>
</div>
</body>
</html>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
string scr = htmlDoc.DocumentNode.SelectSingleNode("//script").InnerText;
Console.WriteLine(scr);
scr
contains the full script texto = {"name":"John"}
- you can remove the texto =
and then json parse the remainder, or just take everything between {
and }
using some substring, for example:
var openBra = scr.IndexOf('{');
var closeBra = scr.LastIndexOf('}');
var between = scr[openBra 1..closeBra]; //c# version 8 ranges feature, use Substring if you're on c# <8
I'm not really clear on what you wanted to do with it