I have been searching for a way to pull in all the words a person sees on a web page with PHP. I know I can use curl or filegetcontents, but that returns all the HTML, I just want the rendered text (not the formatting of it or images or anything else). Can someone point me in the right direction?
CodePudding user response:
Well curl is not a html processor. Lynx is a html processor You can use lynx -dump Https://stackoverflow.com
dumps the formatted output of the default document or those specified on the command line to standard output. Unlike interactive mode, all documents are processed.
You can run Linux/Windows commands in php with proc_open() function, link to manual: php proc_open