Home > Mobile >  Why is '\' invalid in this command called with os/exec?
Why is '\' invalid in this command called with os/exec?

Time:02-25

When I execute this code written in Go:

package main

import ( "fmt" 
"os/exec"
)

func donde(num string) string {                                                                                                                                                         
  cmd := fmt.Sprintf("wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30\"|grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\"|grep -av \"href=\\"/\"", num)
  out, err := exec.Command("bash","-c",cmd).Output()
        if err != nil {
                return fmt.Sprintf("Failed to execute command: %s", cmd)
        }
        return string(out)
}

func main() {

chicas := map[string][]string{ "Alexia":{"600080000"}, 
"Paola":{"600070008", "600050007", "600000005", "600000001", "600004", "600000000"}}    

for k, v := range chicas { 
    fmt.Printf("%s\n", k)
    for index := range v {
        c := donde(v[index])
        exec.Command("bash", "-c", c)
        fmt.Println(c)}

  }
    
}

I get:

./favoritas.go:8:189: invalid operation: "wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18... / "" (operator / not defined on untyped string)
./favoritas.go:8:190: invalid character U 005C '\'

grep -av \"href=\\"/\" seems to be the culprit. Interestingly, similar Python code works just fine:

from subprocess import run
v = "600000005"
dnd = run('wget -qO- \"https://www.pasion.com/contactos-mujeres/' v '.htm?edadd=18&edadh=30\" |grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\" |grep -av \"href=\\"/\"' , capture_output=True, shell=True, text=True, encoding='latin-1').stdout
print(dnd)

and wget -qO- "https://www.pasion.com/contactos-mujeres/600000003.htm?edadd=18&edadh=30" |grep -av "https:"|grep -av "contactos"|grep -av "javascript" |grep -av "href=\"/" executed from my shell (I use Bash) works fine as well. Why cannot I accomplish the same in my code Go? How might I resolve this issue?

P.S. What is pasted here are just snippets of more lengthy programs.

CodePudding user response:

escaping quotes within a language within a language is hard. Use alternate syntax when available to alleviate this pain.

Your syntax is complex because you chose to enquote the string with double quotes, but the string contains double quotes, so they must be escaped. Additionally, you have double quotes within the string that themselves must be escaped. You've escaped them, but made a typeo in your escaping at the end:

"wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30\"|grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\"|grep -av \"href=\\"/\""

you escaped the backslash, but did not include an additional backslash to escape the quote. So the quoted string ended. The / is not enquoted in the string, thus applied to the quoted string as an operator. But string has no / operator, hence the error.

`wget -qO- "https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30"|grep -av "https:"|grep -av "contactos"|grep -av "javascript"|grep -av 'href="/'`

key takeaway: use backticks when appropriate to enquote strings that contain quotes, then you won't need to escape quotes within the string.

additionally, if you use single quote in bash, it will disable all special characters until another single quote is found. grep -av 'href="/' is more straightforward, no?

key takeaway: use single quotes in bash, when appropriate, to delineate literal strings

Better yet, don't shell out unless you really have to

all your pain here is because you took code that was valid in bash, and tried to encapsulate it within another programming language. don't do that unless you really have to.

consider an alternative here that might make your life easier:

  • Make the http request with Go's net/http library instead of wget.

  • Parse the HTML in the response with https://pkg.go.dev/golang.org/x/net/html which will be more robust than grep. HTML content does not grep well.

  • Related