I am working on an audio related project, and is there a way to know if an audio URL is a streaming(radio) audio programmatically? Like from the header information or somewhere else. I am trying to apply some filter or process differently based on if the audio is a streaming(radio) audio or not.
CodePudding user response:
I'd request the resource but check the Content-Type header. It should give an impression of the response content. There are multiple values used for audio, but probably only very few are used for streaming. But there are indications that you might have to look at the file name extension.
If you want to check the mimetype before downloading the complete audio stream (which would never end by design), run an HTTP HEAD request.
From https://www.rfc-editor.org/rfc/rfc7231#section-4.3.2:
The HEAD method is identical to GET except that the server MUST NOT send a message body in the response (i.e., the response terminates at the end of the header section). The server SHOULD send the same header fields in response to a HEAD request as it would have sent if the request had been a GET, except that the payload header fields (Section 3.3) MAY be omitted. This method can be used for obtaining metadata about the selected representation without transferring the representation data and is often used for testing hypertext links for validity, accessibility, and recent modification.
CodePudding user response:
Unfortunately, it's not really possible to definitively say whether a URL is "live" or not. Though, there are several heuristics you can use:
Response has Content-Length
header
A live radio stream has no determinate ending, and therefore no length. If the response declares that it does have a length, you can usually guess that it is not live.
Just keep in mind that the inverse is not always true. There are still servers that do not provide a Content-Length
even if the length can be known. (All the examples I can think of are poorly configured reverse proxies, but they do exist.)
Also, sometimes a streaming server will return a very large length. This allows it to comply with HTTP/1.1 without using chunked transfer encoding. (Many streaming clients do not support chunked transfer encoding.) If you rely on this method, make sure the Content-Length
is actually some sane value, and not MAX_INT
.
Server
response header contains "SHOUTcast" or "Icecast"
These are the two most popular HTTP progressive streaming servers, and serve almost all of the typical radio streams. If the server identifies itself as one of these, it's likely a live radio stream.
A caveat is that at least Icecast can also be used to serve static files. It's not uncommon to see it serving recordings of streams, for example.
Also, servers can say they are anything they want, so there's no guarantee the server isn't lying to you.
Response rate
If the server sends you 30 minutes worth of audio data in 30 seconds, then that's a really good indicator that it isn't live. :-)
It's rarely that clear cut though. Streaming servers are usually configured with a decent buffer size that they will send to you as quickly as possible upon connect. Sometimes it's super small, like 64 KB. Other times it's a whole megabyte. So, if you want to rely on this you'll have to fetch a decent amount of data to get a good indication.
Some not-live servers will also throttle their responses for regular pre-recorded files if they think they can "stream" it to you. I haven't seen this recently, but it used to be a common way to deter people from downloading as it would take a long time. And then, there are servers which just don't have the bandwidth to deliver to you quickly, so the stream may seem "live" even if it isn't.
"As-live" streams
Many channels are generated programmatically and are recordings sent to you as if they were live. There isn't a way to determine this.
So, what to do?
This depends on your specific needs. If you're making a crawler or something, connect to the stream and if it doesn't end after [some bytes] or [some duration] or [some length of connection time], then flag it as "probably live" and move on.