Discussion:
[Icecast-dev] Icecast stats.xml
Roger Hågensen
2014-10-23 06:38:39 UTC
Permalink
Consider this a Ticket for Icecast 2.4

********************************************************************************
If you look at
{{{
admin/stats.xml
}}}

on a Icecast-KH server (default setup) and an Icecast 2.4 server
(default setup) the following is one of the things that the KH branch
has as extra info.

{{{
<listener id="3581">
<ID>3581</ID>
<IP>127.0.0.1</IP>
<UserAgent>foobar2000/1.3.3</UserAgent>
<lag>42631</lag>
<Connected>1028</Connected>
</listener>
}}}

The fact that Icecast 2.4 lacks this info makes it impossible (or close
to impossible short of scraping the listener page) to collect listener
time stats.

The listener stats (the id, ip and connected) is vital for building the
logs that StreamLicensing.com needs.
Due to this SL can only support Shoutcast v1, SHoutcastv2 and Icecast-KH.
These stats are vital for the calculation and reporting of the royalties
to SoundExchange and various PROs.

This causes a small issue as server hosting companies only support
Icecast (not the -KH branch), and Centova Cast (which many hosters use
for their backend) do not support Icecast-KH either.

This causes a deadzone where Icecast 2.4 can not be used as the
streaming server.

Icecast 2.4 and Icecast-KH should have parity on stats.xml as Icecast
2.4 and Icecast-KH should be interchangeable using the default
out-of-the-box settings.
********************************************************************************

PS! Something must be wrong with Trac on Xiph.org.
Each time I tried to submit the above I got an error saying:
"SpamBayes determined spam probability of 72.83%"
How can the above be spam (the ***** not included obviously), has the
Bayesian filter been poisoned?
Initially I got a spam probability of 50%, I changed the example url of
admin/stats.xml to not include http and localhost and port number
thinking that was the issue, but removing that made it reach 73% instead.
--
Roger "Rescator" Hågensen.
Freelancer - http://www.EmSai.net/
Thomas B. Rücker
2014-10-23 07:02:49 UTC
Permalink
Hi,

Thanks for taking the time to report this.
Post by Roger HÃ¥gensen
Consider this a Ticket for Icecast 2.4
********************************************************************************
If you look at
{{{
admin/stats.xml
}}}
on a Icecast-KH server (default setup) and an Icecast 2.4 server
(default setup) the following is one of the things that the KH branch
has as extra info.
{{{
<listener id="3581">
<ID>3581</ID>
<IP>127.0.0.1</IP>
<UserAgent>foobar2000/1.3.3</UserAgent>
<lag>42631</lag>
<Connected>1028</Connected>
</listener>
}}}
The fact that Icecast 2.4 lacks this info makes it impossible (or close
to impossible short of scraping the listener page) to collect listener
time stats.
The listener stats (the id, ip and connected) is vital for building the
logs that StreamLicensing.com needs.
Due to this SL can only support Shoutcast v1, SHoutcastv2 and Icecast-KH.
These stats are vital for the calculation and reporting of the royalties
to SoundExchange and various PROs.
Hmm, thanks for summarizing this.
I have an item to look into exactly this type of thing, but it didn't
even reach trac yet.
Post by Roger HÃ¥gensen
This causes a small issue as server hosting companies only support
Icecast (not the -KH branch), and Centova Cast (which many hosters use
for their backend) do not support Icecast-KH either.
This causes a deadzone where Icecast 2.4 can not be used as the
streaming server.
It is very important for us to know about such things. So far we were
mostly under the impression that a combination of playlist.log and the
information available through the various admin XML representations
(there is more than the main stats XML) would be sufficient. Only last
June I ran into some mentions of a licensing service and actually
reached out to them, but yet have to go through the information they
provided.
Post by Roger HÃ¥gensen
Icecast 2.4 and Icecast-KH should have parity on stats.xml as Icecast
2.4 and Icecast-KH should be interchangeable using the default
out-of-the-box settings.
We'd prefer it this way, yes. Sadly KH has been continuously diverging
and it's really hard to still call it a branch. It only vaguely syncs
things from Icecast trunk, but there is zero flow back to trunk. As much
as I don't like doing that, I'm going to call it a fork.
Post by Roger HÃ¥gensen
********************************************************************************
PS! Something must be wrong with Trac on Xiph.org.
"SpamBayes determined spam probability of 72.83%"
I'm terribly sorry that you get caught in this.
Post by Roger HÃ¥gensen
How can the above be spam (the ***** not included obviously), has the
Bayesian filter been poisoned?
I've been training the filter with good submissions, but it still
sometimes comes up and barfs at submissions like yours. I suspect it
might be due to the similarity to HTML and how many spammers just try to
dump random HTML into forms.
Post by Roger HÃ¥gensen
Initially I got a spam probability of 50%, I changed the example url of
admin/stats.xml to not include http and localhost and port number
thinking that was the issue, but removing that made it reach 73% instead.
I'll disable the bayesian filter, it has caused too many problems. This
may lead to false negatives, but we usually catch them quite quickly.


Cheers

Thomas
Daniel James
2014-10-23 09:28:12 UTC
Permalink
Hi Thomas,
Post by Thomas B. Rücker
So far we were
mostly under the impression that a combination of playlist.log and the
information available through the various admin XML representations
(there is more than the main stats XML) would be sufficient.
The critical issue for us has been whether the Icecast service is
provided by a third party. In that case, you need some API to fetch
listener connection times and listener IP addresses for geolocation,
assuming that your Icecast provider does not deliver these stats as part
of the hosting package.

Since there is no world-wide flat fee system for calculating music
royalties, you need to know at least the 'aggregate tuning hours' and
the territories of listeners in order to make sure that you are paying
the right amount to the right national collecting society.

There are some reciprocal royalty arrangements for recording royalties
organised by the IFPI, but these arrangements do not include
SoundExchange in the USA. Then you have the separate issue of songwriter
copyrights, which vary from country to country.

If you run your own Icecast server, you can collect and report this
listener information from logs using something like Piwik or Kibana, e.g:

http://sourcefabric.booktype.pro/airtime-25-for-broadcasters/icecast-statistics-with-piwik/

but even then, you might prefer to isolate the servers and fetch the
information over an API instead.

Ultimately, it would be great if the Icecast admin interface provided
the stats breakdown and something like monthly reports, but that might
be more feature creep than you're willing to contemplate.

Cheers!

Daniel
Roger Hågensen
2014-10-23 19:12:14 UTC
Permalink
Post by Thomas B. Rücker
Thanks for taking the time to report this.
Post by Roger HÃ¥gensen
Consider this a Ticket for Icecast 2.4
********************************************************************************
If you look at
{{{
admin/stats.xml
}}}
on a Icecast-KH server (default setup) and an Icecast 2.4 server
(default setup) the following is one of the things that the KH branch
has as extra info.
{{{
<listener id="3581">
<ID>3581</ID>
<IP>127.0.0.1</IP>
<UserAgent>foobar2000/1.3.3</UserAgent>
<lag>42631</lag>
<Connected>1028</Connected>
</listener>
}}}
The fact that Icecast 2.4 lacks this info makes it impossible (or close
to impossible short of scraping the listener page) to collect listener
time stats.
The listener stats (the id, ip and connected) is vital for building the
logs that StreamLicensing.com needs.
Due to this SL can only support Shoutcast v1, SHoutcastv2 and Icecast-KH.
These stats are vital for the calculation and reporting of the royalties
to SoundExchange and various PROs.
This causes a small issue as server hosting companies only support
Icecast (not the -KH branch), and Centova Cast (which many hosters use
for their backend) do not support Icecast-KH either.
This causes a deadzone where Icecast 2.4 can not be used as the
streaming server.
It is very important for us to know about such things. So far we were
mostly under the impression that a combination of playlist.log and the
information available through the various admin XML representations
(there is more than the main stats XML) would be sufficient. Only last
June I ran into some mentions of a licensing service and actually
reached out to them, but yet have to go through the information they
provided.
The way StreamLicensing.org does it, is to login (authenticate) as admin
on the shoutcast/icecast server, and in the case of Icecast-KH they just
retrieve the stats.xml
For Icecast 2.4 this does not work, sure the song title and some other
info is there just like with Icecast-KH (which only has an extra
yellowpages line repeating existing info for some reason).
But Icecast 2.4 lacks the listener list which is what StreamLicensing needs.

If modifying Icecast 2.4.1 (or later) so stats.xml matches what
Icecast-KH outputs is not an option then perhaps some of the info
available in stats.xml could also be presented in a listeners.xml instead?
That way StreamLIcensing (and any other similar services, though after
LoudCity shut down there really aren't any others) only need to fetch...
admin/listeners.xml
With the stats from stats.xml plus a listener list, the needed info is
there in the .xsl (from what I can see)

The system StreamLicensing uses is fully automated, whereas the log
would have to be manually submitted. (as far as I'm aware)

I could poke StreamLicensing for more details on the stuff from
stats.xml they need.
But I know (from previous emails with them) that the following is the
key part, UserAgent is probably not needed, but I'm assuming that some
people (like me) might want to automatically fetch UserAgent info to see
which players (and versions) are the most common.

<listener id="3581">
<ID>3581</ID>
<IP>127.0.0.1</IP>
<UserAgent>foobar2000/1.3.3</UserAgent>
<Connected>1028</Connected>
</listener>


Do note that while having it identical to Icecast-KH would allow
StreamLicensing to simply point their grabber on a Icecast 2.4.x server
with no code changes, if the tag names and such are different then
they'd be willing to add code for parsing that. As long as the key
information is still there.
I.e: The IP, the ID, the Connected info. (and preferably the UserAgent too)

I'll double check with StreamLicensig what the minimum info needed is,
and wat the preferable is (same as Icecast-KH's stats.xml is my guess
though).

Regards,
Roger.
Roger Hågensen
2014-10-24 08:50:18 UTC
Permalink
I got word back from the guy that developed the polling/parsing code for StreamLicensing.
The following are the XML tags/fields that is fetched from Icecast-KH's stats.xml

Regards,
Roger.



*******************************
We connect to each stream’s stats.xml file for collection of Icecast stats.
The typical URL is of course hostname:port/admin/stats.xml?mount=/mountname
We collect the following data. Note that the listener nodes are repeated, one for each listener on that stream.
Some of the metadata like max_listener and listener_peak we store but do not currently use.
However, unless we are to change our scripts to support Icecast2, which may be in the cards in the future, then we expect to see xml in a format used by Icecast.
I understand that clustering and/or an aggregation stats server may be needed, but because each stream is polled every 60 to 90 seconds, parsing large chunks of aggregated data can cause latency in excess of our safety thresholds.
Please consider that requirement in your design approach.

<client_connections>
<source>
<genre>
<server_url>
<server_name>
<bitrate>
<title>
<max_listener>
<listener_peak>

<listener id>
<IP>
<UserAgent>
<Connected>
********************************
Daniel James
2014-10-27 12:00:05 UTC
Permalink
Hi Roger,
Post by Roger HÃ¥gensen
The way StreamLicensing.org does it, is to login (authenticate) as admin
on the shoutcast/icecast server, and in the case of Icecast-KH they just
retrieve the stats.xml
Login as admin is not ideal for this purpose because you may not want
your stats provider or licensing authority to have that much control
over your Icecast instances. It would be better to have a 'stats'
login/password that could only gather stats, nothing else.
Post by Roger HÃ¥gensen
I.e: The IP, the ID, the Connected info. (and preferably the UserAgent too)
There are privacy issues here. Is it really necessary for the licensing
authority to have the timestamp/IP address/user agent of each listener?
Imagine the radio station might have political content which could get
listeners into trouble if the government of a certain country knew who
they were.

It might be better to look up the geolocation for each IP, then delete
the IP from the record. In most cases I know of, the licensing authority
only requires the headline numbers (such as aggregate tuning hours for
listeners in the US, in the case of SoundExchange).

Also, there are issues with polling stats every 60-90 seconds. This will
miss a lot of short duration listeners, which can give you useful
information.

For example, if a particular user agent has lots of very short
connections, there could be a compatibility issue with the stream format
which you would want to know about. Or, people in a particular country
don't like your content as much as listeners in another country, so they
manually disconnect in the first minute.

One solution is to provide instantaneous stats by polling, so the radio
station has some live feedback, but for the detailed reports compile
daily, weekly or monthly stats from the full log.

Cheers!

Daniel

Loading...