ServiceInfo - specification for service metadata

Requirement

To build apps like https://the-federation.info, https://podupti.me, https://www.hello-matrix.net/public_servers.php, https://fediverse.network/ and others, application builders need to somehow get information about servers.

History

Different platforms have had different ways to provide metadata about themselves.

  • Diaspora initially offered a HTTP response header indicating the version of the software
  • Diaspora and compatible platforms later implemented a statistics.json route which was a simple JSON document listing version and usage information
  • This evolved into NodeInfo which is currently used there and in compatible networks, and additionally on many ActivityPub platforms.
  • I forked NodeInfo into NodeInfo2 to provide a more flexible metainfo document.
  • Mastodon offers an API endpoint for both server information and user activity.
  • Matrix offers an API endpoint for version information of the server.

Problems

NodeInfo

I’ve voiced criticism before my fork of NodeInfo regarding the problem of hardcoding Enum values for protocol and services into the specification. This means for example the current NodeInfo is incompatible with many platforms like Matrix and if a new version is implemented, every single platform needs to update their NodeInfo version, keeping backwards compatibility forever to the old ones. Additionally the specification requires using a .well-known path but still needing a separate lookup, due to the versioning scheme.

The lack of flexibility in NodeInfo means that some platforms “fill what they can” in the given keys and then fill “the real information” in the freeform metadata key.

My opinion is that the versioning is unnecessarily complex, making the lookups complex and changes to enum values is not possible without releasing a new backwards incompatible version.

NodeInfo2

NodeInfo2 did not go far enough in rethinking what platforms would need to export as information. The name and usage metrics is too tied into how NodeInfo is doing things.

Platform specific API endpoints

The Mastodon API endpoint is Mastodon specific but platforms which have also implemented the Mastodon API (to make use of Mastodon mobile clients) look like Mastodon servers to apps reading information about the servers.

The Matrix API endpoint has a server software name key, but lacks all other metadata and metrics about the server.

Proposal

I would like to propose a new specification under Feneas or some other (neutral) organization namespace to collect the good parts of the previous metadata specs and learn from the bad parts, creating something that has support from a wide range of the federated web developers. This specification should be able to cover the common needs across the federated web, independent of protocols used, and also be flexible enough for platforms to offer more detailed information about server capabilities or platform specific information that might not interest other servers.

I have a personal strong interest in getting this done, if for nothing else then allowing https://the-federation.info to build a better view of the federated web, so am happy to collect discussion and sync comments to the specifcation, from here and other sites I will be spamming this call out to.

The current draft, which is a fork of NodeInfo2, can be found here:

https://git.feneas.org/feneas/serviceinfo

Comments, discussion, proposals, etc welcome here, on the issue tracker and anywhere you can reach me :slight_smile: I will try to sync comments outside of here and the issue tracker to here. I’ll be contacting various developers in the federated web directly to try and get as much cross-platform support as possible for this specification.

A follow up post to this will explain a few of the reasons for the changes between ServiceInfo and NodeInfo2/NodeInfo.


Edit: renamed to ServiceInfo

3 Likes

I’ve reviewed the above spec and example and it looks like a good start. I do have some questions though:

The areas that are the most free form are the protocol capabilities and the features. Everything else is relatively prescriptive. For the protocol I can see that it’s to avoid a lock in to a protocol description that works for microblogging but not soundsharing or messaging. Each protocol would then have a typical set for themselves and if you care about the protocol you’d have it. Is that the essence of it for protocols? On the features however that’s going to be totally free form which would at best leave that to be a KV lookup and nothing else since or is there an idea that there would be standard fields around that too?

On the metrics, would it be possible to have multiple timespans being reported for each metric? So for example you could pull daily, weekly, or monthly active user counts? Maybe I’m reading the metrics field wrong and it’s not a description of metrics you can query but instead the instantaneous metrics of the server. If it is the stats themselves it seems like we are mixing static server data (everywhere else) with dynamic server data (the stats)

With respect to the formation of the file it could be useful to have a standard library or place to create this which can help fill in the fields and avoid having an accidental typo in a protocol name or something like that or having people scrounging around for templates. That could probably handle the 80-90% case. We could also write a validator which can flag format errors (say mangled JSON or missing required fields) or warnings (say the protocol isn’t in a list of known protocols).

1 Like

Yes I think the protocol capabilities would probably be very specific to a single protocol. I think through implementation there would be consensus on how to utilize this. I added an example with defining an array of extensions for ActivityPub and for example for Matrix indicating whether the server supports presence which is a feature in the protocol but not all servers have it activated. I think there is a distinction between “protocol capabilities” and “server features” which should not be confused together. For example the XMPP chat in DIaspora is a feature of the Diaspora server implementation, not a capability of the protocol.

Might make sense to draft some example files per the common platforms to get a better understanding what the ServerInfo documents could look like.

Yes definitely that is the idea.

For example currently Diaspora exposes monthly, halfyear and yearly active user counts (looking back from the time the document is accessed), Mastodon weekly buckets with start timestamps and Matrix generates monthly active users via a scheduled process (but doesn’t expose it currently). I tried to think of a compromise with ServerInfo allowing to define custom buckets with a time period but provide some example values for the type of metrics and bucket sizes.

JSONSchema has excellent library support and already has validators, but building a hosted generator and validator hosted at Feneas for example would be pretty trivial :+1:

Syncing from outside, on a related note, Marius Orcsik posted about https://federated.id, his idea for an AS2 Service exposing some information about an ActivityPub server: https://metalhead.club/@mariusor/102462702973576465

1 Like

grin@spora.grin.hu mentioned that he has submitted an MSC into Matrix spec for minimal uptime, user count and registrations status information. Another MSC for well-known exists also for admin contact information.

Both parts would be covered by ServerInfo, but ServerInfo isn’t direclty something that makes sense to reuse in Matrix spec directly. Maybe a good idea still to ensure that the provided information is at least compatible, should for example a Matrix server want to expose both Matrix API related information endpoints, but also a more generic wider federated world information endpoint (outside Matrix spec).

1 Like

Syncing some thoughts from a good private chat with grin, loosely written up by me.

ServerInfo is a site-wide well-known. As such it should be more generic and allow describing multiple services running on the domain. Possibly consider making serverinfo an array of services?

organization.account is not specified how the account should be formed. possibly some standard from IETF?

lots of pondering whether ServerInfo could be hosted at a chosen location without being a well-known. One possibility would be discovery via .well-known/host-meta which is an IETF standard. This still leaves the problem of allowing only to describe one service, unless ServerInfo is an array.

Follow-up thoughts to improve:

  • Rename ServerInfo to ServiceInfo?
  • Several possible discovery methods instead of a given .well-known path, as follows:
  • via .well-known/host-meta as per RFC6415. A host-meta should be able to have multiple references to ServiceInfo documents, for example (JRD version), one defining service software in the link properties, one just defining the url:
...,
"links":[
        {
          "rel":"serviceinfo",
          "type":"application/json",
          "href":"https://matrix.example.com/_synapse/serverinfo",
          "properties":{
            "https://feneas.org/specs/serviceinfo#service.software":"synapse"
          }
        },
        {
          "rel":"serviceinfo",
          "type":"application/json",
          "href":"https://mastodon.example.com/api/serverinfo"
        },
]
  • via ActivityPub Actor. Provide an extension that can be referred to in context that indicates the presence of ServiceInfo property, for example serviceinfo:object (url or full document), instead of using the AS2 defined endpoints.
  • Instead of discovery, document that the full object can be given in host-meta or in an ActivityPub object, as per the host-meta and JSON-LD specifications.

I think these changes would be better than making the document itself an array. There ServiceInfo becomes a document that can be given for multiple services on the same domain. This also gets us away from requiring a new .well-known.

Other changes/additions I plan to propose:

  • “No crawling” indication ie a kind of robots.txt property. Probably just a boolean? This could be used to not include the instance in lists like the-federation.info.
  • Make service.software an object, with name (required), version (required) and repository (optional). This would mirror the latest addition to NodeInfo.

Opinions welcome! I’ll try make these changes into a PR and then produce some platform specific examples.

Using host-meta is definitely a good idea. It’s actually one of my issues with these nodeinfo standards, not one of them is registered at IANA: https://www.iana.org/assignments/well-known-uris/well-known-uris.xml

The same applies for rels - short names need to be registered: http://www.iana.org/assignments/link-relations/link-relations.xhtml but there is alternative; using URLs, e.g.:

          "rel":"https://serviceinfo.example.com/serverinfo",
          "type":"application/json",
          "href":"https://matrix.example.com/_synapse/serverinfo",
          "properties":{
            "https://feneas.org/specs/serviceinfo#service.software":"synapse"
          }

Have a nice day! :wave:

2 Likes

Thanks, good clarification!

Changing it to ServiceInfo makes sense. What’s the advantage of distributing via ActivityPub? Is that to support pub/sub?

1 Like

Just like an AS2 actor can be either the full document or just the id uri, I think it makes sense that the serviceinfo property behaves the same way. Or not :thinking: Possibly the extension property could be serviceInfoUrl to make it explicitly an url that requires fetching, since the ServiceInfo document isn’t an AS2 Object type. I suppose it could be. But since ServiceInfo isn’t ActivityPub specific, I’d rather not go down that road.

1 Like

There is a PR with a lot of modifications, including name change to ServiceInfo.

I will be additionally doing the discovery changes discussed above tomorrow. Then I hope to produce a single HTML document for the spec instead of two separate markdown docs. After that, will be pushing for discussion to various places like project issue trackers and the W3C SocialCG group.

Feedback welcome.

1 Like

Users bob and nolan indicated on the fediverse that RDF (JSON-LD) should be used to make the document extensible. I replied as follows:


I did think about this, contemplating making the document JSON-LD. I felt though that this would introduce unnecessary complexity into the spec and frighten away those projects who don’t currently deal with JSON-LD at all (ie non-ActivityPub projects).

However, lately I’ve been thinking that it would make sense to do still, for the extensibility. These comments here reinforce that thought, so thanks for raising. I’ll mirror these comments on the forum thread and propose JSON-LD serialization in the next draft.

My experience with RDF is rather limited (from AP world only), so comments on the draft would be very welcome once incorporated.

Note, to those going to cry “oh no please don’t” - JSON-LD can be treated as pure JSON if one wants to. That is what I do with ActivityPub :slight_smile: But I do agree it’s a nice thing to have on top for future proofing the specification.

Thoughts?

The specification draft now describes discovery via host-meta and/or JSON-LD objects and that ServiceInfo itself is a JSON-LD document.

Next:

  • rewrite the schema in a way that makes sense with the move to JSON-LD
  • provide actual examples
  • possibly add support to my federation library so I can get some live documents out there?

Then going to submit again for wider review, including the W3C SocialCG group.

Thoughts welcome, as always.

Btw, I’m kind of conflicted on requiring XRD format host-meta from clients and services, but that is what the host-meta basically is. JRD is only a recommendation. Many fediverse platforms already implement an XRD well-known due to webfinger.