Associating a commit author with an ActivityPub actor

In ActivityPub, usually you’d refer to a person by their actor URI. But when dealing with VCS repos, we get a different representation of identity. For example, in Git usually the author of a commit looks like Jane Doe <jane@doe.example>. The identifying detail is the email address.

The current situation, I suppose, is that a forge grabs the commit author’s email address and checks if there’s a user with that email address. The email address can be used to grab the user’s Libravatar etc., and to verify that the user’s email address is associated with the GPG key that signs the commit, if the commit is signed.

How do we do this on the Fediverse? There’s no global index matching email addresses to ActivityPub actors.

Ideas:

  • Somehow associate an actor URI with each commit, in addition to the email address or instead
  • Treat the author email address as a WebFinger account URI, query the webfinger endpoint and get the actor URI from there

How can this be done in Git? And in Darcs? And in other VCSs?

More ideas:

  • Use git notes to attach the actor URI to the commit
  • Have a file in the repo, called e.g. .gitauthors or _darcsauthors etc., which maps author email addresses to their ForgeFed actor URIs
  • Use the git prepare-commit-msg hook to insert the actor URI into the commit
  • Keep the git author actor URI in the config config as e.g. user.forgefed-actor alongside user.email

I think putting it into the VCS itself (e.g into commit, repo, or config) is not the way to go.

I would rather build an mapping index based on the know network.
For example could each Forge expose an collection for mail
→URI-ID map that is kept up to date by federating changes to it.
The mapping should be based on the hash of the mail address (similar to how libravatar and gravatar work), we don’t want spammers to harvest this lists, right?
Maybe that could even be part of the ForgeFed (or a related) spec.

And if the mail is not in the index, then you fall-back to webfinger, and, if required, even libravatar/gravatar to at least show an picture if it’s available there.

1 Like

I think putting it into the VCS itself (e.g into commit, repo, or config) is not the way to go.

Hmmm can you please explain your thinking, why you think putting the URI in the VCS isn’t a good idea and using an index is a good idea?

Here are some of my thoughts about this :slight_smile:

The way I see it, putting the URI in the commit is the best way. The ForgeFed actor URI has the same status as the email address: It’s a way to identify the author, to contact them, to send them patches and so on (and it should work even when the author email isn’t even provided). In the same way the author email is attached to the commit, the actor URI can be too. Perhaps, should be.

Putting the URI in the commit allows to associate GPG signatures on commits with the actor URI. It also allows ForgeFed not to depend on email addresses. The email-to-actor index complicates things, and forces people to use email. Also, email address hosts have no obvious match with forge URLs, so, given an email address, how do you even know which forge to ask? And what if that forge happens to be offline? And what if that forge doesn’t exist anymore?

Putting the URI in the commit just seems like the easiest most secure option, because it would work in the same way author email works. Introducing a level of indirection ties ForgeFed to email addresses and introduces a whole new model with its own security implications.

Obviously, existing commits don’t have any actor URI attached to them. And then indeed we need some other way. Ideas:

  • Remember that some commits are made by people who aren’t on the Fediverse at all, so there’s no actor URI, and in ForgeFed that could mean just having an object for attributedTo that doesn’t have an @id, just name and email
  • For signed commits, find the actor URI attached to the GPG key (is it possible to do that, and to publish to a keyserver?)
  • If the email address is in use by a local user, can we assume it’s them?
  • Use some kind of index? But I’m not sure how. The safe way would be to have email servers be the authority over this, but Idk if that’s reasonably practical. Matrix has this identity server thing, but AFAIK it’s not federated and almost everyone uses the identity server run by matrix.org? I’d like to try not to introduce a centralized component.

Perhaps it’s also reasonable to just apply some guesses when the actor URI isn’t in the commit (look at GPG key if commit is signed; check if email address is associated with a local account; check if email address is associated with a known remote account; check if we have some existing Commit in which this email address is already associated with some actor URI), and over time people will be putting actor URIs in new commits and this problem will gradually disappear.

Indeed for the avatar we can use the email with Libravatar without needing the actor URI.

I’m also wondering whether we can/should do some public matching between the actor URI and the email address. Is that secure? Generally email addresses are privately used for login and for password reset, like, generally on the internet. Using email address hashes would be an improvement, I suppose, yeah. But we also need to consider how that index would work: Is it good enough to trust the “known network?” And more generally, what are the uses of the actor URI and how bad is it if sometimes it won’t be available? I suppose we can add an emailHash property to ForgeFed. Or is it better to add the email address itself? Is it a good idea? Does it cause any privacy/spam problem that doesn’t already exist?

Everyone’s thoughts are very welcome, please share :slight_smile:

I am unable to write a proper reply, but here are some snippets I’d like to share:

RE Actor ID in Commit
I agree that is the cleanest way.
But I think that the problem will only slowly “gradually disappear” because of missing support for it in the VCS clients themselves.
For example for git there does not seem to be an global prepare-commit-msg hook. (There is this template thing for new repositories, but you have to proactively configure that.)

In that sense I think that forgefed will have to deal with much more mail-only commits than it with-actor-id commits for a long time.

RE Mail→ID mapping authority by mail domains
That would be equivialent to webfinger, I don’t see trust issues with that. (If you cannot trust the content hosted on the domain that belongs to your mail address that’s not our problem.)

The Mapping Index of the Mail Address btw has nothing to do with the actual mail domain. If I search for the actor uri for foo@examle.org, I won’t request an index from example.org (because if they would have implemented our forgefed index they also could have implemented webfinger and then no mapping via an index is required.)

So far… :slight_smile:

We could use ActivityPub URI as email, unlike @someuser@instance.tld like: someuser@instance.tld

I’m still not sure ForgeFed should try to describe commits at all except by the bare minimum like ref, author and obviously repository (see comment here, which I promise I will reply to…). I feel it’s better not to duplicate all the information between the VCS and the layer for transporting activities (ForgeFed). The forges themselves, or anyone interested in the commits, can just check the repository.

For this reason, I don’t think it makes sense to add stuff to the commit itself.

But since we can’t prove someone owns an email address in a commit, except if that email is verified locally, it’s not possible to trust remote profile emails to provide local links to profiles or profile pictures for commits shown in the UI, like GitHub or GitLab does.

Maybe just accept for now this is a side effect of federation until there is a way to securely prove you own an email, like an index based on trust? There are various decentralized identity server projects out there being worked on that might be useful without needing to reinvent the wheel.