Properties for a Push activity's split list of commits

A Push activity should contain a list of commits pushed. In a simple trivial scenario, this list simply be the value of the standard object property. But there’s a use case I have, that isn’t just such a single list.

Suppose you push 1000 commits at once. This may not be common, but it happens. For example perhaps when creating a new branch, or a new repo that is a clone/fork with existing commits. Should the Push activity have all those 1000 commits?

I checked what forges do with web hooks. Pushing a web hook with 1000 commits means a huge payload, and it’s very likely the web hook handler doesn’t really need all those commits. One forge, IIRC Gitea, simply trims the list to 20 commits at most. Whoever needs more, can just git pull from the repo.

For a long time now, I’ve been doing IRC commit reports like this: If there are many commits, above some number, then display some of the first ones, then an ellipsis, then some of the latest ones. So, instead of displaying the last 20 commits, we could e.g. list the latest 10 commits pushed and the earliest 10. Or, say, the latest 9 and the 1 earliest commit pushed.

When you just see a list of some 3/5/10/20/whatever last commits, it’s not clear exactly what got pushed. Often people just guess, “oh this looks like many commits, including ones I’ve already seen, it’s likely someone just force-pushed or made a new branch/fork” etc. But instead of guessing, I want that info to be really provided, even if 1000 commits got pushed. What’s the earliest commit pushed, and what’s the latest? Also, specify how many commits got omitted in the middle.

So, I’m wondering how to model this in ActivityPub. In the simple list case, I’m doing this as follows: object maps to an OrderedCollection object, where items is a list of commits and totalItems says how many really got pushed, e.g. items could list 20 commits but totalItems could be 1000, letting you know that stuff got omitted for performance reasons etc.

How do I do it with multiple lists? I can’t come up with any sane way. Ideally, keep having object map to an OrderedCollection, but somehow have 2 lists of items in there. Here’s one idea:

  • Suppose 1000 commits got pushed, but our limit to send is 20, so we’ll include the first 10 and the last 20, and we’ll omit the 980 in between
  • To support items and totalItems working the standard way, items will contain the latest 10 commits and totalItems will be set to 1000
  • An additional custom property earlyItems will contain the first 10 commits
  • Whoever wants to get the number 980 will need to grab the 1000, subtract the length of the earlyItems and items lists, and conclude “oh cool I have 10 first and 10 last, and in between 980 that I didn’t receive”

This is weird, but I think it’s what I’ll do for now, just to get the implementation going.

Ideas very very welcome!!

I wish Collections had some built-in mechanism to specify first-items and last-items and how-many-omitted-in-between :slight_smile:

2 Likes

Another idea, maybe more weird, is to have a Collection object with totalItems set to e.g. 1000, and first set to a CollectionPage object listing e.g. 10 items, and last also set to a CollectionPage listing another 10 items.

The purpose of the push is not to convey all of the git information but to publish the fact that there was a repository update? Is the actual repository information using git for synchronizing across instances? I haven’t looked at the ForgeFed implementation but was assuming that git repo synchronization was using standard mechanisms not AP. If so then your suggestion about only including up to 20 and having the total number sounds like the most reasonable idea.

@hankg, the purpose of the Push activity is the same as the purpose of the web hooks commonly available in forges like GitLab CE and GItea (and I assume githu8 has them too). I’m just imitating the way those web hooks work: List some info about the commits, or at least some of the commits if the list is very long. And if someone needs the actual repo content, or the full list of commits, etc. they can just git pull and get the whole repo. That part is nothing innovative, it’s the way web hooks already work, all I’m doing is speccing their format in ForgeFed/ActivityPub, so that we’ll have a common format all forges (and other fediverse apps of course) can use to interact with and display that info.

1 Like

It might make sense to make the limitation on count of commits a servers SHOULD limit the amount of commits to the most recent X amount or something. I would avoid a hard MUST here and instead leave a bit of room for implementation detail.

In the end, receiving servers are going to be the ones likely to limit what information they actually display and consume.

I would not over engineer things too much. The first option ie object is a OrderedCollection that contains a limited amount of items and a totalItems count sounds the best of the two options to me :+1:

1 Like

Thanks for the feedback :slight_smile: Current status:

  • Limiting the amount of commits is entirely optional, and the limit itself is up to the server. I’m limiting to 20 right now in my implementation, just to demonstrate this feature.
  • The commits are in an OrderedCollection, and there’s a custom property earlyItems that can be used to send items from the bottom of the list, in addition to the standard items listing the top items. This is entirely harmless, safe, compatible.

I’m almost done coding this, demo coming soon :slight_smile:

1 Like