Conceptual overview of ActivityPub done

This commit is contained in:
Christopher Lemmer Webber 2019-07-18 13:12:57 -04:00
parent bca5003de8
commit 8e578a58c8
No known key found for this signature in database
GPG Key ID: 4BC025925FF8F4D3

View File

@ -12,16 +12,106 @@
# - ActivityPub is an actor model protocol.
# - The general design can be understood from the overview section of the spec
[[https://www.w3.org/TR/activitypub/][ActivityPub]] is a federated social network protocol.
It is generally fairly easily understood by reading the
[[https://en.wikipedia.org/wiki/Actor_model][Overview section of the standard]].
In short, just as anyone can host their own email server, anyone can
host their own ActivityPub server, and yet different users on different
servers can interact.
At the time of writing, ActivityPub is seeing major uptake, with
several thousand nodes and several million registered users (with the
caveat that registered users is not the same as active users).
ActivityPub defines both a client-to-server and server-to-server
protocol, but at this time the server-to-server protocol is what is
most popular and is the primary concern of this article.
# - In general, most of the design of ActivityPub is fairly clean, with
# a few exceptions
# - sharedInbox is a break from the actor model protocol and was a late
# addition
ActivityPub's core design is fairly clean, following the
[[https://en.wikipedia.org/wiki/Actor_model#Fundamental_concepts][actor model]].
Different entities on the network can create other actors/objects
(such as someone writing a note) and communicate via message passing.
A core set of behaviors are defined in the spec for common message
types, but the system is extensible so that implementations may define
new terms with minimal ambiguity.
If two instances both understand the same terms, they may be able to
operate using behaviors not defined in the original protocol.
This is called an "open world assumption" and is necessary for a
protocol as general as ActivityPub; it would be extremely egotistical
of the ActivityPub authors to assume that we could predict all future
needs of users.[fn:json-ld]
# - (json-ld conversations outside of the scope of this particular post)
Unfortunately (mostly due to time constraints and lack of consensus),
even though most of what is defined in ActivityPub is fairly
clean/simple, ActivityPub needed to be released with "holes in the
spec".
Certain key aspects critical to a functioning ActivityPub server are
not specified:
# - authentication is not specified. The community has settled
# on using http signatures for signing requests, though there is no
# "clean agreement" on how to attach signatures *to* posts yet.
# - authorization is not specified
# - (json-ld conversations outside of the scope of this particular post)
- Authentication is not specified. Authentication is important to
verify "did this entity really say this thing".[fn:did-you-say-it]
However, the community has mostly converged on using [[https://tools.ietf.org/html/draft-cavage-http-signatures-11][HTTP Signatures]]
to sign requests when delivering posts to other users.
The advantage of HTTP Signatures is that they are extremely simple
to implement and require no normalization of message structure;
simply sign the body (and some headers) as-you-are-sending-it.
The disadvantage of HTTP Signatures is that this signature does
not "stick" to the original post and so cannot be "carried around"
the network.
A minority of implementations have implemented some early versions
of [[https://w3c-dvcg.github.io/ld-proofs/][Linked Data Proofs]] (formerly known as "Lined Data Signatures"),
however this requires access to a normalization algorithm that not
all users have a library for in their language, so Linked Data Proofs
have not as of yet caught on as popularly as HTTP Signatures.
- Authorization is also not specified. (Authentication and
authorization are frequently confused (especially because in
English, the words are so similar) but mean two very different
things: the former is checking who said/did a thing, the latter is
checking whether they are allowed to do a thing.) As of right now,
authorization tends to be extremely ad-hoc in ActivityPub systems,
sometimes as ad-hoc as unspecified heuristics from tracking who
received messages previously, who sent a message the first time,
and so on. The primary way this is worked around is sadly that
interactions which require richer authorization simply have not
been rolled out onto the ActivityPub network.
Compounding this situation is the general confusion/belief that
autorization must stem from authentication.
This document aims to show that not only is this not true, it is also
a dangerous assumption with unintended consequences.
An alternative approach based on "object capabilities" is
demonstrated, showing that the actor model itself, if we take it at
its purest form, is itself already a sufficient authorization system.
# - sharedInbox is a break from the actor model protocol and was a late
# addition
Unfortunately there is a complication.
At the last minute of ActivityPub's standardization, =sharedInbox= was
added as a form of mutated behavior from the previously described
=publicInbox= (which was a place for servers to share public content).
The motivation of =sharedInbox= is admirable: while ActivityPub is based
on explicit message sending to actors' =inbox= endpoints, if an actor
on server A needs to send a message to 1000 followers on server B,
why should server A make 1000 separate requests when it could do it
in one?
A good point, but the primary mistake in how this one request is made;
rather than sending one message with a listing of all 1000 recipients
on that server (which would preserve the actor model integrity),
it was advocated that servers are already tracking follower information,
so the receiving server can decide whom to send the message to.
Unfortunately this decision breaks the actor model and also our suggested
solution to authorization; see [[https://github.com/WebOfTrustInfo/rwot9-prague/blob/master/topics-and-advance-readings/ap-unwanted-messages.md#org7937fed][MultiBox]] for a suggestion on how we
can solve this.
# - What to do about the holes in the spec? Many community members have
# asked that we codify current behavior. However, as this document lays
@ -34,14 +124,44 @@
# people who are actively concerned with human rights and the
# well-being of marginalized groups.
** The mess we're in
Despite these issues, ActivityPub has achieved major adoption.
ActivityPub has the good fortune that its earliest adopters tended to
be people who cared about human rights and the needs of marginalized
groups, and spam has been relatively minimal.
[fn:json-ld] The technology that ActivityPub uses to accomplish this is
called [[https://json-ld.org/][json-ld]] and admittedly has been one of the most controvercial
decisions in the ActivityPub specification.
Most of the objections have surrounded the unavailability of json-ld
libraries in some languages or the difficulty of mapping an open-world
assumption onto strongly typed systems without an "other data" bucket.
Since a project like ActivityPub must allow for the possibility of
extensions, we cannot escape open-world assumptions.
However, there may be things that can be done to improve happiness
about what extension mechanism is used; these discussions are out of
scope for this particular document, however.
[fn:did-you-say-it] Or more accurately, since users may appoint
someone else to manage posting for them, "was this post really made
by someone who is authorized to speak on behalf of this entity".
** Unwanted messages, from spam to harassment
# - "there are no nazis on the fediverse"
# - social networks: breadth vs depth?
# - wholesale borrowing of surveillance capitalist assumptions
** Freedom of speech also means freedom to filter
** Don't pretend we can prevent what we can't
** Freedom of speech also means freedom to filter
# - introduce ocap community phrase
# - introduce revised version
# - "the fediverse is not indexed"
#
** Anti-solutions