move matrix-generic documentation from synapse/docs into new matrix-doc project
This commit is contained in:
parent
06723d7465
commit
556e3f8a71
8 changed files with 3721 additions and 0 deletions
53
drafts/definitions.rst
Normal file
53
drafts/definitions.rst
Normal file
|
@ -0,0 +1,53 @@
|
|||
Definitions
|
||||
===========
|
||||
|
||||
# *Event* -- A JSON object that represents a piece of information to be
|
||||
distributed to the the room. The object includes a payload and metadata,
|
||||
including a `type` used to indicate what the payload is for and how to process
|
||||
them. It also includes one or more references to previous events.
|
||||
|
||||
# *Event graph* -- Events and their references to previous events form a
|
||||
directed acyclic graph. All events must be a descendant of the first event in a
|
||||
room, except for a few special circumstances.
|
||||
|
||||
# *State event* -- A state event is an event that has a non-null string valued
|
||||
`state_key` field. It may also include a `prev_state` key referencing exactly
|
||||
one state event with the same type and state key, in the same event graph.
|
||||
|
||||
# *State tree* -- A state tree is a tree formed by a collection of state events
|
||||
that have the same type and state key (all in the same event graph.
|
||||
|
||||
# *State resolution algorithm* -- An algorithm that takes a state tree as input
|
||||
and selects a single leaf node.
|
||||
|
||||
# *Current state event* -- The leaf node of a given state tree that has been
|
||||
selected by the state resolution algorithm.
|
||||
|
||||
# *Room state* / *state dictionary* / *current state* -- A mapping of the pair
|
||||
(event type, state key) to the current state event for that pair.
|
||||
|
||||
# *Room* -- An event graph and its associated state dictionary. An event is in
|
||||
the room if it is part of the event graph.
|
||||
|
||||
# *Topological ordering* -- The partial ordering that can be extracted from the
|
||||
event graph due to it being a DAG.
|
||||
|
||||
(The state definitions are purposely slightly ill-defined, since if we allow
|
||||
deleting events we might end up with multiple state trees for a given event
|
||||
type and state key pair.)
|
||||
|
||||
Federation specific
|
||||
-------------------
|
||||
# *(Persistent data unit) PDU* -- An encoding of an event for distribution of
|
||||
the server to server protocol.
|
||||
|
||||
# *(Ephemeral data unit) EDU* -- A piece of information that is sent between
|
||||
servers and doesn't encode an event.
|
||||
|
||||
Client specific
|
||||
---------------
|
||||
# *Child events* -- Events that reference a single event in the same room
|
||||
independently of the event graph.
|
||||
|
||||
# *Collapsed events* -- Events that have all child events that reference it
|
||||
included in the JSON object.
|
79
drafts/human-id-rules.rst
Normal file
79
drafts/human-id-rules.rst
Normal file
|
@ -0,0 +1,79 @@
|
|||
This document outlines the format for human-readable IDs within matrix.
|
||||
|
||||
Overview
|
||||
--------
|
||||
UTF-8 is quickly becoming the standard character encoding set on the web. As
|
||||
such, Matrix requires that all strings MUST be encoded as UTF-8. However,
|
||||
using Unicode as the character set for human-readable IDs is troublesome. There
|
||||
are many different characters which appear identical to each other, but would
|
||||
identify different users. In addition, there are non-printable characters which
|
||||
cannot be rendered by the end-user. This opens up a security vulnerability with
|
||||
phishing/spoofing of IDs, commonly known as a homograph attack.
|
||||
|
||||
Web browers encountered this problem when International Domain Names were
|
||||
introduced. A variety of checks were put in place in order to protect users. If
|
||||
an address failed the check, the raw punycode would be displayed to disambiguate
|
||||
the address. Similar checks are performed by home servers in Matrix. However,
|
||||
Matrix does not use punycode representations, and so does not show raw punycode
|
||||
on a failed check. Instead, home servers must outright reject these misleading
|
||||
IDs.
|
||||
|
||||
Types of human-readable IDs
|
||||
---------------------------
|
||||
There are two main human-readable IDs in question:
|
||||
|
||||
- Room aliases
|
||||
- User IDs
|
||||
|
||||
Room aliases look like ``#localpart:domain``. These aliases point to opaque
|
||||
non human-readable room IDs. These pointers can change, so there is already an
|
||||
issue present with the same ID pointing to a different destination at a later
|
||||
date.
|
||||
|
||||
User IDs look like ``@localpart:domain``. These represent actual end-users, and
|
||||
unlike room aliases, there is no layer of indirection. This presents a much
|
||||
greater concern with homograph attacks.
|
||||
|
||||
Checks
|
||||
------
|
||||
- Similar to web browsers.
|
||||
- blacklisted chars (e.g. non-printable characters)
|
||||
- mix of language sets from 'preferred' language not allowed.
|
||||
- Language sets from CLDR dataset.
|
||||
- Treated in segments (localpart, domain)
|
||||
- Additional restrictions for ease of processing IDs.
|
||||
- Room alias localparts MUST NOT have ``#`` or ``:``.
|
||||
- User ID localparts MUST NOT have ``@`` or ``:``.
|
||||
|
||||
Rejecting
|
||||
---------
|
||||
- Home servers MUST reject room aliases which do not pass the check, both on
|
||||
GETs and PUTs.
|
||||
- Home servers MUST reject user ID localparts which do not pass the check, both
|
||||
on creation and on events.
|
||||
- Any home server whose domain does not pass this check, MUST use their punycode
|
||||
domain name instead of the IDN, to prevent other home servers rejecting you.
|
||||
- Error code is ``M_FAILED_HUMAN_ID_CHECK``. (generic enough for both failing
|
||||
due to homograph attacks, and failing due to including ``:`` s, etc)
|
||||
- Error message MAY go into further information about which characters were
|
||||
rejected and why.
|
||||
- Error message SHOULD contain a ``failed_keys`` key which contains an array
|
||||
of strings which represent the keys which failed the check e.g::
|
||||
|
||||
failed_keys: [ user_id, room_alias ]
|
||||
|
||||
Other considerations
|
||||
--------------------
|
||||
- Basic security: Informational key on the event attached by HS to say "unsafe
|
||||
ID". Problem: clients can just ignore it, and since it will appear only very
|
||||
rarely, easy to forget when implementing clients.
|
||||
- Moderate security: Requires client handshake. Forces clients to implement
|
||||
a check, else they cannot communicate with the misleading ID. However, this is
|
||||
extra overhead in both client implementations and round-trips.
|
||||
- High security: Outright rejection of the ID at the point of creation /
|
||||
receiving event. Point of creation rejection is preferable to avoid the ID
|
||||
entering the system in the first place. However, malicious HSes can just allow
|
||||
the ID. Hence, other home servers must reject them if they see them in events.
|
||||
Client never sees the problem ID, provided the HS is correctly implemented.
|
||||
- High security decided; client doesn't need to worry about it, no additional
|
||||
protocol complexity aside from rejection of an event.
|
51
drafts/state_resolution.rst
Normal file
51
drafts/state_resolution.rst
Normal file
|
@ -0,0 +1,51 @@
|
|||
State Resolution
|
||||
================
|
||||
This section describes why we need state resolution and how it works.
|
||||
|
||||
|
||||
Motivation
|
||||
-----------
|
||||
We want to be able to associate some shared state with rooms, e.g. a room name
|
||||
or members list. This is done by having a current state dictionary that maps
|
||||
from the pair event type and state key to an event.
|
||||
|
||||
However, since the servers involved in the room are distributed we need to be
|
||||
able to handle the case when two (or more) servers try and update the state at
|
||||
the same time. This is done via the state resolution algorithm.
|
||||
|
||||
|
||||
State Tree
|
||||
------------
|
||||
State events contain a reference to the state it is trying to replace. These
|
||||
relations form a tree where the current state is one of the leaf nodes.
|
||||
|
||||
Note that state events are events, and so are part of the PDU graph. Thus we
|
||||
can be sure that (modulo the internet being particularly broken) we will see
|
||||
all state events eventually.
|
||||
|
||||
|
||||
Algorithm requirements
|
||||
----------------------
|
||||
We want the algorithm to have the following properties:
|
||||
- Since we aren't guaranteed what order we receive state events in, except that
|
||||
we see parents before children, the state resolution algorithm must not depend
|
||||
on the order and must always come to the same result.
|
||||
- If we receive a state event whose parent is the current state, then the
|
||||
algorithm will select it.
|
||||
- The algorithm does not depend on internal state, ensuring all servers should
|
||||
come to the same decision.
|
||||
|
||||
These three properties mean it is enough to keep track of the current state and
|
||||
compare it with any new proposed state, rather than having to keep track of all
|
||||
the leafs of the tree and recomputing across the entire state tree.
|
||||
|
||||
|
||||
Current Implementation
|
||||
----------------------
|
||||
The current implementation works as follows: Upon receipt of a newly proposed
|
||||
state change we first find the common ancestor. Then we take the maximum
|
||||
across each branch of the users' power levels, if one is higher then it is
|
||||
selected as the current state. Otherwise, we check if one chain is longer than
|
||||
the other, if so we choose that one. If that also fails, then we concatenate
|
||||
all the pdu ids and take a SHA1 hash and compare them to select a common
|
||||
ancestor.
|
Loading…
Add table
Add a link
Reference in a new issue