-
Pl
chevron_right
Mathieu Pasquet: XMPP and metadata
news.movim.eu / PlanetJabber • Yesterday - 17:49 • 14 minutes
I had the pleasure of giving a talk on "XMPP and Metadata" during the last Chaos Communication Congress, in the Critical Decentralization Cluster area. It was my first public presentation in a very long while (also in english), so the talk went okay-ish at best. The end of the year was also hectic and I did not manage to prepare or rehearse as much as I would have liked to.
This blog post will be a longer, more complete version of the talk. You can nonetheless find the talk slides on the CDC pretalx . Thanks a lot to the people who proofread the blog post to fix stuff or suggest additional content.
This was about metadata, but also generally data retention and what the server sees in general.
Metadata
First I want to define the basics: metadata is data about the data. In a messaging system, data is usually considered to be the message contents ( what is being exchanged), and metadata is about the envelope:
- who (sender and receiver)
- when (timestamp)
- hints about the what (e.g. payload size)
Outside of messages, you can additionally have data/metadata about the users themselves, like online status or location, which can be either sent in the clear to the server, or deduced from user activity.
Obvious message workflow
This might be too obvious for most people, but for clarity’s sake, I want to assert that to send a message to another entity, you need:
- a sender
- a message
- a receiver
This is not technical, this is baked into the concept of sending a message. Those elements will always be present somewhere in the workflow. Assuming a working encryption system, the message data itself will not be considered.
There are, however, some technical tricks that can hide a lot of things from the infrastructure layer.
XMPP
I cannot really make this an introduction to XMPP but to summarize, XMPP is an extensible federated protocol for messaging and presence. It is using XML for the most part but nobody should care (except trolls, I guess). It started in 1999 as Jabber, and grew to be an IETF standard under the name XMPP after Jabber got bought by Cisco (we can still use the Jabber name in many ways).
The protocol started server-heavy with light clients - and in fact, you will read as much in the "XMPP, the definitive guide" book -, but the trend got reversed in the last decade due to the rise of mobile clients which can be updated very often and other circumstances.
There are clients and servers, and it is therefore a protocol made of client-to-server interactions and server-to-server interactions, each with their own privacy implications.
The key elements to remember in order to assess threats in the XMPP network fabric would be:
- Your server is the only entity sending data to other servers. Every single bit of XML your XMPP client sends goes through your server.
- Other servers will only see your interactions with their own users.
Those two points are true in most non-P2P models. Centralized models can be thought as a specialization of this model, but with only one single server.
That is why it is essential to rely on a server you can trust, either operated by people you trust, or at least who have some accountability in place, for example the services listed on providers.xmpp.net .
Threats
I can roughly point out four types of "passive" threats on metadata for XMPP:
-
A server compromise (present)
- Correlation of data streams in real time
-
A server compromise (future)
- Exploitation of the static data available on the server
-
An attacker present on the server network
- Can see what the server does (both with clients and servers)
-
An attacker on the client (your) network
- Can see what your client does
Network metadata threats
Client network attacker
Only a bit of metadata is visible on the client network:
- Timing and size of every network call
- HTTP(s) interactions that cross the XMPP boundary, like file upload
- Jingle interactions
- Other services (STUN, TURN services, etc)
On XMPP, one client - with only one account - will only ever connect to the XMPP server the account is living on. Some services might go through HTTP, which then cross the boundary and create networks calls outside of the XMPP TLS stream, like HTTP Upload (XEP-0363). Likewise, application like audio, video calls or direct file transfers will also cross that boundary.
STUN and TURN servers are used to bypass NAT restrictions which are everywhere nowadays. STUN can be described as a way for clients to find out their reachable address from behind the NAT, and then use that to establish a peer-to-peer connection, and TURN is used in the more restricted networks where a relay is necessary to be able to communicate. In the case of TURN it creates yet another server overseeing the -encrypted- communication, and in most modern XMPP setups the server operator also operates a TURN server; a network attacker will also see the encrypted stream to the TURN server from the client. In the STUN case, clients will do a short network call to find out their external address, then establish the direct session to the other user, making that metadata available to an attacker as well.
I do not know of any work related to fingerprinting encrypted client-to-server XMPP traffic, but I would guess that it is fairly easy if no precautions are taken.
Some examples:
- Ping ( XEP-0199 ) as connectivity check: periodical request/answer to your server, with a fixed size
- Room join ( XEP-0045 ): send disco, receive disco info, send join request, receive plenty of presences in a short period of time
- Sending messages: composing chat state (somewhat fixed size), sometimes followed by other chat states (e.g. paused), followed by a longer payload, the message, which size can often be predicted by the time since the chat state.
- etc…
This metadata will depend on the XMPP client used and how much it is customized, because while protocol workflows are mostly the same for all clients, some might do different or extra steps in common cases.
Server network attacker
If an attacker can eavesdrop on the server network, it gets access to a lot more info than on the client network:
- It has a view on both open client and server connections
- It can then correlate quite easily the destination of stanzas sent by the client (since it will see something go from the client to the server, then immediately something from the server to another server)
- If the server has lots of active users, it makes it harder to correlate, but metadata of single-user servers are a very easy target!
If there is plenty of activity on the server, it will be much harder to find out what the source of the activity is:
All stanzas are going through a single stream for each destination server which makes it difficult for an observer to find out which payload is going where provided there is enough activity, and luckily XMPP is a somewhat chatty protocol.
Server compromise
On-disk data and metadata
In an event where a server gets compromised later (or even a backup gets compromised), it is important to know what kind of data and metadata will be visible to the attacker.
There is a lot of data available on servers with normal XMPP usage, and a few categories stand out.
User accounts
Accounts are tied to a specific server; they can in some ways be migrated to another server but the process is currently not fun.
For a normal user, one of the important troves of metadata is the roster (contact list): it contains all the people you have added to the list, which allows you to put them in groups, as well as letting the server broadcast your online presence when you get online.
Another important piece of information is your bookmarks list: this allows you to sync the chatrooms you are in between all your clients, but incidentally that information stays in cleartext on the server as well.
The most identifying piece of information about the user on the server can be the vcard, as it is litterally a collection of personal data (including the user’s avatar).
And the server will usually store some useful information it can know without asking the user: date of last connection, last seen IP address, etc… Some of this data can be very useful to assist manual account recovery!
Groupchats
Groupchats in XMPP are hosted on a specific server, which does not have to be (and often isn’t) the user’s server. The room identifiers are in the form room@a-groupchat-server.net , where a-groupchat-server.net gives the information of where the room is hosted.
The server hosting a persistent room needs to keep, on-disk:
- A message archive - not mandatory but very nice to have -
- A members/admins/onwers/ban list; only the owner list is mandatory, as long as everyone else is fine with being a "normal" participant
- Data related to the room, like configuration, access control (which can be tied to membership), topic, etc
There are two types of rooms:
- Semi-anonymous rooms, where only room admins (and their servers, as well as the room server) can see the real addresses of the users exchanging messages, creating a pseudonymous situation.
- Non-anonymous rooms, where everyone sees everyone’s addresses; here all participants and their servers can know who is in the room and authors messages.
In terms of live data, due to how XMPP is designed, all the server of all the active participants will be able to see who is sending messages:
But in semi-anonymous rooms, they only get a nickname, and not the real address.
Live stream interception
As stated earlier, choosing a server you trust is the very first step, and if you do not trust your server (and operators) at all, why are you there?
If the server itself is compromised, the point-to-point TLS encryption between client and servers, and between servers, becomes very much useless, which means "server network attacker" scenarios can now be matched with 100% precision, and more:
- Correlation between sender and receiver is exact
- The user’s XMPP address can be mapped to an IP and port
- Stanza type is exposed, whether , , stanzas are sent, everything is known
- Activity patterns can get a lot more detailed
- E2EE still works to protect data - but can get disrupted -
Some solutions for XMPP
Remediation of network metadata leaks
I do not see magic bullets here, and everything has a cost. The idea would be to add noise that is not easy to filter out when parsing network activity at scale.
- Inserting random padding in server-to-server and client-to-server communications would help, of course, but network data is not always cheap, and processing XML also has a cost.
- Another solution would be "garbage" stanzas sent in various manners to other servers, but again there is a cost here, and since very little litterature exists on such attacks, it remains to be seen that there is an upside to this.
- Another one would be to allow stanzas to be grouped on the wire when possible (e.g. presence updates or groupchat messages)
- The last one I can see would be to induce random delays in stanza delivery when processing them on a server. This has of course a usability cost, but for large servers, adding random delays < 0.3s would probably be enough to make correlation quite hard, but of course ordering must be preserved.
Remediation of roster issues
Presence-based messaging has been going downhill for a while thanks to a smartphone-centric population and apps, which means the roster is not as useful as it could once be.
It still provides useful information in my opinion, and is also very useful for abuse prevention, since you do not receive abuse from your contacts - hopefully -.
The roster could be stored on devices instead of servers, and we could find a way to share it encrypted and cross-signed with the user’s other devices. This does not change the fact that your server will see who you are chatting with or sending presence to.
The bookmarks could likely be shared in the same way, and removed from the server as well.
Fixing groupchat metadata
As mentioned earlier, if we phase out presences in many cases, being passively inside a room would at least prevent being exposed to servers looking at the traffic, as long as nobody requests the participants list.
I do not see a good way out of this for metadata.
Fixing many types of metadata issues
The most deployed end-to-end encryption solution on the network is OMEMO , but not in its latest version, which allows for Stanza Content Encryption . The version most people use only encrypts the message’s body tag, which is the text sent by the user. By extension, uploads are also encrypted there because they are encrypted using OMEMO Media sharing which is really just encrypted using AES-GCM with a key and IV shared inside the URL sent through the OMEMO session.
This means people wanting to use nice features like composing chat states and the like have the choice to either generate this metadata in the clear, or disable the feature, which is not ideal.
Other interesting solutions
Serverless messaging
XEP-0174 describes how to operate XMPP inside a trusted network, which bypasses the need for a server by using mDNS on this network and advertising your address with your local IP.
I liked it a lot and had fun at the time, but it was from a blissful era where encryption was not seen as the foundation upon which everything should be built. It means that everything transits unencrypted on the network, both metadata and data are unprotected. Data could be encrypted using OTR, but some required bricks for the modern XMPP experience like the Personal Eventing Protocol are unavailable, making OMEMO in its current form a no-go.
Implementations are scarce nowadays, as an incomplete XMPP layer inside a normal client is usually pretty convoluted to maintain and after a while it was removed from most clients where it existed.
XTLS
XTLS describes a way to create a direct and encrypted TLS stream between two entities using jingle, which could be then used to exchange stanzas in a secure manner, without going through the server.
That means that the metadata threats shift from server-based to client-based, which can be an upside or not at all. The layer used for the channel is also quite important, as it could be In-Band Bytestreams (which means the data goes through the usual client-server-server-client route) which would then provide additional E2EE cloaking of all data exchanged between clients, but still going through all the entities to route data.
Other services/protocols
Signal
Signal is a pioneer in encryption systems at scale, and keeps pushing the boundaries of what is possible to do securely. Nonetheless, their messenger is centralized, with systems running on AWS and Azure (as far as I can tell), which makes them very dependent on the US political wasteland as well as the tech landlords’ whims.
They do a lot of things to ensure things are as secure as it can be within the constraints they imposed on themselves , and as such while I trust them for now, their servers certainly have a lot of opportunities to collect and store a ton of metadata, simply due to the fact that this is a centralized system. While their cryptography work is class-leading, which makes my data secure (as long as someone does not bust the secure enclave which protects my "recovery code" I guess), keeping my metadata volatile and secure there is only a leap of faith on my part, as I can have no guarantees.
Matrix
Matrix is a federated protocol which has many of the same flaws as XMPP with regards to metadata.
One notable difference is that matrix is more like a distributed database with built-in conflict resolution, which leads to every participant’s server replicating the state (data and metadata) of the rooms they are in. This creates a more difficult situation for metadata than XMPP, because while XMPP servers can see what goes through them, in Matrix the servers are required to store this information.
Matrix also has two different sets of APIs for client-to-server and server-to-server communication, which should allow it to batch messages when appropriate.
SimpleX
SimpleX is a protocol with a lot of cryptography baked in, and has interesting properties such as the absence of user accounts and therefore identifiers (which means very little data on the servers can be compromised).
One of its more interesting properties is that it has 2-stage onion routing baked in, which allows it to sidestep many issues around metadata due to connecting to servers.
(credit:
Wikipedia
)
The whitepaper stresses that it is still important to choose your servers well, but that is still less critical than in XMPP since you can easily switch at very little cost.
(P.S.: calling your protocol "SMP" is not nice if it is not based on the Socialist Millionaire Protocol , I haven’t checked but skimming did not reveal any mention of it)
Conclusion
As Daniel noted on mastodon, there are some low-hanging fruits to improving the metadata issues around XMPP (and some higher-hanging fruits as well). I agree that this is not going to be even a blip on XMPP adoption, but we should do what we can nonetheless to improve the situation, in order to improve the standard and the ecosystem. That said, I can perfectly understand that since a lot of the work is purely volunteer-driven and our time and energy is limited, it can appear to be a waste of time to dedicate them to removing bits of metadata here and there.