Synapse quick sands

The result

I’ve put a few scripts and a documentation on how to “quickly” prepare a Synapse server, with SSL management, a TURN server for quick and easy VOIP, a Load balancer and a hosted Riot client.

https://github.com/Miouyouyou/matrix-coturn-docker-setup

The reasons

Oh, a new toy !

End of December 2019, I started to have more free time for myself and, after some Gitlab debacles which went like “we’re changing our TOS, SIGN or GTFO !“, I started to think about starting to host more and more services myself.

Aaand, I also wanted to play with ActivityPub, Federated tools and play a bit with some Chat servers, see how the new “Chat servers” are doing now.

A few months ago, I already stumbled into Synapse, a MATRIX protocol server, which ought to combine the joys of “Federated servers” and chat protocols, while providing all the bells and whistles of current chat rooms (picture, file and video sharing, video-conferencing, bridges towards others protocols, …).

Then I started to read the documentation and went like “Ugh… Typical hipster project.”

“A Docker ? Oh yeah, we have one, you just need to setup various environment variables, run it once, edit the configuration file then relaunch it again, but with different variables this time, …”

And then I went like “Good… I’ll just try to install this, test it a little, and then document the installation in a more ‘friendly’ manner in order to let other people try this project. The installation documentation is terrible but The Riot client for this server looks very nice and promising, so let’s give it shot !”

This was a terrible idea.

WHAT THE !?

Can I add a user, please ?

First : The installation was a pain. The documentation is horribly done and I quicky discovered that, the main reason why they ask you to generate a configuration file with specific tools before hand, is because the main configuration file has some “keys”, that has various passwords and cryptographic uses…

Because, YEAH, I love to put my SSH keys in sshd_config !

I learned a few weeks ago, while fiddling with Synapse code and trying to put up my guide, scripts and Docker build scripts; that Synapse configuration file can actually be split, so the keys could be in a separate file loaded with “docker secrets”. But, yeah…

I went on with their default Docker setup first, which invoke the default configuration file generation script, because I didn’t want to go with the whole Ansible Playbook (I haven’t had the time to play with Ansible… One thing at a time…).

I rebuilt their Docker image from their Docker file because I had the good idea to run this on a cheap ARM server from Scaleway, started to run it and… it started ?

Good !

I remember hitting a few issues with my HAProxy setup, and the fact that you need to serve some .well-known/matrix/server that Synapse’s own webserver won’t serve nor generate by itself ! Because why would you serve files that pertain to the protocol related to your program, hmm ? Just force the admin to serve this on another web server, it’s much more fun !

I then used the Riot client on https://riot.im, configured it to ping my server and, after fixing a few issues in my DNS and HAProxy configuration, it worked !
GOOD !
Well kinda… It accepted that I try to connect to my own server but I forgot the most essential part : I didn’t add a user with a password to connect on my chat server !
So, let’s look at the main README.md !

It’s not written…

Wow…

Why would you go so far as creating a chat server that can do so much and forget the essential part “How to add a user”.

Now, I had this reflex because Riot actually asks for a user/password combo.
Thinking about it, IRC servers don’t need such things and work quite well ! Well, until 10000 people start using your chat server for WAREZ and “file sharing”, then it’s less fun suddenly.

So, yeah, I need to know how to add a user to my chat server !

Turned out that the repository with the Ansible playbook for installing Synapse has more useful documentations than the actual Synapse repository ! So, after searching for 30 minutes, I learned that you have to invoke this command, on the Synapse host :

register_new_matrix_user -c /etc/synapse/homeserver.yaml -u chat_user -p chat_password -a http://localhost:8008

Now, /etc/synapse/homeserver.yaml is just “a YAML file” containing “a registration_shared_secret: directive” that is used by your Synapse server.

Running that command with the wrong URL will lead to various hang and crashes, while failing to provide the configuration file will lead to a quick error message telling you that it cannot guess the registration_shared_secret, which seems to be used to encrypt the password (I guess ?).

Which lead to my main question :

Isn’t there an administration UI by now !? Do they do everything by hand !? Why develop a very extensive chat client while forgetting the basics : Managing your server !

Anyway, I connected to my chat server, created a room and…
“Well, it works, at last… I can chat with myself…”

Then I clicked on “Room discovery” and, due to the whole Federation thing, filled with dreams and rainbows, you can list the chat rooms of others serv… No, from “matrix.org”… And only matrix.org.
If you want to list chatrooms from other servers, you’ll have to know their addresses first…

Wow… So much for the “Federation”…

“It’s okay, let’s look at the chat rooms. WTF ? The first room has 40000 users ! Let’s jump in !”

That was a terrible idea.

Resources management ? What’s this ? Can you eat it ?

ONE SINGLE USER ! ON ONE SINGLE SERVER !
IS ALL IT TAKES TO DOWN A SYNAPSE MATRIX SERVER !

I, ALONE, ON MY OWN SERVER, was able to DOS it by joining ONE room !

That’s where my opinion for Synapse went from “The documentation is HORRENDOUS but at least the chat features are there” to “WTF !? Dumby The Clown connects to your server, joins a room and THE SERVER GOES OUT OF RESOURCES !? THIS IS INSANE !!!”

Some people will say “But, hey, it’s 40K users maaan !”, to which I’ll answer : “Let’s do the math, people”.

I understand that, now, most softwares are written like shit and if you don’t have the latest 4GHZ processor with 32GB RAM, you’ll feel that “it’s ok if it lags, or if you’re out of resources”, but you really have to understand the POWER you have with a single machine, and how little a chat consumes.

So, there’s 40K users. Let’s just say that for each individual message you consume 4KB of RAM for the message and 4KB of RAM for the metadata of this message. I’m ignoring files and pictures for a reason, they can be ignored until you deal with the messages first.

So that’s 8KB of RAM for every single message. It’s a chat room, most messages will be less than 150 characters, meaning that even with UTF-8 encoding, it’s still “overkill”.

Now let’s say that EVERY user in the 40K users chatroom send ONE message per second, you’ll have to deal with :
8192 bytes/message * 40 000 messages = 327 680 000 bytes for the messages. Roughly 330 MB (or roughly 312.5 MiB).

Then, let’s say that you use double buffers for every single user, because why not :
You’ll use 330 MB per buffer * 2 buffers -> 660 MB in total.

Understand that you’ll have to output 330 MB PER SECOND to the chat users, which is no-no with my connection. I’m “at best” at 2 MB/sec, that’s not going to cut it.
Also 40K messages per second is unreadable and will mostly put your browser on its knees.

BUT 660 MB is not THAT much on a system with 2 GB of RAM. You can still store them and, if messages are not sent quickly enough, you can either ditch the old ones by overwriting their buffers OR wait for the messages to be sent and ignore new messages for the moment.

So what’s the problem ? Synapse ate 2GB of ram in a few seconds and then got killed by the OOM killer ! All of this for ZERO message sent to my chat client !

See, if you eat more than 660 MB of RAM in this kind of setup, it’s already OVERKILL. Sure, you might deal with 400 MB for storing and sending multimedia content (videos are generally YouTube links with a thumbnail). Yet, even in that kind of unrealistic setup, 1GB of RAM is more than enough !
Just in case, I retried this with a PostgreSQL setup and 4GB of RAM : Almost same result ! I was able to see the users list but then OOM and boom !

So, yeah, after seeing this, I let it down for a few days, then went like “Let’s just make a documentation, say that the service is shit in its current state and call it a day… ?“.
Then I thought about it and tried to look at the documentation, see if there isn’t some others trick I could try.
Note that I tried the tricks in the README.md, under the section :
“Help!! Synapse is slow and eats all my RAM/CPU!”

Given how Synapse ate 4G of RAM with a simple PostgreSQL setup, I’m pretty convinced that these ‘tricks’ should be the default setup for Synapse. If your service is known to down a server with just a few individuals, maybe you should think about the default configuration.

But, yeah, I gave these tricks a try and these didn’t make that much of a difference. The whole SYNAPSE_CACHE_FACTOR just lower the time it takes for the server to go down while eating all the RAM, but still this made it possible to join less crowded rooms.
Which is where I found that most of these rooms are QUIET. With at most 4 people chatting… I mean 30 messages in a day, roughly !
Also most of “online users” were users bridged to Matrix from various other protocols, which gave a “we overinflated the number of users in our chat to make it cooler !” vibe.

In the end, the only thing I can say about this is “Nice chat client. It works nicely on a single server with my chat rooms setup… But the resource management is horrendous when you check up federated chat rooms !“.

Is there a point to federation ?

The main issue with Federation in Synapse is that I don’t see the others servers ! Riot is unable to list them directly, you have to configure the servers you want to see in the “Federated” list in its config.json (which is weird BTW, the CLIENT SHOULD ASK THE SERVER about which others servers it federates with !).

So on the basic setup, any user using “Riot” will :

  • connect to your server
  • list the chatrooms on matrix.org
  • join the crowded ones
  • down your server
  • never come back gain

Now, by looking for other parts of the documentation (yeah, you read it right), I found some configuration directives named “room_complexity” which allows you to tell your server “Hey, I won’t try to relay messages from rooms that are ‘too complex’“.

The root_complexity comes with a score that makes no sense. You can start with 1.0, which basically will make any client connecting on your server unable to join most of the rooms listed in matrix.org. There’s a few rooms at best you’ll be able to connect to, but they are not that common.

Still, where are the others servers ? How do I discover them ?
See, when you use that kind of federated services, you’re interested in these little servers with peculiar rooms (and straight weird shit too). However, here you can only see matrix.org rooms. If you’re looking for the addresses for these little servers, just search for some index of federated servers on the web…

To me, that’s the opposite of Mastodon and Fediverse, where you have your server, you start to federate with the “mainstreams” servers and already start seeing tons of messages coming from tons of small servers. Then you start browsing on these little servers, have a look at their federations, find some new federation streams that interest you and start adding these streams back to your server !

In Mastodon, Federation “just works”.
In Matrix Synapse, Federation “just sucks”.

That’s mostly due to the main client used for that, Riot, which is powerful but seems to be more and more tailored towards ‘matrix.org’ and their services, and less and less towards small federated servers.

I understand that they want to compete with Discord and Slack, in order to generate revenues and pay the developpers working on their software full-time.

But at the same time, I really feel that it’s missing the point. I still have to test “Rocket.chat”, but my biggest question is : Why would I use Matrix Synapse for something like a Discord server ? Just open up a Discord server, it’s quick, easy and gratis.
It’s clearly not “Open Source” nor Free Software but most users don’t give a shit about that. They care about features, usability and integration. And Discord is feature-full, easy (but not quite) to use, and is being MORE and MORE integrated in the Gaming community, to the point where some chat rooms in mobile games are directly connected to Discord servers.

Meanwhile, the whole “interconnected” and “big community” that federated MATRIX servers provide is a feature that big services like Discord or Slack cannot compete with.
But, for that, you’ll need a client that actually list all servers federated to Matrix.org and let the user discover the chat rooms of these servers.

But the server doesn’t exist !

I forgot… In the Matrix protocol, a room is the aggregation of messages sent by different elements from different federated servers.

See, take Mastodon and the federated stream. Take a client and filter the federated stream with a Hashtag. Make every message from a ‘chatroom’ send ‘toots’ with this Hashtag.
BAM ! You got a Matrix chatroom !

When you list the chatrooms in matrix.org, you actually list the all the public federated chatrooms from servers federating with matrix.org
It just DOESN’T LOOK LIKE THIS. BUT IT IS.

Now, matrix chatrooms have all the bells and whistles (and video-conferencing too, you won’t get that with Mastodon).
Still, that means that your server doesn’t ‘host’ chatrooms per-se… AAAND that’s where it becomes VERY blurry. However, the documentation insists on how your chatroom isn’t really hosted by a single entity, yada yada yada.
So, MAYBE they’re not listing other servers with their chatrooms to enforce their points ?

Now, there are cases for Federation that I can clearly see from a corporate point of view.
Let’s say that you want to provide chat services, with your own chat servers scattered around the globe.
You want these chat rooms federated so that every user see the messages sent to every server.
All of that with minimum latency.

With a federated chat server, you could this quickly :

  • Remove federation with the matrix.org server.
  • Add the addresses of the different chat servers from around the world in the federated list (in homeserver.yaml).
  • Ensure low-latency (peered) communications between your servers
  • PROFIT !

The users will connect to the closest end-point, reducing lag, and the servers will relay messages sent to the chatrooms from any endpoint with low latencies !

However, for that, you’ll first need to be able to setup the chat server quickly and efficiently…
Also, that’s a very limited use for Federation, and is just a clustered server with ‘shards’ all over the place.

Back on the installation

Anyway, I wanted to finish this guide about “How to install a Synapse server” that was not as INSANE as the original Docker image from Synapse.

The reason why I called this post “Synapse Quick Sands” is because the more I tried to do this correctly, the more the Synapse project seemed to be fucked up for simple installations, the more I felt that it was a wrong decision to try generating such guides.
This gradually went from “Providing a nice and user-friendly documentation on how to quickly setup a working Synapse server could be a nice plus” to “I should just ditch this shit and forget about it ! This is taking WAY TOO MUCH TIME ! This project is made by insane people !“.

I won’t talk about the issues I got with CoTURN, in order to have a working video-conferencing setup… this project also took me a while to setup, mostly because the documentation is also written “by the main developers, for the main developers”.
Note that the point of TURN is to make the various endpoints, that would like to initiate a video-conference directly, setup their firewalls so that the communications are not blocked.

So, these two weeks, I put in place a ‘semi-automated’ Docker Compose setup that allows you to prepare and install a Synapse Matrix server with :

  • a TURN server (using CoTURN), for video-conferencing setup;
  • a PostgreSQL server (because Synapse default SQLite setup has terrible performances);
  • an HAProxy load balancer, to handle SSL connections, provide ALPN and provide some basic HTTP protections against bots;
  • a NGINX server hosting potential ACME challenges for Let’s Encrypt SSL certificates creations, and ready to host a Riot-web client.

The setup takes a few script calls to configure the various services, can help you setup SSL certificates with Let’s Encrypt, and should allow any admin with Docker skills to have a try at “Synapse”, “Riot” and all their joys.
Video-conferencing works ‘out-the-box’, as long as you setup your DNS servers correctly. Tried with my smartphone and my PC, one connected on a 3G network and one connected on my Wifi.
You can share files, pictures, links, …

Now, I remade parts of the Docker build image for nothing… After checking Synapse code to understand how the initial Docker image generated the configuration, and how I could modify the part concerning the database, which is setup for SQLite by default, I learned that the main YAML file could be split into multiple files…
So, yeah, if you want to enhance it, remove the overcomplex.sh reference in Synapse build file, use Synapse as the ENTRYPOINT for the Docker image, create and split an initial configuration into multiple YAML files in one folder, make the ‘docker-prepare.sh’ script generate the ‘macaroon’ and other secret keys independantly into a specific YAML file and this should be better already.

Still, I’m taking a break out of Synapse. It looks like a nice project, but it lacks SO MUCH basic things to make it useable, and making people want to host such servers, that I’ll just give up on it for now.