Element-web: Feature request: export chat logs

Created on 22 Nov 2016  ·  68Comments  ·  Source: vector-im/element-web

A very useful feature. I find it strange it isn't already implemented, unless I am bad at searching.

Alternatives

  • specify time range
  • possibly, specify file format or formatting

What do you think?

feature p1 timeline

Most helpful comment

i’ve been thinking about this while implementing a smarter clipboard for riot-web. I realised i don’t actually necessarily understand what the 63 people upvoting are after here. is it:

  1. 👍 - ability to export logs from Riot/Web for a given room(s) as a big lump of static HTML, suitable for printing or grepping or sharing out of band? This is hard, given public rooms can easily have millions of messages, and you would probably run out of RAM (or bandwidth or time) trying to export them. But we could implement it with a date range or msgcount limit.
  2. ❤️ - ability for Riot/Desktop to save all the messages it sees on disk as HTML or plain text, a bit like an IRC client would, spidering to fill in any gaps in the logs?
  3. 🚀 - actually, pantalaimon / matrix-recorder / matrix-dl etc actually have solved this already for my use case.

Seshat may help with option 2 in the near future. Meanwhile, i may be able to tweak the clipboard code to easily support option 1.

Please vote by upvoting this msg with the matching emoji so I can get more of a handle.

All 68 comments

yup, i find myself wanting this repeatedly too. #2129 is close to it, but would be nice to just be able to say "download room as log".

I was once toying with the idea of selecting some of the chat, and hitting some kind of key combo/ menu option to copy a IRC style log into the clipboard?

This would also be cool!

Any plans to do this ?

as a p2 feature req this is basically stuck behind everything tagged as p1, which is around 200 issues right now. but given this is FOSS anyone is welcome to contribute it!

I find the search feature of matrix/riot pretty inefficient for an advanced user like me that is used to do greps (even using regexs) over simple text files.

So, being unable to extract simple plain text logs from my matrix conversations (both private or in rooms) is a serious regression for me, when compared with my previous experience with Jabber or IRC clients.

https://gitlab.com/argit/matrix-recorder is a good workaround for this for now (and even does e2e if you desire, albeit storing it unencrypted). We could also build something like this into riot itself in future.

+1 for this feature request

I also need to inform that two my accounts at gmail and yahoo been locked and i have no access to them long time ago. I also worry that my citizen auth key was stored at that acc. Maybe anyone could help if you have my chats and so powerfullm

Being able to export IRC-like logs would be an extremely useful feature to have for quickly copying chat logs for reoccuring FOSS project meetings to archive somewhere. Glad to see it's now labelled p1 instead of p2.

I miss this feature a lot and I'm tempted to give this a go. However the riot-web code base looks very daunting, especially with the weird separation of sdk and actual app. Can someone who has an overview of the code base give a rough plan of the relevant cutpoints where this would be implemented in riot-web?

@Valodim all the code would be in https://github.com/matrix-org/matrix-react-sdk/

This feature is also required by GDPR. I think GDPR favours structured formats, json and the like, but I don't mind whether it's txt or anything else.
But besides the legal stuff it would be a really really useful feature. Copying the history manually is a p.i.t.a..
Also GDPR and other valid reasons convince server operators to delete older history (we are discussing something in between 3-6 month atm). So whatever is not copied gets lost.
So if @Valodim picks this up it would be great.

@ilu33 https://matrix.org/docs/projects/other/matrix-recorder.html
riot-web doesn't store the data so it doesn't have to be able to export it

Nobody requires riot-web to store data. riot-web is just the user frontend and it pulls data from the hs all the time. If the user wants to store selected data the user frontend should provide a way to do so.

Using another tool is a crutch at best. How would the average enduser even install matrix-recorder? Not to mention https://gitlab.com/argit/matrix-recorder/issues/1 which makes the tool unusable for everybody who's in several high traffic rooms (which no sane person would want to archive). Also routinely archieving multi-user rooms without any reason or need to do so (just because the tool happens to be catch-all) defies every purpose of data protection. It might be allowed as long as you don't publish the data but it sounds immoral to me. I would not want to do that.

And regarding encryption: Matrix has already problems handling E2EE if more than one device is involved. It regularly breaks for no obvious reasons (issues are up here on github). I would not want to recommend using another device to anybody.

Yet on the other hand you run into much bigger limitations in webapps in the sizes of files you can generate before the browser kills you

hm - limit the amount of lines? Maybe only export the stuff riot-web has already loaded into the browser window?

It seems that matrix-recorder has a similar problem (ORG.MATRIX.JSSDK_TIMEOUT)?

Note @t3chguy : I edited my previous comment.

Why is there nothing done since ticket open of Nov 22, 2016? The Riot client fronted should have a simple user-friendly basic chat export for admins, just copy chat from 1.1.2018 to 1.2.2018 in channel x to clipboard. It is also a stubborn and ridiculous excuse to say, client, doesn't need to have this feature because of limitations of size. Seriously? We are speaking of chat logs, not of file attachments. If you copy a chat log for a day or week its like 10-1000kb in size.

@makedir It does indeed seem like a very useful feature, but please keep it civil. Developers have their own priorities. If you believe it is not very hard, why not give it a go ? Contributing to a super cool project like Riot is a fun experience !

JFYI, @thiblahute was developed a simple but useful command line for this:
https://gitlab.gnome.org/thiblahute/matrix-dl

In order to install matrix-dl without messing our system's Python setup, we will document how to install it using a virtual environment. The command virtualenv is provided by a Python package that enable the creation of these virtual environments. You can install virtualenv using you packaging system.

Run:

virtualenv -p python3 matrix
cd matrix
source bin/activate

Now, clone the code:

git clone https://gitlab.gnome.org/thiblahute/matrix-dl.git

And install the dependencies and the script itself in the virtual enviornment:

cd matrix-dl
python setup.py install

Usage:

The tool's usage instructions are these:

matrix-dl [-h] [--password PASSWORD] [--matrix-url MATRIX_URL]
          [--start-date START_DATE]
          username room

Download backlogs from Matrix as raw test

positional arguments:
  username
  room

optional arguments:
  -h, --help               show this help message and exit
  --password PASSWORD      Will be asked later if not provided
  --matrix-url MATRIX_URL
  --start-date START_DATE  format %d%m%Y

A couple examples:

Let's download the conversations from Example channel since the beginning of 2018:

matrix-dl --matrix-url https://matrix.example.com/ --start-date 01012018 \
  <fsurname> "Example" > example-2018.log

Then you will be asked for you password, and if there is no errors, the conversations will be dumped in the file example-2018.log with the format hh:mm:ss — @user: message

You can also dump conversation from unnamed rooms, such as personal conversation, you just need the room's internal ID. You can get this string in riot by clicking in the room's settings icon (the gear so far) and at the end of the settings, in the advanced section, there's the room's ID:

matrix-dl --matrix-url https://matrix.example.com/ --start-date 01012018 \
  <fsurname> \!i4BiDaYPkvfbcWdAgb:example.com > my-chat-2018.log

Remember to escape the symbol !, otherwise the shell may consider it an operator.

What Thibault @thiblahute has programmed is for sure useful, but only for programmers who know how to install and deal with that software.

An export of the chatlog to simple text is essential for example for group meetings, to be able to write the minutes after the meeting. Therefore:
Please, please, please implement a possibility to export the chatlog to a simple text file.
Thank you very much.

Any progress on this? "manually running something like select * from events where user=?" as suggested here https://matrix.org/blog/2018/05/08/gdpr-compliance-in-matrix/ is not really a solution. I can't even find this issue in the GDPR project timeline.

JFYI, @thiblahute was developed a simple but useful command line for this:
https://gitlab.gnome.org/thiblahute/matrix-dl

@psaavedra is it working on encrypted channels too ?

Thanks for this tool BTW.

Please, please, please implement a possibility to export the chatlog to a simple text file.

Allowing to do this in a convenient way for every users sounds like an important option to me - at least for backup before deleting the history, or exporting to other channel, or searching in an encrypted channel (as it not working right now), and so on.

@thiblahute, @psaavedra when I try to run the command line tool to dump log for unencrypted named rooms I receive this error:
raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

instead for encrypted rooms I receive this error:
event not found
any suggestion on how to debug these?

Being able to export the logs would be extremely useful.

@psaavedra , I do not understand what the matrix URL is. Can you please clarify what matrix.example.com is? I mean, is the 'example' in the URL the channel name?
I get a lot of exceptions while doing the same but replacing the 'example' with the channel I've created. BTW there is nothing called 'channel', but only rooms right?

image
I get an exception like this.

I've got no idea what URL is the right one

Really sorry if this is the wrong place to ask for help, but it is absolutely critical that I download a Matrix chat room and I don't know where else to go. I'm trying to use @thiblahute's tool, like so:

python matrix-dl --matrix-url https://riot.im/ [my username] "[room name]"

then I enter my password. Then I get this error:

[my username] connecting to https://riot.im/
Traceback (most recent call last):
  File "matrix-dl", line 164, in <module>
    getter.run()
  File "matrix-dl", line 80, in run
    password=self.password)
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\client.py", line 249, in login_with_password
    return self.login(username, password, limit, sync=True)
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\client.py", line 270, in login
    "m.login.password", user=username, password=password, device_id=device_id
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\api.py", line 160, in login
    return self._send("POST", "/login", content)
  File "C:\_Programs\Python\lib\site-packages\matrix_client-0.3.2-py3.7.egg\matrix_client\api.py", line 691, in _send
    code=response.status_code, content=response.text
matrix_client.errors.MatrixRequestError: 404: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /_matrix/client/r0/login was not found on this server.</p>
</body></html>

What am I doing wrong?

In the future you should open an issue on that repo https://gitlab.gnome.org/thiblahute/matrix-dl, not here. I assume the problem is Riot is not a matrix server. The server is the bit at the end of your username. So if my account was @aaron:matrix.org, https://matrix.org would be the server or as they call it matrix-url.

No need to reply thanks if that fixed it. If it didn’t go open an issue on https://gitlab.gnome.org/thiblahute/matrix-dl.

… backup before deleting the history, …

Or (edge case) before rejoining a room from which you have been kicked through no fault of your own.

An example of history lost after rejoining: https://github.com/matrix-org/synapse/issues/2212#issuecomment-487407191

I also would love to see this feature...

Is there any way to do that? I tried what's on https://gitlab.gnome.org/thiblahute/matrix-dl but doesn't work for me. I opened a issue there but seems like development there is inactive.

Is there any way to do that? I tried what's on https://gitlab.gnome.org/thiblahute/matrix-dl but doesn't work for me. I opened a issue there but seems like development there is inactive.

the method works fine if you do it correctly, it is really annoying though. and the matrix-dl client doesnt export all ASCII characters correctly. I asked the dev about it and he never wrote back.

i’ve been thinking about this while implementing a smarter clipboard for riot-web. I realised i don’t actually necessarily understand what the 63 people upvoting are after here. is it:

  1. 👍 - ability to export logs from Riot/Web for a given room(s) as a big lump of static HTML, suitable for printing or grepping or sharing out of band? This is hard, given public rooms can easily have millions of messages, and you would probably run out of RAM (or bandwidth or time) trying to export them. But we could implement it with a date range or msgcount limit.
  2. ❤️ - ability for Riot/Desktop to save all the messages it sees on disk as HTML or plain text, a bit like an IRC client would, spidering to fill in any gaps in the logs?
  3. 🚀 - actually, pantalaimon / matrix-recorder / matrix-dl etc actually have solved this already for my use case.

Seshat may help with option 2 in the near future. Meanwhile, i may be able to tweak the clipboard code to easily support option 1.

Please vote by upvoting this msg with the matching emoji so I can get more of a handle.

i’ve been thinking about this while implementing a smarter clipboard for riot-web. I realised i don’t actually necessarily understand what the 63 people upvoting are after here. is it:

1. +1 - ability to export logs from Riot/Web for a given room(s) as a big lump of static HTML, suitable for printing or grepping or sharing out of band? This is hard, given public rooms can easily have millions of messages, and you would probably run out of RAM (or bandwidth or time) trying to export them. But we could implement it with a date range or msgcount limit.

2. heart - ability for Riot/Desktop to save all the messages it sees on disk as HTML or plain text, a bit like an IRC client would, spidering to fill in any gaps in the logs?

3. rocket - actually, pantalaimon / matrix-recorder / matrix-dl etc actually have solved this already for my use case.

Seshat may help with option 2 in the near future. Meanwhile, i may be able to tweak the clipboard code to easily support option 1.

Please vote by upvoting this msg with the matching emoji so I can get more of a handle.

What would be perfect to me is option 2 :heart:

I wasn't able to check those you mentioned in option 3 but I will try it out now.... Still riot could maybe has such option.

Isnt it obvious what people want? A simple per chat room export button with start date and end date, export all chat messages from start date to end date and give it as html or txt or zip file.

I want to export personal chats with only thousands to tens of thousands of messages, so running out of ram isn't a concern for me. But even if it was, "you have to use disk as ram to export chats" is better than "you can't export chats at all"

No 1 :+1: , per channel, with start and end date and probably a reasonable msgcount limit. If the room is big or old, people would have to maneuver around this limit by selecting periodic chunks.

Maybe give format options, xml or json would work too. And thanks for coming back to this.

No 1 👍, but it Should be possible to select multiple rooms at once

Why not both? Export on demand, and also save all logs/events to disk as plaintext logs for posterity.

I just noticed I spammed Matrix with lots of pictures (1.5MB each) and would like to help you clean up, but scrolling up takes around 10 minutes (I used a heavy item on the pgup). It would be so great if this process was easier, that is exporting and deleting chatrooms.

Does the integration of search through seshat which claims to support E2EE rooms https://github.com/vector-im/riot-web/pull/11125 change this feature request ? Could we use the indexed data collected there ?

Allow for import as well! 👍

@martindale indeed. this would help unload the high traffic on matrix.org by enabling a migration to another (self-hosted?) server.

@arthurlutz I assume you are aware but you can already migrate to another server (that’s kind of the whole point of Matrix, that no one server owns a room). Just join the room from your new server. If you want the full history you can just scroll up in the room and then your new server will have a full copy of the room. (Yes there should probably be a button in Riot that performs both steps, joins a room and requests the full history. I just filed https://github.com/vector-im/riot-web/issues/12766 for that.)

If you want the full history you can just scroll up in the room and then your new server will have a full copy of the room.

That depends on your room settings and is not always possible. But thanks for filing the issue, hope it can be resolved. AFAIK the plan is to use a UUID derived from a private key as identifier (instead of username@server), and the server just being something you can change as you wish.

Has anyone actually managed to export E2E encrypted chats successfully? With COVID-19, there are many of us working from home now, and presumably many looking for new chat platforms. Without exported chat logs, Matrix/Riot are not viable options for me.

I've checked out the existing export options. matrix-dl doesn't appear to export encrypted rooms. I can't even get matrix-recorder to build, and it looks like it may have been abandoned anyway. pantalaimon was mentioned above, but it's unclear to me how to export chat logs with this.

I tried a number of approaches several months ago but I ended up giving up. If it's presently possible to export E2E chats, it is really really difficult to do so and it needs to be more accessible.

Thanks @JimmyCushnie. I might have to go to a closed-source platform for now then.

Will the future "daemon" (I don't know if it's the correct name) that will allow to search an encrypted channel be able to export chat logs ?
I thought I've seen that information somewhere, yet I can't find it.

Seshat will only be indexing encrypted rooms, it could possibly be extended to exporting but it isn't in place atm.

There is functionality to export file based events (m.image, m.video, etc...) generalizing this to export all events for a room or all events for all rooms isn't that hard. How and if this should be exposed in Riot is another question.

@poljar good news ! Could you point to use the "export file based events" functionality or documentation ?

Its the source for the File Panel on seshat-enabled riot-desktop builds.

If you mean developer docs those can be found here for the Rust side of things and here for the Javascript side of things.

As for Riot like @t3chguy mentioned, it's used for the File Panel.

It would already help if one could PRINT the message pane of a room.. at the moment that seems neither possible in web nor in desktop version.. The latter does not allow print, the former just spits out the room overview and room members pane..

I would also like this feature - even above everything else that is not yet implemented.
I've voted here as well: https://github.com/vector-im/riot-web/issues/2630#issuecomment-526944604

In Germany, Blackberry holds a patent for exporting chats. That's because Whatsapp has removed this feature from their version for Germany (German link: https://t3n.de/news/whatsapp-kein-chat-export-mehr-1238327/).
Could that be a problem if this feature get implemented into Riot?

That case is about sending the logs to a third party via email while this feature request is about saving to a local file. Unrelated.

That case is about sending the logs to a third party via email while this feature request is about saving to a local file. Unrelated.

Agree. It would be ridiculous to deprive users of the ability to archive their own content for this reason, law or not.

In Germany, Blackberry holds a patent for exporting chats.

IANAL but you can't patent an idea. The open-source Pidgin chat client has had export for 15+ years.

Anyone working on this now? It's been almost 4 years. matrix-recorder works for now but it lacks the ability to record older encrypted messages and seems to have been abandoned.

I think this feature is still needed in Element, but while it's not I successfully exported logs and media using https://github.com/russelldavies/matrix-archive

Let me reformulate how bad the lack of offline history is, in hopes to increase its priority.

Imagine we have a community using a chat system not only to share cat picz but to build something great towards a big goal. Like an open source or an open hardware project. In the process we share valuable knowledge and ideas in this chat. In other words, we rely on chat for something serious.

Now consider the same scenario for two systems.

A: Any decade-old IRC or XMPP client:

  1. enable saving of chat logs to local text files
  2. chat with your community for years, accumulating knowledge
  3. the single server we all used is gone (for whatever reason)

Result: a lot of members have replicas of community's entire chat history and can cooperate to quickly respawn it in a new location.

B: Matrix flagship client (Element) in 2020:

  1. chat with your community for years, accumulating knowledge
  2. the single server we all used is gone (for whatever reason)

Result: there is a high chance that everything is lost. If we are lucky, admins of participating homeservers will help to recover _some_ of the lost content. Now we have to scratch our heads: how does s2s replication work? Do homeservers replicate the entirety of _all_ rooms or only slices of rooms relevant to homeserver's users?

I didn't study other Matrix clients but I assume it is the same for many.

Now I open matrix.org:

An open network for secure, decentralized communication

and

Matrix is really a decentralised conversation store rather than a messaging protocol.

and so on.

Considering the practical difficulty of running homeservers let me ask a simple question: What is more decentralized?

a. 1 central IRC server + 10 always-online users that save chat logs and publish them somewhere + 50 users who periodically sync them

b. 1-3 Matrix homeservers you will find in small communities + 0 users saving chat logs

I should also mention that in the IRC scenario it is practically impossible to delete messages from users' local files. In Matrix it is possible to delete messages and while I hope the database is append-only, I'm not sure it is possible for users to recover deleted messages in all cases.

Closing my mini-rant, Matrix is the best open-source, actively-developed, semi-decentralized, self-hosted, E2EE-enabled chat system I am aware of. I highly respect what Matrix is doing and I am very thankful that we now have a decent alternative to Slack/Discord/Telegram where freedom and sovereignty mean something.

The point of this message is to make a case that from a _user perspective_, as it is today, Matrix is less decentralized than IRC. I don't want to sound too negative, but such system is _training_ end users to rely on servers and believe/hope everything will be fine. And all of it comes from one missing feature: easy-to-save offline history that keeps a copy of the knowledge the community generates every day.

I kindly ask to reconsider the priority of offline history.

Thank you for your attention.

Speaking of how to implement it, I suggest these steps:

  • command-line tool that simply saves events of all rooms as local json files, e.g. 1 file per one day of events

    • subsequent invocations incrementally sync only what is missing
    • this tool alone will enable archivists to make rooms much more resilient to loss
  • another command-line tool that reads these files, reads e2ee key file and renders local html files, again one per day of events

  • add UI feature to Element to select rooms, date range, and retrieve a compressed archive that uses same formats as above

* command-line tool that simply saves events of all rooms as local json files, e.g. 1 file per one day of events

Considering this is only a matter of output formatting, which can easily offer several choices, plain text can also be a good choice for greping through them :)

Speaking of experience, this issue is one of the only two that are a blocking one for me to consider switching from full IRC to Matrix instead: I hold years of IRC logs, which act as a complete part of my memory. Sometimes it's easier to remember the context where a link was given, a conversations happened rather than the content itself.

So here we are more on a personal matter than a project one, but these logs act as complete part of my memory, where, aside from the fact than, as suggested by xaur, it makes this memory rely on server, I also consider that a fancy UI won't ever achieve the power and flexibility good ol' greps & pipes can get you when you're used to those.

plain text can also be a good choice for greping through them :)

Even plaintext logs would be infinitely better than nothing. But the reason I suggest to keep original even jsons is because they carry more information. If you have original events, you can then render them as plaintext, but not the other way round.

matrix-archive project linked earlier looks like the most up-to-date solution so I suggested this feature in their repo: https://github.com/russelldavies/matrix-archive/issues/14

"How to export / save chat logs" seems most popular questions in Element related chats. And, when I have writing another answer to such a question, thought came to my mind:

Before this feature will be implemented in Element itself, maybe add "placeholder" menu item to all Element's interfaces, that will suggest to use external scripts for this task, as workaround?

I think, that this will be better, than nothing, like now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

richvdh picture richvdh  ·  3Comments

richvdh picture richvdh  ·  3Comments

bagage picture bagage  ·  3Comments

anoadragon453 picture anoadragon453  ·  3Comments

grahamperrin picture grahamperrin  ·  3Comments