urlpreview

A bot that responds to links with a link preview embed, using Matrix API to fetch meta tags

preview.jpg


Usage

Sending any link in chat will have the bot reply to your message with the link's embed details.

The bot will first mark the chat as read, to indicate that it has initiated properly.

If there are multiple links in the message, the bot will fetch up to max_links (3) links. If it fails, it will skip embedding that link.

If the link returns a 404, the bot will return an emoji no_results_react (πŸ’¨) on your message, to show that no results were returned.

url_blacklist and user_blacklist can allow you to control how urlpreview is used.


Config

  • ext_enabled - Change which data sources to use for meta tags (last in array takes priority)
  • html_custom_headers - Set custom headers (ie. User-Agent, Accept-Encoding, etc.) for data fetching
  • max_links - Change how many links you'd like to process per message. 1-3 is recommended.
  • max_image_embed - Change the maximum image width displayed in the embed. 300 is recommended.
  • no_results_react - Adds a reaction emoji to the message to show that no results were returned. Put '' to disable.
  • url_blacklist - Disable urlpreview for an IP range or a Regex entry
  • user_blacklist - Disable urlpreview for a user
htmlparser

N/A

json
  • json_max_char - Set a maximum character limit for outputted JSON, to prevent long files from blocking chat. Default 2000.
synapse
  • appid - Your bot's access token. This is needed to make the request to the Matrix Synapse URL Preview API.
  • homeserver - Your homeserver (matrix-client.matrix.org by default, don't add https in front

Notes

  • This bot comes with three parsers: htmlparser, json, and synapse. By default, all are enabled.
  • You can control which ones to enable/disable or prioritize using ext_enabled (last in array takes priority).
  • Due to the length of some embeds, line-breaks are stripped from any og:description tags.
  • Image width relies on og:image:width provided by websites, and falls back to max_image_embed px wide. There may be an option in the future to install a dependency that'll parse image height.
htmlparser
  • htmlparser works out-of-the-box by directly fetching the HTML page and parsing using htmlparser (built-in).
  • htmlparser may leak your server's IP, and is recommended for bots hosted in a VPS/server environment.
  • Some sites protected by Cloudflare/similar services may not return results.
json
  • json works out-of-the-box by directly fetching pages with application/json mime_type and parsing using json (built-in).
  • json may leak your server's IP, and is recommended for bots hosted in a VPS/server environment.
  • By default, JSON results are truncated to json_max_char (2000) characters in chat.
synapse

Upgrade Guide

If you're updating from older urlpreview versions, delete the whole ext_enabled: [...] line and click "Save" to activate new parsers.

To get new Config entries, in your Maubot Manager's Instances, please click "Save" (even with no changes) to force-update the default Config values. This will restore missing Config values and defaults. You can also delete some or all of your Config entries and click "Save" to restore defaults.

Known Bugs

  • As of v0.3, image previews will expire after a few days. If you would like to preserve any images, please manually copy-paste reupload the images into chat as an uploaded image.
  • YouTube doesn't put line breaks in their og:description, which may lead to improperly parsed links in your Matrix client.