* fix: Library Header layout shift * Bump Github Actions versions. * Self-Hosting Changes * Fix Minio Environment Variable * Just make pdfs successful, due to lack of PDFHandler * Fix issue where flag was set wrong * Added an NGINX Example file * Add some documentation for self-hosting via Docker Compose * Make some adjustments to Puppeteer due to failing sites. * adjust timings * Add start of Mail Service * Fix Docker Files * More email service stuff * Add Guide to use Zapier for Email-Importing. * Ensure that if no env is provided it uses the old email settings * Add some instructions for self-hosted email * Add SNS Endpoints for Mail Watcher * Add steps and functionality for using SES and SNS for email * Uncomment a few jobs. * Added option for Firefox for parser. Was having issues with Chromium on Docker. * Add missing space. Co-authored-by: Russ Taylor <729694+russtaylor@users.noreply.github.com> * Fix some wording on the Guide * update browser extension to handle self-hosted instances * add slight documentation to options page * Fix MV * Do raw handlers for Medium * Fix images in Medium * Update self-hosting/GUIDE.md Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com> * Update Guide with other variables * Add The Verge to JS-less handlers * Update regex and image-proxy * Update self-hosting/nginx/nginx.conf Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com> * Update regex and image-proxy * Update self-hosting/docker-compose/docker-compose.yml Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com> * Fix Minio for Export * Merge to main * Update GUIDE with newer NGINX * Update nginx config to include api/save route * Enable Native PDF View for PDFS * Enable Native PDF View for PDFS * feat:lover packages test * feat:working build * feat:alpine build * docs:api dockerfile docs * Write a PDF.js wrapper to replace pspdfkit * Revert changes for replication, set settings to have default mode * build folder got removed due to gitignore on pdf * Add Box shadow to pdf pages * Add Toggle for Progress in PDFS, enabled native viewer toggle * Update node version to LTS * Update node version to LTS * Fix Linting issues * Fix Linting issues * Make env variable nullable * Add touchend listener for mobile * Make changes to PDF for mobile * fix(android): change serverUrl to selfhosted first * feat:2 stage alpine content fetch * feat:separated start script * fix:changed to node 22 * Add back youtube functionality and add guide * trigger build * Fix cache issue on YouTube * Allow empty AWS_S3_ENDPOINT * Allow empty AWS_S3_ENDPOINT * Add GCHR for all images * Add GCHR For self hosting. * Add GCHR For self hosting. * Test prebuilt. * Test prebuilt * Test prebuilt... * Fix web image * Remove Web Image (For now) * Move docker-compose to images * Move docker-compose files to correct locations * Remove the need for ARGS * Update packages, and Typescript versions * Fix * Fix issues with build on Web * Correct push * Fix Linting issues * Fix Trace import * Add missing types * Fix Tasks * Add information into guide about self-build * Fix issues with PDF Viewer --------- Co-authored-by: keumky2 <keumky2@woowahan.com> Co-authored-by: William Theaker <wtheaker@nvidia.com> Co-authored-by: Russ Taylor <729694+russtaylor@users.noreply.github.com> Co-authored-by: David Adams <david@dadams2.com> Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com> Co-authored-by: m1xxos <66390094+m1xxos@users.noreply.github.com> Co-authored-by: Adil <mr.adil777@gmail.com>
omnivore-discover
What is this?
One of my bi ggest problems is actually discoverability of articles. I have my five sites, and my link aggregators like Reddit. This is a bubble, and I miss a lot this way.
So I wanted to see if I could create something that would enable discoverability from Omnivore.
I had a few goals when creating Omnivore Discover.
Features
Automatic Categorisation
A while ago I worked a proof of concept for automatically adding user tags to an article. I ultimately still need to work on that further, but the basics for it worked well.
I wanted to take the learnings from this and use it to add automatic categorisation of stories.
I created a few topics, and added some descriptions to them. I generate an Embedding from this using OpenAIs embedding. These can be seen below.
When ingesting articles (see Ingesting Articles) we use their title and small description to create an Embedding. We can then use Cosine Similarity to identify which category this story should be a part of.
This is of course not 100% accurate, but it does a good enough job at categorising articles.
Social Features
Discord Integration.
I created Omnivore Discover, and added it to the Omnivore WebApp.
I wanted to also add some social features to this. We have a fantastic community within the Omnivore Discord. I have found a lot of interesting reads in the #recommendations channel.
I wanted to be able to take these recommendations, and expose them to the Omnivore Community.
We do this using a Discord Bot. In order to moderate these recommendations a moderator must add an emoji (🦥) to the story.
This then gets ingested in the same way as the other stories. Meaning that it is also categorised. It also gets added to the Community Picks tab.
Popularity
There is also a popularity feed. This provides a score based on recent saves, weighting more heavily for newer articles. This allows us to have a popular tab, which shows in order the most popular stories on Omnivore Right now according to the community
Ingesting Articles
I ensured that articles could come from multiple locations. This is why I chose an RXJS Poller.
This project also started from the automatic labelling project. So that too was an important part of the decision to enable ingestion from multiple plages. Including a PubSub queue.
I wanted one of the main sources of the articles to be RSS Feeds.
I did this because I thought that some of this functionality might, in the future, be extendable to other RSS Feeds.
I have chose 3 article sources for now, Wired, ArsTechnica, and The Atlantic.
Technologies
Below is a list of the technologies that were used to design this feature. This repository represents the RXJS side.
- RxJS
- Typescript
- Axios
- PGVector
- Discord Bot
- PubSub
Running
Creation of the PubSub Topic and Subscription is external to this app.




