167 Commits

Author SHA1 Message Date
4e582fb55d Improving Self-Hosting and Removing 3rd Party dependencies. (#4513)
* fix: Library Header layout shift

* Bump Github Actions versions.

* Self-Hosting Changes

* Fix Minio Environment Variable

* Just make pdfs successful, due to lack of PDFHandler

* Fix issue where flag was set wrong

* Added an NGINX Example file

* Add some documentation for self-hosting via Docker Compose

* Make some adjustments to Puppeteer due to failing sites.

* adjust timings

* Add start of Mail Service

* Fix Docker Files

* More email service stuff

* Add Guide to use Zapier for Email-Importing.

* Ensure that if no env is provided it uses the old email settings

* Add some instructions for self-hosted email

* Add SNS Endpoints for Mail Watcher

* Add steps and functionality for using SES and SNS for email

* Uncomment a few jobs.

* Added option for Firefox for parser. Was having issues with Chromium on Docker.

* Add missing space.

Co-authored-by: Russ Taylor <729694+russtaylor@users.noreply.github.com>

* Fix some wording on the Guide

* update browser extension to handle self-hosted instances

* add slight documentation to options page

* Fix MV

* Do raw handlers for Medium

* Fix images in Medium

* Update self-hosting/GUIDE.md

Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com>

* Update Guide with other variables

* Add The Verge to JS-less handlers

* Update regex and image-proxy

* Update self-hosting/nginx/nginx.conf

Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com>

* Update regex and image-proxy

* Update self-hosting/docker-compose/docker-compose.yml

Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com>

* Fix Minio for Export

* Merge to main

* Update GUIDE with newer NGINX

* Update nginx config to include api/save route

* Enable Native PDF View for PDFS

* Enable Native PDF View for PDFS

* feat:lover packages test

* feat:working build

* feat:alpine build

* docs:api dockerfile docs

* Write a PDF.js wrapper to replace pspdfkit

* Revert changes for replication, set settings to have default mode

* build folder got removed due to gitignore on pdf

* Add Box shadow to pdf pages

* Add Toggle for Progress in PDFS, enabled native viewer toggle

* Update node version to LTS

* Update node version to LTS

* Fix Linting issues

* Fix Linting issues

* Make env variable nullable

* Add touchend listener for mobile

* Make changes to PDF for mobile

* fix(android): change serverUrl to selfhosted first

* feat:2 stage alpine content fetch

* feat:separated start script

* fix:changed to node 22

* Add back youtube functionality and add guide

* trigger build

* Fix cache issue on YouTube

* Allow empty AWS_S3_ENDPOINT

* Allow empty AWS_S3_ENDPOINT

* Add GCHR for all images

* Add GCHR For self hosting.

* Add GCHR For self hosting.

* Test prebuilt.

* Test prebuilt

* Test prebuilt...

* Fix web image

* Remove Web Image (For now)

* Move docker-compose to images

* Move docker-compose files to correct locations

* Remove the need for ARGS

* Update packages, and Typescript versions

* Fix

* Fix issues with build on Web

* Correct push

* Fix Linting issues

* Fix Trace import

* Add missing types

* Fix Tasks

* Add information into guide about self-build

* Fix issues with PDF Viewer

---------

Co-authored-by: keumky2 <keumky2@woowahan.com>
Co-authored-by: William Theaker <wtheaker@nvidia.com>
Co-authored-by: Russ Taylor <729694+russtaylor@users.noreply.github.com>
Co-authored-by: David Adams <david@dadams2.com>
Co-authored-by: Mike Baker <1426795+mbaker3@users.noreply.github.com>
Co-authored-by: m1xxos <66390094+m1xxos@users.noreply.github.com>
Co-authored-by: Adil <mr.adil777@gmail.com>
2025-01-27 13:33:16 +01:00
066883a84d remove unused dependencies 2024-07-24 12:51:25 +08:00
29a5b20d2c remove scrapingbee from content-fetch 2024-07-24 12:17:13 +08:00
dfc0ce0e54 create a context for each new page 2024-07-10 17:20:04 +08:00
e61da03ed4 wait until body is fetched 2024-07-10 16:01:39 +08:00
431c4cc098 use old headless mode for better performance 2024-07-10 15:13:16 +08:00
75338f5927 bypass cloudflare captcha 2024-07-10 14:43:47 +08:00
73e180f43d add more dependencies to docker container 2024-07-09 19:16:21 +08:00
cd83199eb3 remove hardcode gpu vendor as it does not work 2024-07-09 18:18:11 +08:00
c0f45e2411 upgrade puppeteer-core 2024-07-09 18:06:42 +08:00
0899b8fc8f use swAngle 2024-07-09 14:31:51 +08:00
c75cbb39d6 injecting webgl fingerprint 2024-07-09 14:11:31 +08:00
2c15c21bf1 remove user-agent 2024-07-08 18:59:53 +08:00
728059c6f8 do not cache some urls 2024-07-05 19:05:36 +08:00
81fbaf9807 inject fingerprint 2024-07-05 18:52:20 +08:00
a6653414e8 fix: use software graphic rendering instead of gpu and reduce browser launch timeout to 10 seconds 2024-07-05 12:13:00 +08:00
1eb1d25960 remove specific user-data-dir 2024-07-04 19:34:23 +08:00
38a3e03780 improve args 2024-07-04 19:32:11 +08:00
b38b28c75e create a browser singleton instance and checks browser existence before creating context 2024-07-04 19:12:42 +08:00
dde9f16396 put error message in the analytic event 2024-05-17 16:16:44 +08:00
f43c48e376 reduce chromium launch timeout to 30 seconds 2024-05-17 14:27:59 +08:00
293ed87100 remove redundant response from return value 2024-05-16 12:21:11 +08:00
cd315fa6c6 remove redundant assignment 2024-05-14 13:07:42 +08:00
484676750e reconnect/restart browser if it crashed/lost connections 2024-05-14 13:04:17 +08:00
a924c8448b capture content-fetch success and error events 2024-05-13 14:55:48 +08:00
d886c3b7d0 catch puppeteer page error 2024-05-13 14:35:47 +08:00
d23bccf459 upgrade puppeteer-core to prevent ProtocolTimeout and adding more debug logs 2024-05-13 14:28:26 +08:00
0ac5299c32 do not pass browser instance to content-handler 2024-05-13 13:10:02 +08:00
86e637febd add more logs to debug browser context 2024-05-13 12:56:20 +08:00
475c636c1a print browser log 2024-05-13 12:55:15 +08:00
88a7e8d85b fix tests 2024-04-04 12:17:15 +08:00
0e46dc2302 save dir in the database 2024-03-04 12:28:51 +08:00
5e239d2568 run readability in save-page instead of puppeteer 2024-01-25 16:30:59 +08:00
94dd4be659 fix: page content not saved when title is empty but content is not 2024-01-23 16:47:42 +08:00
1411cf074e fix: finalUrl defaults to the url of the page saved 2024-01-23 14:14:54 +08:00
a03eee5ef7 fix dependecies 2024-01-18 18:48:46 +08:00
d9feb740cb convert content-fetch to typescript 2024-01-18 18:48:46 +08:00
cd3402b98a rewrite puppeteer in typescript 2024-01-18 18:48:46 +08:00
51e586ed3d separate content-fetch in puppeteer packages from saving page content 2024-01-18 18:48:46 +08:00
ad63c75e63 fix typo 2023-12-08 11:29:03 +08:00
3759e10615 fix feed url in pdf file not saved 2023-12-08 11:29:02 +08:00
d09ec51136 Merge pull request #3182 from omnivore-app/fix/importer-notification 2023-11-28 14:59:52 +08:00
b10b704da3 fix importer metrics not updated when failed to catch invalid url in the list 2023-11-28 12:14:27 +08:00
fd781644f1 feat: fetch content for rss feed items in following folder 2023-11-23 18:03:25 +08:00
c4773dc904 Landing page improvements and various supporting improvements 2023-10-24 09:43:39 +01:00
1b1cce7485 disable javascript for the host 2023-10-20 18:59:22 +08:00
d746510358 cont 2023-10-19 21:50:16 +08:00
f750648824 fix importer triggers thumbnailer unexpectedly 2023-10-19 21:46:43 +08:00
0fcc7096aa docs: fix typo in packages/puppeteer-parse/README.md 2023-10-18 17:33:22 +05:45
00bd183287 do not retry importer job if user account is deleted 2023-10-16 16:33:22 +08:00