Commit Graph

93 Commits

Author SHA1 Message Date
d9feb740cb convert content-fetch to typescript 2024-01-18 18:48:46 +08:00
cd3402b98a rewrite puppeteer in typescript 2024-01-18 18:48:46 +08:00
51e586ed3d separate content-fetch in puppeteer packages from saving page content 2024-01-18 18:48:46 +08:00
41805d13db Bump @sentry/serverless from 6.19.3 to 7.77.0
Bumps [@sentry/serverless](https://github.com/getsentry/sentry-javascript) from 6.19.3 to 7.77.0.
- [Release notes](https://github.com/getsentry/sentry-javascript/releases)
- [Changelog](https://github.com/getsentry/sentry-javascript/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-javascript/compare/6.19.3...7.77.0)

---
updated-dependencies:
- dependency-name: "@sentry/serverless"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-07 03:35:00 +00:00
8ddfa0a389 Fixes for new docker images 2023-10-26 11:19:38 +08:00
939a00b893 Update yarn file 2023-10-25 16:58:11 +08:00
16784728cf Use debian for base content-fetch container 2023-10-25 09:44:15 +08:00
c4773dc904 Landing page improvements and various supporting improvements 2023-10-24 09:43:39 +01:00
4b171c0657 docs: fix typos in packages/content-fetch/README.md 2023-10-18 20:35:35 +05:45
a97fcd1e88 do not use single process in chromium 2023-08-02 15:58:32 +08:00
63cbb3011e upgrade puppeteer and update chromium args 2023-08-02 15:33:15 +08:00
fa1ff9ba17 Upgrade node to 18.16 2023-07-24 15:55:11 +08:00
6cd6994aff Fix docker image 2022-12-28 15:28:27 +08:00
a5f5e6fbdb Fix docker build 2022-12-28 11:51:11 +08:00
e866541ae1 Fix puppeteer launch in head mode 2022-11-17 11:28:46 +08:00
e75e49a7b4 Remove logging dependecies in puppeteer-parse 2022-11-17 11:28:26 +08:00
db8c9cf97d Add function-framework dependency 2022-11-16 10:52:07 +08:00
d6e687d5d1 Update env example 2022-11-16 10:15:49 +08:00
b18af10e75 Import puppeteer-parse in content-fetch 2022-11-16 10:15:49 +08:00
00fed8a0fb Remove content-fetch-gcf and create a Dockerfile for the cloud function 2022-11-16 10:15:49 +08:00
623bb8780c Call puppeteer module from content-fetch 2022-11-16 10:15:49 +08:00
cb858484c6 Make puppeteer-parse a module 2022-11-16 10:15:49 +08:00
b5926ccf1c Get old tweet thread with puppeteer and new tweet with twitter api 2022-10-26 20:41:51 +08:00
bc9b50c3cb Remove dockerfile-local 2022-10-06 12:57:30 +08:00
d6e465d482 Add Dockerfile for pdfHandler 2022-10-04 15:28:12 +08:00
53d6afe25f Fix tests 2022-10-04 10:47:58 +08:00
9cae703666 Fix Dockerfile 2022-10-04 10:20:13 +08:00
4b01fccad8 Fix content-fetch dockerfile 2022-10-03 14:21:31 +08:00
a9607adfd3 Import content-handler as local dependency 2022-10-03 11:11:24 +08:00
99956539a0 Handle newsletter in content-handlers 2022-09-30 12:51:22 +08:00
206d795c54 Import content-handler in puppeteer 2022-09-30 12:51:22 +08:00
8c61832c77 Import content-handler in content-fetch 2022-09-30 12:51:22 +08:00
cb609d893e Escape HTML entities in puppeteer-parse 2022-09-23 16:40:32 +08:00
aef83ee958 Escape HTML entities in Twitter title and description 2022-09-23 16:33:57 +08:00
7656b37e1b Escape youtube title and author name 2022-09-23 16:16:25 +08:00
e52013ccb1 It seems to have some issue with disabling puppeteer timeout by setting it to 0.
So I set the timeout to 2 minutes which should be enough and it works in my local env
2022-08-26 11:54:33 +08:00
d12f3642e6 Bump puppeteer-core from 15.3.2 to 16.1.0
Bumps [puppeteer-core](https://github.com/puppeteer/puppeteer) from 15.3.2 to 16.1.0.
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/CHANGELOG.md)
- [Commits](https://github.com/puppeteer/puppeteer/compare/v15.3.2...v16.1.0)

---
updated-dependencies:
- dependency-name: puppeteer-core
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-08-11 02:15:16 +00:00
f17ee64676 Use ScrapingBee for some hosts 2022-07-16 14:09:45 -07:00
11f20ab64a Revert "close browser when request finished"
This reverts commit 7e68ad5237.
2022-07-15 15:35:07 +08:00
7e68ad5237 close browser when request finished 2022-07-15 15:23:02 +08:00
b2238ce7f2 revert no-sandbox 2022-07-15 14:43:17 +08:00
ed09d78980 remove no-sandbox 2022-07-15 14:32:02 +08:00
d9bb664fc0 remove puppeteer dependency in docker 2022-07-15 14:15:31 +08:00
9191f5710c remove single-process arg 2022-07-15 14:04:41 +08:00
4929bae81b close context if encounter error 2022-07-15 11:49:36 +08:00
610c790a7e do not use puppeteer-extra plugin 2022-07-15 11:09:57 +08:00
bb7ea78e8f Bump puppeteer-core from 13.7.0 to 15.3.2
Bumps [puppeteer-core](https://github.com/puppeteer/puppeteer) from 13.7.0 to 15.3.2.
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/CHANGELOG.md)
- [Commits](https://github.com/puppeteer/puppeteer/compare/v13.7.0...v15.3.2)

---
updated-dependencies:
- dependency-name: puppeteer-core
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-11 21:35:51 +00:00
c94d5db259 Merge pull request #889 from omnivore-app/dependabot/npm_and_yarn/axios-0.27.2
Bump axios from 0.26.0 to 0.27.2
2022-07-08 13:48:27 -07:00
01353add63 Shorten the timeout requesting pages
I believe our process is sometimes being terminated before this
timeout is hit, which means we then don't have time to fetch
with a fallback.
2022-07-05 11:16:11 -07:00
9554f8f6ba Create a scrapingbee url when using the fallback
Javascript hoists variables to the top of scope, so `url` here
refers to the `url` variable defined lower in the block.
2022-07-05 08:41:34 -07:00