|
|
f17ee64676
|
Use ScrapingBee for some hosts
|
2022-07-16 14:09:45 -07:00 |
|
|
|
11f20ab64a
|
Revert "close browser when request finished"
This reverts commit 7e68ad5237.
|
2022-07-15 15:35:07 +08:00 |
|
|
|
7e68ad5237
|
close browser when request finished
|
2022-07-15 15:23:02 +08:00 |
|
|
|
b2238ce7f2
|
revert no-sandbox
|
2022-07-15 14:43:17 +08:00 |
|
|
|
ed09d78980
|
remove no-sandbox
|
2022-07-15 14:32:02 +08:00 |
|
|
|
d9bb664fc0
|
remove puppeteer dependency in docker
|
2022-07-15 14:15:31 +08:00 |
|
|
|
9191f5710c
|
remove single-process arg
|
2022-07-15 14:04:41 +08:00 |
|
|
|
4929bae81b
|
close context if encounter error
|
2022-07-15 11:49:36 +08:00 |
|
|
|
610c790a7e
|
do not use puppeteer-extra plugin
|
2022-07-15 11:09:57 +08:00 |
|
|
|
bb7ea78e8f
|
Bump puppeteer-core from 13.7.0 to 15.3.2
Bumps [puppeteer-core](https://github.com/puppeteer/puppeteer) from 13.7.0 to 15.3.2.
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/CHANGELOG.md)
- [Commits](https://github.com/puppeteer/puppeteer/compare/v13.7.0...v15.3.2)
---
updated-dependencies:
- dependency-name: puppeteer-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
|
2022-07-11 21:35:51 +00:00 |
|
|
|
c94d5db259
|
Merge pull request #889 from omnivore-app/dependabot/npm_and_yarn/axios-0.27.2
Bump axios from 0.26.0 to 0.27.2
|
2022-07-08 13:48:27 -07:00 |
|
|
|
01353add63
|
Shorten the timeout requesting pages
I believe our process is sometimes being terminated before this
timeout is hit, which means we then don't have time to fetch
with a fallback.
|
2022-07-05 11:16:11 -07:00 |
|
|
|
9554f8f6ba
|
Create a scrapingbee url when using the fallback
Javascript hoists variables to the top of scope, so `url` here
refers to the `url` variable defined lower in the block.
|
2022-07-05 08:41:34 -07:00 |
|
|
|
3a79710dbf
|
Always fall back to scrapingbee if there is an exception
|
2022-07-05 21:48:58 +08:00 |
|
|
|
37075f076e
|
Remove userDataDir
|
2022-06-29 22:56:14 +08:00 |
|
|
|
e91f25e58c
|
Bump axios from 0.26.0 to 0.27.2
Bumps [axios](https://github.com/axios/axios) from 0.26.0 to 0.27.2.
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v0.27.2/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v0.26.0...v0.27.2)
---
updated-dependencies:
- dependency-name: axios
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
|
2022-06-27 21:31:14 +00:00 |
|
|
|
feb197c731
|
Fix a crash when parsing content fetches that are blocked
|
2022-06-22 14:55:37 -07:00 |
|
|
|
a9b3a5c925
|
Merge pull request #805 from omnivore-app/fix/duplicate-content
Remove duplicate content
|
2022-06-19 10:56:10 +08:00 |
|
|
|
1d99bfaa10
|
Use a different Dockerfile for content-fetch with App Engine and docker-compose
|
2022-06-17 17:12:33 -07:00 |
|
|
|
ddaac82653
|
Fix content-fetch on docker compose
|
2022-06-17 14:59:42 -07:00 |
|
|
|
58814e1854
|
Run the content-fetch service in docker compose
|
2022-06-17 14:19:06 -07:00 |
|
|
|
71f8834477
|
Fix detection of medium subdomains
|
2022-06-17 09:25:42 -07:00 |
|
|
|
486f22a594
|
remove redundant async
|
2022-06-15 22:31:55 +08:00 |
|
|
|
2aafd39650
|
add fastcompany.com to the non-script hosts list
|
2022-06-15 22:27:25 +08:00 |
|
|
|
e028e2e440
|
generate test page for fast company
|
2022-06-15 21:22:25 +08:00 |
|
|
|
486f3c930b
|
Remove PROXY_URL from content-fetch
|
2022-06-14 20:30:02 -07:00 |
|
|
|
ec5bbb8350
|
Return URL as a string
|
2022-06-14 16:23:07 -07:00 |
|
|
|
be2801477b
|
Add some extra debugging
|
2022-06-14 16:13:43 -07:00 |
|
|
|
159a7f8950
|
Fallback to scrapingbee if a page cant fetch content
|
2022-06-14 16:06:01 -07:00 |
|
|
|
814f6098a3
|
Log proxy url
|
2022-06-14 14:27:19 -07:00 |
|
|
|
a4ad78652a
|
Allow specifying a proxy url when launching puppeteer
|
2022-06-14 13:30:32 -07:00 |
|
|
|
b94215f1fc
|
Allow selectively disabling javascript on some hosts
Some hosts readability is improved by disabling javascript
|
2022-06-10 13:25:14 -07:00 |
|
|
|
cb98a9cf86
|
Make clients opt into creating a page when uploading a file
|
2022-05-26 21:40:40 -07:00 |
|
|
|
0cc7e84a82
|
Fix content not getting parsed by linkedom properly without <html> tag by replacing innerHtml with outerHtml
|
2022-05-18 15:52:16 +08:00 |
|
|
|
8f0447ed3f
|
Stop blocking images and css file
|
2022-05-18 15:50:52 +08:00 |
|
|
|
629aa54c58
|
Fix youtube handler
|
2022-05-18 11:28:33 +08:00 |
|
|
|
ca662964e6
|
Fix not getting youtube video id from url
|
2022-05-17 21:51:03 +08:00 |
|
|
|
745f55a843
|
Set headless=true
|
2022-05-14 10:47:15 +08:00 |
|
|
|
80c14cd6ca
|
Remove single-process from chromium args
|
2022-05-14 10:37:06 +08:00 |
|
|
|
7bfb8cfee4
|
Merge pull request #597 from omnivore-app/remove-chrome-aws-lambda
Optimize puppeteer and remove chrome-aws-lambda dependencies
|
2022-05-13 16:12:24 -07:00 |
|
|
|
87b11277d1
|
Add token verification to the content-fetch service
|
2022-05-13 15:14:09 -07:00 |
|
|
|
6f09a4b31a
|
Fix missing variable name in medium handler
|
2022-05-13 17:47:21 +08:00 |
|
|
|
f5003c1370
|
Stop blocking script
|
2022-05-13 12:17:19 +08:00 |
|
|
|
37e55add98
|
Stop blocking stylesheet and media
|
2022-05-13 12:09:05 +08:00 |
|
|
|
ad99f933e5
|
Fix tests cont
|
2022-05-12 17:53:28 +08:00 |
|
|
|
60bbbb6cf3
|
Block requests to 'font', 'image', 'stylesheet', 'script', 'media' in puppeteer
|
2022-05-12 17:10:38 +08:00 |
|
|
|
b766e17189
|
Remove jsdom in content-fetch
|
2022-05-12 16:48:59 +08:00 |
|
|
|
9606cd6b28
|
Remove chrome-aws-lambda dependencies
|
2022-05-12 16:32:22 +08:00 |
|
|
|
e1e0ddf7fc
|
Merge pull request #582 from omnivore-app/optimize-parsing
Optimize parsing
|
2022-05-12 11:07:52 +08:00 |
|
|
|
0984dca183
|
Remove adblocker and block resources by url and also block mathJax script
|
2022-05-11 22:04:47 +08:00 |
|