Commit Graph

156 Commits

Author SHA1 Message Date
1cdbfaca26 Merge pull request #294 from omnivore-app/feature/add-created-or-updated_at-to-newsletter_email
add created_at and updated_at to newsletter emails and sort by create…
2022-03-25 09:10:54 -07:00
5036924b21 Merge pull request #280 from omnivore-app/dependabot/npm_and_yarn/jsdom-19.0.0
Bump jsdom from 16.7.0 to 19.0.0
2022-03-25 09:08:55 -07:00
27735f7310 add page id in updated page json file in gcs (#319) 2022-03-25 13:50:43 +08:00
d595d4a8c4 add success and failure count in assertions (#318) 2022-03-25 12:16:43 +08:00
e7203bebb5 add a function_resolver to return originalArticleUrl from article.url (#317)
* add a function_resolver to return originalArticleUrl from article.url

* add a test
2022-03-25 11:25:30 +08:00
0264bc6733 Merge pull request #283 from omnivore-app/dependabot/npm_and_yarn/sentry/integrations-6.19.1
Bump @sentry/integrations from 5.30.0 to 6.19.1
2022-03-23 20:05:09 -07:00
3ec6f22199 Bump jsdom from 16.7.0 to 19.0.0
Bumps [jsdom](https://github.com/jsdom/jsdom) from 16.7.0 to 19.0.0.
- [Release notes](https://github.com/jsdom/jsdom/releases)
- [Changelog](https://github.com/jsdom/jsdom/blob/master/Changelog.md)
- [Commits](https://github.com/jsdom/jsdom/compare/16.7.0...19.0.0)

---
updated-dependencies:
- dependency-name: jsdom
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-23 18:51:08 +00:00
79a7eb11a3 fix data loss by selecting data from links instead of pages 2022-03-23 17:27:55 +08:00
8a485af20d validate publishedAt before inserting to elastic 2022-03-23 14:30:33 +08:00
e1752bb176 uncomment testing code 2022-03-23 12:37:33 +08:00
1b0399f286 add assertion of migrated data in python script 2022-03-23 12:09:25 +08:00
6597cc37b6 add created_at and updated_at to newsletter emails and sort by created_at desc 2022-03-22 18:43:47 +08:00
7d19c933ac add elastic in README 2022-03-22 15:33:51 +08:00
1893e36375 Fix uploading endpoint returns 504 (#289)
* add timeout = 60s for uploading to private bucket

* add debug logs

* send response to client to close connection

* add tests for upload

* add PUBSUB_VERIFICATION_TOKEN in .env.test
2022-03-22 14:20:40 +08:00
3e9063c145 return empty content instead of null in search result (#291) 2022-03-22 13:34:44 +08:00
bee3b3c6fa fix article_id type mismatch (uuid => varchar) when saving highlights in postgres by setting article_id to be undefined (#285) 2022-03-22 11:48:01 +08:00
4ca7c89622 format date string as yyyy-mm-dd (#288) 2022-03-22 11:46:39 +08:00
c15efe6aad Fix sentry log size error (#287)
* reduce sentry log size error by partially logging page created in elastic

* remove some debugging logs

* remove more debugging logs
2022-03-22 11:45:58 +08:00
783118a2e4 Bump @sentry/integrations from 5.30.0 to 6.19.1
Bumps [@sentry/integrations](https://github.com/getsentry/sentry-javascript) from 5.30.0 to 6.19.1.
- [Release notes](https://github.com/getsentry/sentry-javascript/releases)
- [Changelog](https://github.com/getsentry/sentry-javascript/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-javascript/compare/5.30.0...6.19.1)

---
updated-dependencies:
- dependency-name: "@sentry/integrations"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-21 21:19:05 +00:00
f9bce1932c Print the logging level on startup 2022-03-21 09:31:05 -07:00
77050905a5 retry max three times if update labels is conflicted 2022-03-18 10:39:11 +08:00
9d8d2e40d7 reduce time to save reading progress in elastic update 2022-03-18 10:29:01 +08:00
ac567f7e71 retry max three times if update is conflicted 2022-03-18 10:17:20 +08:00
7910565dbf Merge pull request #251 from omnivore-app/fix/jsonld-encoding
Use html decoding when getting values from fetched oembed
2022-03-17 16:53:21 -07:00
563131ea23 Fix/newsletter label not add (#256)
* default sort by savedAt to avoid recent read page to appear on top of the list

* revert lint on generated code

* fix elastic update script syntax error
2022-03-17 13:22:38 +08:00
a2ce98229e default sort by savedAt to avoid recent read page to appear on top of… (#254)
* default sort by savedAt to avoid recent read page to appear on top of the list

* revert lint on generated code
2022-03-17 10:54:00 +08:00
af037a2837 make readingProgress required in the elastic page data (#253)
* make readingProgress required in the elastic page data

* delete readingProgress from function_resolvers because we have stored them in elastic
2022-03-17 10:06:21 +08:00
ff1200f3a1 Use html decoding when getting values from fetched oembed
If we fetch oembed data from an external source, instead of
handling it in readabilityjs we need to html decode it.
2022-03-16 15:29:42 -07:00
89f3719ba3 Fix linting 2022-03-16 14:23:39 -07:00
0314d8cf17 Add some extra logging 2022-03-16 13:53:55 -07:00
dc27c0c7a4 Return 0 if looking up page reading progress fails 2022-03-16 13:33:36 -07:00
27143cbedf Remove function handlers for reading progress
These shouldn't be needed anymore because the values are stored
in elastic.
2022-03-16 13:11:36 -07:00
88103cba9f Merge pull request #162 from omnivore-app/dependabot/npm_and_yarn/apollo-datasource-3.3.1
Bump apollo-datasource from 0.7.3 to 3.3.1
2022-03-16 11:03:26 -07:00
07dcd5da26 add label to page only if not exists 2022-03-16 23:00:37 +08:00
1c4dcd7b00 automatically update updatedAt when page is updated in elastic 2022-03-16 22:28:39 +08:00
f412758040 add Newsletter label to the page created by newsletters email 2022-03-16 18:40:05 +08:00
27157006c1 use private bucket to upload page events (#244)
* use private bucket to upload page events

* fix tests

* add GCS_UPLOAD_PRIVATE_BUCKET in test env

* allow GCS_UPLOAD_PRIVATE_BUCKET to be empty
2022-03-16 14:39:07 +08:00
e652a6ea8c Rebased version of the elastic PR (#225)
* Add elastic to our docker compose

* add AND/OR/NOT search operations

* add elastic and create article in elastic

* change error code when elastic throws error

* add search pages in elastic

* add search by labels

* Add elastic to GitHub Action

* Update elastic version

* Fix port for elastic

* add url in search query

* Set elastic features when running tests

* add debug logs

* Use localhost instead of service hostname

* refresh elastic after create/update

* update search labels query

* add typescript support

* search pages in elastic

* fix search queries

* use elastic for saving page

* fix test failure

* update getArticle api to use elastic

* use generic get page function

* add elastic migration python script

* fix bulk helper param

* save elastic page id in article_saving_request instead of postgres article_id

* fix page archiving and deleting

* add tests for deleteArticle

* remove custom date type in elastic mappings which not exist in older version of elastic

* fix timestamp format issue

* add tests for save reading progress

* add tests for save file

* optimize search results

* add alias to index

* update migration script to receive env var as params

* Add failing test to validate we don't decrease reading progress

This test is failing with Elastic because we aren't fetching
the reading progress from elastic here, and are fetching it
from postgres.

* Rename readingProgress to readingProgressPercent

This is the name stored in elastic, so fixes issues pulling the
value out.

* Linting

* Add failing test for creating highlights w/elastic

This test fails because the highlight can't be looked up. Is there
a different ID we should be passing in to query for highlights,
or do we need to update the query to look for elastic_id?

* add tests code coverage threshold

* update nyc config

* include more files in test coverage

* change alias name

* update updateContent to update pages in elastic

* remove debug log

* fix createhighlight test

* search pages by alias in elastic

* update set labels and delete labels in elastic

* migration script enumeration

* make BULK_SIZE an env var

* fix pdf search indexing

* debug github action exit issue

* call pubsub when create/update/delete page in elastic

* fix json parsing bug and reduce reading data from file

* replace a depreciated pubsub api call

* debug github action exit issue

* debug github action exit issue

* add handler to upload elastic page data to GCS

* fix tests

* Use http_auth instead of basic_auth

* add index creation and existing postgres tables update in migration script

* fix a typo to connect to elastic

* rename readingProgress to readingProgressPercent

* migrate elastic_page_id in highlights and article_saving_request tables

* update migration script to include number of updated rows

* update db migration query

* read index mappings from file

* fix upload pages to gcs

* fix tests failure due to pageContext

* fix upload file id not exist error

* Handle savedAt & isArchived attributes w/out quering elastic

* Fix prettier issues

* fix content-type mismatching

* revert pageId to linkId because frontend was not deployed yet

* fix newsletters and attachment not saved in elastic

* put linkId in article for setting labels

* exclude orginalHtml in the result of searching to improve performace

* exclude content in the result of searching to improve performace

* remove score sorting

* do not refresh immediately to reduce searching and indexing time

* do not replace the backup data in gcs

* fix no article id defined in articleSavingRequest

* add logging of elastic api running time

* reduce home feed pagination size to 15

* reduce home feed pagination size to 10

* stop revalidating first page

* do not use a separate api to fetch reading progress

* Remove unused comment

* get reading progress if not exists

* replace ngram tokenizer with standard tokenizer

* fix tests

* remove .env.local

* add sort keyword in searching to sort by score

Co-authored-by: Hongbo Wu <hongbo@omnivore.app>
2022-03-16 12:08:59 +08:00
a4533dc016 Merge pull request #201 from omnivore-app/feature/beehiiv-newsletter-support
Support newsletters hosted on beehiiv
2022-03-15 14:03:55 -07:00
8e1b4fb1a4 Formatting 2022-03-14 15:36:17 -07:00
a81181ee60 Dont make queries for readingProgressPercent unless we have to 2022-03-14 15:22:53 -07:00
fc7d972855 Fix typo in readability date handling causing this parse issue
Can remove our special handler for the published date now that we
are pulling it out correctly.
2022-03-14 10:20:19 -07:00
5983cfe2a6 Attempt to set publishedDate if readability fails to parse it
This can happen if JSONLD fails to load. The test page here has
an encoding issue that causes the oembed jsonld to fail to load
and then readability fails to parse the date.
2022-03-13 21:23:51 -07:00
78660c886d rm debug 2022-03-13 09:06:15 -07:00
a874482d11 Dont perform an extra query for isArchived 2022-03-13 09:00:25 -07:00
bea7d084c4 SetClaims when creating an email article 2022-03-09 19:45:52 -08:00
f7814a0c4a Remove unused function 2022-03-09 19:45:31 -08:00
b0fe9059a9 Dont try to generate highlight URL previews until share is re-enabled 2022-03-09 09:49:03 -08:00
a5a42a36e7 Merge pull request #206 from omnivore-app/dependabot/npm_and_yarn/opentelemetry/exporter-jaeger-1.0.1
Bump @opentelemetry/exporter-jaeger from 0.24.0 to 1.0.1
2022-03-08 21:00:53 -08:00
9bda6ede06 Bump @opentelemetry/exporter-jaeger from 0.24.0 to 1.0.1
Bumps [@opentelemetry/exporter-jaeger](https://github.com/open-telemetry/opentelemetry-js) from 0.24.0 to 1.0.1.
- [Release notes](https://github.com/open-telemetry/opentelemetry-js/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-js/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-js/compare/v0.24.0...stable/v1.0.1)

---
updated-dependencies:
- dependency-name: "@opentelemetry/exporter-jaeger"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-09 04:45:18 +00:00