Commit Graph

35 Commits

Author SHA1 Message Date
0e523d8c73 upload readable content before exporting to cache the content 2024-08-27 15:33:04 +08:00
48b3f736f0 wait for write stream to finish 2024-08-27 14:57:31 +08:00
444c78f0cb use async job to handle exporter 2024-08-27 14:19:02 +08:00
8cfa24a847 allow downloading/uploading readable content 2024-05-15 21:46:22 +08:00
950b42899d fix mock cloud storage 2024-05-15 15:53:42 +08:00
9dee510be1 fix rss 2024-05-14 20:18:18 +08:00
dbd7b7932f cont 2024-05-14 17:23:56 +08:00
eddf9206d0 do not store original content in db 2024-05-14 17:15:51 +08:00
7a0b2f3d33 upload file only not exists 2024-05-14 17:14:41 +08:00
9286174ec7 upload and download original content from GCS 2024-05-14 17:14:40 +08:00
6bb81dd5c3 skip uploading if file already exists 2024-05-11 11:30:41 +08:00
0f184c4c21 add get content api 2024-05-10 15:43:15 +08:00
01ebcbb16b add bulk upload original content job 2024-05-10 14:37:05 +08:00
4220922ff8 Preserve original URL if possible 2023-10-18 10:23:26 +08:00
25c0051fd4 Only use signed URLs for PDF attachments 2023-10-18 10:14:35 +08:00
efdbc4f345 use interpolated string or dictionary parameter in logs 2023-10-05 14:32:03 +08:00
4ff5484d8e change item_type to text 2023-10-05 14:31:06 +08:00
1bade22076 fix file uploading 2023-10-05 14:28:52 +08:00
0609043956 reduce retry 2023-08-17 17:59:24 +08:00
792cf0b207 upgrade google cloud sdk 2023-08-11 13:55:29 +08:00
99a52f8d56 replace all the console logs with logger logs 2023-07-27 16:06:44 +08:00
e53cc28683 Fix issue where we return EPUB content reader for items marked as book in metadata
If we parse a page with <meta property="og:type" content="book"> in
it now, we incorrectly set this to ContentReader.EPUB, which then
causes syncing issues on iOS which doesn't have an EPUB reader
type defined in its GraphQL schema yet.
2023-06-05 14:44:51 +08:00
69df32d428 Implement a reader for epubs 2023-04-18 09:00:48 +08:00
8148453503 When saving an existing page, update the content 2023-03-21 13:44:00 +08:00
9314c3d8f1 Add uploadImportFile API method
Add uploadImportFile API method

Fix prefix, counting max files uploaded

Add resolver types

Basic web ui for the uploader interface

Allow selecting type when uploading import files
2023-01-03 10:01:59 +08:00
77570aa5ab Synthesize text to speech with azure API 2022-08-18 19:24:36 +08:00
0419472c2e Upload audio file with public access right 2022-08-18 19:23:41 +08:00
05ea57f76f Linting 2022-06-09 12:08:42 -07:00
e23eada168 Use the public GCS URL for local uploaded files 2022-06-09 11:59:01 -07:00
b56c6dafa9 Disable sorting options that need API improvements 2022-06-03 12:47:43 -07:00
1893e36375 Fix uploading endpoint returns 504 (#289)
* add timeout = 60s for uploading to private bucket

* add debug logs

* send response to client to close connection

* add tests for upload

* add PUBSUB_VERIFICATION_TOKEN in .env.test
2022-03-22 14:20:40 +08:00
27157006c1 use private bucket to upload page events (#244)
* use private bucket to upload page events

* fix tests

* add GCS_UPLOAD_PRIVATE_BUCKET in test env

* allow GCS_UPLOAD_PRIVATE_BUCKET to be empty
2022-03-16 14:39:07 +08:00
e652a6ea8c Rebased version of the elastic PR (#225)
* Add elastic to our docker compose

* add AND/OR/NOT search operations

* add elastic and create article in elastic

* change error code when elastic throws error

* add search pages in elastic

* add search by labels

* Add elastic to GitHub Action

* Update elastic version

* Fix port for elastic

* add url in search query

* Set elastic features when running tests

* add debug logs

* Use localhost instead of service hostname

* refresh elastic after create/update

* update search labels query

* add typescript support

* search pages in elastic

* fix search queries

* use elastic for saving page

* fix test failure

* update getArticle api to use elastic

* use generic get page function

* add elastic migration python script

* fix bulk helper param

* save elastic page id in article_saving_request instead of postgres article_id

* fix page archiving and deleting

* add tests for deleteArticle

* remove custom date type in elastic mappings which not exist in older version of elastic

* fix timestamp format issue

* add tests for save reading progress

* add tests for save file

* optimize search results

* add alias to index

* update migration script to receive env var as params

* Add failing test to validate we don't decrease reading progress

This test is failing with Elastic because we aren't fetching
the reading progress from elastic here, and are fetching it
from postgres.

* Rename readingProgress to readingProgressPercent

This is the name stored in elastic, so fixes issues pulling the
value out.

* Linting

* Add failing test for creating highlights w/elastic

This test fails because the highlight can't be looked up. Is there
a different ID we should be passing in to query for highlights,
or do we need to update the query to look for elastic_id?

* add tests code coverage threshold

* update nyc config

* include more files in test coverage

* change alias name

* update updateContent to update pages in elastic

* remove debug log

* fix createhighlight test

* search pages by alias in elastic

* update set labels and delete labels in elastic

* migration script enumeration

* make BULK_SIZE an env var

* fix pdf search indexing

* debug github action exit issue

* call pubsub when create/update/delete page in elastic

* fix json parsing bug and reduce reading data from file

* replace a depreciated pubsub api call

* debug github action exit issue

* debug github action exit issue

* add handler to upload elastic page data to GCS

* fix tests

* Use http_auth instead of basic_auth

* add index creation and existing postgres tables update in migration script

* fix a typo to connect to elastic

* rename readingProgress to readingProgressPercent

* migrate elastic_page_id in highlights and article_saving_request tables

* update migration script to include number of updated rows

* update db migration query

* read index mappings from file

* fix upload pages to gcs

* fix tests failure due to pageContext

* fix upload file id not exist error

* Handle savedAt & isArchived attributes w/out quering elastic

* Fix prettier issues

* fix content-type mismatching

* revert pageId to linkId because frontend was not deployed yet

* fix newsletters and attachment not saved in elastic

* put linkId in article for setting labels

* exclude orginalHtml in the result of searching to improve performace

* exclude content in the result of searching to improve performace

* remove score sorting

* do not refresh immediately to reduce searching and indexing time

* do not replace the backup data in gcs

* fix no article id defined in articleSavingRequest

* add logging of elastic api running time

* reduce home feed pagination size to 15

* reduce home feed pagination size to 10

* stop revalidating first page

* do not use a separate api to fetch reading progress

* Remove unused comment

* get reading progress if not exists

* replace ngram tokenizer with standard tokenizer

* fix tests

* remove .env.local

* add sort keyword in searching to sort by score

Co-authored-by: Hongbo Wu <hongbo@omnivore.app>
2022-03-16 12:08:59 +08:00
7229c64da0 fix: Should not get fake uploading url in demo (#79)
* fix not getting fake uploading url in demo

* use internal service endpoint env

* update internal svc endpoint
2022-02-17 13:34:50 +08:00
84f32935f5 Open source omnivore 2022-02-11 09:24:33 -08:00