* Add state and taskName in elastic page mappings * Add state and taskName in elastic page interface * Create page with PROCESSING state before scrapping * Update createArticleRequest API * Fix tests * Add default state for pages * Update createArticle API * Update save page * Update save file * Update saving item description * Show unable to parse content for failed page * Fix date parsing * Search for not failed pages * Fix tests * Add test for saveUrl * Update get article saving request api * Update get article test * Add test for articleSavingRequest API * Add test for failure * Return new page id if clientRequestId empty * Update clientRequestId in savePage * Update clientRequestId in saveFile * Replace article with slug in articleSavingRequest * Add slug in articleSavingRequest response * Depreciate article * Use slug in web * Remove article and highlight fragments * Query article.slug on Prod * Show unable to parse description for failed page * Fix a bug having duplicate pages when saving the same url multiple times * Add state in response * Rename variables in removeArticle API * Rename state * Add state in response in web * Make state an enum * Open temporary page by link id * Use an empty reader view as the background for loading pages * Progressively load the article page as content is loaded * Add includePending flag in getArticles API * Set includePending = true in web * Add elastic update mappings in migration script * Add elastic mappings in docker image * Move index_settings.json to migrate package * Remove elastic index creation in api * Move elastic migrations to a separate directory * Remove index_settings from api docker image Co-authored-by: Jackson Harper <jacksonh@gmail.com>
Database management
This workspace is used for database schema definitions and migrations.
Project currently uses PostgreSQL 11. Make sure you use the correct version.
Migrations usage
❗ It is important to understand that migrations that had already been performed are locked and are not a subject to change. Therefore for every alteration of schema you need to generate a new migration and perform alterations there.
Never commit changed files for migrations that already had been performed elsewhere, - it will break the migrations.
To migrate to latest version: yarn migrate
To migrate up/down to specific version: yarn migrate <version>
To migrate down to empty state: yarn migrate 0000
To generate new migration: yarn generate
* At the moment, version numbers expect to be padded to 4 digits
Policies and roles
We use Row Level Security when accessing the database from the application. In order to create a correct schema for new tables please study migration files for the schemas of previous tables (including possible later changes).
In order to use Row Level Security every transaction with the database must set the correct role via omnivore.set_claims function.
Current roles
omnivore_user - a user role that is intended for a regular user to access the data. Currently, this is the primary user of the database.
Database users on GCP
postgres - administrator of the database, used for migrations.
app_user - a user that the app uses to login to database.
❗ Do not issue any extra grants to app_user other than that are needed to assume a certain internal role, i.e. GRANT omnivore_user TO app_user
Installing and Using locally
- Install and run the postgresql service on your machine.
###Configure access to the database
On some systems, Postgres will only allow the local postgres user to connect to the database. One way around this is to
set the authentication method in your pg_hba.conf file. The trust method allows any user who can connect to
database to make changes. The default is peer which means the user must be logged into the system as the postgres
user. Another option is md5 or password which allows access to the postgres user with a password instead of
needing to be logged in locally as postgres. For simplicity, set the method to trust. You
will need to restart the postgresql service after making this change.
-
Verify you can connect to postgres
psql -U postgres. -
Quit psql (command is
\q) -
Create database using handy CLI tool (which needs the
-U <user>flag for now):$ createdb -U postgres omnivore -
Copy .env.example file to .env file:
cp .env.example .env -
Modify
.envand setPG_USERtopostgres -
Run migration:
yarn migrate
Accessing the database locally
Instead of using the superuser to access, create a user with the omnivore_user role. You can choose your local
username instead of app_user here to avoid needing the -U app_user flag in the psql command below.
- Create a user named
app_userin Postgres - Allow
app_userto assume the roles necessary for the application. Do not manually grant any other role toapp_user
$ psql -U postgres
# CREATE USER app_user WITH ENCRYPTED PASSWORD 'app_pass';
# GRANT omnivore_user to app_user;
-
Update the
PG_USERandPG_PASSWORDvalues in.envfiles (packages/db, pkg/api) toapp_userandapp_pass, respectively -
You can now use psql to login to your database:
psql -U app_user -d omnivore
Gotchas
Postgres Row-Level Security can at times catch us off guard: there are policies limiting select/update operations on
tables based on active user role/ID in a transaction block. So at times, when working on the local database, one must
make sure to login via postgres user to view all rows in the tables or perform updates.