Go to file
Andrei Stoica 5201a444e9 added requirements to package 2023-05-16 14:57:43 -04:00
docker/psql updated init script and docker file for pair functions 2022-08-18 11:56:56 -04:00
src/recipe_graph refactoring 2022-10-15 14:40:16 -04:00
test added file for testing scrape script 2022-10-15 14:40:42 -04:00
.drone.yml drone-ci testing 2023-05-16 07:15:46 -04:00
.gitignore drone-ci testing 2023-05-16 07:15:46 -04:00
README.md added testing to readme 2023-05-15 10:49:38 -04:00
docker-compose.yml inital commit 2022-07-18 20:43:21 -04:00
pyproject.toml added requirements to package 2023-05-16 14:57:43 -04:00
requirements.txt added test coverage report 2022-10-15 11:27:12 -04:00

README.md

Recipe Graph

Setup

Prerequisits

  • Docker compose
  • Python

Install python requirements

python -m pip installl -r requirements.txt

Environment (.env)

POSTGRES_URL=0.0.0.0
POSTGRES_USER=rgraph
POSTGRES_PASSWORD=rgraph
POSTGRES_DB=rgraph

Start database

docker-compose -p recipe-dev up

Example sites.json

[
    {
        "name": "Example Site Name",
        "ingredient_class": "example-ingredients-item-name",
        "name_class" : "example-heading-content",
        "base_url" : "https://www.example.com/recipe/"
    }
]

Initialize database and recipe sites

python src/db.py
python src/insert_sites.py data/sites.json

Shutdown database

docker-compose -p recipe-dev down

Usage

Scrape

import new recipes

python src/scrape.py <SiteName> -id <RecipeIdentifier>

To scrape only one recipe.

or

python src/scrape.py <SiteName> -a <N>

To scrape <N> recipes

By default it will start at id 0 or the greatest value of id alread in the database. To start at another value please use both -id and -a.

Scrape a recipe site for recipies

positional arguments:
  site                  Name of site

options:
  -h, --help            show this help message and exit
  -id ID, --identifier ID
                        url of recipe(reletive to base url of site) or commma seperated list
  -a N, --auto N        automaticaly generate identifier(must supply number of recipies to scrape)
  -v, --verbose

Testing

For testing create a new set up docker containers. Tests will fail if the database is already initiated.

Starting testing db

docker-compose -p recipe-test up

running tests

pytest

WARNINING: If you get ERROR at setup of test_db_connection and ERROR at setup of test_db_class_creation, please check if testing database is already initiated. Testing is destructive and should be done on a fresh database.

Shutting down testing db

docker-compose -p recipe-test down

Test are written in pytest framework. Currently focused on unittest. Integration tests to come.

To run test use:

pytest --cov=src/recipe_graph --cov-report lcov --cov-report html

The html report is under htmlcov/ and can be viewed through any browser. The lcov file can be used for the Coverage Gutters plugin for VS Code to view coverage in your editor.

TODO

☑ automate scraping
☑ extracting quantity and name (via regex)
☑ creating adjacency list
☐ api for web frontend
☐ random ingredient list generation
☐ visualization(web frontend)
☐ create ontology of ingredients
☐ extend importing funcionality to more websites