Go to file
Andrei Stoica 4c96bd8a28 removed unused imports 2022-10-15 11:59:34 -04:00
docker/psql updated init script and docker file for pair functions 2022-08-18 11:56:56 -04:00
src/recipe_graph testing db connection and table creation 2022-09-18 14:40:54 -04:00
test removed unused imports 2022-10-15 11:59:34 -04:00
.gitignore added test coverage reports to gitignore 2022-10-15 11:47:47 -04:00
README.md updated readme with environment variables for docker 2022-08-18 11:56:22 -04:00
docker-compose.yml inital commit 2022-07-18 20:43:21 -04:00
pyproject.toml restructured code for packaging 2022-09-18 13:01:19 -04:00
requirements.txt added test coverage report 2022-10-15 11:27:12 -04:00

README.md

Recipe Graph

Setup

Prerequisits

  • Docker compose
  • Python

Install python requirements

python -m pip installl -r requirements.txt

Environment (.env)

POSTGRES_URL=0.0.0.0
POSTGRES_USER=rgraph
POSTGRES_PASSWORD=rgraph
POSTGRES_DB=rgraph

Start database

docker-compose up

Initialize database and recipe sites

python src/db.py
python src/insert_sites.py data/sites.json

Usage

Scrape

import new recipes

python src/scrape.py <SiteName> -id <RecipeIdentifier>

To scrape only one recipe.

or

python src/scrape.py <SiteName> -a <N>

To scrape <N> recipes

By default it will start at id 0 or the greatest value of id alread in the database. To start at another value please use both -id and -a.

Scrape a recipe site for recipies

positional arguments:
  site                  Name of site

options:
  -h, --help            show this help message and exit
  -id ID, --identifier ID
                        url of recipe(reletive to base url of site) or commma seperated list
  -a N, --auto N        automaticaly generate identifier(must supply number of recipies to scrape)
  -v, --verbose

TODO

☑ automate scraping
☑ extracting quantity and name (via regex)
☑ creating adjacency list
☐ api for web frontend
☐ random ingredient list generation
☐ visualization(web frontend)
☐ create ontology of ingredients
☐ extend importing funcionality to more websites