Go to file
Andrei Stoica c30fea1ddc added pytest to requirements 2022-09-18 11:44:37 -04:00
docker/psql updated init script and docker file for pair functions 2022-08-18 11:56:56 -04:00
recipe_graph moved scripts to module 2022-09-18 11:39:54 -04:00
.gitignore refactored table creation into seperate function 2022-08-03 17:02:49 -04:00
README.md updated readme with environment variables for docker 2022-08-18 11:56:22 -04:00
docker-compose.yml inital commit 2022-07-18 20:43:21 -04:00
requirements.txt added pytest to requirements 2022-09-18 11:44:37 -04:00

README.md

Recipe Graph

Setup

Prerequisits

  • Docker compose
  • Python

Install python requirements

python -m pip installl -r requirements.txt

Environment (.env)

POSTGRES_URL=0.0.0.0
POSTGRES_USER=rgraph
POSTGRES_PASSWORD=rgraph
POSTGRES_DB=rgraph

Start database

docker-compose up

Initialize database and recipe sites

python src/db.py
python src/insert_sites.py data/sites.json

Usage

Scrape

import new recipes

python src/scrape.py <SiteName> -id <RecipeIdentifier>

To scrape only one recipe.

or

python src/scrape.py <SiteName> -a <N>

To scrape <N> recipes

By default it will start at id 0 or the greatest value of id alread in the database. To start at another value please use both -id and -a.

Scrape a recipe site for recipies

positional arguments:
  site                  Name of site

options:
  -h, --help            show this help message and exit
  -id ID, --identifier ID
                        url of recipe(reletive to base url of site) or commma seperated list
  -a N, --auto N        automaticaly generate identifier(must supply number of recipies to scrape)
  -v, --verbose

TODO

☑ automate scraping
☑ extracting quantity and name (via regex)
☑ creating adjacency list
☐ api for web frontend
☐ random ingredient list generation
☐ visualization(web frontend)
☐ create ontology of ingredients
☐ extend importing funcionality to more websites