2.8 KiB
Recipe Graph
Setup
Prerequisits
- Docker compose
- Python
Install python requirements
python -m pip installl -r requirements.txt
Environment (.env)
POSTGRES_URL=0.0.0.0
POSTGRES_USER=rgraph
POSTGRES_PASSWORD=rgraph
POSTGRES_DB=rgraph
Start database
docker-compose -p recipe-dev up
Example sites.json
[
{
"name": "Example Site Name",
"ingredient_class": "example-ingredients-item-name",
"name_class" : "example-heading-content",
"base_url" : "https://www.example.com/recipe/"
}
]
Initialize database and recipe sites
python src/db.py
python src/insert_sites.py data/sites.json
Shutdown database
docker-compose -p recipe-dev down
Usage
Scrape
import new recipes
python src/scrape.py <SiteName> -id <RecipeIdentifier>
To scrape only one recipe.
or
python src/scrape.py <SiteName> -a <N>
To scrape <N> recipes
By default it will start at id 0 or the greatest value of id alread in the
database. To start at another value please use both -id and -a.
Scrape a recipe site for recipies
positional arguments:
site Name of site
options:
-h, --help show this help message and exit
-id ID, --identifier ID
url of recipe(reletive to base url of site) or commma seperated list
-a N, --auto N automaticaly generate identifier(must supply number of recipies to scrape)
-v, --verbose
Testing
For testing create a new set up docker containers. Tests will fail if the database is already initiated.
Starting testing db
docker-compose -p recipe-test up
running tests
pytest
WARNINING: If you get ERROR at setup of test_db_connection and
ERROR at setup of test_db_class_creation, please check if testing database is
already initiated. Testing is destructive and should be done on a fresh database.
Shutting down testing db
docker-compose -p recipe-test down
Test are written in pytest framework. Currently focused on unittest and code coverage. Integration tests to come.
To run test use:
pytest --cov=src/recipe_graph --cov-report lcov --cov-report html
The html report is under htmlcov/ and can be viewed through any browser.
The lcov file can be used for the Coverage Gutters
plugin for VS Code to view coverage in your editor.
TODO
☑ automate scraping
☑ extracting quantity and name (via regex)
☑ creating adjacency list
☐ api for web frontend
☐ random ingredient list generation
☐ visualization(web frontend)
☐ create ontology of ingredients
☐ extend importing funcionality to more websites