128 lines
2.7 KiB
Markdown
128 lines
2.7 KiB
Markdown
# Recipe Graph
|
|
|
|
## Setup
|
|
Prerequisits
|
|
- Docker compose
|
|
- Python
|
|
|
|
Install python requirements
|
|
```sh
|
|
python -m pip installl -r requirements.txt
|
|
```
|
|
|
|
Environment (`.env`)
|
|
```sh
|
|
POSTGRES_URL=0.0.0.0
|
|
POSTGRES_USER=rgraph
|
|
POSTGRES_PASSWORD=rgraph
|
|
POSTGRES_DB=rgraph
|
|
```
|
|
|
|
Start database
|
|
```sh
|
|
docker-compose -p recipe-dev up
|
|
```
|
|
|
|
Example `sites.json`
|
|
```json
|
|
[
|
|
{
|
|
"name": "Example Site Name",
|
|
"ingredient_class": "example-ingredients-item-name",
|
|
"name_class" : "example-heading-content",
|
|
"base_url" : "https://www.example.com/recipe/"
|
|
}
|
|
]
|
|
```
|
|
|
|
Initialize database and recipe sites
|
|
```sh
|
|
python src/db.py
|
|
python src/insert_sites.py data/sites.json
|
|
```
|
|
|
|
Shutdown database
|
|
```sh
|
|
docker-compose -p recipe-dev down
|
|
```
|
|
|
|
## Usage
|
|
### Scrape
|
|
import new recipes
|
|
```sh
|
|
python src/scrape.py <SiteName> -id <RecipeIdentifier>
|
|
```
|
|
To scrape only one recipe.
|
|
|
|
or
|
|
```sh
|
|
python src/scrape.py <SiteName> -a <N>
|
|
```
|
|
To scrape `<N>` recipes
|
|
|
|
By default it will start at id `0` or the greatest value of id alread in the
|
|
database. To start at another value please use both `-id` and `-a`.
|
|
|
|
```
|
|
Scrape a recipe site for recipies
|
|
|
|
positional arguments:
|
|
site Name of site
|
|
|
|
options:
|
|
-h, --help show this help message and exit
|
|
-id ID, --identifier ID
|
|
url of recipe(reletive to base url of site) or commma seperated list
|
|
-a N, --auto N automaticaly generate identifier(must supply number of recipies to scrape)
|
|
-v, --verbose
|
|
```
|
|
|
|
## Testing
|
|
For testing create a new set up docker containers. Tests will fail if
|
|
the database is already initiated.
|
|
|
|
Starting testing db
|
|
```sh
|
|
docker-compose -p recipe-test up
|
|
```
|
|
|
|
running tests
|
|
```sh
|
|
pytest
|
|
```
|
|
|
|
**WARNINING**: If you get `ERROR at setup of test_db_connection` and
|
|
`ERROR at setup of test_db_class_creation`, please check if testing database is
|
|
already initiated. Testing is destructive and should be done on a fresh database.
|
|
|
|
|
|
Shutting down testing db
|
|
```sh
|
|
docker-compose -p recipe-test down
|
|
```
|
|
|
|
|
|
Test are written in pytest framework. Currently focused on unittest.
|
|
Integration tests to come.
|
|
|
|
To run test use:
|
|
```
|
|
pytest --cov=src/recipe_graph --cov-report lcov --cov-report html
|
|
```
|
|
|
|
The html report is under `htmlcov/` and can be viewed through any browser.
|
|
The `lcov` file can be used for the [Coverage Gutters](https://marketplace.visualstudio.com/items?itemName=ryanluker.vscode-coverage-gutters)
|
|
plugin for VS Code to view coverage in your editor.
|
|
|
|
|
|
## TODO
|
|
> ☑ automate scraping\
|
|
> ☑ extracting quantity and name (via regex)\
|
|
> ☑ creating adjacency list\
|
|
> ☐ api for web frontend\
|
|
> ☐ random ingredient list generation\
|
|
> ☐ visualization(web frontend)\
|
|
> ☐ create ontology of ingredients\
|
|
> ☐ extend importing funcionality to more websites
|
|
>
|