lingxue

lingxue

向着遥不可及的梦想,进发!
steam
youtube
keybase
twitter

wiseflow Chief Intelligence Officer Complete User Guide

There are many people online promoting how useful this tool is, but I still can't find a complete tutorial

This article will teach you how to run this tool

Since web scraping is not very secure, this article does not provide a tutorial on how to solve the problem of web pages not being scraped correctly. Please solve the information retrieval problem on your own

This article is based on Windows tutorial, other systems may need to modify the steps accordingly

Let's introduce this project again#

Chief Intelligence Officer (Wiseflow) is an agile information mining tool that can extract information from various sources such as websites, WeChat official accounts, social platforms, etc. according to the set focus, automatically classify and upload the information to the database.

Screenshot 2024-08-24 215708

Environment preparation#

python (tested with 3.11.6)

ollama client

git

wiseflow project

pocketbase database

Installation#

Install python#

https://www.python.org/ftp/python/3.11.6/python-3.11.6-amd64.exe

Download and install, remember to check "add path"

Install git#

https://github.com/git-for-windows/git/releases/download/v2.46.0.windows.1/Git-2.46.0-64-bit.exe

Download and install, no need to modify, just click "Next" all the way

Install ollama#

https://ollama.com/download/OllamaSetup.exe

Download and install

Change pip source#

Then open the command line and enter the code to change the pip source to Huawei source

pip config set global.index-url https://mirrors.huaweicloud.com/repository/pypi/simple

Clone the project#

Then enter the command to clone the project

git clone https://github.com/TeamWiseFlow/wiseflow

pocketbase database#

https://github.com/pocketbase/pocketbase/releases/download/v0.22.19/pocketbase_0.22.19_windows_amd64.zip

Download for backup

Install project environment#

cd wiseflow
cd core
pip install -r requirements.txt

Configure the project#

wiseflow configuration#

Extract the downloaded pockeybase database to /wiseflow/core/pb

Go to the pb directory and execute the following command in the command line

.\pocketbase migrate up
.\pocketbase --dev admin create [set email randomly] [set password randomly]

Move the .sh scripts in the /core/scripts folder to the /core directory

Modify start_backend.sh

#!/bin/bash
set -o allexport
source ../.env
set +o allexport
exec uvicorn backend:app --reload --host localhost --port 8077

Modify start_tasks.sh

#!/bin/bash
set -o allexport
source ./env
set +o allexport
exec python tasks.py

Delete one "." in the content "source ../env"

Then right-click on the sh file, click "Open with" and change it to the program in the screenshot below

Screenshot 2024-08-24 213503

Copy the env_sample file in the /wiseflow folder and rename it to env

Then modify the content as follows

export LLM_API_KEY=" " ##there is a space here, it will cause an error if not added
export LLM_API_BASE="http://127.0.0.1:11434/v1/" ##for local model services or calling non-OpenAI services with openai_wrapper
##strongly recommended to use the following model provided by siliconflow (consider both effect and price)
export GET_INFO_MODEL="qwen2:7b"
export REWRITE_MODEL="qwen2:7b"
export HTML_PARSE_MODEL="qwen2:7b" ##or"01-ai/Yi-1.5-9B-Chat"
export PROJECT_DIR="work_dir"
export PB_API_AUTH="[set email randomly]|[set password randomly]"
# export "PB_API_BASE"="" ##only use if your pb not run on 127.0.0.1:8090
export WS_LOG="verbose" ##for detail log info. If not need, just delete this item.

Then copy the env file to the /core folder

ollama configuration#

Since the official recommendation is to use qwen2:7b, let's use this model. If there are better options, please recommend them in the comments

Then enter the command

ollama pull qwen2:7b 

Run the project#

Double-click to start the start_backend.sh and start_pb.sh files in the /core folder

Enter http://127.0.0.1:8090/_/ in the browser

Then enter the email and password set above

Add sites and tags

Screenshot 2024-08-24 214655

Screenshot 2024-08-24 214748

Don't forget to activate

Then run start_tasks.sh

You can see the crawled article content displayed in the command line, and you can also see it in the articles section

Afterword#

Currently, the project does not support rsshub. Please solve the problem of crawling certain websites on your own

Due to the author's use of an AMD 7600MXT graphics card, the graphics memory is overloaded, so I don't know how it performs, but I can confirm that this configuration can run

According to the official statement of the project

SiliconFlow officially announced that several LLM online reasoning services such as Qwen2-7B-Instruct and glm-4-9b-chat are now free, which means you can use the Chief Intelligence Officer for information mining at "zero cost"!

As of the time of publication of this article, it has changed to a paid service. Considering the amount of data retrieved by the program, using the paid API will be costly.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.