• 主页
  • 相册
  • 随笔
  • 目录
  • 存档
Total 244
Search AboutMe

  • 主页
  • 相册
  • 随笔
  • 目录
  • 存档

bs4-pymsql-flask结构搭建

2019-04-29

本实验实现一个爬虫展示demo,其中爬虫部分使用beautifulsoup解析,pymysql实现数据库写入,SQLAlchemy实现数据库读出,flask实现网页端展示。

0.0.1. 数据爬取

此处以本博客访问量为数据对象

  • getHTMLText:
1
2
3
4
5
6
7
8
9
def getHTMLText(url, code='utf-8'):
try:
r = requests.get(url, timeout=30)
r.raise_for_status()
r.encoding = r.apparent_encoding
# r.encoding = code
return r.text
except:
return "wrong"
  • getBlogInfo:
1
2
3
4
5
6
7
8
9
def getBlogInfo(html):
soup = BeautifulSoup(html, 'html.parser')
title = []
readnum = []
for tags in soup.find_all('a', attrs={'class', 'article-title'}):
title.append(tags.get_text())
for tags in soup.find_all('span', attrs={'class', 'pageViews'}):
readnum.append(tags.get_text())
return title, readnum

0.0.2. 数据存入

需事先在mysql中建库建表,此处为MyBlogDB和DB_Info

  • savMySql:
1
2
3
4
5
6
7
8
9
10
11
12
13
def savMySql(title, readnum):
db = pymysql.connect('localhost', 'root', '***', 'MyBlogDB')
cursor = db.cursor()
for i in range(len(title)):
sql = "replace into DB_Info(article_title, article_readnum) values('%s','%s')" % (
title[i], readnum[i])
try:
cursor.execute(sql)
db.commit()
except:
db.rollback()
print("error")
db.close()

0.0.3. 数据读出

在 app.py中用 SQLAlchemy库读出数据

  • 参数配置
1
2
3
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://root:***@localhost:3306/MyBlogDB?charset=utf8'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
db = SQLAlchemy(app)
  • DBInfo类:
1
2
3
4
5
6
class DBInfo(db.Model):  
article_title = db.Column(db.String(50), primary_key=True) # 主键
article_readnum = db.Column(db.String(20))

def __repr__(self):
return '<User %r>' % self.username

0.0.4. 网页展示

  • test:
1
2
3
4
@app.route('/test')
def test():
dbinfo = DBInfo.query.all()
return render_template('test.html', dbinfo=dbinfo)

html用到了 bootstrap

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{% block content %}
<div class="container">
<table class="table table-striped table-bordered table-hover table-condensed">
<tbody>
{% for tags in dbinfo %}
<tr>
<td>{{ tags.article_title }}</td>
<td>{{ tags.article_readnum }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% endblock %}

0.0.5. 结果展示

  • Flask
  • Program Language
  • Python
  • Web
实验:powershell文本格式化及mysql导入
[翻譯]Depth perception
© 2024 何决云 载入天数...