没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
Develop a NLP Model in Python &
Deploy It with Flask, Step byStep
Flask API, Document Classification,
SpamFilter
By far, we have developed many machine learning models, generated
numeric predictions on the testing data, and tested the results. And
we did everything offline. In reality, generating predictions is only
part of a machine learning project, although it is the most important
part in my opinion.
Considering a system using machine learning to detect spam SMS
text messages. Our ML systems workflow is like this: Train offline ->
Make model available as a service -> Predict online.
A classifier is trained offline with spam and non-spam messages.
The trained model is deployed as a service to serve users.
Susan Li
F
o
ll
ow
Dec 17, 2018
·
6 min read
•
•
When we develop a machine learning model, we need to think about
how to deploy it, that is, how to make this model available to other
users.
Kaggle and Data science bootcamps are great for learning how to
build and optimize models, but they don’t teach engineers how to
take them to the next step, where there’s a major difference between
building a model, and actually getting it ready for people to use in
their products and services.
In this article, we will focus on both: building a machine learning
model for spam SMS message classification, then create an API for
the model, using Flask, the Python micro framework for building web
applications.This API allows us to utilize the predictive capabilities
through HTTP requests. Let’s get started!
ML ModelBuilding
The data is a collection of SMS messages tagged as spam or ham that
can be found here. First, we will use this dataset to build a prediction
model that will accurately classify which texts are spam.
Naive Bayes classifiers are a popular statistical technique of e-mail
filtering. They typically use bag of words features to identify spam e-
mail. Therefore, We’ll build a simple message classifier using Naive
Bayes theorem.
Figure 1
Not only Naive Bayes classifier is easy to implement but also provides
very good result.
After training the model, it is desirable to have a way to persist the
model for future use without having to retrain. To achieve this, we
add the following lines to save our model as a.pkl file for the later
use.
from sklearn.externals import joblib
joblib.dump(clf, 'NB_spam_model.pkl')
And we can load and use saved model later like so:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVecto
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report
df = pd.read_csv('spam.csv', encoding="latin-1")
df.drop(['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4'], ax
df['label'] = df['class'].map({'ham': 0, 'spam': 1})
X = df['message']
y = df['label']
cv = CountVectorizer()
X = cv.fit_transform(X) # Fit the Data
NB_spam.py
Figure 2
剩余14页未读,继续阅读
资源评论
tox33
- 粉丝: 64
- 资源: 304
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功