sklearn model 백업, 재사용

sklearn 내부의 pickle lib 를 통해 모델을 저장하고 다시 로드하여 재사용할 수 있다.

from sklearn import svm
from sklearn import datasets
clf = svm.SVC()
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)  





import pickle
s = pickle.dumps(clf)
clf2 = pickle.loads(s)
clf2.predict(X[0:1])

y[0]

아래 api를 통해 file 저장도 가능한 듯 하다.

자세한 내용은 pickle 홈페이지에 있다.

https://docs.python.org/2/library/pickle.html

pickle.dump(obj, file[, protocol])

Write a pickled representation of obj to the open file object file. This is equivalent to Pickler(file, protocol).dump(obj).

If the protocol parameter is omitted, protocol 0 is used. If protocol is specified as a negative value or HIGHEST_PROTOCOL, the highest protocol version will be used.

Changed in version 2.3: Introduced the protocol parameter.

file must have a write() method that accepts a single string argument. It can thus be a file object opened for writing, a StringIO object, or any other custom object that meets this interface.

pickle.load(file)

Read a string from the open file object file and interpret it as a pickle data stream, reconstructing and returning the original object hierarchy. This is equivalent to Unpickler(file).load().

file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return a string. Thus file can be a file object opened for reading, a StringIO object, or any other custom object that meets this interface.

This function automatically determines whether the data stream was written in binary mode or not.

pickle.dumps(obj[, protocol])¶

파일 저장은 아래와 같이 joblib 를 통해 저장할 수 있다.

In the specific case of the scikit, it may be more interesting to use joblib’s replacement of pickle (joblib.dump & joblib.load), which is more efficient on big data, but can only pickle to the disk and not to a string:

>>>from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl') 

Later you can load back the pickled model (possibly in another Python process) with:

>>>clf = joblib.load('filename.pkl') 

자세한 내용은 아래의 scikit learn 홈페이지에서 확인 할 수 있다. 
http://scikit-learn.org/stable/tutorial/basic/tutorial.html#machine-learning-the-problem-setting

'Machine Learning(머신러닝)' 카테고리의 다른 글

강 인공지능과 약 인공지능 (0)	2018.09.17
Machine learning을 포함한 A.I 구조 (0)	2018.09.10
sklearn 성능 측정 (0)	2017.12.03
Scikit learn (Python) (0)	2017.12.03
머신러닝 배우는 방법 (0)	2017.04.08

다섯번째 이야기

sklearn model 백업, 재사용

'Machine Learning(머신러닝)' 카테고리의 다른 글

티스토리툴바

sklearn model 백업, 재사용

'Machine Learning(머신러닝)' 카테고리의 다른 글

'Machine Learning(머신러닝)' Related Articles

티스토리툴바