Data splitting using sklearn

Data splitting is a technique of dividing you data into training and testing.This help you use the training data to teach or train the model ,while the test data from the name implies it would be use to test the accuracy level of the model.This technique of splitting is mostly used in supervised learning, where data has some kind of labels attached to it.

`
import numpy as np
from sklearn.model_selection import train_test_split

a = np.arange(1,100)

a_train,a_test =train_test_split(a)

`

Some other options include

  • test_size: It must range from 0 – 1, which shows the percentage of the data required for testing alone.

  • Shuffle: By default it is True but it can be made False.It is to prevent data shuffle

  • random_state: It make randomised data to remain fix and unchanging no matter the amount of time you slit the data

原文链接:Data splitting using sklearn

© 版权声明
THE END
喜欢就支持一下吧
点赞10 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容