Fórum Root.cz
Hlavní témata => Vývoj => Téma založeno: Wangarad 27. 12. 2020, 16:48:07
-
Zdravim.
Snazim sa poriesit ML a predikciu a skusam nasledovne.
# Load the dataset using pandas
output_path = os.path.dirname(__file__)
csv = os.path.join(output_path, 'data.csv')
# Load dataset
names = ['Open','Closing']
#names = ['Open']
dataset = read_csv(csv, names=names, header=0)
dataset=dataset.astype(float)
# box and whisker plots
# Split-out validation dataset
array = dataset.values
print (array)
X = array[:,0]
y = array[:,1]
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1)
print ('After XY separation')
print (array)
pyplot.plot(array)
pyplot.ylabel('some numbers')
pyplot.show()
# Spot Check Algorithms
models = []
models.append(('LR', LogisticRegression(solver='liblinear', multi_class='ovr')))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('NB', GaussianNB()))
models.append(('SVM', SVC(gamma='auto')))
# evaluate each model in turn
results = []
names = []
for name, model in models:
kfold = StratifiedKFold(n_splits=10, random_state=1, shuffle=True)
cv_results = cross_val_score(model, X_train, Y_train, cv=kfold, scoring='accuracy')
results.append(cv_results)
names.append(name)
print('%s: %f (%f)' % (name, cv_results.mean(), cv_results.std()))
# Compare Algorithms
pyplot.boxplot(results, labels=names)
pyplot.title('Algorithm Comparison')
pyplot.show()
Zdrojove data vyzeraju takto nieako
Closing Open
870.13125 910.2475
905.45625 870.13125
900.36625 905.45625
948.89875 900.36625
971.645 948.89875
954.84375 971.645
952.455 954.84375
964.325 952.455
1009.97375 964.325
1028.33375 1009.97375
1047.09999 1028.33375
1140.385 1047.09999
985.93875 1140.385
837.83625 985.93875
923.52375 837.83625
aj graf vyzera ok ale zomrie to na chybu.
File "C:\Program Files\Python36\lib\site-packages\sklearn\model_selection\_split.py", line 589, in _make_test_folds
allowed_target_types, type_of_target_y))
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead.
Vedel by niekto preco?
Ono povodne som chcel este pouzit column Date ale ten mi nechcelo ani za toho pana nacitat.
Mam pocit ze som sa dost zamotal s tym array.
-
Tak zda sa ze som to doriesil pridanim resp. upravou
lab_enc = preprocessing.LabelEncoder()
X_train = lab_enc.fit_transform(X_train)
Y_train = lab_enc.fit_transform(Y_train)
Y_train= Y_train.reshape(-1, 1)
X_train= X_train.reshape(-1, 1)
-
Ale aj tak mi to cele skape na
print(accuracy_score(Y_validation, predictions.round(), normalize=False))
File "C:\Program Files\Python36\lib\site-packages\sklearn\metrics\classification.py", line 176, in accuracy_score
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "C:\Program Files\Python36\lib\site-packages\sklearn\metrics\classification.py", line 81, in _check_targets
"and {1} targets".format(type_true, type_pred))
ValueError: Classification metrics can't handle a mix of continuous and multiclass targets
:-\
-
https://stackoverflow.com/questions/37367405/python-scikit-learn-cant-handle-mix-of-multiclass-and-continuous (https://stackoverflow.com/questions/37367405/python-scikit-learn-cant-handle-mix-of-multiclass-and-continuous)
-
Skusal som resp. pozeral ale neprisiel som na to wo co go.
-
Musíš mít jasno v tom, jestli učíš klasifikátor, nebo regresor. Používáš metriku accuracy, která funguje jen pro klasifikační úlohy. Zároveň používáš modely, které nedělají klasifikaci, ale regresi. Jestli má být model klasifikátor nebo regresor, záleží na typu úlohy a musíš si to rozhodnout sám, nikdo to za tebe neudělá. Jsou to úplné základy, doporučuju začít studium např. tady: https://machinelearningmastery.com/classification-versus-regression-in-machine-learning/
-
linuxak dakujem.
Isiel som podla tohoto navodu s tym ze som to skusal ohnut na predikciu cisel.
https://shortl.online/o7tZI
ale stane neviem kde som spravil chybu ked som pouzil svoje data.
S odkazu som pochopil ze Classification je vlastne ano nie resp. je to validne alebo nie.
Regression je na viac menej predicia dalsieho cisla.
ValueError: Classification metrics can't handle a mix of continuous and multiclass targets