Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

Tagged with
0 votes
1 answer
11 views

min_samples_leaf in GradientBoostedClassifier() having weird behavior

Trying to tune the min_samples_leaf in a GradientBoostedClassifer(). I'm seeing expected results with the bias/variance tradeoff. However, just to test the boundaries, I made the min_samples_leaf > ...
deutan_suave's user avatar
-1 votes
0 answers
12 views

I keep encountering this problem training a Random forest regressor

/usr/local/lib/python3.10/dist-packages/sklearn/base.py:432: UserWarning: X has feature names, but RandomForestRegressor was fitted without feature names warnings.warn( I have tried adding .values ...
Emmanuel Owusu's user avatar
1 vote
1 answer
29 views

Leave one out encoding on test set with transform

Context: When preprocessing a data set using sklearn, you use fit_transform on the training set and transform on the test set, to avoid data leakage. Using leave one out (LOO) encoding, you need the ...
Jelle's user avatar
  • 251
1 vote
1 answer
19 views

MultiLabelBinarizer: inverse_transform How can get a list of labels sorted according to their probability?

I am doing a multi label classification and I use MultiLabelBinarizer to translate the list of labels into Zeros and Ones. I could get the Labels using inverse_transform which is super. However, in a ...
sveer's user avatar
  • 462
0 votes
0 answers
12 views

Is it possible to integrate feature extraction within Core ML model/pipeline?

I have a Core ML model (.mlmodel) which I then can use in the Swift code using the Core ML framework to make predictions. However the feature extraction is also happening in this code. I currently use ...
mirkap's user avatar
  • 7
-3 votes
0 answers
35 views

How to improve Random Forest accuracy? [closed]

I am working on a basic random forest classifier but I currently have extremely low results and I don't know why. I need another set of eyes to see if I have an obvious problem. My dataset is a list ...
Abby's user avatar
  • 9
0 votes
0 answers
26 views

Is it possible to limit the scikit learn model to only predict certain tags?

I have two models trained on a number of tags and use it for predicting the genre of a game. I noticed that due to have the models were trained, sometimes the same input data can have the two models ...
linkey apiacess's user avatar
0 votes
0 answers
29 views

ValueError when loading sklearn DecisionTreeClassifier pickle in Python 3.10

I'm encountering an issue while transitioning from Python 3.7.3 to Python 3.10 due to the deprecation of the older version. The problem arises when attempting to load a pickled sklearn ...
user3369545's user avatar
-2 votes
0 answers
37 views

Using an array as a feature/independent variable? [closed]

Let's say I have a dataframe of: Subject id (int) Age (int) Sex (int 1- for male, 2- for female) Pearson CC representing a functional network (matrix) Which means i have an array of arrays for the ...
Cal's user avatar
  • 7
-1 votes
0 answers
20 views

Very high MAE and MSE on my RandomForestRegressor

I got a flight predictions dataset that i wanted to try my machine learning skills. I cleaned the data and fixed some new features and removed others i also got out some valuable data. But when i ...
Jaldu's user avatar
  • 1
-2 votes
0 answers
25 views

sklearn and imputer library issues [duplicate]

ImportError: cannot import name 'Imputer' from 'sklearn.preprocessing' (C:\Users\user\anaconda3\lib\site-packages\sklearn\preprocessing\__init__.py) I'm getting this. I'm tired of them errors I've ...
Kyalo Josephine's user avatar
0 votes
0 answers
34 views

XGBoost Classifier, Grid search

I am trying to apply an XGBoost Classifier model in Python via XGBClassifier (see documentation here). The idea is to train the model on a city-year (e.g. zurich-2000) couple and test it on data for ...
Lusian's user avatar
  • 661
0 votes
0 answers
17 views

Store/load Scikit Learn objects (Pipeline, ColumnTransformer) without different version considerations?

I need to be able to somehow store and reload a Scikit Learn pipeline in a way that breaking changes between different Scikit Learn versions aren't as much of a concern as they are when I use e.g. ...
Hendrik Wiese's user avatar
-1 votes
1 answer
38 views

K-Means taking a long time to run

I have a dataset that has 3 million records with 15 columns that I'm using for customer segmentation. I've used KMeans and MiniBatchKMeans but it's running even after 45 hours (did not run them ...
Ayushman Mishra's user avatar
1 vote
0 answers
34 views

When running via command line, receive error ''no attribute predict_proba"

I have a set of code that, when I run it in a Python interpreter (3.8.4), everything works fine. However, when I try to run via the command line, I end up receiving an error: AttributeError: This '...
linkey apiacess's user avatar

15 30 50 per page
1
2 3 4 5
1461