Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [imbalanced-data]

The tag has no usage guidance, but it has a tag wiki.

imbalanced-data
-2 votes
0 answers
10 views

solving the imbalanced numbers of DICOM Data in medical MRI datasets

there is a dataset of brain MRI in DICOM format and we want to use them for training a model every ID have 3 folders and folders contain files between 16 to 20 and now my problem is how can i balance ...
Omid Taji's user avatar
0 votes
0 answers
12 views

ROSE package in R not reading variable correctly; does not read updated value contained in variable

I'm hoping to receive some help here as I've struggled for a while now and I cannot figure out the problem. I am using the ROSE package in R, attempting to make use of the function for random over/...
studentinneed's user avatar
0 votes
0 answers
5 views

Imbalanced Dataset Correlation in Machine Learning

If there is an imbalanced dataset, I cannot figure out the correlation or dependency of the target column on different features. How can I check that? I am using countplot but with that, I cannot ...
Maisara Waseem's user avatar
0 votes
0 answers
167 views

Managing problems of class imbalance in machine learning models using spatial data in R

I am trying to simultaneously perform feature selection and hyperparameter tuning on stacked learners (glmnet and rpart). However, I am encountering the following error message with the classif.glmnet ...
Marine Régis's user avatar
0 votes
0 answers
12 views

I faced an error when I used PCA with LSTM model

I have a time series dataset with 20 classes, but they are imbalanced; when I tried a method like "RandomOverSampler", I got an error because of the 3D of our data so could you suggest a ...
Zineb Adaika's user avatar
0 votes
0 answers
15 views

Weighted F1-score

I'm training and validating models for a binary classification problem in a dataset that has great class imbalance. When searching for metrics for evaluating the performance of the models, I found ...
Juan Segundo Peña Loray's user avatar
0 votes
0 answers
22 views

Class imbalance calculation for each class in a dataset

I am trying to compute class imbalance in each dataset and my approach was to check average and standard deviation of the counts. The average is the total number of samples in class 1 / total number ...
Aparna Bhat's user avatar
1 vote
2 answers
56 views

Does XGBoost's scale_pos_weight correctly balance the positive samples if the training dataset has more positive than negative samples?

After researching, I realized that scale_pos_weight is typically calculated as the ratio of the number of negative samples to the number of positive samples in the training data. My dataset has 840 ...
viji's user avatar
  • 477
0 votes
1 answer
52 views

Which parts of the Imbalanced Learn Pipeline are applied to the test set?

I have created an imbalanced-learn Pipeline consisting of RobustScaler, SMOTE-NC, RandomUndersampling and a Random Forest Classifier. A RandomSearchCV is used to select the best hyperparameters. I ...
CodeSurgeon's user avatar
1 vote
1 answer
57 views

Class_weight parameter not impacting results in imbalanced dataset with RandomForestClassifier

I'm fairly new to ML and now I'm in the process of predicting employee attrition in a medium sized dataset. I have been able to run everything smoothly, but, as the dataset is imbalanced, I've been ...
Raughar's user avatar
  • 13
0 votes
0 answers
16 views

Working with classWeight in model parameters for highly imbalanced datasets in pyspark

I am working on a binary classification problem with a highly balanced dataset(majority class 0: 523152826, and minority class 1: 2711142) I tried the logistic regression model from pyspark.ml....
DS_nerd's user avatar
0 votes
0 answers
15 views

using class_weight in model.fit() doesnt't work

I have an imbalanced dataset and I would like to use class_weight in model.fit(). When I use model.fit() without class_weight, it works correctly, but if I add class_weight, I've got an error. My ...
user24560346's user avatar
0 votes
0 answers
43 views

How do I add a bias to the last layer in my model if my model outputs logits and not probabilities?

I'm working on a medical image binary segmentation problem using a U-Net in tensorflow, and my classes are extremely unbalanced (about 1 in 10,000). As a result, my model wastes a ton of time going ...
Thao Nguyen's user avatar
0 votes
0 answers
13 views

Use an external system-installed Scala library in Python in Databricks notebook

In the context of fixing an imbalanced dataset in pyspark, I found the following external library in scala which is similar to SMOTE for imbalanced data: I installed it on my system with > $...
Malek BEN HMIDA's user avatar
0 votes
0 answers
22 views

Highly imbalanced pyspark dataset

I have a highly imbalanced Pyspark dataset (523148956 for majority class vs 2722245 for minority class) and I would like to perform techniques to balance it without having to convert it to pandas. Can ...
Malek BEN HMIDA's user avatar

15 30 50 per page
1
2 3 4 5
24