Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [dvc]

Data Version Control (DVC) is an open-source version control system for ML and data science projects. Use this tag for questions related to DVC usage and workflows.

dvc
1 vote
1 answer
19 views

DCV ERROR: Failed to import "file" due to SCM error: "github.com/thursday/myrepo". name 'urllib3' is not defined

I tried using dvc to import my zipped data using the command: `dvc import https://github.com/thursday/myrepo xyz.zip -o data/myrepo/xyz.zip'. This is my github :"https://github.com/...
Thursday U's user avatar
0 votes
0 answers
29 views

Module not found error using dvc repro, issues with pandas and pyenv

OS: windows 10 I cloned my teammate's git repo, but when I run dvc repro I get an error saying that there is no module named pandas. project_path (main) $ dvc repro Running stage 'data_collection': &...
prayner's user avatar
  • 415
0 votes
0 answers
37 views

Can I create a dynamic variable that gets a new value everytime for dvc.yaml

I'm aiming to incorporate a unique identifier into my dvc.yaml file every time I execute dvc repro. I'm using a Python script, generate_uuid.py, to generate a UUID and store it in a JSON file. Then, ...
Razor's user avatar
  • 99
1 vote
1 answer
91 views

Extract current running stage from dvc

I'm conducting an experiment using 'dvc repro -f', where multiple stages are executed according to the dvc.yaml configuration. For instance: Stages: Training: foreach: -cycle: 0 -cycle: 1 ...
Razor's user avatar
  • 99
2 votes
1 answer
184 views

How can I download data from just one of the DVC repositories?

I have a project that uses several databases, to avoid versioning huge files in git, I used DVC to manage it on gdrive. I followed the following step by step on DVC Start DVC (dvc init) dvc add #...
L. Guilherme P. Melquiades's user avatar
3 votes
1 answer
77 views

Adding data using dagshub.upload.Repo(USER_NAME,REPO_NAM)

I want to add a raw dataset file to my dagshub repo (my first repo, and its being used alongside an MLflow tutorial) This is the line that is giving me trouble: repo = dagshub.upload.Repo(USER_NAME,...
J.Kent's user avatar
  • 195
1 vote
0 answers
61 views

DVC using cached run although parameter changed

I am trying to perform pipeline tracking using dvc. The problem is, that if i change for example the size parameter from the params.yaml, it does not rerun the stage but simply uses a cached run, ...
Beathvn's user avatar
  • 11
1 vote
1 answer
36 views

Paramater-based dependecies and outs in DVC from constants file

I am trying to define a single-source set of paths such that it can be modified if necessary from a single spot rather than modifying it in various places across many scripts. I am doing this by ...
Jack Avante's user avatar
  • 1,523
0 votes
0 answers
55 views

conda can't activate existing virtual enivrement

I'm trying to learn how to automate data science project using dvc and cookiecutter as project structure, I made conda envirument and installed all my libraries in venv file. everything was working ...
user23304627's user avatar
1 vote
0 answers
37 views

dvc push: local variable referenced before assigment

Error while pushing files to DVC: dvc push ERROR: unexpected error - local variable 'paths' referenced before assignment Having any troubles? Hit us ...
zkhrnkk's user avatar
  • 11
2 votes
0 answers
106 views

DVC with Azure Blob Storage, blobs get deleted after second dvc push

I am using DVC (Data Version Control) with Azure Blob Storage as remote. The connection works. I added two files and when I ran dvc push the blobs appeared in my storage container on Azure. However, ...
Simon's user avatar
  • 2,389
1 vote
1 answer
108 views

Can not dvc push to a NAS

I am trying to use DVC with a remote set to a NAS. So I did dvc remote add -d myremote1 https://mynas.url.something.es:3444/thefolder/dvcstore1/ but after this I cannot push. I tried dvc remote ...
KansaiRobot's user avatar
  • 9,123
0 votes
0 answers
776 views

How to solve HTTPSConnectionPool(host='pypi.org', port=443) when I use poetry install

I am getting this error when I use poetry install in different repos of my organization (using a Mac): HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/dvc/ (...
Nina Lima's user avatar
1 vote
0 answers
35 views

How do I drop all real files after checking them out from DVC, keeping only the stubs?

DVC, or data version control, allows me to stub out large files with an MD5 hash, push them to a remote store, and version control the hash. Then, I can checkout those files from the remote and have ...
Chris's user avatar
  • 30.5k
1 vote
0 answers
93 views

How to see files under DVC TRACKED section in VS code after adding the folder using 'dvc add'?

After setting up the DVC in my project, I am able to see modifications, commit and push through command line but VScode doesn't show anything in 'DVC TRACKED' section. First, I set up my DVC project ...
Deb's user avatar
  • 529

15 30 50 per page
1
2 3 4 5
11