Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [delta]

A Delta is a file that represents the changes between two or more revisions of structured or semi-structured data. Use delta-lake tag for questions about Delta Lake file format.

0 votes
0 answers
6 views

VM Restore Shows Old Data Only

I had to recover VMDK of one of my file server. When I add this to inventory and load, it show only one drive and all the data is old. It should have total two drives and should have data as latest as ...
Deepak Itkar's user avatar
1 vote
0 answers
49 views

ADF/Databricks : The table is not delta format

Getting error, when trying to copy data from CSV (Azure Blob) to Databricks Delta table. Using ADF Copy activity, Blob Storage as Source, Databricks Delta Lake as Sink. Getting error that the table is ...
Tautvydas Perminas's user avatar
1 vote
1 answer
87 views

Spark-delta not working when upgrade to spark 3.5.0 and delta 3.1.0

I have a docker project to work with spark locally, as follows: Ubuntu 20.04 (WSL) openjdk:17.0.2 Scala 2.12 Spark 3.4.0 Spark-delta 2.4.0 JupyterLab Everything works fine, but when I wanted to ...
Yaya's user avatar
  • 13
0 votes
1 answer
114 views

How can I efficiently move data from Databricks to Snowflake?

I’m going to explain you a real problem I'm facing in my job, related to the movement of data from Databricks in Delta format, to Snowflake (which we use as an exposure layer). I work as a Data ...
Miguel's user avatar
  • 1
1 vote
0 answers
26 views

move the Bool data type from x register to the D register in the Delta PLC(DIADesigner)

I am trying to move the bool data type of the X or Y registers(X0.0\Y0.0) to the D registers(D10000-which is a word/dword data type) so i can send the data from the PLC to the PC using TCP Socket ...
Vasanth's user avatar
  • 11
0 votes
0 answers
18 views

How to add jars to a pyspark interpreter in Zeppelin

Got a working standalone PySpark script that reads a local delta file and creates a table on top of it for querying. Asked to move it to a zeppelin notebook so as to run on AWS infrastructure. The ...
coredump's user avatar
  • 131
1 vote
1 answer
64 views

Which is the correct approach to operate on Delta Files

New to Py Spark, parquet and delta ecosystem and little confused on seeing multiple ways to play with delta files. Can someone help me understand on which one is correct or preferred for updating a ...
coredump's user avatar
  • 131
0 votes
1 answer
209 views

Adding a dataframe to an existing delta table throws DELTA_FAILED_TO_MERGE_FIELDS error

Fix Issue was due to mismatched data types. Explicitly declaring schema type resolved the issue. schema = StructType([ StructField("_id", StringType(), True), StructField("...
coredump's user avatar
  • 131
0 votes
2 answers
140 views

Can I use parquet format v2 when writing delta tables from spark?

is there a way to configure spark to write a specific version of parquet format when writing a dataframe as a delta table? I could not find anything to help me configure the file format version in ...
Takreem's user avatar
1 vote
0 answers
10 views

how search best epsilon and delta for lcs matrix(what object function)

how search best epsilon and delta for lcs matrix(what object function) I want to use heuristic algorithms to search the value of epsilon and delta to find the best epsilon for time series data. I don'...
Valliporygmail com Valliporigm's user avatar
1 vote
1 answer
81 views

Column accepting null strings and struct values for delta table

I'm working with unstructured data, JSON files, in Databricks using Pyspark. Basically I have some json files which contains a field. "Field1" which is a regular StructType, but when it ...
BlueVelvet's user avatar
0 votes
1 answer
42 views

Is there a safe way to overwrite a stream delta table?

I need to fully overwrite a stream delta table using PySpark without messing with the checkpoint, is there any safe way to do this? I do no need to keep any version from previous delta.
s528060's user avatar
  • 73
0 votes
0 answers
86 views

Pyspark - read csv and saving it to delta format folder errors out ( no pandas)

New to Pyspark and trying to play with parquet/delta ecosystem. Trying to write a script that does the following Read a csv file into a spark dataframe. Save it as parquet file. Read the above saved ...
coredump's user avatar
  • 131
1 vote
0 answers
73 views

Delta Lake (OSS) merge operation never finishes (or takes too long)

I am experimenting with Delta Lake as the primary storage solution for my tabular data, which is updated daily. I tried to mimic the basic use case - an existing target table is updated by the new ...
Anton Kretov's user avatar
1 vote
1 answer
94 views

How can I filter and update a delta table in pyspark and save the result?

I have a delta table saved in s3, and I'm using an aws glue job to read a set of csv's into a pyspark dataframe, and then to update the delta table by appending the dataframe rows to the delta table. ...
Boris's user avatar
  • 794

15 30 50 per page
1
2 3 4 5
25