Questions tagged [delta]
A Delta is a file that represents the changes between two or more revisions of structured or semi-structured data. Use delta-lake tag for questions about Delta Lake file format.
delta
366
questions
0
votes
0
answers
6
views
VM Restore Shows Old Data Only
I had to recover VMDK of one of my file server. When I add this to inventory and load, it show only one drive and all the data is old. It should have total two drives and should have data as latest as ...
1
vote
0
answers
49
views
ADF/Databricks : The table is not delta format
Getting error, when trying to copy data from CSV (Azure Blob) to Databricks Delta table. Using ADF Copy activity, Blob Storage as Source, Databricks Delta Lake as Sink. Getting error that the table is ...
1
vote
1
answer
87
views
Spark-delta not working when upgrade to spark 3.5.0 and delta 3.1.0
I have a docker project to work with spark locally, as follows:
Ubuntu 20.04 (WSL)
openjdk:17.0.2
Scala 2.12
Spark 3.4.0
Spark-delta 2.4.0
JupyterLab
Everything works fine, but when I wanted to ...
0
votes
1
answer
114
views
How can I efficiently move data from Databricks to Snowflake?
I’m going to explain you a real problem I'm facing in my job, related to the movement of data from Databricks in Delta format, to Snowflake (which we use as an exposure layer).
I work as a Data ...
1
vote
0
answers
26
views
move the Bool data type from x register to the D register in the Delta PLC(DIADesigner)
I am trying to move the bool data type of the X or Y registers(X0.0\Y0.0) to the D registers(D10000-which is a word/dword data type) so i can send the data from the PLC to the PC using TCP Socket ...
0
votes
0
answers
18
views
How to add jars to a pyspark interpreter in Zeppelin
Got a working standalone PySpark script that reads a local delta file and creates a table on top of it for querying.
Asked to move it to a zeppelin notebook so as to run on AWS infrastructure.
The ...
1
vote
1
answer
64
views
Which is the correct approach to operate on Delta Files
New to Py Spark, parquet and delta ecosystem and little confused on seeing multiple ways to play with delta files.
Can someone help me understand on which one is correct or preferred for updating a ...
0
votes
1
answer
209
views
Adding a dataframe to an existing delta table throws DELTA_FAILED_TO_MERGE_FIELDS error
Fix
Issue was due to mismatched data types. Explicitly declaring schema type resolved the issue.
schema = StructType([
StructField("_id", StringType(), True),
StructField("...
0
votes
2
answers
140
views
Can I use parquet format v2 when writing delta tables from spark?
is there a way to configure spark to write a specific version of parquet format when writing a dataframe as a delta table?
I could not find anything to help me configure the file format version in ...
1
vote
0
answers
10
views
how search best epsilon and delta for lcs matrix(what object function)
how search best epsilon and delta for lcs matrix(what object function)
I want to use heuristic algorithms to search the value of epsilon and delta to find the best epsilon for time series data. I don'...
1
vote
1
answer
81
views
Column accepting null strings and struct values for delta table
I'm working with unstructured data, JSON files, in Databricks using Pyspark.
Basically I have some json files which contains a field.
"Field1" which is a regular StructType, but when it ...
0
votes
1
answer
42
views
Is there a safe way to overwrite a stream delta table?
I need to fully overwrite a stream delta table using PySpark without messing with the checkpoint, is there any safe way to do this? I do no need to keep any version from previous delta.
0
votes
0
answers
86
views
Pyspark - read csv and saving it to delta format folder errors out ( no pandas)
New to Pyspark and trying to play with parquet/delta ecosystem.
Trying to write a script that does the following
Read a csv file into a spark dataframe.
Save it as parquet file.
Read the above saved ...
1
vote
0
answers
73
views
Delta Lake (OSS) merge operation never finishes (or takes too long)
I am experimenting with Delta Lake as the primary storage solution for my tabular data, which is updated daily. I tried to mimic the basic use case - an existing target table is updated by the new ...
1
vote
1
answer
94
views
How can I filter and update a delta table in pyspark and save the result?
I have a delta table saved in s3, and I'm using an aws glue job to read a set of csv's into a pyspark dataframe, and then to update the delta table by appending the dataframe rows to the delta table. ...