Questions tagged [etl]
ETL is an acronym for Extract, Transform, and Load. It refers to a process of extracting data from source systems, transforming the data in some way (manipulating it, filtering it, combining it with other sources), and finally loading the transformed data to target system(s).
5,956
questions
-1
votes
0
answers
14
views
Is it acceptable to source datasets from EDW/LakeHouse for operational pipelines? [closed]
When using a data platform like Databricks; is it acceptable to source datasets from the EDW/LakeHouse for operational pipelines? e.g. for a data sync to a third-party platform.
For the project I'm ...
0
votes
1
answer
23
views
Error while trying to migrate data from one table to another in BigQuery
I run into a problem when trying to migrate data between BigQuery tables, I have an old table that has an attribute nested as Float64, I want to migrate the data from this old table, to another one, ...
0
votes
1
answer
58
views
How to update a BigQuery table with data from a local csv file while using Python?
Every day, I migrate data from a MySQL database to BigQuery. For most processes, I use LoadJobConfig().writedisposition with WRITE_APPEND. However, for my raw_stock table, using WRITE_APPEND doesn't ...
0
votes
0
answers
31
views
Not able to create job in dataflow for streaming data
I am executing my Apache-beam code in google cloud shell, I am able to execute code without errors, but jobs not creating in data flow.
**below roles I assigned to service account
**
Dataflow Worker, ...
-1
votes
1
answer
69
views
Choosing good approach to copy multiple tables in Azure Data Factory
I need to copy hundreds of tables (full or delta) from source to target using Azure Data Factory (ADF). I have two options:
Option A: 1 Pipeline per Table
Pros
Uses native ADF functionality.
...
0
votes
0
answers
45
views
Using Powershell to data drop a csv file from a WebEx API to SQL Server: Exception calling "WriteToServer" with "1" argument(s):
I am trying to drop WebEx meeting data to my company's SQL server. After running line by line, the csv data and datatable are loaded. However when I run the bulk copy method, it returns this error.
...
0
votes
0
answers
14
views
Dynamically pass file name to FTP task in Realisable Iman
I want to generate a dynamic file name and pass it to the 'Remote file' fied of an FTP task in Realisable Iman. I have used a script task (Vb) to generate the file name but I can't find away to pass ...
0
votes
1
answer
38
views
Extracting json array in Postgres
My Postgres database contains a list of json objects as records. I am trying to extract an array from the record, and Postgres does not seem to like what I'm proposing.
Here's an example of a record.
...
2
votes
1
answer
66
views
Whats is a good way to add a column to an existing sql database table using spark?
I have an existing Postgres SQL table with some features.
I want to use spark to :
Read that table
Create some additional columns
Add those columns to the table.
Is there any way to make spark add ...
-1
votes
1
answer
40
views
How to import only the new data and new records that have changed using the ETL process in a data warehouse?
I have an ETL process that allows me to load data from one database to another, applying transformations along the way. The process currently starts by deleting all records from all tables, and then ...
0
votes
1
answer
54
views
Unable to use both SQL Server and Postgres connection together on the same job - Talend
For a test, I created this simple Talend job:
The tRowGenerator generates a row with int column and is staged to temporary database in Postgres.
The issue occurs whenever I run the job. The ...
1
vote
0
answers
33
views
Error parsing CSV File when copying Data to Snowflake after July 3rd Incident
I am encountering an error while trying to copy a CSV file into Snowflake from an S3 bucket. This process was functioning correctly until an incident occurred in Snowflake on July 3rd. The error ...
-1
votes
0
answers
14
views
SQL: How to convert iCal event blob to start and end timestamp
We have a production service that schedules events (both ad-hoc and recurring) via the iCal format. We then replicate these events to a Snowflake data warehouse as the event "blob" itself. ...
0
votes
1
answer
34
views
Drop message in function execution time - Spark
I'm trying to run some functions for my ETL pipelines and log them in the process, the issue is that when I call my function my log message is instantly shown, I want to display it during the ...
0
votes
0
answers
31
views
ETL design over an existing DDD aggregate
I hope you can help me with the design of a data ingestion process.
Currently, I have an existing aggregate called ExperiencePricing and an existing command called SetExperiencePricingCommand. The ...