mysql - JDBC : UTF-8 Characters not getting stored in SQL table in right format

I am trying to write a dataframe in SQL table which is having some extended ASCII characters like 'em-dash', pound symbol. But same is not getting written in DB table, showing some junk characters instead of pound symbol or em-dash.

Is there any alternative way to write down the data in SQL table without corrupting the data. I have used "useUnicode=true&characterEncoding=utf8&characterSetResults=utf8" in my JDBC URL and also used "DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci;" in my create table schema. But the issue still persists!

JDBC URL: jdbc:mysql://localhost:3306/MySchema?zeroDateTimeBehavior=convertToNull&allowPublicKeyRetrieval=true&useSSL=false&serverTimezone=UTC&useUnicode=true&characterEncoding=utf8&characterSetResults=utf8

Sample Code:

    val handleSpecialChar = udf {(make: String) => new String(make.getBytes("UTF-8"))}

    val test_df = Seq(
      ("Row 1", "Hello–World"),
      ("Row 2", "Hello—World"),
      ("Row 3", "Hello˜World"),
      ("Row 4", "Hello•World"),
      ("Row 5", "Hello™World"),
    )

    var test = sparkSession.createDataFrame(test_df).toDF("id", "test_string")
    test.show(false)

    test = test.withColumn("test_string", handleSpecialChar(test("test_string")))
    test.show(false)


    test.coalesce(1).write.mode(SaveMode.Overwrite).option("truncate", "true")
      .jdbc(dbUrlMis, "MySchema.testdata", misConnectionProperties)

Create statement of testdata table:

 CREATE TABLE mySchema.`testdata` (
   `id` varchar(200) DEFAULT NULL,
   `test_string` varchar(200) DEFAULT NULL,
   KEY (`id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci;

Sample database table output after writing: (https://i.sstatic.net/VbnwEOth.jpg) Sample intellij console output after executing test.show() : (https://i.sstatic.net/zEyRyg5n.jpg)

edited Jul 12 at 13:10

Rahal Kanishka

74015 silver badges29 bronze badges

asked Jul 9 at 12:00

Prantik Banerjee

12 bronze badges

See stackoverflow.com/questions/38363566/…
– Rick James
Commented Jul 9 at 19:45

Add a comment |

Collectives™ on Stack Overflow

JDBC : UTF-8 Characters not getting stored in SQL table in right format

0

Browse other questions tagged
mysql
jdbc
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Browse other questions tagged mysqljdbc or ask your own question.

Linked

Browse other questions tagged
mysql
jdbc
or ask your own question.