Handling Base64 Encoded Attachments - Converting and Sending Binary Content

Hi, I am working with a customer who has difficulty sending an attachment from his ticketing system to our ticketing system (via Apigee). His ticketing system uses JS ECMAScript 6 and for some reason, he is having difficulty sending a request with an attachment. The only thing that works for him is to read the image file into binary and then encode it with base64. Form this base64 content, he can make a normal request to Apigee by sending the base64 content.

I have set up a new endpoint to handle this. I use a Python resource to simplify the base64 decoding and construct the multipart-boundary request. I can attach the file to our ticketing system using base64 content. However, for the image attachment, it got corrupted while it works fine for the text file.
I noticed the flow.getVariable("request.content") returns an org.python.core.PyUnicode type. I believe the body needs to be in bytes, but I am not able to convert it successfully. How can I convert PyUnicode to binary so I can place the content in the body?
 

 

import base64

# flow.getVariable("request.content") and decodedContent type are class org.python.core.PyUnicode
decodedContent = base64.b64decode(flow.getVariable("request.content"))

flow.setVariable('request.header.Content-Type','multipart/form-data; boundary=boundary123')

attachmentBody = ""
attachmentBody += "--boundary123\r\n"
attachmentBody += 'Content-Disposition: form-data; name="file"; filename="testpy.png"\r\n'
attachmentBody += "Content-Type: image/png\r\n\r\n"
attachmentBody += decodedContent + "\r\n"
attachmentBody += "--boundary123--\r\n"

# attachmentBody type is class org.python.core.PyUnicode
# Modifying the request content
flow.setVariable('request.content',  attachmentBody)

 

Solved Solved
2 3 265
1 ACCEPTED SOLUTION

@dchiesa1 , I got it working with Jython without the need for jar files!! Thank you so much for the leads and your note and code sample about Java's "[B@1cf582a1", which is a string representation of byte a array.
The trickiest part was the data manipulation needed to be done in Java instead of Python. To accomplish this, we needed to chain the Java commands to prevent corruption. The code has been successfully tested with the following attachments: DOCX, XLSX, PDF, JPG, and PNG.
Below is the bare minimum solution along with comments.

# The flow.setVariable('debug-') is used to print messages to Apigee variables. This is especially needed when using the Apigee Emulator because Python's print function doesn't output to stepExecution-stdout.
# Why is Jython used for this script? RhinoJS doesn't provide a base64 function out of the box. 
# Jython reduces Java's boilerplate, but it uses Python v2.
# When manipulating data in Jython, make sure to leverage Java's functions (use Java syntax) to prevent corruption. 


import hashlib # This is only needed for debugging purposes. We use this to calculate the SHA-256 checksum for each step where the content is stored into a variable.
from java.util import Base64 # DO NOT USE Jython's base64. Jython's base64 returns PyUnicode (string). We want it to return PyArray (byte array).
from java.lang import String # Used to convert PyUnicode to Java String
from java.nio.charset import Charset # Used to convert Java String to Java byte array 
from java.io import ByteArrayInputStream # Used to convert from Java byte array to Java bytes

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
flow.setVariable('debug-OriginContentType', type(flow.getVariable("request.content"))) #org.python.core.PyUnicode
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug

requestContent = flow.getVariable("request.content")
fileName = flow.getVariable("request.header.filename")

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# Checksum of base64-encoded data
sha256_base64Payload = hashlib.sha256(requestContent).hexdigest()
flow.setVariable('debug-sha256_base64Payload', sha256_base64Payload)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug

# We use Java's Base64 so it returns PyArray. Jython's base64 returns PyUnicode
bytesPayload = Base64.getDecoder().decode(requestContent)

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
flow.setVariable('debug-bytesPayloadType', type(bytesPayload)) # class org.python.core.PyArray
# with this SHA256, the result should match as we perform shasum -a 256 attachmentFileName
sha256_bytesPayload = hashlib.sha256(bytesPayload).hexdigest()
flow.setVariable('debug-sha256_bytesPayload', sha256_bytesPayload)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug

# Constructing MIME-multipart

boundary = "----boundary123"

# MIME Header
MIMEHeaderStr = ""
MIMEHeaderStr += "--" + boundary + "\r\n"
MIMEHeaderStr += 'Content-Disposition: form-data; name="file"; filename="' + fileName + '"\r\n'
MIMEHeaderStr += "Content-Type: application/octet-stream\r\n\r\n"
# The MIMEStr is a PyUnicode (string). We need to convert it to byte array later before we join it with the image payload

# MIME Tail
MIMETailStr = "\r\n"
MIMETailStr += "--" + boundary + "--\r\n"


# Constructing MIME multipart by converting MIMEStr and MIMETtail from Pyunicode to PyArray. 
# We need to chain the Java commands; otherwise, it will pickup Jython's and cause corruption
MIMEPayload = String(MIMEHeaderStr).getBytes(Charset.forName("UTF-8")) + bytesPayload + String(MIMETailStr).getBytes(Charset.forName("UTF-8"))

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
flow.setVariable('debugMIMEHeaderLength', len(MIMEHeaderStr)) 
flow.setVariable('debugMIMEFooterLength', len(MIMETailStr)) 
flow.setVariable('debugbytesPayloadLength', len(bytesPayload)) 
flow.setVariable('debugMIMEPayloadLength', len(MIMEPayload)) 
flow.setVariable('debugaMIMEPayloadType', type(MIMEPayload)) # class org.python.core.PyArray
flow.setVariable('debuggingsha256_MIMEPayload',  hashlib.sha256(MIMEPayload).hexdigest())
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug


# Setting up final requests
flow.setVariable('request.header.content-type', 'multipart/form-data; boundary='+ boundary)

# Typically, the Target server will calculate the content length and overwrite what we send
# We calculate the length anyway, just in case the Target server doesn't do what it supposed to do
flow.setVariable('request.header.content-length', str(len(MIMEPayload)))

# DO NOT set request.content using flow.setVariable. It will return HTTP 400 because the request content something like `[B@beab919`. 
# The `[B@beab919` is a string representation of a byte array
# Instead, use Java
message.setContent(ByteArrayInputStream (MIMEPayload))

 

View solution in original post

3 REPLIES 3

Update: I compared the Apigee X debug between our ticketing system's original attachment endpoint and the new base64 endpoint. The corrupted image has extra characters. According to Gemini and chatGPT, Jython's base64.b64decode expects a byte string instead of Unicode. I still don't have luck converting it to a byte string.

Succesful vs corrupted.png

I would not describe that difference as "extra characters". It looks like it is being decoded differently.

Stepping back you wrote of your customer

he is having difficulty sending a request with an attachment. The only thing that works for him is to read the image file into binary and then encode it with base64. Form this base64 content, he can make a normal request to Apigee by sending the base64 content.

So this part is new? This base64 encoding on the client side?

If I were working on this, I would start by breaking down the problem. There are multiple steps that could be introducing problems.

  1. the client (customer) side is base64-encoding an image and sending it to your Apigee endpoint.
  2. The apigee endpoint is base64-decoding the image blob
  3. The apigee endpoint is creating a multi-part form payload to send to YOUR ticketing system

I would want to verify each step independently.

Have you verified that the base64 encoding on the client side is working correctly? one way to do this is: Produce a PNG image, preferably something involving a kitten, and compute the sha256 of the image. Then send it in some way (email?) to your customer. And tell the customer to also compute the sha256 of that image, maybe using a command-line tool. The sha256 numbers should match.

Then, have your customer send THAT IMAGE as a test, into your endpoint. At this moment you do not need your Apigee endpoint to connect to a ticketing systems. You just need to re-calculate the sha256 on the decoded bytestream. If it is the same, you know the customer was able to send to you the image in the correct form. You've verified step 1 and step 2.

Then you can work on your multi-part form message. I don't know python, and its interaction with the Apigee message flow. But I have built proxies that create multi-part form messages before, involving binary attachments. For that I used Java, specifically this callout: https://github.com/DinoChiesa/Apigee-Java-MultipartForm-V2

Good luck

@dchiesa1 , I got it working with Jython without the need for jar files!! Thank you so much for the leads and your note and code sample about Java's "[B@1cf582a1", which is a string representation of byte a array.
The trickiest part was the data manipulation needed to be done in Java instead of Python. To accomplish this, we needed to chain the Java commands to prevent corruption. The code has been successfully tested with the following attachments: DOCX, XLSX, PDF, JPG, and PNG.
Below is the bare minimum solution along with comments.

# The flow.setVariable('debug-') is used to print messages to Apigee variables. This is especially needed when using the Apigee Emulator because Python's print function doesn't output to stepExecution-stdout.
# Why is Jython used for this script? RhinoJS doesn't provide a base64 function out of the box. 
# Jython reduces Java's boilerplate, but it uses Python v2.
# When manipulating data in Jython, make sure to leverage Java's functions (use Java syntax) to prevent corruption. 


import hashlib # This is only needed for debugging purposes. We use this to calculate the SHA-256 checksum for each step where the content is stored into a variable.
from java.util import Base64 # DO NOT USE Jython's base64. Jython's base64 returns PyUnicode (string). We want it to return PyArray (byte array).
from java.lang import String # Used to convert PyUnicode to Java String
from java.nio.charset import Charset # Used to convert Java String to Java byte array 
from java.io import ByteArrayInputStream # Used to convert from Java byte array to Java bytes

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
flow.setVariable('debug-OriginContentType', type(flow.getVariable("request.content"))) #org.python.core.PyUnicode
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug

requestContent = flow.getVariable("request.content")
fileName = flow.getVariable("request.header.filename")

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# Checksum of base64-encoded data
sha256_base64Payload = hashlib.sha256(requestContent).hexdigest()
flow.setVariable('debug-sha256_base64Payload', sha256_base64Payload)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug

# We use Java's Base64 so it returns PyArray. Jython's base64 returns PyUnicode
bytesPayload = Base64.getDecoder().decode(requestContent)

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
flow.setVariable('debug-bytesPayloadType', type(bytesPayload)) # class org.python.core.PyArray
# with this SHA256, the result should match as we perform shasum -a 256 attachmentFileName
sha256_bytesPayload = hashlib.sha256(bytesPayload).hexdigest()
flow.setVariable('debug-sha256_bytesPayload', sha256_bytesPayload)
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug

# Constructing MIME-multipart

boundary = "----boundary123"

# MIME Header
MIMEHeaderStr = ""
MIMEHeaderStr += "--" + boundary + "\r\n"
MIMEHeaderStr += 'Content-Disposition: form-data; name="file"; filename="' + fileName + '"\r\n'
MIMEHeaderStr += "Content-Type: application/octet-stream\r\n\r\n"
# The MIMEStr is a PyUnicode (string). We need to convert it to byte array later before we join it with the image payload

# MIME Tail
MIMETailStr = "\r\n"
MIMETailStr += "--" + boundary + "--\r\n"


# Constructing MIME multipart by converting MIMEStr and MIMETtail from Pyunicode to PyArray. 
# We need to chain the Java commands; otherwise, it will pickup Jython's and cause corruption
MIMEPayload = String(MIMEHeaderStr).getBytes(Charset.forName("UTF-8")) + bytesPayload + String(MIMETailStr).getBytes(Charset.forName("UTF-8"))

# Debug>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
flow.setVariable('debugMIMEHeaderLength', len(MIMEHeaderStr)) 
flow.setVariable('debugMIMEFooterLength', len(MIMETailStr)) 
flow.setVariable('debugbytesPayloadLength', len(bytesPayload)) 
flow.setVariable('debugMIMEPayloadLength', len(MIMEPayload)) 
flow.setVariable('debugaMIMEPayloadType', type(MIMEPayload)) # class org.python.core.PyArray
flow.setVariable('debuggingsha256_MIMEPayload',  hashlib.sha256(MIMEPayload).hexdigest())
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Debug


# Setting up final requests
flow.setVariable('request.header.content-type', 'multipart/form-data; boundary='+ boundary)

# Typically, the Target server will calculate the content length and overwrite what we send
# We calculate the length anyway, just in case the Target server doesn't do what it supposed to do
flow.setVariable('request.header.content-length', str(len(MIMEPayload)))

# DO NOT set request.content using flow.setVariable. It will return HTTP 400 because the request content something like `[B@beab919`. 
# The `[B@beab919` is a string representation of a byte array
# Instead, use Java
message.setContent(ByteArrayInputStream (MIMEPayload))