This page explains how to encode and decode rows when you prepare data in the Wrangler workspace of the Cloud Data Fusion Studio.
Encode a row
You can use base encoding of data to store or transfer data in environments that, for legacy reasons, are restricted to US-ASCII data. You might use it in new applications without those legacy restrictions because it allows the manipulation of objects with text editors.
You can apply the following encoding schemes, which are based on RFC-4648, to all values in a column:
- Base32
- Base64
- Hex
- URL
When you encode, Wrangler generates a new column with a name in the following
format: <column>_encode_<type> except for url-encode
.
Cloud Data Fusion uses the following rules for the column values:
- If the column is
null
, the resulting column is alsonull
. - If the chosen column isn't found in the row, the row is skipped.
- If the column value doesn't have a string or byte data type, the transformation fails, and an error displays.
Supported encoding options
Wrangler supports the following encoding options:
- Encode base64
- The Base64 option adds the
encode64
directive as a transformation step to the recipe and creates a new column with encoded values. - Encode base32
- The Base32 option adds the
encode32
directive as a transformation step to the recipe and creates a new column with encoded values. - Encode hex
- The Hex option adds the
encode_hex
directive as a transformation step to the recipe and creates a new column with encoded values. - Encode URL
- The URL option adds the
url-encode
directive as a transformation step to the recipe and encodes the current column.
Decode a row
You can use base decoding of data to store or transfer data in environments that, for legacy reasons, are restricted to US-ASCII data. You might use it in new applications without those legacy restrictions because it allows the manipulation of objects with text editors.
You can apply the following decoding schemes, which are based on RFC-4648, to each value in a column:
- Base32
- Base64
- Hex
- URL
When you decode, Wrangler generates a new column with a name in the following
format: <column>_encode_<type>
, except for url-decode
.
Cloud Data Fusion uses the following rules for the column values:
- If the column is
null
, the resulting column is alsonull
. - If the chosen column isn't found in the row, the row is skipped.
- If the column value doesn't contain the string or byte array data types, the operation fails.
Supported decoding options
Wrangler supports the following encoding options:
- Decode base64
- The base64 option adds the
decode64
directive as a transformation step to the recipe and creates a new column with the decoded values. - Decode base32
- The base32 option adds the
decode32
directive as a transformation step to the recipe and creates a new column with the decoded values. - Decode hex
- The Hex option adds the
decode hex
directive as a transformation step to the recipe and creates a new column with the decoded values. - Decode URL
- The URL option adds the
url-decode
directive as a transformation step to the recipe and decodes the current column.
What's next
- Learn more about Wrangler directives.