Rust - Converting between file formats - JSON, YAML, & TOML
Rust’s serde
library is a generic serialize-deserialize framework
that has been implemented for many file formats. It’s an incredibly
powerful framework and well worth giving the documentation a read.
It can deserialize a file format into a strongly typed rust data structure, so that the data in code has no affiliation to the data format it was read from, then can be serialized into another file format.
#![allow(unused_variables)]
use serde_derive::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct MyData {
field_one: usize,
field_two: String,
field_three: bool,
some_data: std::collections::HashMap<String, usize>,
}
fn main() -> anyhow::Result<()> {
let my_data_yaml = r#"
fieldOne: 7
fieldTwo: "lorem"
fieldThree: true
someData:
x: 1
y: 2
z: 3
"#;
let my_data_toml = r#"
fieldOne = 7
fieldTwo = "lorem"
fieldThree = true
[someData]
x = 1
y = 2
z = 3
"#;
let my_data_json = r#"
{
"fieldOne": 7,
"fieldTwo": "lorem",
"fieldThree": true,
"someData": {
"x": 1,
"y": 2,
"z": 3
}
}
"#;
let deserialized_yaml = serde_yaml::from_str::<MyData>(my_data_yaml);
let deserialized_toml = toml::from_str::<MyData>(my_data_toml);
let deserialized_json = serde_json::from_str::<MyData>(my_data_json);
assert!(deserialized_yaml.is_ok());
assert!(deserialized_toml.is_ok());
assert!(deserialized_json.is_ok());
let deserialized_toml_copy = deserialized_toml.clone();
assert_eq!(deserialized_yaml?, deserialized_toml?);
assert_eq!(deserialized_toml_copy?, deserialized_json?);
let my_data_yaml_missing_field = r#"
fieldOne: 7
fieldTwo: "lorem"
someData:
x: 1
y: 2
z: 3
"#;
let my_data_yaml_extra_field = r#"
fieldOne: 7
fieldTwo: "lorem"
fieldThree: true
someData:
x: 1
y: 2
z: 3
out_of_schema_data: 42
"#;
let data_missing_field = serde_yaml::from_str::<MyData>(my_data_yaml_missing_field);
let data_extra_field = serde_yaml::from_str::<MyData>(my_data_yaml_extra_field);
assert!(data_missing_field.is_err());
// Because MyData is decorated with `deny_unknown_fields`, adding extra fields
// will cause parsing to fail.
assert!(data_extra_field.is_err());
Ok(())
}
You can also read data into a type that can represent any data in a particular format, if you don’t want to or can’t strongly define the contents:
let yaml_data = serde_yaml::from_str::<serde_yaml::Value>(my_data_yaml)?;
let toml_data = toml::from_str::<toml::Value>(my_data_toml)?;
let json_data = serde_json::from_str::<serde_json::Value>(my_data_json)?;
Recently I’ve found that it’s even able to deserialize a type that’s supposed
to represent one format straight into another, and conversely can serialize a
type for a certain format into another. For JSON, YAML, and TOML formats there
are the types serde_json::Value
, serde_yaml::Value
, and
toml::Value
which represent any data in their respective formats and
can used to deserialize data when we can’t or don’t want to define the precise
structure of the data. It turns out you can read a file format straight into
one of these other types.
let toml_from_yaml = serde_yaml::from_str::<toml::Value>(my_data_yaml)?;
It might be that this conversion fails and the data is not readable into the
other format’s Value
, and you’ll get a Err
. E.g:
serde_yaml::Mapping
allows keys of any variant of Value
,
however serde_json:🗺:Map
is only implemented with String
as a key
type:
let some_yaml = r#"
[5,6]: true
"#;
let try_yaml = serde_yaml::from_str::<serde_yaml::Value>(some_yaml);
let try_json = serde_yaml::from_str::<serde_json::Value>(some_yaml);
assert!(try_yaml.is_ok());
assert!(try_json.is_err());
This actually has real practical implication. JSON, for example, is far less human friendly than either YAML or TOML. There are libraries, packed with useful functionality that intend to receive data in a certain format. Serde’s design decouples this. JSON Schema is an example. It’s a richly featured schema implementation. But usually you have to write your schemas in JSON. JSON Schema Tool is an online tool that will generate a inferred schema from some example JSON. Using the example given when you visit the page: this JSON:
{
"checked": false,
"dimensions": {
"width": 5,
"height": 10
},
"id": 1,
"name": "A green door",
"price": 12.5,
"tags": [
"home",
"green"
]
}
, generates this schema:
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "http://example.com/root.json",
"type": "object",
"title": "The Root Schema",
"description": "The root schema is the schema that comprises the entire JSON document.",
"default": {},
"required": [
"checked",
"dimensions",
"id",
"name",
"price",
"tags"
],
"properties": {
"checked": {
"$id": "#/properties/checked",
"type": "boolean",
"title": "The Checked Schema",
"description": "An explanation about the purpose of this instance.",
"default": false,
"examples": [
false
]
},
"dimensions": {
"$id": "#/properties/dimensions",
"type": "object",
"title": "The Dimensions Schema",
"description": "An explanation about the purpose of this instance.",
"default": {},
"examples": [
{
"height": 10.0,
"width": 5.0
}
],
"required": [
"width",
"height"
],
"properties": {
"width": {
"$id": "#/properties/dimensions/properties/width",
"type": "integer",
"title": "The Width Schema",
"description": "An explanation about the purpose of this instance.",
"default": 0,
"examples": [
5
]
},
"height": {
"$id": "#/properties/dimensions/properties/height",
"type": "integer",
"title": "The Height Schema",
"description": "An explanation about the purpose of this instance.",
"default": 0,
"examples": [
10
]
}
}
},
"id": {
"$id": "#/properties/id",
"type": "integer",
"title": "The Id Schema",
"description": "An explanation about the purpose of this instance.",
"default": 0,
"examples": [
1
]
},
"name": {
"$id": "#/properties/name",
"type": "string",
"title": "The Name Schema",
"description": "An explanation about the purpose of this instance.",
"default": "",
"examples": [
"A green door"
]
},
"price": {
"$id": "#/properties/price",
"type": "number",
"title": "The Price Schema",
"description": "An explanation about the purpose of this instance.",
"default": 0,
"examples": [
12.5
]
},
"tags": {
"$id": "#/properties/tags",
"type": "array",
"title": "The Tags Schema",
"description": "An explanation about the purpose of this instance.",
"default": [],
"examples": [
[
"home",
"green"
]
],
"items": {
"$id": "#/properties/tags/items",
"type": "string",
"title": "The Items Schema",
"description": "An explanation about the purpose of this instance.",
"default": "",
"examples": [
"home",
"green"
]
}
}
}
}
You could imagine that you might want to expand that schema even further with defauts, examples and other rules constraining the values of some fields.
Rust has a library schemars
that implements JSON Schema. While the
documentation goes through examples of defining schemas in code, and
serializing with serde_json
, you can read a SchemaObject
straight
from YAML, or TOML. So your application can leverage the functionality of the
SchemaObject
and the schemars
library while getting users write schemas in
a more human readable form.
let json_schema = serde_json::from_str::<schemars::schema::RootSchema>(
&std::fs::read_to_string("example.schema.json")?,
)?;
let json_schema_from_yaml = serde_yaml::from_str::<schemars::schema::RootSchema>(
&std::fs::read_to_string("example.schema.yaml")?,
)?;
The same JSON Schema from above is much more readable when written in YAML:
$schema: http://json-schema.org/draft-07/schema
$id: http://example.com/root.json
type: object
title: The Root Schema
description: The root schema is the schema that comprises the entire JSON document.
default: {}
required:
- checked
- dimensions
- id
- name
- price
- tags
properties:
checked:
$id: '#/properties/checked'
type: boolean
title: The Checked Schema
description: An explanation about the purpose of this instance.
default: false
examples:
- false
dimensions:
$id: '#/properties/dimensions'
type: object
title: The Dimensions Schema
description: An explanation about the purpose of this instance.
default: {}
examples:
- height: 10
width: 5
required:
- width
- height
properties:
width:
$id: '#/properties/dimensions/properties/width'
type: integer
title: The Width Schema
description: An explanation about the purpose of this instance.
default: 0
examples:
- 5
height:
$id: '#/properties/dimensions/properties/height'
type: integer
title: The Height Schema
description: An explanation about the purpose of this instance.
default: 0
examples:
- 10
id:
$id: '#/properties/id'
type: integer
title: The Id Schema
description: An explanation about the purpose of this instance.
default: 0
examples:
- 1
name:
$id: '#/properties/name'
type: string
title: The Name Schema
description: An explanation about the purpose of this instance.
default: ''
examples:
- A green door
price:
$id: '#/properties/price'
type: number
title: The Price Schema
description: An explanation about the purpose of this instance.
default: 0
examples:
- 12.5
tags:
$id: '#/properties/tags'
type: array
title: The Tags Schema
description: An explanation about the purpose of this instance.
default: []
examples:
- - home
- green
items:
$id: '#/properties/tags/items'
type: string
title: The Items Schema
description: An explanation about the purpose of this instance.
default: ''
examples:
- home
- green
I learned that serde could do this cross-format parsing by reading the source code of the refmt project.
Examples from this post are mastered in github.