Json: Complete Guide - Progressive Robot

URL: https://www.progressiverobot.com/python-pretty-print-json/

JSON (JavaScript Object Notation), has become the de facto standard for data exchange across the web. From configuring applications to transmitting data between APIs, its lightweight and human-readable format makes it incredibly versatile. However, while machines can easily process JSON in its most compact form – often minified into a single line – that efficiency can turn into a significant hurdle for human developers.

This guide will walk you through the fundamentals, starting with Python's built-in json module. From there, you'll move beyond the basics to tackle more complex, real-world challenges. You'll learn how to serialize custom Python objects, process massive JSON files without running out of memory, and leverage high-performance alternative libraries to speed up your applications.

Key Takeaways:

Use indent and sort_keys in json.dumps() to instantly make your JSON output readable and consistently structured.
Control your JSON output's compactness and character encoding with the separators and ensure_ascii parameters.
Serialize custom Python objects that are not natively supported by creating a JSONEncoder subclass or by passing a handler function to the default parameter.
Process massive JSON files without exhausting memory by using streaming parsers like ijson or adopting the line-delimited JSON format.
Boost performance in data-intensive applications by replacing the standard json module with faster alternatives like orjson.
Reconstruct your custom Python objects from a JSON string by using the object_hook parameter in json.loads() to intercept and transform the data.
Improve the debugging experience by using the rich library to pretty-print JSON with syntax highlighting directly in your terminal.

Python Pretty Print JSON String

We can use the dumps() method to get the pretty formatted JSON string.

				
					import json

json_data = '[{"ID":10,"Name":"Pankaj","Role":"CEO"},' \
 '{"ID":20,"Name":"David Lee","Role":"Editor"}]'

json_object = json.loads(json_data)

json_formatted_str = json.dumps(json_object, indent=2)

print(json_formatted_str)

This outputs the formatted JSON:

				
					[
 {
 "ID": 10,
 "Name": "Pankaj",
 "Role": "CEO"
 },
 {
 "ID": 20,
 "Name": "David Lee",
 "Role": "Editor"
 }
]

First, we use json.loads() to create the JSON object from the JSON string.
The json.dumps() method takes the JSON object and returns a JSON formatted string. The indent parameter defines the indent level for the formatted string.

Python Pretty Print JSON File

Let's see what happens when we try to print a JSON file data. The file data is saved in a pretty printed format.

				
					import json

with open('Cars.json', 'r') as json_file:
 json_object = json.load(json_file)

print(json_object)

print(json.dumps(json_object))

print(json.dumps(json_object, indent=1))

Output:

				
					[{'Car Name': 'Honda City', 'Car Model': 'City', 'Car Maker': 'Honda', 'Car Price': '20,000 USD'}, {'Car Name': 'Bugatti Chiron', 'Car Model': 'Chiron', 'Car Maker': 'Bugatti', 'Car Price': '3 Million USD'}]
[{"Car Name": "Honda City", "Car Model": "City", "Car Maker": "Honda", "Car Price": "20,000 USD"}, {"Car Name": "Bugatti Chiron", "Car Model": "Chiron", "Car Maker": "Bugatti", "Car Price": "3 Million USD"}]
[
 {
 "Car Name": "Honda City",
 "Car Model": "City",
 "Car Maker": "Honda",
 "Car Price": "20,000 USD"
 },
 {
 "Car Name": "Bugatti Chiron",
 "Car Model": "Chiron",
 "Car Maker": "Bugatti",
 "Car Price": "3 Million USD"
 }
]

It's clear from the output that we have to pass the indent value to get the JSON data into a pretty printed format.

Advanced json.dumps() Parameters

While indent and sort_keys are commonly used for creating readable JSON, the json.dumps() function offers several other powerful parameters that give you finer control over the serialization process. Let's explore some of the most useful ones.

separators: Controlling Whitespace for Compact Output

The separators parameter allows you to customize the delimiter characters used in your JSON output. It takes a tuple containing two strings: (item_separator, key_separator). By default, Python uses (', ', ': '), which includes a space after the comma and colon for readability.

You can create a more compact representation by removing this whitespace. This is useful for reducing file size when readability is not the primary concern.

For example, let's define a simple Python dictionary:

				
					import json

data = {
 "name": "John Doe",
 "age": 30,
 "isStudent": False,
 "courses": [
 {"title": "History", "credits": 3},
 {"title": "Math", "credits": 4}
 ]
}

Now, let's serialize it using the default separators and a more compact version:

				
					# Default pretty-printed output
print(json.dumps(data, indent=4))

# Compact pretty-printed output
print(json.dumps(data, indent=4, separators=(',', ':')))

The output demonstrates the difference:

Default Output:

				
					{
 "name": "John Doe",
 "age": 30,
 "isStudent": false,
 "courses": [
 {
 "title": "History",
 "credits": 3
 },
 {
 "title": "Math",
 "credits": 4
 }
 ]
}

Compact Output with separators:

				
					{
 "name":"John Doe",
 "age":30,
 "isStudent":false,
 "courses":[
 {
 "title":"History",
 "credits":3
 },
 {
 "title":"Math",
 "credits":4
 }
 ]
}

As you can see, the second version removes the space after the colons within each key-value pair, resulting in a slightly smaller output.

ensure_ascii: Working with International Characters

By default, json.dumps() escapes all non-ASCII characters. For example, a character like 'é' would be converted to \u00e9. While this guarantees compatibility, it can make the JSON difficult to read if you are working with languages other than English.

By setting ensure_ascii=False, you can instruct json.dumps() to write these characters directly. This is highly recommended when your output destination supports UTF-8, which is standard for modern web APIs and file systems.

Consider this example with a non-ASCII character:

				
					import json

data = {"name": "Søren", "city": "København"}

# Default behavior with ASCII escaping
print(json.dumps(data))

# With ensure_ascii=False for direct output
print(json.dumps(data, ensure_ascii=False))

Output:

				
					{"name": "S\u00f8ren", "city": "K\u00f8benhavn"}
{"name": "Søren", "city": "København"}

The second line is much more readable and is the preferred format for UTF-8 compatible systems.

default: Handling Custom Python Objects

A TypeError is raised when you try to serialize a Python object that isn't directly supported by the JSON specification, such as a datetime object or a custom class instance. The default parameter provides an elegant way to handle this.

You can pass a function to default that will be called for any object that the serializer doesn't know how to handle. This function should return a JSON-serializable version of the object.

Let's see how to serialize a datetime object and a custom User object.

				
					import json
from datetime import datetime

class User:
 def __init__(self, name, registered_at):
 self.name = name
 self.registered_at = registered_at

def custom_serializer(obj):
 # Custom JSON serializer for objects not serializable by default.
 if isinstance(obj, datetime):
 return obj.isoformat()
 if isinstance(obj, User):
 return {
 "name": obj.name,
 "registered_at": obj.registered_at.isoformat(),
 "__class__": "User" # Optional: for custom decoding
 }
 raise TypeError(f"Object of type {type(obj).__name__} is not JSON serializable")

user = User("Jane Doe", datetime.now())

# Use the default parameter to handle the custom User object and datetime
json_string = json.dumps(user, default=custom_serializer, indent=4)
print(json_string)

Output:

				
					{
 "name": "Jane Doe",
 "registered_at": "2025-09-11T15:03:18.673824",
 "__class__": "User"
}

This approach allows you to centrally define serialization logic for any custom types in your application, making your code cleaner and more maintainable.

Handling Large JSON Files

When working with data, you may encounter JSON files that are too large to fit into your computer's memory. Loading such a file with json.load() would lead to a MemoryError. Fortunately, there are techniques for processing large JSON files without consuming excessive memory.

Streaming Parsers: ijson

Streaming parsers read and parse a file incrementally, piece by piece, rather than loading the entire document at once. This approach allows you to process files of any size with a small, constant memory footprint.

A popular library for this in Python is ijson. It can parse a JSON stream and yield items as they are found.

First, install the library:

				
					pip install ijson

Imagine you have a large JSON file large_data.json containing an array of user objects:

				
					[
 {"id": 1, "name": "Alice", "data": "..." },
 {"id": 2, "name": "Bob", "data": "..." },
 ...
]

Instead of loading the whole list, you can iterate over it with ijson:

				
					import ijson

filename = "large_data.json"

with open(filename, 'r') as f:
 users = ijson.items(f, 'item')
 for user in users:
 # Process each user object one by one
 print(f"Processing user: {user['name']}")

In this example, ijson.items(f, 'item') creates a generator that yields each object from the root array. Only one user object is held in memory at a time, making it efficient for terabyte-scale files.

Line-Delimited JSON (NDJSON)

Another common format for handling large datasets is Line-Delimited JSON, also known as Newline Delimited JSON (NDJSON). In this format, each line in the file is a complete, valid JSON object.

An example data.ndjson file would look like this:

				
					{"id": 1, "event": "login", "timestamp": "2025-09-11T10:00:00Z"}
{"id": 2, "event": "click", "target": "button_a", "timestamp": "2025-09-11T10:01:15Z"}
{"id": 1, "event": "logout", "timestamp": "2025-09-11T10:05:30Z"}

This format is excellent for streaming because you can process the file line by line. Each line can be parsed independently.

Processing an NDJSON file in Python is straightforward:

				
					import json

filename = "data.ndjson"

with open(filename, 'r') as f:
 for line in f:
 try:
 # Each line is a separate JSON object
 event_data = json.loads(line)
 print(f"Processed event: {event_data['event']} for user {event_data['id']}")
 except json.JSONDecodeError:
 print(f"Skipping malformed line: {line.strip()}")

This method is not only memory-efficient but also robust, as a malformed line doesn't prevent the rest of the file from being processed.

Alternative JSON Libraries

While Python's built-in json module is sufficient for many use cases, several third-party libraries offer improved performance and additional features.

Library	Key Features	Best For
`orjson`	Very high performance, serializes additional types (datetimes, UUIDs, dataclasses), produces compact, UTF-8 binary output.	Performance-critical applications, web APIs.
`simplejson`	The original library `json` was based on. Often faster and updated more frequently with new features.	A drop-in replacement for `json` with potential performance gains.
`rich`	Not a parser, but provides beautiful, syntax-highlighted pretty-printing of JSON in the terminal.	Enhancing debuggability and readability during development.

orjson: The High-Performance Choice

orjson is a fast JSON library for Python that is significantly faster than the standard json module. It is written in Rust and is designed for performance.

First, install orjson:

				
					pip install orjson

Using orjson is similar to the built-in module, but it serializes to bytes (bytes) by default.

				
					import orjson
from datetime import datetime

data = {
 "name": "Project X",
 "deadline": datetime(2026, 1, 1),
 "status": "active"
}

# orjson handles datetime objects automatically
json_bytes = orjson.dumps(data)

print(json_bytes)
print(orjson.loads(json_bytes))

Output:

				
					b'{"name":"Project X","deadline":"2026-01-01T00:00:00+00:00","status":"active"}'
{'name': 'Project X', 'deadline': '2026-01-01T00:00:00+00:00', 'status': 'active'}

simplejson: A Feature-Rich Alternative

simplejson is the external library that the json module was originally based on. It is still actively developed and sometimes includes features or performance optimizations before they make it into the standard library.

Install simplejson:

				
					pip install simplejson

Its usage is identical to the json module. You can use it as a drop-in replacement.

				
					import simplejson as json

data = {"key": "value"}
print(json.dumps(data))

rich: For Beautiful Terminal Output

When you're debugging or inspecting JSON data in a terminal, readability is key. The rich library excels at producing beautifully formatted and syntax-highlighted output for various data types, including JSON.

Install rich:

				
					pip install rich

To pretty-print JSON with rich, you pass a JSON string to its JSON class.

				
					import json
from rich.console import Console

data = {
 "name": "John Doe",
 "age": 30,
 "isStudent": False,
 "courses": [
 {"title": "History", "credits": 3},
 {"title": "Math", "credits": 4}
 ]
}

console = Console()
json_string = json.dumps(data)

# Print the JSON with syntax highlighting
console.print_json(json_string)

This will produce a color-coded, indented output in your terminal, making nested structures much easier to read than with the standard print() function.

Working with Custom Objects

We previously saw how the default parameter in json.dumps() can help serialize custom objects. For more complex scenarios, especially when you also need custom deserialization logic, creating custom encoder and decoder classes provides a more structured, object-oriented approach.

Custom JSONEncoder

Subclassing json.JSONEncoder allows you to create a reusable encoder for your custom objects. You only need to override the default() method. This approach encapsulates the serialization logic within a class, which is cleaner than a standalone function if you have multiple custom types.

Let's refactor our earlier example to use a custom encoder.

				
					import json
from datetime import datetime

class User:
 def __init__(self, name, registered_at):
 self.name = name
 self.registered_at = registered_at

class CustomEncoder(json.JSONEncoder):
 def default(self, obj):
 if isinstance(obj, datetime):
 return obj.isoformat()
 if isinstance(obj, User):
 return {
 "name": obj.name,
 "registered_at": obj.registered_at.isoformat(),
 "__class__": "User"
 }
 # Let the base class default method raise the TypeError
 return super().default(obj)

user = User("Jane Doe", datetime.now())

# Use the cls parameter to specify the custom encoder
json_string = json.dumps(user, cls=CustomEncoder, indent=4)
print(json_string)

This produces the same output as before but packages the logic into a reusable CustomEncoder class.

Custom JSONDecoder

Deserialization is the process of converting a JSON string back into a Python object. To reconstruct your custom objects, you can use the object_hook parameter in json.loads() or create a custom JSONDecoder subclass.

The object_hook is a function that gets called with the result of any object literal decoded (a dict). It can then transform this dictionary into a different object.

Let's use the __class__ key we added during encoding to identify and reconstruct our User object.

				
					import json
from datetime import datetime

# Assume User and CustomEncoder classes are defined as above

def from_json_object(dct):
 """Object hook to decode custom objects."""
 if "__class__" in dct and dct["__class__"] == "User":
 return User(name=dct["name"], registered_at=datetime.fromisoformat(dct["registered_at"]))
 return dct

json_string = """
{
 "name": "Jane Doe",
 "registered_at": "2025-09-11T15:03:18.673824",
 "__class__": "User"
}
"""

# Use object_hook to deserialize the string into a User object
user_object = json.loads(json_string, object_hook=from_json_object)

print(type(user_object))
print(user_object.name)
print(user_object.registered_at)

Output:

				
					&lt;class '__main__.User'&gt;
Jane Doe
2025-09-11 15:03:18.673824

This demonstrates how object_hook successfully converted the dictionary back into an instance of our User class. Creating a full JSONDecoder subclass is also possible but is often unnecessary, as object_hook handles most use cases with less boilerplate code.

Debugging API responses

When you make an API request, you often get a single, long line of JSON response to save bandwidth. This is incredibly difficult for humans to read and understand, especially for complex or deeply nested data. Pretty-printing transforms this unreadable string into a structured, indented, and human-readable format, making it far easier to identify correct data, missing fields, or unexpected errors.

Here's an example of how to fetch an API response and pretty-print its content. We're using the JSONPlaceholder API for testing.

				
					import requests
import json

url = "https://jsonplaceholder.typicode.com/posts/1"
response = requests.get(url)

if response.status_code == 200:
 data = response.json()
 print(json.dumps(data))
 print(json.dumps(data, indent=2))
else:
 print(f"Error: {response.status_code}")

This will print the following output:

				
					{"userId": 1, "id": 1, "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit", "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"}

{
 "userId": 1,
 "id": 1,
 "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
 "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
}

As you can see, the pretty printed JSON is easier to read and debug.

Logging structured data

Traditional log messages are often plain text. When you need to log complex events, user actions, or system states that are best represented as JSON (e.g., a full request payload, an error context, or a processed data record), logging it as a single, unformatted string makes logs difficult to read, parse, and analyze. Pretty-printing JSON within your logs makes them immediately understandable, especially when manually sifting through log files or using log aggregation tools that might not automatically format JSON.

This is suitable for local developement. However, in production environments, it's better to use a log management system (e.g., ELK, Splunk, DataDog) to parse the data into searchable fields. This is more efficient for large volumes and automated analysis.

Improving readability of config files

Configuration files are the backbone of many applications, defining settings, database connections, API keys, and more. While many are manually written and thus already formatted, pretty-printing JSON becomes invaluable in scenarios:

When a script or application generates or modifies a JSON config file, it might save it in a minified format. Pretty-printing ensures that the saved file is human-readable for subsequent manual edits or review.
If you receive a JSON config file from an external source, pretty-printing it can quickly reveal its structure and content, making it easier to validate against expectations.
When changes are made to a JSON config file, pretty-printing with consistent indentation (and optionally sorted keys) ensures that diffs in version control systems (like Git) are clean and meaningful, showing only the actual content changes rather than formatting shifts.

FAQs

1. What’s the best way to indent JSON output in Python?

The best and easiest way to indent JSON output in Python is by using the the indent parameter in the json.dumps() function.

				
					import json

data = {"name": "Alice", "age": 30, "hobbies": ["reading", "chess", "hiking"]}

# Indent JSON output by 4 spaces
json_string = json.dumps(data, indent=4)
print(json_string)

2. What's the difference between json.dumps() and pprint?

json.dumps() converts Python objects (like a dict or list) to a JSON-formatted string. You can use json.dumps() when you want to serialize Python data into a valid JSON string. pprint() pretty-prints any Python data structure for readability; usually used for debugging or displaying nested Python objects in a readable format.

3. Is there a tool to automatically pretty print JSON in Python scripts?

There are several tools to automatically pretty-print JSON in Python scripts. Here are a few options:

You can use json.tool in the terminal to pretty-print JSON from a file or standard input:

				
					 python -m json.tool input.json

jq is another powerful and fast tool for formatting and querying JSON:

				
					 jq . input.json

You will have to first install jq using the pip install jq command.

Many editors like VS Code, PyCharm, and Sublime Text have built-in or plugin-based JSON formatters that you can use.

4. When should I use orjson instead of the built-in json module?

You should switch to orjson when performance is a critical factor. orjson is significantly faster than the standard json library, making it the ideal choice for high-throughput applications like web APIs, data processing pipelines, or any system where serialization and deserialization speed is a bottleneck. Additionally, orjson natively supports types like datetime and dataclasses, which can simplify your code by eliminating the need for custom handlers.

5. Besides debugging, what are other practical uses for pretty-printing JSON?

While debugging is the most common use case, pretty-printing JSON is valuable in several other scenarios. It is essential for creating human-readable configuration files (.json), generating clear and understandable API documentation with example responses, and for logging structured data in a way that is easy for developers to inspect and analyze later. Any situation where a human needs to read or verify structured data can benefit from pretty-printing.

Conclusion

Pretty-printing JSON isn't merely about making your JSON data look pretty; it's a powerful technique to improve readability and enhance your debugging capabilities. With Python's json.dumps() and pprint modules, you can quickly format output for better clarity. You can also use advanced parameters to handle non-ASCII characters, serialize custom Python objects, and fine-tune whitespace for compact output.

This extends beyond simple output, proving valuable for debugging API responses, logging structured data, and improving readability of config files. You are now equipped to tackle more advanced challenges, from processing massive JSON files with streaming parsers like ijson to creating your own classes for full control over custom data types. By exploring high-performance alternative libraries like orjson and simplejson, you can optimize your applications for speed and efficiency. It's a small change with a big impact on your development experience, elevating your ability to manage data in any scenario.

For more information on JSON, and working with files in Python, you can refer to the following tutorials:

How to Pretty Print JSON in Python

Table of Contents

Python Pretty Print JSON String

Python Pretty Print JSON File

Advanced json.dumps() Parameters

separators: Controlling Whitespace for Compact Output

ensure_ascii: Working with International Characters

default: Handling Custom Python Objects

Handling Large JSON Files

Streaming Parsers: ijson

Line-Delimited JSON (NDJSON)

Alternative JSON Libraries

orjson: The High-Performance Choice

simplejson: A Feature-Rich Alternative

rich: For Beautiful Terminal Output

Working with Custom Objects

Custom JSONEncoder

Custom JSONDecoder

Debugging API responses

Logging structured data

Improving readability of config files

FAQs

1. What’s the best way to indent JSON output in Python?

2. What's the difference between json.dumps() and pprint?

3. Is there a tool to automatically pretty print JSON in Python scripts?

4. When should I use orjson instead of the built-in json module?

5. Besides debugging, what are other practical uses for pretty-printing JSON?

Conclusion

References

Links

Newsletter

Contact