The Day 15 Files Became One

Meet Sam. Four months ago, Sam joined a fast-growing document management startup as a backend developer. The company's product processes documents in various formats, JSON, XML, YAML, CSV, and more. Sam's job? Build a robust document parsing system.

Sounds straightforward, right? It was... until it wasn't.

It's Thursday, 4:23 PM. Sam is in a meeting with the product team.

Product Manager (Emma): "Great news! We just signed a huge enterprise client. They need support for TOML, INI, and PDF text extraction. Can you add those by Monday?"

Sam (internally screaming): "Sure, let me check the code..."

Sam opens the laptop. The color drains from Sam's face.

The Current Code

Here's what Sam has been dealing with:

// THE NIGHTMARE - Sam's current parser selection code.

// In document_service.rs.
pub fn parse_document(content: &str, format: &str) -> Result<ParsedDoc, Error> {
    let parser: Box<dyn Parser> = if format == "json" {
        Box::new(JsonParser::new())
    } else if format == "xml" {
        Box::new(XmlParser::new())
    } else if format == "yaml" {
        Box::new(YamlParser::new())
    } else if format == "csv" {
        Box::new(CsvParser::new())
    } else if format == "markdown" {
        Box::new(MarkdownParser::new())
    } else if format == "html" {
        Box::new(HtmlParser::new())
    } else if format == "toml" {
        Box::new(TomlParser::new())
    } else if format == "ini" {
        Box::new(IniParser::new())
    } else {
        return Err(Error::UnsupportedFormat(format.to_string()));
    };

    parser.parse(content)
}

// In file_processor.rs - SAME CODE AGAIN!
pub fn process_file(path: &Path) -> Result<ParsedDoc, Error> {
    let extension = path.extension()...;
    let parser: Box<dyn Parser> = if extension == "json" {
        Box::new(JsonParser::new())
    } else if extension == "xml" {
        Box::new(XmlParser::new())
    } else if extension == "yaml" {
        Box::new(YamlParser::new())
    }
    // ... DUPLICATED 8 MORE TIMES!
}

// In api_handler.rs - SAME CODE THIRD TIME!
async fn handle_upload(file: MultipartFile) -> Result<Response> {
    let format = detect_format(&file);
    let parser: Box<dyn Parser> = if format == "json" {
        Box::new(JsonParser::new())
    }
    // ... DUPLICATED AGAIN!
}

// In batch_processor.rs - FOURTH TIME!
// In validator.rs - FIFTH TIME!
// In converter.rs - SIXTH TIME!
// ... This pattern repeats in 15+ FILES!

Let me break down what's happening here for those new to Rust.

The code is trying to create a "parser", a piece of code that reads a document and makes sense of its structure. Since different file formats (JSON, XML, YAML) need different parsing logic, Sam created separate parser types for each one. The Box<dyn Parser> syntax is Rust's way of saying "give me any type that knows how to parse, and I don't care which specific one." That's actually a good idea!

The problem isn't the concept, it's the execution. Every single file that needs a parser has to include this massive decision tree. Want to parse a document in the API handler? Copy the if-else chain. Need to parse in the batch processor? Copy it again. File validator? You guessed it, copy it a third time. Now imagine adding a new format like TOML. You'd have to hunt down every single copy of this chain and add a new branch. Miss one? That's a bug waiting to happen in production.

If you've been coding for any length of time, you've probably felt that sinking feeling Sam is experiencing right now. That moment when you realise the codebase has grown into something unwieldy, and every small change requires touching a dozen files.

Sam's internal monologue:

"Every time we add a new format, I have to modify FIFTEEN different files. Each with the same giant if-else chain. Last time I added HTML parser, I missed updating it in two places and we had a production bug. How am I supposed to add THREE new formats by Monday?!"

The specific horrors:

Massive Duplication: Same if-else chain in 15+ files.
Error-Prone: Easy to miss updating one place.
No Single Source of Truth: Parser selection logic scattered everywhere.
Hard to Test: Can't test parser creation independently.
No Extensibility: Can't add parsers without recompiling.
Maintenance Nightmare: Every change touches multiple files.

The Parser Chaos

Let's break down exactly what's wrong with Sam's code. If you're new to software design, this section will help you recognise patterns in your own projects that might be heading toward the same cliff.

Problem 1: The If-Else Chain

// This appears in 15 different files!
let parser: Box<dyn Parser> = if format == "json" {
    Box::new(JsonParser::new())
} else if format == "xml" {
    Box::new(XmlParser::new())
} else if format == "yaml" {
    Box::new(YamlParser::new())
} else if format == "csv" {
    Box::new(CsvParser::new())
} else if format == "markdown" {
    Box::new(MarkdownParser::new())
} else if format == "html" {
    Box::new(HtmlParser::new())
} else if format == "toml" {
    Box::new(TomlParser::new())
} else if format == "ini" {
    Box::new(IniParser::new())
} else {
    return Err(Error::UnsupportedFormat(format.to_string()));
};

Problems with this approach:

Length: 18 lines just to create a parser.
Duplication: Repeated 15+ times across the codebase.
Fragility: Typo in one place equals a bug.
Not Extensible: Can't add formats without modifying code

Problem 2: No Abstraction

Every piece of code that needs a parser must know all possible parser types, how to construct each one, the exact string matching logic, and the mapping from format string to parser type. This violates something called the Dependency Inversion Principle, one of the fundamental rules of clean software design. In simple terms, high-level code (like your document service) shouldn't depend on low-level implementation details (like knowing about JsonParser, XmlParser, and every other concrete parser type).

https://stackify.com/dependency-inversion-principle/

Think of it like this: when you order coffee at a cafe, you don't need to know the exact temperature of the water, the grind size of the beans, or which brand of espresso machine they're using. You just say "I'll have a latte" and the barista handles the details. That's the abstraction we're missing here.

Problem 3: Testing

#[test]
fn test_parse_json() {
    // To test, must include full if-else chain.
    let parser = if format == "json" {
        Box::new(JsonParser::new())
    } else if format == "xml" {
        // ... need entire chain!
    };

    // Can't easily mock or stub.
    // Can't test parser creation in isolation.
}

When you can't test something easily, you tend not to test it at all. And untested code is a ticking time bomb waiting to explode in production.

The Real-World Impact

Sam keeps a log of incidents:

Week 1: Added HTML parser, forgot to update batch_processor.rs
        → Batch jobs crashed, 2 hours downtime

Week 2: Typo "yml" vs "yaml" in one file caused inconsistent behavior
        → Customer complaints, had to hotfix

Week 3: Added TOML parser, updated 14 files, missed 1
        → CLI tool couldn't parse TOML, support tickets

Week 4: Tried to write unit test, realized it's impossible to test
        → No tests = more bugs in production

Week 5: Boss wants 3 new formats by Monday
        → Sam considers quitting

What's cyclomatic complexity? It's a fancy way of measuring how many different paths your code can take. Each if or else branch adds to the complexity. A cyclomatic complexity of 9 means there are 9 different paths through that function which means 9 different ways things can go wrong, and 9 different scenarios you need to test. That's way too high for what should be a simple "give me the right parser" operation.

"I'm not solving problems anymore. I'm just copying and pasting if-else statements. There has to be a better way!"

The Breaking Point

Friday morning. Sam is called into a meeting with Emma (Product Manager) and the CTO, David.

Emma: "Sam, I know we just talked about adding three new formats, but I have more news..."

Sam (nervously): "More?"

Emma: "Our new enterprise client uses 12 different document formats. They need support for all of them. Can we…"

Sam (interrupting): "No. We can't. Not with the current architecture."

David: "What do you mean?"

Sam (frustrated): "Every time we add a new format, I have to modify 15 different files. Each one has the same giant if-else chain. It takes hours, it's error-prone, and I've already introduced bugs three times this month."

Sam opens the laptop and shows them the code.

Sam: "Look at this. This if-else chain is duplicated everywhere. Adding your 12 formats means updating it 15 times per format. That's 180 places I have to modify. And if I miss even ONE, we get production bugs."

David (understanding dawning): "Oh. That's... not good."

Emma: "Is there a way to fix this?"

David: "Yes. We need to refactor. Sam, have you heard of the Factory Pattern?"

Sam: "Factory? Like a factory that makes things?"

David draws on the whiteboard:

Sam: "So instead of everyone knowing how to create parsers, they just ask the factory?"

David: "Right! And we can make it even better with a registry. That way, you can add new parsers without even modifying the factory."

Sam (excited): "So adding the 12 new formats would just mean creating 12 new parser structs and registering them?"

David: "Exactly. No changes to existing code."

Emma: "How long would this refactoring take?"

David: "A few days for the factory. But then adding new formats becomes trivial, maybe 30 minutes each."

Sam: "Let's do it. I can't keep maintaining this mess."

Enter the Factory Pattern

Sam spends the weekend reading about the Factory Pattern. Monday morning, armed with coffee and determination, Sam is ready to refactor.

What is the Factory Pattern?

https://refactoring.guru/design-patterns/factory-method

The Factory Pattern is what we call a creational design pattern. Design patterns are essentially battle-tested solutions to common programming problems, recipes that generations of developers have refined over decades. The Factory Pattern specifically deals with the problem of creating objects.

Here's the core insight: instead of scattering object creation code throughout your application, you centralise it in one place. That "one place" is the factory.

The Core Idea

Instead of this:

Client Code → Directly creates object
           → Knows about all concrete types
           → Scattered creation logic

We do this:

Client Code → Asks factory for object
Factory → Creates the right type
       → Centralizes creation logic
       → Hides concrete types

It's like the difference between cooking at home versus ordering from a restaurant. When you cook at home, you need to know all the recipes, have all the ingredients, and understand every technique. When you order from a restaurant, you just say what you want and the kitchen handles everything.

Let's break down what each piece does:

Product Interface (DocumentParser): This is a trait in Rust (similar to an interface in other languages). It defines a contract, "here's what all parsers must be able to do." Every parser promises to implement parse(), name(), and supported_extensions().
Concrete Products (JsonParser, XmlParser, etc.): These are the actual implementations. Each one knows how to parse its specific format. JsonParser knows JSON, XmlParser knows XML, and so on.
Factory (ParserFactory): This is the star of the show. It takes a format string like "json" and returns the appropriate parser. Client code never needs to know which concrete type it's getting, it just works with the trait.
Client: This is any code that needs a parser. It asks the factory, gets a parser, and uses it. Simple.

Real-World Analogy

David's explanation:

"Think of a car factory. You walk in and say 'I want a sedan.' You don't need to know:

How to build a sedan

What parts it needs

The assembly process

The factory handles all that. You just get your car.

Same with parser factory. You say 'I want a JSON parser.' You don't need to know:

How to construct it

What dependencies it has

Implementation details

The factory gives you a parser. You just use it."

Factory Pattern & Our Code

Let's dive deeper into how the Factory Pattern works and why it solves Sam's problem.

Before vs After

BEFORE (Without Factory):

// In document_service.rs.
let parser = if format == "json" {
    Box::new(JsonParser::new())
} else if format == "xml" {
    Box::new(XmlParser::new())
}
// ... 8 more branches.

// In file_processor.rs.
let parser = if format == "json" {  // DUPLICATED!
    Box::new(JsonParser::new())
} else if format == "xml" {
    Box::new(XmlParser::new())
}
// ... 8 more branches.

// In api_handler.rs.
let parser = if format == "json" {  // DUPLICATED AGAIN!
    Box::new(JsonParser::new())
}
// ... you get the idea.

AFTER (With Factory):

// In document_service.rs
let parser = ParserFactory::create(format)?;

// In file_processor.rs
let parser = ParserFactory::create(format)?;  // Same call, but OK!

// In api_handler.rs
let parser = ParserFactory::create(format)?;  // Consistent!

From 18 lines per file to 1 line per file!

The magic here isn't just that the code is shorter, it's that every file now does the same thing in the same way. If you need to change how parsers are created, you change it in ONE place, and the change propagates everywhere automatically.

This diagram shows the dance that happens when you request a parser. The client never talks directly to JsonParser or XmlParser, it only talks to the Factory. The Factory is the matchmaker, connecting requests to the right implementations.

Key Benefits

Centralised Creation: One place to modify when adding or changing parsers.
Encapsulation: Hides construction complexity from client code.
Flexibility: Easy to change what's created without affecting callers.
Testability: Can mock the factory for unit tests.
Maintainability: Changes don't ripple across the codebase.

The SOLID Principles

If you're not familiar with SOLID, these are five principles of object-oriented design that help create maintainable software. The Factory Pattern naturally supports several of them:

1. Single Responsibility Principle

Each piece of code should do one thing and do it well:

Factory's only job: create parsers
Parser's only job: parse documents
Clear separation of concerns

2. Open/Closed Principle

Software should be open for extension but closed for modification. With our factory:

Open for extension: we can add new parsers
Closed for modification: we don't need to change existing client code
This is achieved through the registry pattern (coming up!)

3. Dependency Inversion Principle

High-level code shouldn't depend on low-level details:

Client depends on the Parser trait (abstraction)
Not on JsonParser, XmlParser (concrete types)
Factory bridges the gap

The Solution

"Alright," Sam says, opening a fresh terminal. "Let's build this factory!"

https://github.com/kartikmehta8/document-parser-api

Step 1: Define the Parser Trait

First, we need the interface that all parsers will implement. In Rust, we use a trait for this, it's like a contract that says "any type that wants to be a parser must provide these capabilities."

Create src/parser_trait.rs:

use std::error::Error;

/// The core Parser trait - all parsers implement this.
///
/// This is the "Product" interface in Factory Pattern terminology.
pub trait DocumentParser: Send + Sync {
    /// Parse the document content and return structured data.
    fn parse(&self, content: &str) -> Result<serde_json::Value, Box<dyn Error>>;

    /// Get the parser name. (e.g., "JSON", "XML")
    fn name(&self) -> &str;

    /// Get supported file extensions. (e.g., ["json", "jsonl"])
    fn supported_extensions(&self) -> Vec<&str>;

    /// Get a human-readable description.
    fn description(&self) -> &str;
}

What's happening here?

The DocumentParser trait defines four methods that every parser must implement. The Send + Sync bounds are Rust-specific, they tell the compiler that parsers can be safely shared between threads. This matters because our API might handle multiple requests simultaneously.

The parse method returns Result<serde_json::Value, Box<dyn Error>>. In Rust, Result is how we handle operations that might fail. It's either Ok(value) for success or Err(error) for failure. serde_json::Value is a flexible type that can represent any JSON structure, perfect for our parsed documents.

Sam's note: "This trait defines what ALL parsers must do. Any parser can be used interchangeably because they all implement this interface. That's the power of abstraction!"

Step 2: Create Concrete Parsers

Now let's build the actual parsers. Each one implements the trait in its own way.

Create src/parsers/json_parser.rs:

use crate::parser_trait::DocumentParser;
use std::error::Error;

/// JSON document parser.
pub struct JsonParser;

impl JsonParser {
    pub fn new() -> Self {
        Self
    }
}

impl DocumentParser for JsonParser {
    fn parse(&self, content: &str) -> Result<serde_json::Value, Box<dyn Error>> {
        // Parse JSON using serde_json.
        let value: serde_json::Value = serde_json::from_str(content)?;
        Ok(value)
    }

    fn name(&self) -> &str {
        "JSON"
    }

    fn supported_extensions(&self) -> Vec<&str> {
        vec!["json", "jsonl"]
    }

    fn description(&self) -> &str {
        "Parses JSON (JavaScript Object Notation) documents"
    }
}

Notice how simple this is! The JsonParser struct is actually empty, it's what Rust calls a "zero-sized type." It doesn't need to store any data; it just provides the parsing behaviour.

The ? operator in serde_json::from_str(content)? is Rust's way of propagating errors. If parsing fails, the error automatically gets returned to the caller. No try-catch blocks needed.

Let's create a YAML parser too:

Create src/parsers/yaml_parser.rs:

use crate::parser_trait::DocumentParser;
use std::error::Error;

/// YAML document parser.
pub struct YamlParser;

impl YamlParser {
    pub fn new() -> Self {
        Self
    }
}

impl DocumentParser for YamlParser {
    fn parse(&self, content: &str) -> Result<serde_json::Value, Box<dyn Error>> {
        // Parse YAML and convert to JSON value.
        let value: serde_json::Value = serde_yaml::from_str(content)?;
        Ok(value)
    }

    fn name(&self) -> &str {
        "YAML"
    }

    fn supported_extensions(&self) -> Vec<&str> {
        vec!["yaml", "yml"]
    }

    fn description(&self) -> &str {
        "Parses YAML (YAML Ain't Markup Language) documents"
    }
}

Sam: "Each parser is now a standalone module. They don't know about each other. Perfect separation of concerns!"

Step 3: Create the Factory

Now for the main event, the factory itself! This is where the magic happens.

Create src/factory.rs:

use crate::parser_trait::DocumentParser;
use crate::parsers::*;
use std::sync::Arc;

/// Parser Factory - Creates parsers based on format.
///
/// This is the FACTORY in Factory Pattern terminology.
pub struct ParserFactory;

impl ParserFactory {
    /// Create a parser based on format string.
    ///
    /// This is a STATIC FACTORY METHOD - no instance needed.
    pub fn create(format: &str) -> Result<Arc<dyn DocumentParser>, String> {
        match format.to_lowercase().as_str() {
            "json" | "jsonl" => Ok(Arc::new(JsonParser::new())),
            "xml" | "xhtml" => Ok(Arc::new(XmlParser::new())),
            "yaml" | "yml" => Ok(Arc::new(YamlParser::new())),
            "csv" | "tsv" => Ok(Arc::new(CsvParser::new())),
            "md" | "markdown" => Ok(Arc::new(MarkdownParser::new())),
            _ => Err(format!("Unsupported format: {}", format)),
        }
    }

    /// Create parser from filename extension.
    pub fn create_from_filename(filename: &str) -> Result<Arc<dyn DocumentParser>, String> {
        let extension = filename
            .rsplit('.')
            .next()
            .ok_or_else(|| "No file extension found".to_string())?;

        Self::create(extension)
    }
}

What just happened?

Let's break this down piece by piece:

match expression: This replaces the ugly if-else chain. Rust's match is exhaustive, the compiler ensures you handle all cases. The _ arm catches anything not explicitly matched.
format.to_lowercase(): We normalise the input so "JSON", "Json", and "json" all work the same way. Small detail, big usability improvement.
Arc<dyn DocumentParser>: Arc stands for "Atomic Reference Counted." It's a smart pointer that allows multiple parts of your code to share ownership of the same parser safely. The dyn DocumentParser means "any type that implements DocumentParser", that's how we achieve polymorphism in Rust.
create_from_filename: A convenience method. If someone has a file path like "data.json", this extracts "json" and creates the right parser.

The transformation:

// Before: 18 lines of if-else per file.
// After: 1 line.
let parser = ParserFactory::create("json")?;

Sam (amazed): "That's it? That's the whole factory?"

David: "Yep! Simple, clean, and now there's only ONE place to update when adding new formats."

Step 4: Using the Factory

Now let's update the document service to use our shiny new factory:

// BEFORE (the nightmare):
pub fn parse_document(content: &str, format: &str) -> Result<ParsedDoc, Error> {
    let parser: Box<dyn Parser> = if format == "json" {
        Box::new(JsonParser::new())
    } else if format == "xml" {
        Box::new(XmlParser::new())
    } else if format == "yaml" {
        Box::new(YamlParser::new())
    } else if format == "csv" {
        Box::new(CsvParser::new())
    } else if format == "markdown" {
        Box::new(MarkdownParser::new())
    } else {
        return Err(Error::UnsupportedFormat(format.to_string()));
    };

    parser.parse(content)
}

// AFTER (clean and beautiful):
pub fn parse_document(content: &str, format: &str) -> Result<ParsedDoc, Error> {
    let parser = ParserFactory::create(format)
        .map_err(|e| Error::UnsupportedFormat(e))?;

    parser.parse(content).map_err(Error::from)
}

From 18 lines to 5 lines! And this same transformation happens in ALL 15 files that needed parser creation.

Step 5: Testing the Factory

One of the biggest wins with the Factory Pattern is testability. Now we can test parser creation independently:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_factory_creates_json_parser() {
        let parser = ParserFactory::create("json").unwrap();
        assert_eq!(parser.name(), "JSON");
    }

    #[test]
    fn test_factory_creates_yaml_parser() {
        let parser = ParserFactory::create("yaml").unwrap();
        assert_eq!(parser.name(), "YAML");
    }

    #[test]
    fn test_factory_handles_case_insensitivity() {
        let parser1 = ParserFactory::create("JSON").unwrap();
        let parser2 = ParserFactory::create("json").unwrap();
        assert_eq!(parser1.name(), parser2.name());
    }

    #[test]
    fn test_factory_from_filename() {
        let parser = ParserFactory::create_from_filename("data.json").unwrap();
        assert_eq!(parser.name(), "JSON");
    }

    #[test]
    fn test_factory_unsupported_format() {
        let result = ParserFactory::create("unsupported");
        assert!(result.is_err());
    }

    #[test]
    fn test_parse_json() {
        let parser = ParserFactory::create("json").unwrap();
        let content = r#"{"name": "test", "value": 42}"#;
        let result = parser.parse(content);
        assert!(result.is_ok());
    }
}

Sam: "Look! I can test the factory independently! And each parser independently! This was impossible before!"

The Registry Pattern

David: "Sam, the factory is great. But we can make it even better with a registry."

Sam: "A registry?"

David: "Right now, to add a new parser, you still have to modify the factory's match statement. What if we could register parsers dynamically?"

The Problem with the Simple Factory

Even with our nice factory, adding a new parser requires changing the factory code:

// To add TOML parser, must modify this:
pub fn create(format: &str) -> Result<Arc<dyn DocumentParser>, String> {
    match format.to_lowercase().as_str() {
        "json" | "jsonl" => Ok(Arc::new(JsonParser::new())),
        "xml" | "xhtml" => Ok(Arc::new(XmlParser::new())),
        // ... existing parsers ...
        "toml" => Ok(Arc::new(TomlParser::new())),  // NEW! Must edit factory.
        _ => Err(format!("Unsupported format: {}", format)),
    }
}

It's way better than before, but can we do even better?

The Registry Solution

A registry is like a phone book for parsers. Instead of hardcoding which parsers exist, we maintain a dynamic collection that can be modified at runtime.

Create src/registry.rs:

use crate::parser_trait::DocumentParser;
use once_cell::sync::Lazy;
use std::collections::HashMap;
use std::sync::{Arc, RwLock};

/// Parser Registry - Extensible parser registration system.
///
/// This allows adding new parsers WITHOUT modifying the factory!
pub struct ParserRegistry {
    parsers: RwLock<HashMap<String, Arc<dyn DocumentParser>>>,
}

impl ParserRegistry {
    fn new() -> Self {
        Self {
            parsers: RwLock::new(HashMap::new()),
        }
    }

    /// Register a new parser for a given format.
    pub fn register(&self, format: String, parser: Arc<dyn DocumentParser>) {
        let mut parsers = self.parsers.write().unwrap();
        parsers.insert(format.to_lowercase(), parser);
    }

    /// Get a parser by format name.
    pub fn get(&self, format: &str) -> Option<Arc<dyn DocumentParser>> {
        let parsers = self.parsers.read().unwrap();
        parsers.get(&format.to_lowercase()).cloned()
    }

    /// List all registered parsers.
    pub fn list_all(&self) -> Vec<(String, Arc<dyn DocumentParser>)> {
        let parsers = self.parsers.read().unwrap();
        parsers
            .iter()
            .map(|(k, v)| (k.clone(), Arc::clone(v)))
            .collect()
    }
}

/// Global parser registry singleton.
static PARSER_REGISTRY: Lazy<ParserRegistry> = Lazy::new(|| {
    let registry = ParserRegistry::new();

    // Pre-register default parsers.
    use crate::parsers::*;
    registry.register("json".to_string(), Arc::new(JsonParser::new()));
    registry.register("xml".to_string(), Arc::new(XmlParser::new()));
    registry.register("yaml".to_string(), Arc::new(YamlParser::new()));
    registry.register("csv".to_string(), Arc::new(CsvParser::new()));
    registry.register("markdown".to_string(), Arc::new(MarkdownParser::new()));

    registry
});

/// Get the global parser registry.
pub fn get_registry() -> &'static ParserRegistry {
    &PARSER_REGISTRY
}

Let's unpack what's happening here:

The RwLock<HashMap<...>> is a thread-safe way to store our parsers. RwLock allows multiple readers OR one writer at a time, perfect for a registry that's read often but written rarely.

Lazy from the once_cell crate gives us lazy initialisation. The registry isn't created until the first time someone calls get_registry(). After that, the same instance is reused forever. This is the Singleton pattern working alongside our Factory pattern.

The default parsers are registered when the registry is first accessed. But here's the key insight: you can call register() at any time to add more parsers!

Using the Registry

// Get parser from registry.
let parser = get_registry()
    .get("json")
    .ok_or("Parser not found")?;

// Add new parser WITHOUT modifying existing code!
get_registry().register(
    "toml".to_string(),
    Arc::new(TomlParser::new())
);

Sam (excited): "So to add TOML, I just create TomlParser and register it? No changes to factory OR client code?"

David: "Exactly! You could even load parsers from plugins at runtime if you wanted!"

Adding a New Parser

Here's the complete process for adding TOML support:

// Step 1: Create the parser. (new file: src/parsers/toml_parser.rs)
pub struct TomlParser;

impl DocumentParser for TomlParser {
    fn parse(&self, content: &str) -> Result<serde_json::Value, Box<dyn Error>> {
        let value: toml::Value = toml::from_str(content)?;
        // Convert TOML value to JSON value for consistency.
        let json = serde_json::to_value(value)?;
        Ok(json)
    }

    fn name(&self) -> &str {
        "TOML"
    }

    fn supported_extensions(&self) -> Vec<&str> {
        vec!["toml"]
    }

    fn description(&self) -> &str {
        "Parses TOML (Tom's Obvious, Minimal Language) configuration files"
    }
}

// Step 2: Register it. (in main.rs or initialization code)
get_registry().register("toml".to_string(), Arc::new(TomlParser::new()));

// Step 3: Done! No other changes needed!

That's it! No modifications to:

Factory code
Client code
Other parsers
API handlers
Tests

Just:

Create new parser
Register it
It works everywhere!

Sam (amazed): "This is incredible! Emma wanted 12 new formats. With the registry, I can add all 12 without touching ANY existing code!"

The Resolution

Monday Morning - The Demo

Sam calls a meeting with Emma and David to show the new system.

Sam: "Alright, remember when you wanted support for 12 new formats? Watch this."

Sam opens the terminal:

# Create a new parser (takes 5 minutes)
# src/parsers/toml_parser.rs - 30 lines of code.

# Register it (takes 1 line)
# registry.register("toml".to_string(), Arc::new(TomlParser::new()));

# Test it immediately.
curl -X POST http://localhost:3000/parse \
  -H "Content-Type: application/json" \
  -d '{"content": "name = \"test\"\nvalue = 42", "format": "toml"}'

# Response: Success!

Emma (amazed): "Wait, that's it? You just added TOML support in 5 minutes?"

Sam: "Yep. And I didn't touch ANY existing code. No changes to handlers, no changes to other parsers, no changes to tests."

David: "That's the power of the Factory Pattern with Registry. Show them the before/after metrics."

The Metrics That Matter

Before (If-Else Hell):

Metric	Value
Lines of duplicate code	270 lines (18 × 15 files)
Time to add new format	2-3 hours
Files to modify per format	15+ files
Bugs per new format	1-2 on average
Test coverage	12% (untestable)
Cyclomatic complexity	9 per chain
Developer happiness	2/10

After (Factory + Registry):

Metric	Value	Improvement
Lines of duplicate code	0 lines	100% elimination
Time to add new format	10-15 minutes	92% faster
Files to modify per format	1 file (new parser)	93% reduction
Bugs per new format	0 (isolated)	100% reduction
Test coverage	87% (testable!)	625% increase
Cyclomatic complexity	1-2 per function	78% reduction
Developer happiness	9/10	350% increase

Architecture Evolution

Before:

If-Else Chains Everywhere
document_service.rs    ────┐
file_processor.rs      ────┤
api_handler.rs         ────┤
batch_processor.rs     ────┤  All contain
validator.rs           ────┼─ same 18-line
converter.rs           ────┤  if-else chain
cli_tool.rs            ────┤
upload_handler.rs      ────┤
... 8 more files       ────┘

Adding format = Modify 15+ files

After:

Clean Factory + Registry
                    ┌──────────────┐
All Clients ───────▶│   Registry   │
                    │  (Singleton) │
                    └──────┬───────┘
                           │
            ┌──────────────┼──────────────┐
            ▼              ▼              ▼
        JsonParser    YamlParser    CsvParser
            ▼              ▼              ▼
        XmlParser    TomlParser    ... more

Adding format = Create 1 file + 1 line to register

Code Quality Transformation

Team Reaction

Emma: "So you can add all 12 formats for the enterprise client?"

Sam: "I can add them this week. Probably 2-3 hours total for all 12."

Emma (excited): "Last time you estimated 3 hours PER format!"

Sam: "That was before the refactor. Now it's just:

Create parser (15 min)
Write tests (10 min)
Register it (1 line)
Done!"

David: "And the best part? The code is now maintainable. When we hire new developers, they won't need to hunt through 15 files to understand parser selection. It's all in one place."

Production Deployment

Sam deploys the new factory-based system to production.

Deployment Log:

All existing parsers working
Zero breaking changes
Response times improved (10% faster)
Memory usage down (parsers shared via Arc)
Test coverage up from 12% to 87%
Code complexity down 78%

Incident Log (Next 30 Days):

Week 1: 0 parser-related bugs (was 2-3/week)
Week 2: 0 parser-related bugs
Week 3: 0 parser-related bugs
Week 4: 0 parser-related bugs

Total: 0 bugs!

Feature Velocity:

Formats added: 12 (all enterprise client formats)
Time taken: 3.5 hours total
Bugs introduced: 0
Files modified: 12 new files, 0 existing files modified

When to Use (and Not Use) Factory Pattern

Now that you've seen the Factory Pattern in action, let's talk about when it makes sense and when it doesn't.

Use Factory Pattern When:

1. Creating objects requires complex logic

If you find yourself with multiple conditional branches, configuration-based instantiation, or object creation that varies based on input, a factory can help centralise and simplify that logic.

2. You have a family of related classes

When you have multiple types that implement the same interface and are used interchangeably (like our parsers), a factory provides a clean way to select between them.

3. Object creation is scattered

The biggest red flag is finding the same creation code duplicated across your codebase. That's exactly what Sam was dealing with, and exactly what the factory fixed.

4. You want loose coupling

If you want client code to work with abstractions rather than concrete types, a factory acts as the bridge. Clients ask for "a parser" without knowing or caring about the specific implementation.

Don't Use Factory Pattern When:

1. Object creation is simple

If creating an object is just calling new() with no complex logic, a factory adds unnecessary abstraction:

// Overkill.
let user = UserFactory::create(name, email);

// Better - just use new directly.
let user = User::new(name, email);

2. You only have one concrete type

If there's no polymorphism needed, no variants to choose from, a factory adds no value.

3. Performance is critical

Factories add a small amount of indirection. For most applications this is negligible, but in performance-critical hot paths, direct instantiation may be faster. Always profile before optimizing!

4. Added complexity not justified

If your team doesn't understand the pattern, or if you're solving a simple problem with a complex solution, step back and ask if you really need it. Remember YAGNI, You Aren't Gonna Need It.

Conclusion

When I first started writing backend systems, I made the same mistakes Sam did. I'd copy-paste code, add another else if branch, and tell myself "I'll clean this up later." Spoiler: later never came. Instead, the codebase grew into something I dreaded opening every morning.

The Factory Pattern wasn't something I learned from a textbook and immediately understood. It clicked for me only after I'd experienced the pain firsthand, after I'd introduced production bugs because I forgot to update one of twelve files, after I'd spent entire weekends doing what should have been a thirty-minute task.

Here's what I want you to take away from this:

Design patterns aren't academic exercises. They're battle scars turned into blueprints. Every pattern exists because thousands of developers before us hit the same wall and figured out how to climb over it. The Factory Pattern exists because object creation gets messy, and centralising that mess into one place makes everything else cleaner.

Start simple, refactor when it hurts. I'm not saying you should use the Factory Pattern everywhere from day one. If you're building something small with two or three types, direct instantiation is fine. But pay attention to the warning signs: duplicated creation logic, fear of adding new types, bugs that keep appearing in the same places. When you feel that pain, that's when patterns become your friend.

The Registry Pattern is underrated. Combining Factory with Registry gave me something I didn't expect, the ability to extend the system without touching existing code. That's not just convenient; it's liberating. New requirements stopped feeling like a burden and started feeling like opportunities.

Rust makes this elegant. Traits, Arc, match expressions, Rust's type system practically guides you toward clean factory implementations. The compiler catches mistakes that would have been runtime bugs in other languages. If you're coming from a dynamically typed background, lean into Rust's strictness. It's not fighting you; it's protecting you.

Looking back at the document parser API we built together, I'm genuinely proud of how it turned out. It's not over-engineered. It's not clever for the sake of being clever. It's just... clean. And clean code is code you can hand off to someone else without writing a novel of documentation. It's code you can come back to six months later and actually understand.

If you've made it this far, thank you for reading. I hope Sam's story resonated with you, and I hope the next time you find yourself drowning in if-else chains, you'll remember there's a better way.

Now go build something. And when it gets messy, because it will, you'll know what to do.

Command Palette