Skip to content

Architecture Diagrams

This page presents various architecture diagrams of the DocuSnap-Backend system, helping readers understand the system design and component relationships from different perspectives.

Overall Architecture Diagram

The following diagram shows the overall architecture of DocuSnap-Backend, including the three-layer architecture (Backend Server, OCR Server, LLM Provider) and the main data flows.

Overall Architecture Diagram

Diagram Explanation:

  1. Client Layer:
  2. Shows different types of clients interacting with the system
  3. Communicates with the backend server via encrypted REST API

  4. Backend Server Layer:

  5. The core component of the system, based on Flask framework
  6. Contains multiple functional modules such as task processing, security encryption, caching, etc.
  7. Manages communication with OCR server and LLM provider

  8. OCR Server Layer:

  9. Independent OCR processing service, based on CnOCR
  10. Responsible for image text recognition

  11. LLM Provider Layer:

  12. External LLM service (Zhipu AI)
  13. Provides text analysis and information extraction capabilities

  14. Data Flow:

  15. Arrows indicate the direction of data flow
  16. Shows the complete path of request processing

This architecture diagram clearly shows the layered structure and component relationships of the system, helping to understand the overall design and working principles of the system.

Core Modules Relationship Diagram

The following diagram shows the relationships and interactions between the five core modules of the DocuSnap-Backend system.

Core Modules Relationship Diagram

Diagram Explanation:

  1. Task Processing Module:
  2. Located at the center of the system, coordinating the work of other modules
  3. Manages task queues and worker threads
  4. Interacts with all other modules

  5. OCR Processing Module:

  6. Responsible for image processing and OCR service calls
  7. Interacts with the Task Processing Module and LLM Processing Module

  8. LLM Processing Module:

  9. Responsible for prompt construction and LLM API calls
  10. Processes OCR results, generates structured output

  11. Security & Encryption Module:

  12. Provides end-to-end encryption and request validation
  13. Protects communication security with clients

  14. Cache & Persistence Module:

  15. Manages storage of task status and results
  16. Provides cache query and cleanup functions

The connections between modules represent their dependencies and data flows, with different colors distinguishing the functional scope of different modules.

Data Flow Diagram

The following diagram shows the complete process of data processing in the DocuSnap-Backend system, from client request to response.

Data Flow Diagram

Diagram Explanation:

  1. Request Processing Phase:
  2. Client sends encrypted request
  3. System decrypts and validates
  4. Checks task status, decides whether to create a new task

  5. Task Processing Phase:

  6. Task enters the queue
  7. Worker thread retrieves task
  8. Selects processing strategy based on task type

  9. OCR Processing Phase:

  10. Sends images to OCR service
  11. Processes multiple images in parallel
  12. Merges OCR results

  13. LLM Processing Phase:

  14. Builds prompts suitable for task type
  15. Calls LLM API
  16. Parses LLM response

  17. Result Processing Phase:

  18. Stores processing results
  19. Encrypts response data
  20. Returns results to client

The numbers in the data flow diagram indicate the sequence of processing steps, helping to understand the system's workflow and data transformation process.

Deployment Architecture Diagram

The following diagram shows the deployment architecture of the DocuSnap-Backend system in a production environment.

Deployment Architecture Diagram

Diagram Explanation:

  1. Frontend Deployment:
  2. Web clients and mobile clients
  3. Communicates with backend via HTTPS

  4. Load Balancing Layer:

  5. Nginx reverse proxy
  6. Distributes requests to multiple application instances

  7. Application Service Layer:

  8. Multiple Flask application instances
  9. Uses Gunicorn as WSGI server

  10. OCR Service Layer:

  11. Independent OCR service instances
  12. Can be horizontally scaled

  13. Data Storage Layer:

  14. SQLite database for caching
  15. File storage for temporary data

  16. External Service Layer:

  17. Zhipu AI LLM service
  18. Other potential external services

The deployment architecture diagram shows the distribution and connection of system components in physical or virtual environments, helping to understand the system's deployment structure and scaling strategy.

Architecture Diagram Design Notes

All architecture diagrams were created using GraphViz tools, following these design principles:

  1. Clear Hierarchical Structure: Using layered layouts to clearly show system component relationships
  2. Consistent Visual Style: Using unified colors and shapes to identify different types of components
  3. Detailed Annotations: Adding appropriate labels and notes to explain diagram content
  4. Separation of Concerns: Different diagrams focus on different aspects of the system, providing a comprehensive perspective

Together, these architecture diagrams provide a comprehensive view of the DocuSnap-Backend system design, helping readers understand the system architecture and working principles from different angles.