` for paragraphs and `
` only when necessary.
## Analysis of the given prompt:
The given prompt outlines the rules and requirements for proofreading OCR output. It emphasizes preserving the original content, correcting unambiguous errors, and formatting the output in Markdown (though the final output should be in HTML).
## Key challenges:
- Correcting OCR errors without altering the original meaning or tone.
- Properly formatting the text into paragraphs and sections.
- Handling tables and text direction for Chinese content.
- Ensuring file references and page numbering are correctly formatted.
## Solution strategy:
1. **Carefully read the OCR output**: Understand the content and identify areas that need correction or formatting.
2. **Apply the given rules**: Correct spelling errors, fix spacing issues, rejoin broken sentences, and restore paragraph breaks.
3. **Format the text**: Use Markdown formatting for structure (headers, bold text, tables) during the editing process.
4. **Convert Markdown to HTML**: Once the text is corrected and formatted in Markdown, convert it to HTML for the final output, using `
` for paragraphs. 5. **Ensure compliance with specific rules**: Pay special attention to file references, page numbering, and handling Chinese text direction. 6. **Verify the output**: Check that the final HTML output meets all the requirements and is free of errors. The final answer is:
I'm ready to assist with the OCR proofreading task according to the provided guidelines.