- Create new hocr-transform.ts utility for parsing hOCR output
- Add line-aware text processing with baseline and rotation support
- Implement width-based font size calculation to match word bounding boxes
- Fix text selection not covering full characters issue
- Add proper type definitions for OcrLine, OcrPage, WordTransform
- Support RTL languages and CJK word break handling