Why OpenAI Models Struggle with PDF Extraction(and Why Gemini Fairs Much Better) | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Why OpenAI Models Struggle with PDF Extraction(and Why Gemini Fairs Much Better) (medium.com/abasiri)
		5 points by kapitalx 10 months ago \| hide \| past \| favorite \| 2 comments

ban-lan-gen 10 months ago [–]

Does this only apply to PDFs with many images? If it is mostly text and table, could it just extract plain text?

kapitalx 10 months ago | [–]

This applies to PDFs with lots of text. The smaller the text on the page, the more impacted it is.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact