No subject


Thu Apr 11 16:32:37 PDT 2013


I think<br>
it would be helpful to open a discussion on how the FOMS/Xiph community<br>
can help support the underpinnings of OCR compatibility and open,<br>
portable-document formats (i.e., PDF and DjVu) with relatively minimal<br>
effort.<br>
<br>
As best as I can tell the hOCR specification has been non-maintained for<br=
>
the past three years; the &#39;hocr-tools&#39; project has been largely ina=
ctive<br>
for the past four. =A0The spec is elegant in that is is built entirely from=
<br>
HTML but there are unresolved ambiguities and a few underdeveloped<br>
sections. =A0It needs a new home...<br>
<br>
I&#39;ve started the &#39;hocr-workflow-tools&#39; project, making it possi=
ble to<br>
markup images with hOCR (via Inkscape) then export-all as a<br>
text-searchable PDF (via command-line or Inkscape):<br>
=A0<a href=3D"https://groups.google.com/forum/#!msg/hocr/CjeiE5MiqS8/iumxuS=
XNvRsJ" target=3D"_blank">https://groups.google.com/forum/#!msg/hocr/CjeiE5=
MiqS8/iumxuSXNvRsJ</a><br>
(Think text-searchable handwriting.)<br>
<br>
I&#39;d like to extend hocr-workflow-tools to produce DjVu files as well, a=
s a<br>
separate effort. =A0DjVu is in a unique position: the patent license grante=
d<br>
by LizardTech to the DjVu community only covers implementations of the<br>
current spec; innovation is disallowed. =A0My understanding is that<br>
LizardTech&#39;s key patents are expiring now or in the near future; it&#39=
;s a<br>
neat, high-quality &quot;codec&quot; also in need of adoption (possibly). =
=A0By<br>
contrast PDF is a proprietary specification provided by Adobe, even if<br>
&quot;fair&quot; via standardization (&#39;PDF/A&#39; is especially helpful=
).<br>
<br>
OCR inter-compatibility and portable-document formats are big deal for<br>
libraries and law offices, other organizations too.<br>
<br>
<br>
Sincerely,<br>
George<br>
<br>
_______________________________________________<br>
foms mailing list<br>
<a href=3D"mailto:foms at lists.annodex.net">foms at lists.annodex.net</a><br>
<a href=3D"http://lists.annodex.net/cgi-bin/mailman/listinfo/foms" target=
=3D"_blank">http://lists.annodex.net/cgi-bin/mailman/listinfo/foms</a><br>
</blockquote></div><br><br clear=3D"all"><br>-- <br>Andoni Morales Alastrue=
y<br><br>LongoMatch:The Digital Coach<br><a href=3D"http://www.longomatch.y=
latuya.es">http://www.longomatch.ylatuya.es</a>
</div></div>

--001a11c20a725d346c04eadc5194--


More information about the foms mailing list