eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

202
active users

#ocr4all

0 posts0 participants0 posts today
Monika Barget<p>Every now &amp; then, I give <a href="https://akademienl.social/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatGPT</span></a> a scan of my handwriting to test its skills in working with <a href="https://akademienl.social/tags/handwrittentexts" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>handwrittentexts</span></a>. Initially, it responded that it could not process the scans or gave me entirely fictional output, but today it got almost everything right. These results are better than those I achieved with <a href="https://akademienl.social/tags/HWR" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HWR</span></a> models in <a href="https://akademienl.social/tags/Tesseract" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Tesseract</span></a> &amp; <a href="https://akademienl.social/tags/OCR4all" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OCR4all</span></a> without additional training. I also asked ChatGPT what it "thought" about my writing &amp; it called it "consistently shaped &amp; large with stylistic strokes."</p>
Daniela Schneider<p>Hi <a href="https://fedihum.org/tags/histodons" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>histodons</span></a>,<br>I need your expertise. We want to integrate an <a href="https://fedihum.org/tags/opensource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>opensource</span></a> <a href="https://fedihum.org/tags/ocr" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ocr</span></a> tool into our <a href="https://fedihum.org/tags/useGalaxy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>useGalaxy</span></a> Platform so you can better analyse your texts, etc.<br>I worked with <a href="https://fedihum.org/tags/tesseract" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tesseract</span></a> some years ago, and I heard about <a href="https://fedihum.org/tags/ocr4all" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ocr4all</span></a>. <br>Do you have experience with any of these - or other recommendations?<br>We are also integrating <a href="https://fedihum.org/tags/tranksribus" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tranksribus</span></a> via API but want another ocr-specific option.<br>Looking forward to your experiences! </p><p><span class="h-card" translate="no"><a href="https://xn--baw-joa.social/@galaxyfreiburg" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>galaxyfreiburg</span></a></span> <br><span class="h-card" translate="no"><a href="https://nfdi.social/@NFDI4Memory" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>NFDI4Memory</span></a></span></p>
Frederik Elwert<p>Re OCR/ATR, interestingly the <a href="https://fedihum.org/tags/OCR4all" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OCR4all</span></a> paper also offers a very good overview of the different steps and workflows. It has a different purpose, but I think it can still be used in a class context.</p><p>Reul, Christian et al. 2019. “OCR4all—An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings.” Applied Sciences 9 (22): 4853. <a href="https://doi.org/10.3390/app9224853" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">doi.org/10.3390/app9224853</span><span class="invisible"></span></a>.</p>
Benjamin Rosemann<p><span class="h-card" translate="no"><a href="https://historians.social/@tkinias" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>tkinias</span></a></span> as far as I understand you want to implement a PDF -&gt; Text -&gt; PDF workflow. Using plaintext as intermediate is problematic, as you (may) lose a lot of layout information.</p><p>For high quality fulltext you may need a more sophisticated intermediate format like <a href="https://mastodon.social/tags/PageXML" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PageXML</span></a> or <a href="https://mastodon.social/tags/AltoXML" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AltoXML</span></a>. But they also require a more sophisticated tool for editing like <a href="https://mastodon.social/tags/OCR4All" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OCR4All</span></a>.</p>
Frederik Elwert<p>A colleague just asked me about a good, free OCR software for a historical book they are scanning. I was checking out <a href="https://fedihum.org/tags/OCR4all" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OCR4all</span></a> to see if I could recommend it. First thing on the "Getting started" page: A Linux terminal command to start docker … 😵‍💫 I’m not criticizing the project, which I think does important work, but it’s a rather peculiar definition of "all" …</p>