The eupolicy.social admin @admin

**Major Hayden** @major@tootloop.com · Sep 26

We've seen using #docling a lot at work lately to parse all kinds of documents in various formats. It's handy for converting them into a common JSON document.

https://major.io/p/fun-with-docling/

Major Hayden · Sep 26Fun with docling

More from

Major Hayden

#rag #ai #knowledge

Replied to Major Hayden

**Alexandre B A Villares** @villares@ciberlandia.pt · Sep 20

Sep 20

Alexandre B A Villares @villares@ciberlandia.pt

@major Tell us more about #Docling!

**Markus Eisele** @myfear@mastodon.online · Jul 17

Jul 17

Markus Eisele @myfear@mastodon.online

ICYMI: Taming Unstructured Data: From PDFs to JSON with Quarkus and Docling https://open.substack.com/pub/myfear/p/quarkus-docling-data-preparation-for-ai?r=17bggb&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false
#Java #quarkus #Docling #Data

Enterprise Java and Quarkus · Jul 12Taming Unstructured Data: From PDFs to JSON with Quarkus and DoclingBy Markus Eisele

**Markus Eisele** @myfear@mastodon.online · Jul 12

Jul 12

Markus Eisele @myfear@mastodon.online

Taming Unstructured Data: From PDFs to JSON with Quarkus and Docling
Build a fast, scalable converter to turn business documents into structured data
https://myfear.substack.com/p/quarkus-docling-data-preparation-for-ai
#Java #Quarkus #Docling #AIML #PDF #DocumentParsing

**Caio** @caiocco@bolha.us · Jul 8

Jul 8

Caio @caiocco@bolha.us

A pauta de hoje do #TerSoftware é sobre "gestão de papel". Recentemente, testei OCR para digitalização de tabelas e... não fiquei muito feliz com o resultado.

Acredito que #OCR funcione melhor quando fica bem amarrado com o documento digitalizado (por exemplo, tornando um arquivo PDF buscável), mas para extração de texto, ainda é um grande "depende".

Na minha curta jornada, testei #Tesseract e #Docling. Talvez funcione com código bem escrito, mas acabei me rendendo e indo "no muque" mesmo.

O Tesseract parece bem fácil de instalar no Linux (mesmo no #openSUSE Leap, que tem suas limitações por sair do SUSE empresarial, achei fácil), mas o Docling exigiu alguns malabarismos com ambientes em Python (usando conda e pip).

Para texto corrido, o Tesseract parece bem suficiente, já. Pode ser rodado via linha de comando e, pelo menos no openSUSE Leap, vários dicionários se encontram empacotados para facilitar.

**olеg lаvrоvsky** @loleg@hachyderm.io · May 9 *

May 9 *

olеg lаvrоvsky @loleg@hachyderm.io

Taking part in the #Docling workshop at the #OpenSource AI conference. This is a project I heard about at #DINAconCH a few months ago, and it seems to since have exploded in popularity on PyPi and GitHub - in part thanks to the #CHopen community

There are strong overlaps with what I've been doing at #ProxeusApp - my notes from the Docling deep-dive have been posted here: https://log.alets.ch/105/

**Markus Eisele** @myfear@mastodon.online · Apr 18

Apr 18

Markus Eisele @myfear@mastodon.online

Simplify AI data integration with RamaLama and RAG
https://developers.redhat.com/articles/2025/04/03/simplify-ai-data-integration-ramalama-and-rag#
#Docling #Ramalama #podman #aiml

**Markus Eisele** @myfear@mastodon.online · Jan 5

Jan 5

Markus Eisele @myfear@mastodon.online

Docling with ollama https://youtube.com/watch?v=GMHazLUQBQM&si=3IPrjQx2pMMRGwR9
#ollama #docling #rag #llm #genai

**Peter Bronez** @PeterBronez@hachyderm.io · Jan 3

Jan 3

Peter Bronez @PeterBronez@hachyderm.io

Wrestling with PDF files today… delighted to find #Docling https://ds4sd.github.io/docling/

It’s a solid CLI for parsing documents. It was annoying to install, but works well. I still have manual cleanup to do, but way easier than manual and higher quality than other AI options

ds4sd.github.ioDocling - Docling

**Markus Eisele** @myfear@mastodon.online · Dec 12, 2024

Dec 12, 2024

Markus Eisele @myfear@mastodon.online

Docling: AI-powered document processing!
PDFs & DOCXs in your AI workflow? Docling makes it easy! Converts to markdown & JSON for RAG and more. Blazing fast!

https://youtu.be/zSCxbqgqeJ8?si=mede5eJL_iGRFwAJ

YouTubeDocling: Efficient document processing for AI workflowsBy Red Hat Developer

#AI #Docling #DocumentProcessing

**Markus Eisele** @myfear@mastodon.online · Nov 16, 2024

Nov 16, 2024

Markus Eisele @myfear@mastodon.online

Docling, IBM’s new open-source toolkit, is designed to more easily unearth that information for generative AI applications. The toolkit streamlines the process of turning unstructured documents into JSON and Markdown files that are easy for large language models (LLMs) and other foundation models to digest.

https://github.com/DS4SD/docling
#docling #aiml #ml #genai

GitHubGitHub - DS4SD/docling: Get your documents ready for gen AIGet your documents ready for gen AI. Contribute to DS4SD/docling development by creating an account on GitHub.

Recent searches

Search options

Administered by:

Server stats:

#docling