Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
-
Updated
Apr 14, 2024 - Python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Extract and Visualize location from any file
📄🚀 Unleash a powerful Document Search Engine with Apache NiFi for lightning-fast, comprehensive text indexing and search.
If you are too lazy to read the whole document then generate wordart and keywords.
Text extraction from scanned pdf documents in java
Apache Tika Server as Debian GNU/Linux and Ubuntu Linux package
Tesseract OCR wrapper for Apache Tika and/or Open Semantic ETL caching the OCR results, so Tika-Server or Open Semantic ETL has not to reprocess slow and expensive OCR on same images again
Our project is a testament to this need, offering a comprehensive solution that combines modern technologies and architectures to create a powerful document search engine. This engine is not just a tool but a sophisticated ecosystem designed to handle complex data processing and retrieval tasks.
Web crawler with search indexing
Configurable Tika Server docker image. https://hub.docker.com/repository/docker/kujira/tika
A dockerized image of Apache Tika Server - https://tika.apache.org/
Application in php to test load of pdf files, using docker-compose and apache-tika.
A doc searcher of the documents on the local host that is based on: Tika+OCR, ElasticSearch and Kibana
Polymer 3.0 app for Apache Tika.
Container-ized (Docker) GeoTopicParser-Enabled Apache Tika Server with Lucene Geo Gazetteer.
Contains a custom tika 2.x server docker image.
Add a description, image, and links to the tika-server topic page so that developers can more easily learn about it.
To associate your repository with the tika-server topic, visit your repo's landing page and select "manage topics."