Read (Almost) Any Document in Java

Apache Tika

VocabHunter

Reading Documents

Metadata metadata = new Metadata();

try (InputStream in = TikaInputStream.get(file, metadata)) {
Tika tika = new Tika();

return tika.parseToString(in, metadata, -1);
} catch (IOException | TikaException e) {
throw new VocabHunterException(
String.format("Unable to read file '%s'", file), e);
}

Final Words

--

--

I’m a Java software developer, based in Barcelona

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Carroll

Adam Carroll

I’m a Java software developer, based in Barcelona