Browse Source

- added corpus info

master
Andreas Romeyke 10 months ago
parent
commit
53ab7bd51d
  1. 4
      README.md

4
README.md

@ -11,7 +11,7 @@ The ideas are following:
- random sampling to improve scanning (we need very fast, not very accurate results)
- category check (what kind of data could be there in general?)
- filetype identification using bigram based estimation, learned by decision
tree over files, identified by pronom
tree over files (using format-corpus https://github.com/openpreserve/format-corpus and Mime::Types)
TODO ideas
@ -20,3 +20,5 @@ TODO ideas
- scan EWF images, too
This will be the base for an upcoming standalone application.

Loading…
Cancel
Save