项目作者: openpreserve

项目描述 :
File validation and characterisation.
高级语言: Java
项目地址: git://github.com/openpreserve/jhove.git
创建时间: 2014-03-11T10:47:10Z
项目社区:https://github.com/openpreserve/jhove

开源协议:Other

下载


JHOVE

JSTOR/Harvard Object Validation Environment

example branch parameter
Build Status
Maven Central
CodeCov Coverage
Codacy Badge
GitHub issues
GitHub forks

Licensing

Copyright 2003-2012 by JSTOR and the President and Fellows of Harvard College,
2015-2022 by the Open Preservation Foundation.
JHOVE is made available under the
GNU Lesser General Public License (LGPL).

Rev. 1.32.1, 2025-02-06

JHOVE Homepage

http://jhove.openpreservation.org

Overview

JHOVE (the JSTOR/Harvard Object Validation Environment, pronounced “jove”)
is an extensible software framework for performing format identification,
validation, and characterization of digital objects.

  • Format identification is the process of determining the format to which a
    digital object conforms: “I have a digital object; what format is it?”
  • Format validation is the process of determining the level of compliance of a
    digital object to the specification for its purported format: “I have an
    object purportedly of format F; is it?”
  • Format characterization is the process of determining the format-specific
    significant properties of an object of a given format: “I have an object of
    format F; what are its salient properties?”

These actions are frequently necessary during routine operation of digital
repositories and for digital preservation activities.

The output from JHOVE is controlled by output handlers. JHOVE uses an
extensible plug-in architecture; it can be configured at the time of its
invocation to include whatever specific format modules and output handlers
that are desired. The initial release of JHOVE includes modules for
arbitrary byte streams, ASCII and UTF-8 encoded text, AIFF and WAVE audio,
GIF, JPEG, JPEG 2000, TIFF, and PDF; and text and XML output handlers.

The JHOVE project is a collaboration of JSTOR and the Harvard University
Library. Development of JHOVE was funded in part by the Andrew W. Mellon
Foundation. JHOVE is made available under the GNU Lesser General Public
License (LGPL; see the file LICENSE for details).

JHOVE is currently being maintained by the
Open Preservation Foundation.

Pre-requisites

  1. Java JRE 1.8
    Version 1.20 of JHOVE is built and tested against Oracle JDK 8,
    and OpenJDK 8 on Travis. Releases are built using Oracle JDK 8
    from the OPF’s Jenkins server.

  2. If you would like to build JHOVE from source, then life will be easiest if
    you use Apache Maven.

Getting JHOVE

For Users: JHOVE Cross Platform Installer

You can download the latest version of JHOVE here.

For Developers: JHOVE JARs via Maven

From v1.16 onwards all production releases of JHOVE are deployed to Maven
Central. Add the version of JHOVE you’d like to use as a property in your Maven
POM:

  1. <properties>
  2. ...
  3. <jhove.version>1.20.1</jhove.version>
  4. </properties>

Use this dependency for the core classes Maven module (e.g. JhoveBase,
Module, ModuleBase, etc.):

  1. <dependency>
  2. <groupId>org.openpreservation.jhove</groupId>
  3. <artifactId>jhove-core</artifactId>
  4. <version>${jhove.version}</version>
  5. </dependency>

this for the JHOVE internal module implementations:

  1. <dependency>
  2. <groupId>org.openpreservation.jhove</groupId>
  3. <artifactId>jhove-modules</artifactId>
  4. <version>${jhove.version}</version>
  5. </dependency>

this for the JHOVE external module implementations:

  1. <dependency>
  2. <groupId>org.openpreservation.jhove</groupId>
  3. <artifactId>jhove-ext-modules</artifactId>
  4. <version>${jhove.version}</version>
  5. </dependency>

and this for the JHOVE applications:

  1. <dependency>
  2. <groupId>org.openpreservation.jhove</groupId>
  3. <artifactId>jhove-apps</artifactId>
  4. <version>${jhove.version}</version>
  5. </dependency>

If you want the latest development packages you’ll need to add the
Open Preservation Foundation’s Maven repository
to your settings file:

  1. <profiles>
  2. <profile>
  3. <id>opf-artifactory</id>
  4. <repositories>
  5. <repository>
  6. <snapshots>
  7. <enabled>false</enabled>
  8. </snapshots>
  9. <id>central</id>
  10. <name>opf-dev</name>
  11. <url>http://artifactory.openpreservation.org/artifactory/opf-dev</url>
  12. </repository>
  13. </repositories>
  14. </profile>
  15. </profiles>
  16. <activeProfiles>
  17. <activeProfile>opf-artifactory</activeProfile>
  18. </activeProfiles>

You can then follow the instructions above to include particular Maven modules,
but you can now also choose odd minor versioned development builds. At the time
of writing the latest development version could be included by using the
following property:

  1. <properties>
  2. ...
  3. <jhove.version>1.21.1</jhove.version>
  4. </properties>

or even:

  1. <properties>
  2. ...
  3. <jhove.version>[1.21.0,1.22.0]</jhove.version>
  4. </properties>

to always use the latest 1.21 build.

For Developers: Building JHOVE from Source

Clone this project, checkout the integration branch, and use Maven, e.g.:

  1. git clone git@github.com:openpreserve/jhove.git
  2. cd jhove
  3. git checkout integration
  4. mvn clean install

See the Project Structure section for a guide to
the Maven artifacts produced by the build.

Installation

Application Installation

Download the JHOVE installer. The installer itself requires Java 1.6 or later
to be pre-installed. Installation is OS dependant:

Windows

Currently only tested on Windows 7.

Simply double-click the downloaded installer JAR. If Java is installed then the
windowed installer will guide you through selection. It’s best to stay with
the default choices if installing the beta.

Once the installation has finished you’ll be able to double-click
C:\Users\yourName\jhove\jhove-gui to start the JHOVE GUI. Alternatively,
open a Command window, e.g. press the Windows key and type cmd, then issue
these commands:

  1. C:\Users\yourName>cd jhove
  2. C:\Users\yourName\jhove>jhove

to display the command-line usage message.

It is also possible to use JHOVE with the openJDK, e. g. jdk-13. It might be necessary to set the java path in the Environment variables, for which one usually needs administration rights for the windows machine.

Mac OS

Currently only tested on OS X Mavericks.

Simply double-click the downloaded installer JAR. If Java is installed then the
windowed installer will guide you through selection. It’s best to stay with
the default choices if installing the beta.

Once the installation has finished you’ll be able to double-click
/Users/yourName/jhove/jhove-gui to start the JHOVE GUI. Alternatively,
open a Terminal command window and then issue these commands:

  1. cd ~/jhove
  2. ./jhove

to display the command-line usage message.

Linux

Currently tested on Ubuntu 16.10 and Debian Jessie.

Once the installer has downloaded, start a terminal, e.g. Ctrl+Alt+T,
and type the following, assuming the download is in ~/Downloads:

  1. java -jar ~/Downloads/jhove-latest.jar

Once the installation is finished you’ll be able to:

  1. cd ~/jhove
  2. ./jhove

to run the command-line application and show the usage message. Alternatively:

  1. cd ~/jhove
  2. ./jhove-gui

will run the GUI application.

Distribution

We’ve moved to Maven and have taken the opportunity to update the distribution.
For now we’re producing:

  • a Maven package, for developers wishing to incorporate JHOVE into their
    own software;
  • a “fat” (1MB) JAR that contains the old CLI and desktop GUI, for anyone
    who doesn’t want to use the new installer; and
  • a simple cross-platform installer that installs the application JAR, support
    scripts, etc.

Usage

  1. jhove [-c config] [-m module] [-h handler] [-e encoding] [-H handler]
  2. [-o output] [-x saxclass] [-t tempdir] [-b bufsize]
  3. [-l loglevel] [[-krs] dir-file-or-uri [...]]
  4. -c config Configuration file pathname
  5. -m module Module name
  6. -h handler Output handler name (defaults to TEXT)
  7. -e encoding Character encoding used by output handler (defaults to UTF-8)
  8. -H handler About handler name
  9. -o output Output file pathname (defaults to standard output)
  10. -x saxclass SAX parser class (defaults to J2SE default)
  11. -t tempdir Temporary directory in which to create temporary files
  12. -b bufsize Buffer size for buffered I/O (defaults to J2SE 1.4 default)
  13. -l loglevel Logging level
  14. -k Calculate CRC32, MD5, and SHA-1 checksums
  15. -r Display raw data flags, not textual equivalents
  16. -s Format identification based on internal signatures only
  17. dir-file-or-uri Directory or file pathname or URI of formated content
  18. stream

All named modules and output handlers must be found on the Java CLASSPATH at
the time of invocation. The JHOVE driver script, jhove/jhove, automatically
sets the CLASSPATH and invokes the Jhove main class:

  1. jhove [-c config] [-m module] [-h handler] [-e encoding] [-H handler]
  2. [-o output] [-x saxclass] [-t tempdir] [-b bufsize] [-l loglevel]
  3. [[-krs] dir-file-or-uri [...]]

The following additional programs are available, primarily for testing
and debugging purposes. They display a minimally processed, human-readable
version of the contents of AIFF, GIF, JPEG, JPEG 2000, PDF, TIFF, and WAVE
files:

  1. java ADump aiff-file
  2. java GDump gif-file
  3. java JDump jpeg-file
  4. java J2Dump jpeg2000-file
  5. java PDump pdf-file
  6. java TDump tiff-file
  7. java WDump wave-file

For convenience, the following driver scripts are also available:

  1. adump aiff-file
  2. gdump gif-file
  3. jdump jpeg-file
  4. j2dump jpeg2000-file
  5. pdump pdf-file
  6. tdump tiff-file
  7. wdump wave-file

The JHOVE Swing-based GUI interface can be invoked from a command shell from
the jhove/bin sub-directory:

  1. jhove-gui -c <configFile>

where <configFile> is the pathname of the JHOVE configuration file.

Project Structure

A quick introduction to the restructured Maven project. The project’s been
broken into three Maven modules with an additional installer module added.

  1. jhove/
  2. |-jhove-apps/
  3. |-jhove-core/
  4. |-jhove-installer/
  5. |-jhove-ext-modules/
  6. |-jhove-modules/

All Maven artifacts are produced in versioned form,
i.e. ${artifactId}-${project.version}.jar, where ${project.version} defaults
to 1.20.0 unless you explicitly set the version number.

jhove

The jhove project root acts as a Maven parent and reactor for the sub-modules.
This simply builds sub-modules and doesn’t produce any artifacts, but decides
which sub-modules are built.

The jhove-core and jhove-modules are most likely all that are required for
developers wishing to call and run JHOVE from their own code.

jhove-core

The jhove-core module contains all of the main data type definitions and the
output handlers. This module produces a single JAR:

  1. ./jhove/jhove-core/target/jhove-core-${project.version}.jar

The jhove-core JAR contains a single module implementation, the default
BytestreamModule. For the format-specific modules you’ll need
the jhove-modules JAR.

jhove-modules

The jhove-modules contains all of JHOVE’s core format-specific module
implementations, specifically:

  • AIFF
  • ASCII
  • GIF
  • HTML
  • JPEG
  • JPEG 2000
  • PDF
  • TIFF
  • UTF-8
  • WAVE
  • XML

These are all packaged in a single modules JAR:

  1. ./jhove/jhove-modules/target/jhove-modules-${project.version}.jar

jhove-ext-modules

The jhove-ext-modules contains JHOVE modules developed by external parties, specifically:

  • PNG
  • WARC
  • GZIP
  • EPUB

These are all packaged in a single modules JAR:

  1. ./jhove/jhove-ext-modules/target/jhove-ext-modules-${project.version}.jar

jhove-apps

The jhove-apps module contains the command-line and GUI application code and
builds a fat JAR containing the entire Java application. This JAR can be used
to execute the command-line app:

  1. ./jhove/jhove-apps/target/jhove-apps-${project.version}.jar

jhove-installer

Finally, the jhove-installer module takes the fat JAR and creates a Java-based
installer for JHOVE. The installer bundles up invocation scripts and the like,
installs them under <userHome>/jhove/ (default, can be changed) while also
looking after:

  • variable substitution to ensure that JHOVE_HOME and the like are set to
    reflect a user’s install location;
  • making sure that Windows users get batch scripts, while Mac and Linux users
    get bash scripts; and
  • optionally generating unattended install and uninstall files.

The module produces two JARs, one called jhove-installer-${project.version},
which contains the JARs for the installer, and an executable JAR to install
JHOVE:

  1. ./jhove/jhove-installer/target/jhove-xplt-installer-${project.version}.jar

The xplt stands for cross-platform.

pdf-hul-59-govdocs-681811_1647694630814.pdf
pdf-hul-64-616615027_1647694630908.pdf
pdf-hul-65-847453723_1647694631031.pdf
pdf-hul-8-Secured_1647694631059.pdf
pdf-hul-84-govdocs-484279_1647694631075.pdf
pdf-hul-87-embedded_video_avi_1647694631104.pdf
pdf-hul-87-webCapture_1647694631194.pdf
AA_Banner-single_1647694619354.pdf
AA_Banner_1647694619386.pdf
bedfordcompressed_1647694619406.pdf
DDAP_Singlev3_1647694619436.pdf
DDAP_Spreadv3_1647694619468.pdf
fallforum03_1647694619473.pdf
imd_1647694619479.pdf
T00_000_minimal-valid_1647694621004.pdf
corruptionOneByteMissing_1647694621025.pdf
T02-01_001_document-catalog-No-document-catalog_1647694621040.pdf
T02-01_002_document-catalog-wrong-object-number_1647694621069.pdf
T02-01_003_document-catalog-indirecte-pages-reference-missing_1647694621094.pdf
T02-01_004_document-catalog-incorrect-pages-reference_1647694621121.pdf
T02-01_005_document-catalog-type-key-missing_1647694621139.pdf
T02-01_006_document-catalog-wrong-type-key_1647694621170.pdf
T02-01_007_document-catalog-type-key-value-pair-missing_1647694621185.pdf
T01_001_header-invalid-major-version_1647694621203.pdf
T01_002_header-invalid-minor-version_1647694621220.pdf
T01_003_header-no-minor-version_1647694621250.pdf
T01_004_header_invalid-syntax-no-dash_1647694621302.pdf
T01_005_header-invalid-syntax-replace-char_1647694621305.pdf
T01_006_header-invalid-syntax-no-pdf_1647694621308.pdf
T01_007_header-no-version-info_1647694621310.pdf
T02-02_001_no-page-tree-node_1647694621337.pdf
T02-02_002_page-tree-kids-links-recursive_1647694621340.pdf
T02-02_003_page-tree-different-kids_1647694621343.pdf
T02-02_004_page-tree-non-existing-object-as-kid_1647694621345.pdf
T02-02_005_page-tree-no-kids_1647694621363.pdf
T02-02_006_page-tree-no-type_1647694621366.pdf
T02-02_007_page-tree-wrong-count_1647694621369.pdf
T02-02_008_page-tree-node-no-count_1647694621384.pdf
T02-02_009_page-tree-wrong-type_1647694621387.pdf
corruptionOneByteMissing_1647694622203.pdf
pdf-hul-1-govdocs-519846_1647694622246.pdf
pdf-hul-10-govdocs-803945_1647694622327.pdf
pdf-hul-13-govdocs-346874_1647694622421.pdf
pdf-hul-14-govdocs-489354_1647694622515.pdf
pdf-hul-22-govdocs-000187_1647694622802.pdf
pdf-hul-26-MalformedFilter-Xerox_1647694623021.pdf
pdf-hul-26-govdocs-776298_1647694623102.pdf
pdf-hul-29-govdocs-375118_1647694623136.pdf
pdf-hul-34-govdocs-259511_1647694623169.pdf
pdf-hul-35-govdocs-156429_1647694623212.pdf
pdf-hul-43-govdocs-486355_1647694623285.pdf
pdf-hul-49-32932439X_1647694623976.pdf
pdf-hul-5-govdocs-659152_1647694624705.pdf
pdf-hul-51-govdocs-085551_1647694624944.pdf
pdf-hul-52-govdocs-983827_1647694625155.pdf
pdf-hul-61-CERN-2005-009_1647694625678.pdf
pdf-hul-62-567147525_1647694626125.pdf
pdf-hul-76-372051162_1647694626150.pdf
pdf-hul-76-govdocs-289573_1647694626225.pdf
pdf-hul-79-govdocs-095305_1647694626297.pdf
pdf-hul-81-govdocs-128112_1647694626343.pdf
pdf-hul-82-govdocs-333472_1647694626413.pdf
pdf-hul-86-govdocs-445892_1647694626447.pdf
pdf-hul-9-govdocs-065694_1647694626472.pdf
AA_Banner-single_1647694627042.pdf
AA_Banner_1647694627045.pdf
bedfordcompressed_1647694627049.pdf
DDAP_Singlev3_1647694627052.pdf
DDAP_Spreadv3_1647694627055.pdf
fallforum03_1647694627064.pdf
imd_1647694627082.pdf
class-cast_1647694629143.pdf
pdf-hul-10-814778526_1647694629608.pdf
pdf-hul-11-govdocs-152588_1647694629690.pdf
pdf-hul-15-grid-system_1647694629724.pdf
pdf-hul-2-simple-annotated-in-adobe-x_1647694629744.pdf
pdf-hul-33-826355544_1647694629784.pdf
pdf-hul-39-616615442_1647694629846.pdf
pdf-hul-4-615006647_1647694629883.pdf
pdf-hul-4-govdocs-788261_1647694629957.pdf
pdf-hul-40-61501688X_1647694629986.pdf
pdf-hul-40-govdocs-088919_1647694630056.pdf
pdf-hul-41-834460599_1647694630205.pdf
pdf-hul-44-629642362_1647694630280.pdf
pdf-hul-45-52897422X_1647694630349.pdf
pdf-hul-45-govdocs-600753_1647694630378.pdf
pdf-hul-5-externalLink_1647694630413.pdf
pdf-hul-55-govdocs-616137_1647694630505.pdf
pdf-hul-56-improperly-constructed-page-tree_1647694630549.pdf
pdf-hul-59-629642362_1647694630742.pdf
not-an-epub_1647694620797.docx