Bibliographic Metadata Submission

Overview of Bibliographic Metadata Submission Process

Bibliographic records for HathiTrust items are managed by the University of California in a metadata management system called Zephir. For a full description of the submission process, including error reporting and other details, see the guide Submitting Metadata to Zephir. An overview of the process is given below.

  1. Contributor submits sample records to Zephir via FTPS
    • Contributors should contact support@hathitrust.org to initiate discussions about ingest, including bibliographic metadata. Through this process, contributors will be given account information (login and password) used to access an FTPS directory for submitting bibliographic data to Zephir. Instructions on using FTPS are in Appendix C of Submitting Metadata to Zephir. Contributors will initially be asked to submit sample records so that the Zephir team can evaluate their potential success in being ingested.
  2. Contributor submits files of records via FTPS
    • Once a contributor’s sample files have been analyzed and their records have been prepared for submission, full files of bibliographic records can be submitted via FTPS to the Zephir “submissions” directory using this address: ftps.cdlib.org. Contributors will need to use their account name and password to access this directory.
  3. Contributors files should:
    • contain records representing the print version of the digitized works
    • include no more than 50,000 records each
    • contain records only from a single digitization agent (e.g., Google)
    • correspond to the following file name convention:
      • <configuration code>_<date>_<other distinguishing data>.xml (Note: the date must be in YYYYMMDD format)
      • Example: uiowa-2_20190820.xml, hvd_20190801_backfile1.xml
      • Note: metadata source codes and configuration codes will be provided to contributors, and “other distinguishing data” is optional.
      • All files submitted to Zephir must have unique file names.
  4. When contributors submit bibliographic records via FTPS, an email should also be sent to cdl-zphr-l@ucop.edu that includes the following transmission information, formatted in the message body as follows for machine readability:

file name=<file name>

file size=<file size in bytes>

record count=<number of records>

notification email=<email address to contact regarding file submissions>

Record Scoring

When bibliographic records are loaded into Zephir, they are given a score based on the presence or absence of data in MARC metadata fields. When more than one institution deposits a record for an item, the record score determines which record is used in the HathiTrust catalog. See Zephir Record Scoring for details.

Notifications

When contributors send files to Zephir, they will receive an email notification confirming successful submission. If the submission is not successful, contributors will receive an email notification addressing one of these three error conditions, with information and directions for resolution: duplicate files, incorrectly named or unrecognizable files, and submission of a directory structure rather than a file. Other possible error conditions are reported directly to the Zephir team for resolution.

After files have been submitted and loaded into Zephir, contributors will receive an email notification with information about the loading run: a brief run report and histogram detailing MARC tag usage in submitted records (typically sent 1-2 days after files have been received). Additionally, contributors can retrieve more detailed run reports specific to each file run via FTPS. More information about what contributors can expect to find in email notifications, run reports, and histograms can be found in Submitting Metadata to Zephir.

Bibliographic Record Corrections

Contributors may be requested to correct or enhance records in response to errors or deficits identified during loading or observed after records have been included in the system.

 It is the general policy of the HathiTrust not to alter the content of contributor’s records, except where necessary to assure the coordination of functions in the metadata management system. Bibliographic Metadata Correction Policy

Detailed information about how errors (both loading and user-observed) are reported back to contributors and how contributors can address them is included in Submitting Metadata to Zephir. Corrected records can be re-submitted following the process outlined in Step #2 above.

Top