Application for batch loading GW ScholarSpace
Application for batch loading GW ScholarSpace
Requires Python >= 3.5
Get this code.
git clone https://github.com/gwu-libraries/batch-loader.git
Create a virtualenv.
virtualenv -p python3 ENV
source ENV/bin/activate
Copy configuration file.
cp example.config.py config.py
Edit configuration file. The file is annotated with descriptions of the configuration options.
To run batch-loader:
source ENV/bin/activate
python batch_loader.py <path to csv>
creator2
, creator3
, etc.Field Name | Required? | Label in GWSS UI (if different) | Notes |
---|---|---|---|
title1 | Y | ||
creator1 | Y | Author | |
resource_type1 | Y | Type of Work | For values, see https://github.com/gwu-libraries/scholarspace-hyrax/blob/master/config/authorities/resource_types.yml |
license1 | Y | For values, see https://github.com/gwu-libraries/scholarspace-hyrax/blob/master/config/authorities/licenses.yml - NOTE: use the id values |
|
rights_statement | Y | For values, see https://github.com/gwu-libraries/scholarspace-hyrax/blob/master/config/authorities/rights_statements.yml - NOTE: use the id values. Also note that this is a single-valued field. |
|
gw_affiliation1 | N | GW Unit | For values, see https://github.com/gwu-libraries/scholarspace-hyrax/blob/master/config/authorities/gw_affiliations.yml |
location1 | N | ||
date_created1 | N | Should be YYYY format | |
description1 | N | Abstract | |
keyword1 | N | ||
identifier1 | N | ||
contributor1 | N | ||
publisher1 | N | ||
language1 | N | ||
related_url1 | N | ||
bibliographic_citation1 | N | Previous Publication Information | |
depositor | Y | Email address for the depositor/owner in GWSS | |
files | Y | Path to the attachment file, or in the case of multiple attachments, to the folder containing the attachment files | |
first_file | Y | (Field name is required, value is not) Path to the file which should be positioned as the first attachment (used for the thumbnail, etc. | |
object_id | Y | (Field name is required, value is not) If specified, the GW ScholarSpace ID of the existing object to be updated |
To update items already in GW ScholarSpace, populate the object_id
column with the GWSS ID of the item.
When updating:
gw_affiliation1
column blank wil remove the GW Unit value in GWSS. If there is no gw_affiliation1
column in the CSV, then the GW Unit metadata in GWSS will not be modified, if present. first_file
is left blank, batch loading will only update metadata and will not update files. Any existing file attachments will be left in place.manifest.json
file, which it then points to when calling the rake task. However, it normally then deletes manifest.json
once the load is successful. For debugging purposes, you’ll often want to see this manifst.json
file. In config.py
, set debug_mode
to True
and the file won’t be deleted. You can then do things like calling the rake task manually, to see the stack trace thrown by the rake task.