项目作者: ibm-watson-data-lab

项目描述 :
Serverless CouchDB/Cloudant attachment de-attacher
高级语言: JavaScript
项目地址: git://github.com/ibm-watson-data-lab/detacher.git
创建时间: 2017-10-19T14:59:20Z
项目社区:https://github.com/ibm-watson-data-lab/detacher

开源协议:

下载


detacher

If you are using Cloudant or CouchDB and occasionally storing binary attachments inside documents, then detacher may be for you. It is a serverless function that runs in IBM Cloud Functions (based on Apache OpenWhisk) that is invoked whenever a Cloudant document changes. If the document contains attachments, those documents are copied into Cloud Object Storage or AWS S3 and removed from the document.

This allows the Cloudant database to remain free of binary attachments with no loss of data.

Here is a typical document before

  1. {
  2. "_id": "7",
  3. "_rev": "2-920d8da7eb1a1175fcbc10cf6f989d99",
  4. "first_name": "Glynn",
  5. "last_name": "Bird",
  6. "job": "Developer Advocate @ IBM",
  7. "twitter": "@glynn_bird",
  8. "_attachments": {
  9. "headshot.jpg": {
  10. "content_type": "image/jpeg",
  11. "revpos": 2,
  12. "digest": "md5-N0JXExRZxZaOD3sszjMXzA==",
  13. "length": 46998,
  14. "stub": true
  15. }
  16. }
  17. }

CouchDB/Cloudant stores attached files in an object called _attachmments. After processing by detacher, the document is modified to look like this:

  1. {
  2. "_id": "7",
  3. "_rev": "3-c3272191e6e94d3bd2a3d72145c7d4fd",
  4. "first_name": "Glynn",
  5. "last_name": "Bird",
  6. "job": "Developer Advocate @ IBM",
  7. "twitter": "@glynn_bird",
  8. "attachments": {
  9. "headshot.jpg": {
  10. "content_type": "image/jpeg",
  11. "revpos": 2,
  12. "digest": "md5-N0JXExRZxZaOD3sszjMXzA==",
  13. "length": 46998,
  14. "stub": true,
  15. "Location": "https://detacher.s3.eu-west-2.amazonaws.com/7-headshot.jpg",
  16. "Key": "7-headshot.jpg"
  17. }
  18. }
  19. }

Notice that the _attachments key is no longer there: Cloudant is not storing the attachment anymore. In its place is attachments (without the underscore) which contains the same data but with an extra Location and Key which record where in your Object Storage the file is stored.

Pre-requisites

You need:

Installation

Ensure you have a new “bucket” in your Object Storage service and a new database in your Cloudant service.

Set up environment variables containing the credentials of your Cloudant service and Object storage service:

  1. export CLOUDANT_HOST="myhost.cloudant.com"
  2. export CLOUDANT_USERNAME="myusername"
  3. export CLOUDANT_PASSWORD="mypassword"
  4. export CLOUDANT_DATABASE="mydatabase"
  5. export AWS_ACCESS_KEY_ID="ABC123"
  6. export AWS_SECRET_ACCESS_KEY="XYZ987"
  7. export AWS_BUCKET="mybucket"
  8. export AWS_REGION="eu-west-2"
  9. export AWS_ENDPOINT="https://ec2.eu-west-2.amazonaws.com"

If you are using Amazon S3, you can omit the AWS_ENDPOINT environment variable. For the IBM Cloud Object Storage service, the endpoints are listed here.

Then run the deploy.sh script

  1. ./deploy.sh

You can now add document to your database and add an attachment too it. In a few moments the document will have updated and will no longer contain attachments, but references to those files in your object storage.