PRODUCT : Rapture Getting Started: Python

Tutorial 1: Your first Python Rapture Project

Objectives:

Use of the RIM user interface to interact with Rapture managed data
Use of a client-side python script app to highlight the Rapture API

This will cover the following steps:

Uploading a CSV file to a blob repo in your RIM instance
Translating raw CSV data files into document data type
Creating a series using document created in step #2
Running a graph based report from series data in step #3 and saving as blob pdf

For the purposes of this tutorial:

A python script is provided to allow you to exercise code and see the output in cmd line and follow along in the RIM user interface
It is recommended to also look at the source code
In this tutorial mac conventions are used for command line work

Before beginning this demonstration of Rapture it is expected that your environment satisfies the following pre-requisites

Java 8 (or later) runtime is installed and accessible
Python 2.7.10 (or later) is installed with the following python modules installed:
1. requests (https://pypi.python.org/pypi/requests)
2. numpy (https://pypi.python.org/pypi/numpy)

1. Create new Repos for use in this Tutorial

The first step in this tutorial is to login to your RIM instance and create the data repositories.

Process
Login to your demo environment using url and credentials provided during signup
Create a Document repository using RIM's *+ Add* Repositories interface: Set Name to tutorialDoc Set Type to Document Set Data System to MongoDB Press the OK Button
Create a Blob repository again using RIM's *+ Add* Repositories interface: Set Name to tutorialBlob Set Type to Blob Set Data System to MongoDB Press the OK Button
With the two repositories in place we are ready to move onto the next step in the tutorial: upload a CSV file.

If you'd like to see how to create repositories via Rapture Information Manager's UI, follow this process. If you wish to skip this, the repo creation will be handled by our tutorial code.

Create Repos

To Add a repo simply select "+ ADD" under REPOSITORIES.

2. Capture Data – Load a CSV

In folder $RAPTURE_HOME/Intro01/Python/src/main python script App.py:

$ python App.py -s UPLOAD -p <password>

Under the Hood

Click here to see what's going on under the hood...

This is the first step in the python Tutorial Intro01. All steps use the global login "rapture" from the "loginToRapture" function seen here:

def loginToRapture(url, username, password):

        	global rapture

        	rapture = raptureAPI.raptureAPI(url, username, password)

        	if 'valid' in rapture.context and rapture.context['valid']:

        	    print 'Logged in successfully ' + url

        	else:

        		print "Login unsuccessful"

The csv file is read from the file location (provided by argument or environment variable).
Next, we create a URI, repo config, and meta config for the blob repository we intend so upload our CSV file to:
```
# Create blob repo

        	blobRepoUri = "//tutorialBlob"

        	rawCsvUri = blobRepoUri + "/introDataInbound.csv"

        	config = "BLOB {} USING MONGODB {prefix=\"tutorialBlob\"}"

        	metaConfig = "REP {} USING MEMORY {prefix=\"tutorialBlob\"}"
```
This: BLOB {} USING MONGODB {prefix=\"tutorialBlob\"} this config declares the repo as a BLOB repo, using MongoDB (when supplied, other back-ends can be used such as Cassandra or Redis), with a MongoDB prefix of "tutorialBlob". The meta data config uses a non-versioned document repository ("REP") using system memory, with the same prefix.

We check if the repository may already exist, clean it out if it does, then (re-)create it:

if(rapture.doBlob_BlobRepoExists(blobRepoUri)):

        		print "Repo exists, cleaning & remaking"

        		rapture.doBlob_DeleteBlobRepo(blobRepoUri)

        	rapture.doBlob_CreateBlobRepo(blobRepoUri, config, metaConfig)

To finally upload the CSV to the BLOB repo we just created, we encode the raw .csv data using base64, and upload to the rawCsvUri with the description of "text/csv":

   # Encode and Store Blob

        	print "Uploading CSV"

        	encoded_blob = base64.b64encode(str(rawFileData))

        	rapture.doBlob_PutBlob(rawCsvUri, encoded_blob, "text/csv")

At this point we have uploaded the CSV file into the blob://tutorialBlob repository.

Look at the UPLOAD Function() in the RaptureTutorials project to see the API calls used.

Look at the RIM interface to see the blob you uploaded:

Next up we will run code to translate the CSV file (in the blob repo) to a document (in the document repo).

In this step

Run upload script to load CSV into Blob

View Blob

Windows users

Use %RAPTURE_HOME% if needed

3. Translate Data – Create documents from raw Data

In folder $RAPTURE_HOME/Intro01/Python/src/main python script App.py:

$ python App.py -s BLOBTODOC -p <password>

Under the Hood

Click here to see what's going on under the hood...

First thing we do in this step is to check if the CSV data has already been uploaded to the BLOB (done in UPLOAD step).

We then retrieve that data, and decode with base64 (recall we encoded it with base64 before uploading). We then transform the raw CSV data into a list of lists, with each list being a "row" in the CSV data & the very first one containing the keys (ex: frequency):

		# Pull encoded blob out of rapture

        	 	encoded_blob = rapture.doBlob_GetBlob(blobUri)['content']

        		# Decode blob using base64 python library

        		blob_string = base64.b64decode(str(encoded_blob))

        		# Add list identifiers to string so that we can convert to python list

        		blob_string = "[[\"" + blob_string.replace(',', "\",\"").replace('\n', "\"],[\"") + "\"]]"

        		# Convert string to python list

        		rows = ast.literal_eval(blob_string)

We then create a versioned document repository ("NREP") using MONGODB with a prefix of "tutorialDoc" by creating a configuration string to pass to the CreateDocRepo API call. The same check here is done as in the UPLOAD step to clean the repository out if it already exists:

		# Create doc repo

        		docRepoUri = "//tutorialDoc"

        		docUri = docRepoUri + "/introDataTranslated"

        		config = "NREP {} USING MONGODB {prefix=\"tutorialDoc\"}"

        		if(rapture.doDoc_DocRepoExists(docRepoUri)):

        			rapture.doDoc_DeleteDocRepo(docRepoUri)

        		rapture.doDoc_CreateDocRepo(docRepoUri, config)

The next block of code is the guts of the transformation of data from the list of lists into a flattened JSON structure using Python dictionaries. The structure goes as follows:
```
{Provider: <provider>,{series_type: <series_type>}, {frequency: <frequency>}, {index_id:{<index_id>:{<price_type>:{<date>:<index_price>}}}
```
Once we have the data structured the way we want it, we then make a Rapture API call to put it into our document repository we just created. Note that since we are in python, we must use "json.dumps" to convert the dictionary to a JSON object. Since an orderedDict was used to format the order the data appears, we set "sort_keys" to "False" to preserve the order.
```
		#PUT THE CSV DATA RETRIEVED FROM A BLOB & TRANSLATED INTO THE DOCUMENT REPOSITORY
        		rapture.doDoc_PutDoc(docUri, json.dumps(Order, sort_keys=False))
```

At this point the CSV (blob) data has been converted into a document type, which is in this repository: document://tutorialDoc/introDataTranslated

Look at the BLOBTODOC Function() in the RaptureTutorials project to see the API calls used.

Look at the RIM interface to see the document you created.

Next up we will look at creating series data from the document.

In this step

Run translate script to create Doc

View Doc

Windows users

Use %RAPTURE_HOME% if needed

4. Analyze Data – Update series Data from documents

In folder $RAPTURE_HOME/Intro01/Python/src/main python script App.py:

$ python App.py -s DOCTOSERIES -p <password>

Under the Hood

Click here to see what's going on under the hood...

The first thing we do in this step is to check that the document repository containing the JSON data exists (we created it in the last step: BLOBTODOC)

We then retrieve & convert the document content (JSON) back into a python dictionary:

		# Convert json string to a python dict object
        		doc = ast.literal_eval(rapture.doDoc_GetDoc(docUri))

We then check if the //datacapture series repo has already been created (It should have came pre-installed on your demo environment)
Next is the code block for generating a series URI for each price type. An Example URI would be:
//datacapture/HIST/Provider_1a/AUDUSD_CURNCY_Dummy/DAILY/PX_BID
The loop in this script generates disposable base URI's and is able to insert the data from our dictionary into the correct respective series URI's.
The API call to insert the data into a series is:
```
		# Store each date and price in the appropriate series
        		rapture.doSeries_AddDoubleToSeries(seriesUri, date, float(price))
```

At this point a a series has been created using data from document://tutorialDoc/introDataTranslated.

Look at the DOCTOSERIES Function() in the RaptureTutorials project to see the API calls used.

Your data has now been stored in Rapture Information Manager in three forms: raw CSV, JSON document, and series data.

You can view the series on the RIM web interface.

Next we will create a pdf from the series repository.

In this step

Run analyze script to create/update Series

View Series

Windows users

Use %RAPTURE_HOME% if needed

5. Distribute Data – create reports from series Data

This step will produce a pdf from the series data. The pdf is saved to your local directory and also the Rapture Instance.

$ cd $RAPTURE_HOME/Intro01/Java/ReportApp/build/install/ReportApp/bin

$ ./ReportApp

Look at the ReportApp code in RaptureTutorials project.

You can look at the pdf on the RIM interface at this url blob://tutorialBlob/output.pdf

6. Reset Data (Optional)

Reset data

Follow these steps to reset your demo environment.

Using RIM, go to Scripts/Tutorial01/tutorialCleanup.rfx and click open.
Execute the script using the "Run Script" button