PRODUCT : Rapture Getting Started: Python


Tutorial 1: Your first Python Rapture Project 

Objectives:

  1. Use of the RIM user interface to interact with Rapture managed data 
  2. Use of a client-side python script app to highlight the Rapture API 

This will cover the following steps:

  1. Uploading a CSV file to a blob repo in your RIM instance
  2. Translating raw CSV data files into document data type
  3. Creating a series using document created in step #2
  4. Running a graph based report from series data in step #3 and saving as blob pdf

For the purposes of this tutorial:

  • A python script is provided to allow you to exercise code and see the output in cmd line and follow along in the RIM user interface
  • It is recommended to also look at the source code
  • In this tutorial mac conventions are used for command line work

 


1. Create new Repos for use in this Tutorial

 The first step in this tutorial is to login to your RIM instance and create the data repositories.

 

Process

 

Login to your demo environment using url and credentials provided during signup

Create a Document repository using RIM's + Add Repositories interface:
  1. Set Name to tutorialDoc
  2. Set Type to Document
  3. Set Data System to MongoDB
  4. Press the OK Button
Create a Blob repository again using RIM's + Add Repositories interface:
  1. Set Name to tutorialBlob
  2. Set Type to Blob
  3. Set Data System to MongoDB
  4. Press the OK Button
With the two repositories in place we are ready to move onto the next step in the tutorial: upload a CSV file.

 


2. Capture Data – Load a CSV

In folder $RAPTURE_HOME/Intro01/Python/src/main python script App.py:

$ python App.py -s UPLOAD -p <password> 

Under the Hood

Click here to see what's going on under the hood...

This is the first step in the python Tutorial Intro01. All steps use the global login "rapture" from the "loginToRapture" function seen here:

def loginToRapture(url, username, password):

        	global rapture

        	rapture = raptureAPI.raptureAPI(url, username, password)

        	if 'valid' in rapture.context and rapture.context['valid']:

        	    print 'Logged in successfully ' + url

        	else:

        		print "Login unsuccessful"

 

  1. The csv file is read from the file location (provided by argument or environment variable).
  2. Next, we create a URI, repo config, and meta config for the blob repository we intend so upload our CSV file to:

    # Create blob repo
    
            	blobRepoUri = "//tutorialBlob"
    
            	rawCsvUri = blobRepoUri + "/introDataInbound.csv"
    
            	config = "BLOB {} USING MONGODB {prefix=\"tutorialBlob\"}"
    
            	metaConfig = "REP {} USING MEMORY {prefix=\"tutorialBlob\"}"

    This: BLOB {} USING MONGODB {prefix=\"tutorialBlob\"} this config declares the repo as a BLOB repo, using MongoDB (when supplied, other back-ends can be used such as Cassandra or Redis), with a MongoDB prefix of "tutorialBlob". The meta data config uses a non-versioned document repository ("REP") using system memory, with the same prefix.

  3. We check if the repository may already exist, clean it out if it does, then (re-)create it:

    if(rapture.doBlob_BlobRepoExists(blobRepoUri)):
    
            		print "Repo exists, cleaning & remaking"
    
            		rapture.doBlob_DeleteBlobRepo(blobRepoUri)
    
            	rapture.doBlob_CreateBlobRepo(blobRepoUri, config, metaConfig)
  4. To finally upload the CSV to the BLOB repo we just created, we encode the raw .csv data using base64, and upload to the rawCsvUri with the description of "text/csv":

       # Encode and Store Blob
    
            	print "Uploading CSV"
    
            	encoded_blob = base64.b64encode(str(rawFileData))
    
            	rapture.doBlob_PutBlob(rawCsvUri, encoded_blob, "text/csv")

At this point we have uploaded the CSV file into the blob://tutorialBlob repository

Look at the UPLOAD Function() in the RaptureTutorials project to see the API calls used.

Look at the RIM interface to see the blob you uploaded:


Next up we will run code to translate the CSV file (in the blob repo) to a document (in the document repo).


3. Translate Data – Create documents from raw Data

In folder $RAPTURE_HOME/Intro01/Python/src/main python script App.py:

$ python App.py -s BLOBTODOC -p <password> 

Under the Hood

Click here to see what's going on under the hood...
  1. First thing we do in this step is to check if the CSV data has already been uploaded to the BLOB (done in UPLOAD step).
  2. We then retrieve that data, and decode with base64 (recall we encoded it with base64 before uploading). We then transform the raw CSV data into a list of lists, with each list being a "row" in the CSV data & the very first one containing the keys (ex: frequency):

    		# Pull encoded blob out of rapture
    
            	 	encoded_blob = rapture.doBlob_GetBlob(blobUri)['content']
    
            		# Decode blob using base64 python library
    
            		blob_string = base64.b64decode(str(encoded_blob))
    
            		# Add list identifiers to string so that we can convert to python list
    
            		blob_string = "[[\"" + blob_string.replace(',', "\",\"").replace('\n', "\"],[\"") + "\"]]"
    
            		# Convert string to python list
    
            		rows = ast.literal_eval(blob_string)
  3. We then create a versioned document repository ("NREP") using MONGODB with a prefix of "tutorialDoc" by creating a configuration string to pass to the CreateDocRepo API call. The same check here is done as in the UPLOAD step to clean the repository out if it already exists:

    		# Create doc repo
    
            		docRepoUri = "//tutorialDoc"
    
            		docUri = docRepoUri + "/introDataTranslated"
    
            		config = "NREP {} USING MONGODB {prefix=\"tutorialDoc\"}"
    
            		if(rapture.doDoc_DocRepoExists(docRepoUri)):
    
            			rapture.doDoc_DeleteDocRepo(docRepoUri)
    
            		rapture.doDoc_CreateDocRepo(docRepoUri, config)
  4. The next block of code is the guts of the transformation of data from the list of lists into a flattened JSON structure using Python dictionaries. The structure goes as follows: 

    {Provider: <provider>,{series_type: <series_type>}, {frequency: <frequency>}, {index_id:{<index_id>:{<price_type>:{<date>:<index_price>}}}
  5. Once we have the data structured the way we want it, we then make a Rapture API call to put it into our document repository we just created. Note that since we are in python, we must use "json.dumps" to convert the dictionary to a JSON object. Since an orderedDict was used to format the order the data appears, we set "sort_keys" to "False" to preserve the order.

    		#PUT THE CSV DATA RETRIEVED FROM A BLOB & TRANSLATED INTO THE DOCUMENT REPOSITORY
            		rapture.doDoc_PutDoc(docUri, json.dumps(Order, sort_keys=False))

At this point the CSV (blob) data has been converted into a document type, which is in this repository: document://tutorialDoc/introDataTranslated

Look at the BLOBTODOC Function() in the RaptureTutorials project to see the API calls used.

Look at the RIM interface to see the document you created.

Next up we will look at creating series data from the document.


4. Analyze Data – Update series Data from documents

In folder $RAPTURE_HOME/Intro01/Python/src/main python script App.py:

$ python App.py -s DOCTOSERIES -p <password> 

Under the Hood

Click here to see what's going on under the hood...
  1. The first thing we do in this step is to check that the document repository containing the JSON data exists (we created it in the last step: BLOBTODOC)
  2. We then retrieve & convert the document content (JSON) back into a python dictionary:

    		# Convert json string to a python dict object
            		doc = ast.literal_eval(rapture.doDoc_GetDoc(docUri))
  3. We then check if the //datacapture series repo has already been created (It should have came pre-installed on your demo environment)
  4. Next is the code block for generating a series URI for each price type. An Example URI would be:
    //datacapture/HIST/Provider_1a/AUDUSD_CURNCY_Dummy/DAILY/PX_BID 
    The loop in this script generates disposable base URI's and is able to insert the data from our dictionary into the correct respective series URI's.
    The API call to insert the data into a series is:

    		# Store each date and price in the appropriate series
            		rapture.doSeries_AddDoubleToSeries(seriesUri, date, float(price))

At this point a a series has been created using data from document://tutorialDoc/introDataTranslated.

Look at the DOCTOSERIES Function() in the RaptureTutorials project to see the API calls used.

Your data has now been stored in Rapture Information Manager in three forms: raw CSV, JSON document, and series data.

You can view the series on the RIM web interface.

Next we will create a pdf from the series repository.


5. Distribute Data – create reports from series Data

This step will produce a pdf from the series data. The pdf is saved to your local directory and also the Rapture Instance.

$ cd $RAPTURE_HOME/Intro01/Java/ReportApp/build/install/ReportApp/bin
$ ./ReportApp

 

Look at the ReportApp code in RaptureTutorials project.

You can look at the pdf on the RIM interface at this url blob://tutorialBlob/output.pdf


6. Reset Data (Optional)

 

Reset data

Follow these steps to reset your demo environment.

  1. Using RIM, go to Scripts/Tutorial01/tutorialCleanup.rfx and click open.
  2. Execute the script using the "Run Script" button