My Journey with GraphSpace through GSoC 2017: Final Submission

This summer, unlike any other summers, started with a lot of excitement as I was selected to work on the GSoC project, “Implementing a Python library for GraphSpace REST API“, under NRNB. The excitement continued throughout the 12 weeks of the program, within which I was able to complete my project, as there was always something new to learn, something new to implement.

I got a pretty good head start as my mentor, Aditya Bharadwaj, had already started work on the project. We resumed the work on the project. As it is very important to have an organized architecture for any library, our first job was to design the architecture for the graphspace_python library and we implemented it. Next, we had to make the library completely object oriented for which we discussed and planned on how to represent the entities (graphs, layouts, and groups) of the library in the form of easily usable objects. Once we were done implementing that, we moved on to writing APIs for the library which would consume the GraphSpace REST API. We had to implement several APIs in the library for different endpoints of the REST API to facilitate users to work on their graphs and layouts and manage groups from their Python based applications.

Following the implementation of APIs, we worked on exception handling. To make the library robust, we wrote an exception handling module, which handles the exceptions occurring when consuming the REST API. We also maintained documentation for the APIs and classes of the library throughout the development. The library is well documented and we use the Google Style Python Docstrings to document the code. Also, all the APIs of the library have been covered by tests ensuring their proper functionality. We have also integrated Travis CI with our GitHub Repository for continuously checking that all the tests pass whenever a new commit is made.

Moreover, we worked on documenting the GraphSpace REST API using RAML. All endpoints under the GraphSpace REST API were documented in the process, which involved the request, response and error details in a machine readable form.

Links to the work done in my project:

1. Implementation of graphspace_python library:

All work can be found on the develop branch of the graphspace_python repository. Link: https://github.com/adbharadwaj/graphspace-python/tree/develop

2. Documentation of GraphSpace REST API: 

Link to the commit of documenting the REST API in the GraphSpace repository: https://github.com/Murali-group/GraphSpace/commit/c4b1445fff6f23b1be6632c7e706f510429d7a7c

3. Links to blog posts on my work:

Throughout the summer, I have learned a lot in the process of development of the library. Maintaining the standards of open source development, writing proper documentation and employing effective techniques for testing have been one of the many useful things that I have learned in this journey. I am thankful to Google for conducting such a program for the students. Lastly, I would like to thank my mentors Aditya Bharadwaj and T.M. Murali, who have been very supportive and always helped me whenever I was stuck or needed suggestions on the project.

Advertisements

Integrating Travis CI with your GitHub Project

Travis CI is a continuous integration service used to build and test projects hosted at GitHub. Every time a commit is made and pushed to the GitHub repository, Travis CI automatically detects it and tries to build the project and run tests. Upon completion of its build process it provides a report of all the jobs that have passed or failed. In this way one can check whether a commit has broke something and try to fix it before it becomes a problem.

Integrating Travis CI with your GitHub project can be done in 3 simple steps:

1. Visit your profile page on Travis CI website. If not signed in, you can sign in with GitHub. Then you will see a list of your repositories and flick the repository switch on for the repo you want to integrate with Travis.

2. Add .travis.yml file to your repository. This file will contain the details like language, versions of language to test on, command to install dependencies and test script. Lets consider a Python project and define .travis.yml for the project.

i. You have to first specify the language in the file:

language: python

ii. Add versions for the language to test on:

python:
  - "2.7"
  - "3.3"
  - "3.4"
  - "3.5"
  - "3.5-dev" # 3.5 development branch
  - "3.6"
  - "3.6-dev" # 3.6 development branch
  - "3.7-dev" # 3.7 development branch
  - "nightly" # currently points to 3.7-dev

If you don’t include the python key in your file then Travis will build the project for Python 2.7 by default.

iii. Add the command to install dependencies if any:

Travis CI uses isolated virtualenvs to install dependencies. To install Python dependencies use pip.

install:
  - pip install .
  - pip install -r requirements.txt

If you want to skip the installation step then you can specify it as:

install: true

iv. Specify the test script:

If you are using pytest to run the tests:

script: pytest

Or if you are using make test instead:

script: make test

Finally your .travis.yml file should look somewhat like the following:

3. Commit your changes and trigger your first build with a git push. You can visit the dashboard and see all your builds for the project.

Importing docs on Read the Docs

Read the Docs hosts documentation for Open Source Projects. They support Sphinx docs written with reStructuredText and CommonMark. They build documentation by pulling our code from Subversion, Bazaar, Git, or Mercurial repositories and host it.

Before importing a repository you need to sign up and connect an external account such as GitHub or Bitbucket. Connecting your account allows for easier importing and enables Read the Docs to configure your repository webhooks automatically.

Importing Docs

To import a repository visit your dashboard and click Import. If you have a connected account, a list of your repositories will be shown from which you can import a repository by clicking the import icon next to it.

If you do not have a connected account or if the repository you want to import doesn’t show up in the list, you can click on Import Manually and enter the information of the repository yourself. You will be asked for the project name, repository URL and repository type for your new project.

Screenshot from 2017-07-26 21:06:02.png

In the next step, you will be asked for a description of the project, documentation type, language, programming language, project homepage and tags.

Screenshot from 2017-07-26 21:14:09.png

Screenshot from 2017-07-26 21:14:20.png

After filling up and submitting the form your project will be created. Within a few seconds your code will automatically be fetched from your public repository, and the documentation will be built.

Hosting docs of GraphSpace Python library on RTD

After creating the project for GraphSpace Python library in RTD, it needs to be installed inside a virtualenv using setup.py install before documentation can be built. There is an option for this in the Advanced Settings under Admin tab of the project. This can be done by checking the Install Project option and specifying the Requirements File.

Screenshot from 2017-07-26 21:30:41.png

Read the Docs can also host multiple versions of your code. For that you can go to the Versions section under Admin tab of your project and chose the default and active versions for your project and also you can edit them. You can read more about how to use this well on the Versions documentation page.

Docstring formats in Python

Docstring stands for “documentation string” which is a special comment in Python. Elements of interest such as functions, classes, methods and variables are described using the docstrings. The docstrings are not only meant for the developers but also the users.

There are quite a few syntax conventions available for writing docstrings in Python, but it is always advised for a developer to stick to one particular syntax for the entire code of a project.  Lets see the popularly used docstring formats in Python.

reStructuredText

Since, reST is the default format used by Sphinx to generate docs, it is widely used and the most popular. Not only it is used in docstrings but also widely used as a Markdown format for documenting, for e.g. in GitHub for documenting projects.

Example:

"""
This is a an example of reST style docstring.

:param param1: Description of first param.
:type param1: str
:param param2: Description of second param.
:type param2: int
:returns: Description of what is returned.
:rtype: str
:raises MyError: Description of error.
"""

You can refer to this tutorial for more details about the reST documentation.

Googledoc

The Google docstring style is used and supported by Google. This style is also interpreted by Sphinx. It is also popular and used in several forms, Numpydoc is a kind.

Example:

"""
This is an example of Google style docstring.

Args:
    param1 (str): Description of first param.
    param2 (int, optional): Description of second param.

Returns:
    str: Description of what is returned.

Raises:
    MyError: Description of error.
"""

More examples of Google style Python docstrings can be found in this documentation.

Numpydoc

Numpy have their own docstring format which they recommend to use. It is based on the Google docstring format and also uses some reST syntax elements. It is also usable by Sphinx and needs the numpydoc extension, so that the docstrings will be handled correctly. You can refer to this documentation for using the Numpy docstring format.

Example:

"""
This is an example of Numpy style docstring.

Parameters
----------
first : str
    Description of first param.
second : int, optional
    Description of second param.

Returns
-------
str
    Description of what is returned.

Raises
------
MyError
    Description of error.
"""

More examples of Numpy style Python docstrings can be found in this documentation.

Documentation in GraphSpace Python client library:

We have adopted the Google Style docstrings for documenting the code of the GraphSpace Python library. It is easy, flexible and also readily interpreted by Sphinx for external docs generation.

 

Implementing a Python Client for RESTful APIs

RESTful APIs have become very popular in the last few years by the hand of large sites like Fakebook, Twitter and Github, who give developers the opportunity to extend their services with a wide variety of applications and services.

The HTTP protocol has made it possible to consume the REST APIs from different applications. To consume the REST APIs, one has to make a HTTP request in a specific API endpoint and provide the required data and headers and in return get the desired response.

This tutorial focuses on implementing a Python client for the GitHub API using Requests HTTP library.

Installing

Before starting make sure you have the Requests library installed. If not, you can follow this installation guide: Installation of Requests.

Importing

For using the Requests library in your Python program, first you have to import it.

>>> import requests

Making a Request  & working with the Response

For making a GET request, use the get method and pass the URL as parameter.

>>> response = requests.get('https://api.github.com/users/sandeepm96/repos')

A simple implementation of the GET request is as follows:

The above is a simple client which makes a GET request to the user repositories endpoint and waits for a response. The program exits with an assertion error if the request is unsuccessful (status_code is not 200). If the request is successful, the list of repositories of the user gets printed.

For making a POST request, use post method of the library which takes an URL, request body in data parameter, and authentication detail in auth parameter. An example of a POST request is as follows:

The library also provides methods for PUT and DELETE requests.

Custom Headers

For adding custom HTTP headers to a request, pass them as a dictionary to the headers parameter.

Introducing GraphSpace Python client library

GraphSpace Python Client is a client library for the GraphSpace REST API. It simplifies the process of authentication, request construction and response parsing for the developers and allows them to perform all network related operations like constructing, adding nodes and edges, editing and uploading directly from Python based applications.

The code structure of the library is organised into the following:

  • Client class: The most important module of the library. It helps in connecting the user with GraphSpace. It has an internal method which helps make a request to the GraphSpace API by taking in parameters such as the request method, the graph data, the URI of the endpoint.
  • Endpoint classes: The library has 3 endpoint classes for the graphs, layouts and groups endpoint. Under these classes, methods related to the respective endpoints are defined. These methods call the GraphSpace API to perform the required operation via the Client class’ request making method.
  • Object classes: There is an abstract class ResponseObject which provides methods to parse the response from the API into objects that can be easily worked upon. On top of it there are object classes for Graph, Layout and Group which encapsulate the graph, layout and group type dict in the response into objects via the parsing method provided by the ResponseObject class. The entire response from the API is encapsulated into a proper object using the APIResponse class.
  • Error Handling class: This class is responsible for raising exceptions whenever a HTTPError response is received from the API call. It raises the appropriate GraphSpace error based on the error code received in the response.

Currently, work is going on to improve the library and implement more features.

Using the GraphSpace REST API

The GraphSpace REST API provides endpoints for entities such as graphs, layouts, and groups that allow developers to interact with the GraphSpace website remotely by sending and receiving JSON objects. This API enables developers to create, read, update and delete GraphSpace content from client-side JavaScript or from applications written in any language. In this tutorial we are going to cover how to use the GraphSpace REST APIs using the Postman app.

Note: In order to fully utilize the features of GraphSpace REST API, you must have an account on GraphSpace.
Postman is a Google Chrome app for interacting with HTTP APIs. It provides a friendly GUI for constructing requests and reading responses. Postman makes it easy to test, develop and document APIs by allowing users to quickly put together both simple and complex HTTP requests.

Postman Installation

Postman is available as a native app (recommended) for Mac / Windows / Linux, and as a Chrome App. The Postman Chrome app can only run on the Chrome browser. To use the Postman Chrome app, you need to:

  1. Install Google Chrome: Install Chrome.
  2. If you already have Chrome installed, head over to Postman’s page on the Chrome Webstore – https://chrome.google.com/webstore/detail/postman-rest-client-packa/fhbjgbiflinjbdggehcddcbncdddomop?hl=en, and click ‘Add to Chrome’.
  3. After the download is complete, launch the app.

Using Postman for GraphSpace REST API

The GraphSpace REST APIs have the base URL http://www.graphspace.org/api/v1/. There are many endpoints defined under this base URL (the documentation of which can be found here), but to learn and understand the usage of GraphSpace REST APIs through Postman, we would be considering only the /graphs endpoint for GET and POST request.

  • The GET /graphs request fetches a list of graphs from GraphSpace matching the query parameters.
  • The POST /graphs request creates a graph in GraphSpace.

GET /graphs

1. The URL is the first thing that we would be setting for a request. We will set the URL to http://www.graphspace.org/api/v1/graphs.

url

2. Provide Authorization: Select ‘Basic Auth’ from Authorization type drop-down. Enter the username and password and click on ‘Update Request’.

auth

3. Set Header: Add the following key value pairs, "Content-Type":"application/json" and "Accept":"application/json".

header

4. Select Method: Changing the method is straightforward. Just select the method from the select control. We will use GET method here.

5. Add URL Params:  Clicking on the URL Params button will open up the key-value editor for entering URL parameters. The details of the URL Params for /graphs endpoint can be found in the documentation.

params.jpg

6. Click on the Send button to the send the request. A list of graphs matching the query parameters will be received in the response.

POST /graphs

1. The initial steps of setting URL, Authorization and Header are performed.

2. Change Method to POST.

3. Set Request Body: Click on Body to open the request body editor. Select raw request from the choices and JSON(application/json) from the drop-down. Enter the json data for the graph to be created in the editor. The details regarding the properties of the json graph body can be found in the documentation.

request_body.jpg

4. Click on the Send button to the send the request. A new graph object will be created and returned in the response.

Postman Collection

A collection lets you group individual requests together. These requests can be further organized into folders to accurately mirror our API. Requests can also store sample responses when saved in a collection. You can add metadata like name and description too so that all the information that a developer needs to use your API is available easily. Collections can be exported as JSON files. Exporting a collection also saves the Authorization details. Hence, it is advised to remove the Authorization details from the Header before exporting.

For quick use of the GraphSpace REST APIs or if you are stuck somewhere and you want reference, you can download the collection of the APIs here. The collection has details regarding the API endpoints like params and body properties. Importing steps:

1. Click Import button in the top menu.

2. Choose the Import File in the pop up window.

import.jpg

3. Provide the Authorization details for the imported requests (as Authorization details have been removed for security concern)

GSoC 2017 with National Resource of Network Biology: GraphSpace

4th of May, 2017. The results of Google Summer of Code 2017 were announced. My proposal for GraphSpace, a sub-org under National Resource of Network Biology got accepted. Yay!

I will be implementing a Python library for the GraphSpace RESTful APIs along with my mentor Aditya Bharadwaj and T.M. Murali. Furthermore, we would be documenting the GraphSpace RESTful APIs themselves by using RAML.

The coding period will start from 30 May, 2017 and currently the community bonding period is going on. I will utilize the community bonding period  by interacting with my mentor and discussing on the workflow of the project. I will also research more about client libraries for REST APIs and plan to study about maintaining user documentation using Sphinx. I’ll be updating this blog with every milestone I complete in my project.

Looking forward to an exciting summer of code. 🙂