Transition from soaplib to rpclib

Document describing the transition from using the library soaplib to the new rpclib for handling CIS' SOAP functionality.

We have so far created a SOAP webservice by using twisted as the event driven enginge and soaplib for the SOAP functionality. The problem with this solution is that we are missing a link between the SOAP commands and the rest of the environment/context. In twisted, we have access to everything we want, but in soaplib we don't have access to anything but the direct SOAP parameters. We then can't get access to e.g. session IDs without hacking our way through the code.

However, there exists a fork of soaplib called rpclib, which have implemented new functionality that solves this problem. This is a description on how to benefit from this.

About rpclib

The rpclib project is focusing on supporting different versions of both SOAP and XML-RPC while soaplib is focusing on only supporting SOAP 1.1. This generic focus is making the framework a bit more flexible.

MethodContext

The main difference between soaplib and rpclib for our use cases, is that every public SOAP method are given a MethodContext as the first argument. This object is unique per call and contains the SOAP environment, SOAP headers, the current request, and it even contains a User Defined Context (udc) that we could use to anything we want. We still don't have access to twisted's HTTP environment, but we can now use SOAP headers for the session ID instead. If we're still in need of twisted's environment, I guess it's easier to hack it in than in soaplib.

Events

Events are local functions that could be called at every SOAP request, response and at exceptions. This could be used to e.g. find the session ID from the SOAP headers and validate it and do stuff around it. Another example is that it could check the session and raise an AuthException if the client is not logged in in the web service.

Service classes are not instantiated anymore

Since rpclib makes use of MethodContext, it doesn't instantiate the Service classes anymore. This prevents us from storing data in the instances. Instead we have to use the MethodContex instance.

A tutorial for how rpclib works can be found at http://arskom.github.com/rpclib/manual/metadata.html, and example code for our use is put in https://github.com/plq/rpclib/blob/master/examples/authentication/server_soap.py. A short summary:

class SessionHeader(ComplexModel):
    """Our requested SOAP header"""
    session_id = Mandatory.String

class AuthenticationService(ServiceBase):
    """The Service for authentication"""
    @rpc(String, String, _returns=String, _throws=AuthenticationError)
    def authenticate(ctx, user_name, password):
        ret = ctx.udc.cerebrum.authenticate(user_name, password)
        if ret:
            return ctx.udc.site.makeSession(ret)
        raise AuthenticationError()

class GroupService(ServiceBase):
    """The main Service"""
    __in_header__ = SessionHeader

    @rpc(Mandatory.String, _returns=Iterable(String))
    def get_group_members(ctx, group_name):
        return ctx.udc.group.search_members(group_name)

def _on_method_call(ctx):
    """Event for validating session ID, which should be in SOAP header."""
    assert ctx.in_object is not None
    if ctx.in_header.session_id not in session_db:
        raise AuthenticationError()

# Every call to methods in GroupService must first go through
# _on_method_call, which validates the session ID. This makes the
# authentication part.
GroupService.event_manager.add_listener('method_call', _on_method_call)

# run twisted almost as before with soaplib

The new solution

What I have done to transition to rpclib.

  • All code is commited in its own branch to avoid conflicts:
^cerebrum/branches/soaplib-to-rpclib
  • We still have to subclass the call_wrapper to check for CerebrumRPCExceptions, but there are now subclasses of the Fault class which handles faultcodes and faultstrings more elegantly. As a consequence of this, the returned faultcodes have changed.

  • Session IDs can't be sent through HTTP anymore, at least without creating new hacks, but must be given through the SOAP headers. Its functionality is implemented in an event.

    Service classes that needs sessions must:

    1. Set __in_header__ and __out_header__ to the SessionHeader, which tells clients how to return the session ID.
    2. Add the event on_method_session as a callback. This event validates and creates new sessions through twisted.
    3. Add the exit event on_exit_session both for clean exits and exception exits. This gives the client its session ID.
  • The MethodContex's Used Defined Context (udc) has been implemented as a dict for now, to be able to add different functionality into it. I am not sure if this is the best behaviour.

    For now, udc has been used for twisted's site instance, a pointer to the active session and the cerebrum instance. New services might want to add more data to it.

  • I have started on Cerebrum/modules/cis/auth.py, for authentication and access control. Authentication through username and password is almost done, but is not needed for the individuation service.

How to make use of rpclib

  • Install latest usable version of rpclib locally. The version that have been used in development and testing is rpclib 2.6.1-beta2, from 25th of December 2011. The external packages that were required are already installed on cere-utv01. The latest rpclib version is now 2.7.0-beta. Should we upgrade to this version or wait for further updates? To make use of my local installation of rpclib, add:

    export PYTHONPATH=$PYTHONPATH:/site/jokim/rpclib/lib/python2.5/site-packages/
    
  • Update service code to use rpclib. The main functionality in rpclib works as for soaplib, so you don't need to change that much. However, what you need to update in your service:

    • Import rpclib instad of soaplib. Example from:

      from soaplib.core.service import rpc
      from soaplib.core.model.primitive import String
      from soaplib.core.model.clazz import ClassModel, Array
      from soaplib.core.model.exception import Fault
      

      To:

      from rpclib.model.primitive import String, Integer, Boolean
      from rpclib.model.complex import ComplexModel, Iterable
      from rpclib.model.fault import Fault
      from rpclib.decorator import rpc, srpc
      
    • Public method could either be decorated with rpc or the static srpc. With the former, the method's first argument must be the MethodContex ctx, with the latter decorator the method must be considered as a @staticmethod, that is, not special arguments.

    • "Global" variables, e.g. cached data, must be stored in ctx.udc and not in the Service object anymore, as that is not instantiated anymore.

    • For language support, you must create an event that checks if the returned Fault is an EndUserFault and update its faultstring to something proper. Services that don't return language specific code doesn't need to do anything but raise CerebrumRPCExceptions, and the error message is returned to the client. Other exception types is logged and a generic UnknownException is returned to the client, to avoid giving out too much information.

    • For session support you must add events and set headers for it. You should then be able to find the session through:

      ctx.udc['session']
      
  • Cerebrum/modules/cis/auth.py is updated with code for authenticating clients. The wrap around BofhdAuth for access control is still needed, though. This is however not needed for the postmaster service and the forgotten-password service as they are today.

Testing

The transformation needs to be tested thoroughly, especially that the threading part works as it should. Every part of the functionality that is in production must be tested by going through the different steps manually. This means the forgotten-password-service, as this is main service that is using CIS.

Note that the testing was done with rpclib version 2.6.1-beta2, and the same twisted version as in production, 10.2.

Forgotten-password service

The forgotten-password service has been tested by going through every step in the process of both getting out the usernames and setting a new password.

  • Getting out username: Worked, by giving both student number and fnr.
  • Getting error message for unknown person: Worked, by giving a bogus fnr.
  • Sending bogus data through forms: Ampersand is now working, but not utf8-data outside of latin1. Have added a check on the client side that gives the user a message if given data is out of latin1. This is not the best solution, but is what we have the time for. After this fix it worked as it should.
  • Getting error message for unknown person on the password page: Worked, got error message by giving bogus fnr, username and phone number.
  • Getting same error message for known person, but unknown phone: Worked, same error message.
  • Authenticating at the password site: Worked, got an SMS with the token.
  • Adding wrong token: Worked, got an error message.
  • Adding wrong token maximum times: Worked, token was removed and was sent back to the start page.
  • Adding correct token: Worked, was given the password form.
  • Trying to set a short password: Worked, got an error message about it being to short.
  • Trying to give utf8 characters and bogus data to the password checker: Worked as it should, gave the correct error message.
  • Set a valid password: Worked, password got set.

To test the threading part, the same strategy as for when the original CIS was tested for parallell bugs was used. A script is running in an endless loop, running different calls to the website. Note that the captchas are turned off to be able to test this. The script runs different calls for asking if a user exists or not, and checks that the result is as expected. By running more than 11 processes with this script from different IP addresses, started at different times, we stress the service enough to make use of all its threads, which is 10 by default in twisted. This is then run overnight, and the logs are then checked for errors and warnings the next morning.

The stress test was run over the weekend 12th-13th of May with 13 processes from the hosts melk.uio.no, login1.uio.no, login2.uio.no, cere-utv01.uio.no, morpheus.uio.no and umbra.uio.no. The service gave no errors.

Note that this is not a proper stress test, but is just enough to be able to verify that the threads is not creating any major problems for each others. We could still have bugs related to the threads.

Postmasterservice

The postmaster service was quickly tested:

  • Logging on (through bofhd): Worked.
  • Asking for a list of some affiliations: Found a bug, which got fixed. Worked afterwards.
  • Asking for a list of some affiliations and by OUs: Worked, got a list of e-mail addresses.

This service has not been stresstested, as it uses the same framework as the forgotten-password service.

Deployment

How to deploy these changes. Note that the actual deployment requires some downtime, as it takes a few minutes if everythings goes as planned.

This deployment is focusing on UiO, but the other instances are mostly deployed in the same manner, except of UiA. Note that UiA has their own PHP client, which they have to update themselves. This could take some time for them, depending on how much of the code they have changed for themselves.

1. Merging

The branch soaplib-to-rpclib has to be merged into trunk first, before the sysadmins wants to deal with it. This was done in svn revision 15951 .

2. Installing rpclib

The rpclib package needs to be installed in the production environment.

See the rpclib page for more information, or the page at python.org: http://pypi.python.org/pypi/rpclib. Note that the latest version has not been tested yet.

3. Deploying Cerebrum code

Note that this requires some downtime, as the update requires that the web client also gets updated.

Note that we have two such web services at UiO (Individuation and Postmaster), but both clients depends on the same code files, so have to deploy both at the same time.

Related files:

  • Cerebrum/Errors.py
  • Cerebrum/modules/cis/ (everything under here)
  • Cerebrum/modules/no/uio/PostmasterCommands.py
  • servers/cis/ (everything under here)
  • cisconf (everything under here) - Note that some instances are not updated with this yet, which means that we have to move their config out of cereconf and into cisconf files. This is not done yet, to avoid getting in conflict with other deployments. You could for instance talk to jokim for doing this when the given instance is about to be deployed.

How to deploy:

  1. Update all related files.
  2. Run a python-shell with the command: import rpclib; rpclib.__version__
  3. Restart the SoapPostmasterService.py job.
  4. Make sure the job runs and check the log for errors: /cerebrum/uio/var/log/cerebrum/cis_postmaster.log
  5. Restart the SoapIndividuationService.py job.
  6. Make sure the job runs and check the log for errors: /cerebrum/uio/var/log/cerebrum/cis_individuation.log
  7. Deploy the PHP clients, as they don't work at this moment.

4. Deploying the PHP clients

We have two PHP clients that must be updated, the forgotten-password service and the postmaster service, but the last is only for UiO. At UiO is both placed underneath Brukerinfo. The other instances have their own locations for the forgotten-password service.

Related files:

  • clients/web/phplib/ (everything under here) - This is the shared code and is on curumo placed in the directory phplib/
  • clients/web/postmaster/ (everything under here) - This is for the postmaster site, and is in production placed in the directory src/postmaster/ (only for UiO).
  • clients/web/individuation/ (everything under here) - This is for the forgotten-password site, and is in production placed in the directory src/individuation/
  • cerebrum_sites/etc/$INSTANCE/cis_passord.web/ - This is the config directory, which might be updated for some instances. Most likely, this is not the case.

Postmaster webservice

  1. Update all the related files at curumo:/uio/caesar/no.uio.brukerinfo_443/
  2. chmod the files to be readable by everyone, but of course not the secret directory.
  3. Log on to the Postmaster webservice with your superuser account.
  4. Try to get out a list of some e-mail addresses.
  5. Double check the log for errors.

Forgotten-password webservice

  1. Update all the related files at curumo:/uio/caesar/no.uio.brukerinfo_443/
  2. chmod the files to be readable by everyone, but of course not the secret directory.
  3. Go to the forgotten password site. Change the language from english to norwegian and backwards and make sure that you get no errors. The server is then responding.
  4. Try to get your usernames from the service.
  5. Verify that wrong input to the captcha blocks access to the site. It contained a security bug, which totally ignored it.
  6. Not sure how we could test the password changer. Maybe catch a student or just watching the logs for students that needs to reset their passwords?
  7. Double check the log for errors.

How to deploy for UiA

UiA has their own PHP client, which they have modified for their own purposes. The communication part has changed, so they need to update the client's phplib code at the same time as we update our own code. We therefore can't update their Cerebrum code until they say go.

All the related updates are in: phplib/model/CICom.php

We could give them a test server to use for testing, depending on much changes they have to do. We then have to use a port at cere-utv01 that is open for UiA, or ask nett-drift@usit.uio.no to open one for us.

UiA has their own fictive student accounts for testing such scenarios as these, so they could test the password service themselves.

Cleaning up

After every instance is updated and is using rpclib, the old soaplib packages could be uninstalled.

Author: jokim

Publisert 25. juli 2013 14:48