Open architecture for education

Open architecture

Centralised model

Technical spec

In this model, KLP sits in the centre of the network and maintains all the referential integrity by virtue of its special position in the network. Overlays of various other data sets can be made only with KLP's technical support.

Legal position

KLP will have some duties arising from possession of data from external organisations. These terms will have to be negotiated on a case-by-case basis. KLP will need to formulate a privacy policy, a data retention policy and a security policy.


  • Since the data is tightly coupled, overlays are easier to implement
  • Policies can be formulated in way to satisfy all KLP stakeholders
  • Data quality is likely to be better
  • Richer set of data for data mining
  • Lower barrier of entry for external organisations


  • Returning data to an external organisation is hard to do
  • Hard to involve organisations who want to share only part of their data
  • Legal framework is more complicated

Federated model

This is the model that we have decided to go with.

Identifier map

The interface consists essentially of a map of unique identifiers to unique identifiers along with a number of sets of unique identifiers. The data model has two types of entities: individuals and institutions. These entities are associated with geographies. The geographies are numerous. For example, MP constituencies; Education districts, blocks and clusters and so on.

Each partner organisation maintains its own unique identifier for its entities. In the case of KLP we will maintain a unique identifier for every child and institution. The interface will provide a map of our identifiers to a global set of identifiers. The identifier will need to be associated with sufficient identifiable data so that all parties can be certain that they are referring to the same entity. In the case of a child, identifiable data might include name, parents' names, and date of birth.

The global identifier (GID) is what will be used to establish which entity partners are referring to. There are two ways of querying the map. An local() query will return the local identifier of a global identifier, for the partner making the query. A global() query will return the global identifier of a local identifier of the partner making the request. The local identifiers will most likely be the primary keys for the entities in the partner's database. A minimal set of identifiable attributes for each entity will be associated with the entities entry in the map.

All entity IDs described below (indid, iid, geoid) are GIDs. Partners need never know what their local identifiers are and can be kept ignorant of these internal details.

KLP occupies no special place in this model. As a practical matter we will implement the interface but this does not confer any special position to KLP. The onus of security is shared among all partners.


  • External organisations have full control over their data
  • Anyone can make overlays of any shared data
    • In case a particular overlay needs more data than is shared, a bilateral agreement can be drafted for the purpose of that overlay. KLP need not be consulted.
  • From a technical perspective, this model can be used to implement even the centralised model


  • Since no one occupies a special position, enhancements and change processes are harder to implement
  • Even the minimal set of indentifiable information associated with each identifier may weaken the privacy policy
  • Higher barrier of entry as participating organisations will need to have the technical knowledge to use the interface
    • This can be mitigated by implementing the interface glue code for them
  • Overlays are harder to implement
    • However, KLP's involvement is not required to implement overlays
  • Not all partners will be equal in all aspects
    • Data quality will need to be ranked

Legal position

Additional data sharing is done bilaterally and is responsibility of the sharing organisations. KLP need not be a central repository of sharing agreements.

Privileged Interface specification

This is the interface that will be used by authenticated partners to read and write to the data store.

Read interface


indid[] = find_individual(name, dob, sex, mt, geoid, itype, parents[])

Apart from the self-explanatory inputs,

  • mt -- mother tongue
  • geoid -- location identifier. Could be in any of the geographical hierarchies.
  • itype -- institution type identifier.

indinfo = individual_info(indid)
indinfo = individual_info(indid, progid)

Returns indentifiable information about an individual as key value pairs.

  • name
  • dob
  • sex
  • geoid
  • iid[]
  • parents[]

In the case where progid is also supplied, it will additionally return the program assessment details for the individual in question.

indid[] = list_individuals(progid)
indid[] = list_individuals(geoid)
indid[] = list_individuals(iid)

Returns a list of individuals that are associated with a

  • progid -- program
  • geoid -- location
  • iid -- institution


icid[] = institution_components(iid)

Returns a list of institution components.

indid[] = institution_workers(iid)

Returns a list of individuals who work in an institution.

indid[] = institution_customers(iid)

Returns a list of individuals who use in an institution.

Write interface


indid = create(name, dob, sex, geoid, [ parents[] ])

Creates an individual with the specified information and returns the indid of the newly created individual.

edit(indid, changes[])

Edits the details of an individual. Requires the indid of the individual being edited and the changes required as key-value pairs. cf. find_individual() for the list of possible key values.

associate(indid, progid, proginfo[])

Associates an individual's results of a programme assess with the individual. proginfo[] contains the assessment details as key value pairs


iid = create(name, owner, type, geoid, [info[] ])

Creates an institution with the specified name.

  • owner -- indid of the owner.
  • type -- institution type
  • geoid -- location identifier. Could be in any of the geographical hierarchies.
  • info[] -- optional key-value pair list of parameters associated with the institution such as number of rooms, constructed date, etc.

FIXME: duplicate handling

icid = create_component(iid, name, owner, type, [info[] ])

Creates an institution component with the specified name. Classes are mapped to institution components.

associate(icid, indid[], type)

Associates workers or customers (type) of an institution with the institution.

edit(iid, changes[])

Edits the details of an institution. Requires the iid of the institution being edited and the changes required as key-value pairs. cf. create() for the list of possible key values.

edit(icid, changes[])

Edits the details of an institution component. Requires the icid of the institution component being edited and the changes required as key-value pairs. cf. create_component() for the list of possible key values.

associate(iid, progid, proginfo[])

Associates an institution's results of a programme assessment with the institution. proginfo[] contains the assessment details as key value pairs


htype = create_type(name)

Create a new hierarchy type with name.

geoid = create(name, parent, htype)

Create a location name under a particular hierarchy with a specified parent. Parent being 0 indicates that it is a root node.


ptype = create_type(name)

Create a new programme type with name. This is used to aggregate programmes.

progid = create(name, ptype, time, params[])

Creates an instance of a particular programme type.

  • name -- name of the programme
  • ptype -- type of the programme
  • time -- the time period of the programme
  • params[] -- parameters that the assessments will measure


This is not a big concern as on March 2010.

This will become important when a partner with their own data store wishes to interface that store with KLP's. A password based authentication scheme might be sub-optimal. GSSAPI and additionally PKCS#11 might be a better long-term choice.



All data traversing any public network link should be encrypted via TLS.

Data store

The data store should not listen on any public interface.

Public Interface specification

This is the interface through which aggregated, anonymised information can be disseminated. This is a read-only interface and will only provide aggregated information. The KLP website could read from this interface. Data can be provided in two forms: JSON and XML.


htype[] = list_htypes()

Returns a list of hierarchy types in the system.

geoid_struct = htype_info(htype)

Returns a hierarchical list of geoids associated with a hierarchy type.

htype = find_htype(geoid)

Returns the hierarchy type for a given geoid.

geoid_struct = children_of(geoid)

Returns a hierarchical list of all the children of a geoid.


progid[] = get_progs(geoid)
progid[] = get_progs(htype)
progid[] = get_progs(ptype)

Returns a list of programs associated with a particular location, hierarchy type or programme type, respectively.

prog_params[] = get_prog_params(progid)

Returns a list of the parameters that the assessment of a programme measured.

Assessment data

pdata = aggregate(geoid, progid)
pdata = aggregate(iid, progid)

Return a programme's assessment data aggregated for a particular geoid or institution.


type[] = list_types()

Returns a list of institution types in the system.

iid[] = list_institutions(geoid)
iid[] = list_institutions(geoid, itype)

Returns a list of institutions in a particular location. Additionally take institution type as a parameter too.

FIXME: duplicate handling

iinfo = institution_info(iid)

Returns a list of key-value pairs about the institutions.

  • owner -- The owner is the indid of a person who runs the institution.
  • name -- name of the institution.
  • itype -- type of an institution.
  • progid[] -- list of programs that the institution was involved in.
  • ncust -- number of customers
  • nworkers -- number of workers
Last modified 8 years ago Last modified on 03/24/10 17:31:15

Attachments (1)

Download all attachments as: .zip