M3 - Monitoring, Metering and Management
This page will discuss the designs to be put in place to support monitoring, metering and management of Ringside Social Application Servers.
Definitions
Before we start, let's define what those terms mean:
- Monitoring - the act of collecting metrics from different subsystems within the Ringside server. Metrics (aka "stats" or "statistics") can be things like, "how long did it take for a request to complete" or "what was the response time for the server to process an API call". Monitoring should have little to no impact on the server itself; it particularly should not cause the server to behave differently. In other words, the behavior of the Ringside server should not differ whether or not it is being actively monitored.
- Metering - the ability to alter the behavior of the server based on certain metrics within the server. This is similar, but not quite the same, as authorization. An example illustrates the difference: let's say John is authorized to access web page index.php, but Charles is not authorized. It is a yes or no answer - "yes John can always access index.php" and "no, Charles can never access index.php". Metering adds rules to these access checks to support something like, "sometimes Bob may be authorized to view index.php, but other times Bob may not be authorized to view it." Metering is the ability to add these rules to access checks. An example is, "Bob can view index.php unless he's already viewed it 10 times today; after he's viewed it 10 times in a day, Bob is no longer authorized to view index.php for the rest of the day". Metering allows for throttling of services and can be used for subscription-based access controls ("gold customers can view index.php unlimited amount of times; however, bronze customers can only view it 10 times per day").
- Management - the ability to control the behavior by configuring parts of the server as well as starting/stopping components of the server. Being able to manage the Ringside server can mean things like deploying and undeploying applications, provisioning new APIs and shutting the server down.
We use the term M3 when talking about "Monitoring, Metering and Management" as a single concept.
Source Code Layout
The "M3 core" will be in its own top level project called "m3". It will be broken down into modules once the implementation is flushed out. There will be a local M3 client (for use by the web and social tiers mainly) and a remote M3 client (for use by applications mainly). Here's how the source will be laid out inside of SVN:
M3 API
The Ringside server is implemented with three main subsystems - the Web tier, the Social tier and the API tier.
The core M3 implementation will reside as a "fourth tier" such that the W/S/A tiers can call into the M3 tier via a "local client". Our low-level M3 features will also be exposed via a "remote client" to deployed applications as well as external management tools using the same REST protocol used everywhere else.
To add a new M3 API to the source, follow these instructions.
Some thoughts on M3 APIs that we can expose:
| API |
Description |
| getStatistics() |
Retrieves collected data, possibly timeboxed (i.e. "give me all the stats that have been collected in the past hour") |
| purgeStatistics() |
Deletes collected data, to free up space on the persistence store, making room for more stats to be stored |
| hasPermission() |
Determine if the requestor is authorized to do something. We can pass in a tuple (NetworkID, ApplicationID, UserID) or anything within the request context that makes sense. This potentially can access a rules engine to do its check or it can do a simple authz check. This is part of metering |
| addStatisticData() |
Allow an application to add its own set of statistic data, if we do it right, we can provide a monitoring framework for apps to easily plug into - giving them monitoring/metering "for free". This needs alot of thought, but this should be doable. This obviously can even be asynchronous from a user request - so long as the app sends us its app ID and key properly to authenticate itself. |
| getDeployedApplications |
Have the API layer report what applications are currently deployed. Useful for a control panel or 3rd party management tool. |
| undeployApplication |
Useful for a control panel or 3rd party management tool to manage applications |
| deployNewApplication |
Useful for a control panel or 3rd party management tool to manage applications |
| ...anything else we can think of... |
M3 Interception Points
The M3 layer needs to weave itself into the core Ringside code in fairly intimate ways. This is why AOP was such an attractive solution - we should have been able to define the joinpoints where we wanted advice code to run without injecting any code directly into the monitored classes. Alas, production-ready AOP implementations for PHP are not yet available. So, we'll need to pinpoint where in the code we want to intercept requests so we can do things like:
- collect statistics
- authorize calls
- meter access
For example, restserver.php is the main integration point when applications make calls into the API tier. We can therefore inject our own M3 code somewhere in that restserver call chain to perform things like collect stats on what API calls are being made, when and by whom.
Currently, we have 3 interception points that we are looking to inject collection code:
- Incoming requests to the Social tier
- Outgoing requests from the Social tier to a deployed application
- Incoming requests from a deployed application to the API tier
 |
Good list.
One other would be SocialApiRender in renderLocal and renderRemote, but really RingsideSocialUtils getRequest
since at some point all requests to applications go through this mechanism.
1. whichApp was invoked (count)
2. elapsed time app invoked.
Rich
Mark Lugert wrote:
> below...
>
>
>> 1) where the social tier makes outbound requests to apps on behalf of a user
>>
> Not sure.
>
>> 2) where the social tier accepts inbound requests from external networks
>>
> I would expect you to put hooks into all the public endpoints for social:
> map.php
> proxyjs.php
> render.php
> trust.php
>
>> 3) where the web tier accepts inbound requests from client browsers
>>
> oneapp.inc probably? I think pretty much everything uses that.
>
>> 4) where the social tier renders a custom/facebook tag
>>
> RingsideSocialClientLocal has some render methods that are used by
> everything to initiate rendering I believe.
>
>> Also, give me any hints for any other good interception points you think
>> exist where we could collect some good stats.
>>
> Creating some Doctrine Interceptors could get you some good stats.
>
>
>> John Mazz
>>
>> |
Managed Resources
From a management perspective, we will break up the Ringside Social Application Server into separate "managed resources". Each managed resource will be able to have their own statistics/metrics, configuration, operations, events, etc. associated with it. The Ringside Server will consist of the following managed resources:
- Ringside Server - this is the server itself, the software that is deployed within an Apache instance and contains all of its child services:
- Applications - a deployed application
- Networks - a network that is known to the Ringside server
- User - a member of the social network
- API - a deployed REST API
- Tags - a deployed tag that can be used to render content
Implementation Details
This section will briefly discuss the M3 implementation. These are the main packages in the M3 module (found under the PHP include path in ringside/m3/):
| Package |
Description |
| event |
Provides a generic event dispatcher framework |
| db |
Files that are used to store data to a database |
| metric |
Generic classes that provide a metric data |
| paging |
Classes used to page through data |
| util |
Miscellaneous utility classes used throughout M3 |
Event
This is an event dispatcher framework. Other subsystems (like the API or Social tiers) use the DispatcherFactory to create dispatchers and send events with it. Dispatchers are used to send events to listeners (listeners can be created by the factory, too but users of dispatchers don't worry about that, the factory will create the listeners for you). There is a base, generic IEvent class that represents any event. An IListener gets dispatched events by a dispatcher. There are going to be a few specialized events that can be used by those emitting events - for example, a ResponseTimeTupleEvent represents a duration of time take by "something" for a user/app/network tuple request. (side note: a "tuple" represents a user request, where the "tuple" data is the user ID/application ID/network ID).
DB
These are specialized files used by M3 when it needs to store data to the database.
Metric
No matter where M3 stores its metric data, there is some generic functionality that is required to obtain metric information. We abstract out the backing-store agnostic functionality to classes found in this package.
Paging
Metric data can be voluminous. As the metric data is collected and grows large, we will run into memory issues if we attempt to load it all in memory. Therefore, we need our M3 classes to support paging through that data. This package provides some objects that are used to facilitate paging through data and cleaning the data when appropriate.
Util
Just some miscellaneous classes, like a stopwatch, generic file utilities, and a class used to obtain M3 configuration settings.