Rationale
Foreword
PageBox project aims to provide components allowing distributing Presentations on Internet and Intranets.
In this document, we present:
Definitions
- Presentation
We define a presentation here as a set of components able to generate and format what is displayed on a user equipment.
A presentation calls content producers to get the content that it processes.
A presentation as described here handles only HTTP/HTTPS though an intermediate gateway can convert from HTTP to another protocol
such as WAP. A presentation can generate a flow in miscellaneous formats such as XML, HTML, XHTML, WML, PDF, SVG and SWF.
A presentation calls a content producer using a network protocol such as XML over HTTP, SOAP or IIOP.
It can have a function of content adaptation.
- PageBox
PageBox is a mean enabling the hot deployment and update of presentations in Application Servers.
A PageBoxed Application Server (PAS) behaves as a browser, it downloads a presentation from a repository just
like a browser downloads an applet and like a browser a PAS runs the presentation in a sandbox with rights based
on the presentation signature.
Unlike a browser, a PAS downloads or updates a presentation only when commanded through an HTTP request.
PageBox has been designed for Internet deployment and for Application Service Providers (ASPs).
- Repository
A PageBox can subscribe to one or many repositories.
When a new archive is published on a repository, it is automatically deployed on all PageBoxes that have subscribed
to the repository.
- Constellation
A constellation is a set of PageBoxes and repositories with a trust relationship.
Whereas PageBoxes and repositories have a physical existence (they are hosted somewhere and run PageBox code) a constellation
is an organizational entity. A constellation can be characterized by:
- A list of repositories
- A common security mechanism allowing subscription and publication checking
- Mapper
The mapper is a mechanism embedded in PageBox that modifies links in specified static pages (HTML, XHTML),
named routing pages.
A user enters a presentation when it specifies a well-known URL, handled by one PageBox or a small set of clusterized PageBoxes.
When the user clicks on a routing page, links can be modified to:
- Select a PageBox closer to the user
- Load-balance requests between PageBoxes
Analysis
- Performance
Today to offer a Graphical User Interface a company must either:
- Write a Web application or
- Write a graphical front end
- Web application
Advantages:
- A Web application is easier to write and to maintain than a graphical front end. It also requires less skill.
- A Web application is a central application, so it is easy to deploy and update.
Drawbacks:
- A Web application being a central application also means all application parts, presentation, business logic,
data caching and accesses run on a small set of servers. Large server resources (memory, CPU and disks) are more expensive
than small computers ones
- Browsers are used to display Web application pages and these pages are downloaded using HTML or XML over HTTP.
Here the main drawback is that presentation is downloaded with data. As a consequence Web applications require more bandwidth
than applications invoked by graphical front ends
Web Applications are successful and address well End Consumer market where availability and response time requirements
are lower. The End consumer doesn’t pay nor is paid to use the application but we think the major point here is she or he
is an occasional user. Compared to a Professional User, she or he is still a beginner and therefore slower.
- Graphical front end
Advantages:
- From a communication point of view, a graphical front end is the client part of a client/server application. It can use client/server protocols such as EJB over RMI/IIOP, which carry only data and require less bandwidth than Web applications
- It runs presentation on the client and requires less resources on server where they are expensive
Drawbacks:
- A graphical front end is harder to develop and to maintain. It is more demanding in project management and developer skills.
This complexity doesn’t accommodate time to market and frequent changes constraints
- A graphical front end is hard and expensive to deploy and update on a large number of devices
- A third way: PageBox
The company still write a Web Application but the Web Application is deployed automatically on a large number of inexpensive
PageBox hosts. When a user makes a request it is routed to a PageBox on her or his side or close to it. As a consequence:
- The graphical application is easy to write and update as it is a regular Web Application
- Deployment is the responsibility of the infrastructure. From the company point of view it is no more than
a repository publication.
- As the PageBox is close to the user, the response time is better. We believe we can achieve a consistent sub-second response
time.
- PageBoxes run on inexpensive platforms
- The bandwidth requirement is the same as for a graphical front end
- Existing infrastructures
Such infrastructures exist today for static, non-customizable content.
Commercial infrastructures
A good example is Akamai. We recommend reading their white papers on the issue.
Another good example is Inktomi. They have an excellent Flash presentation of the subject and also white papers.
You can find them here.
Components
Examples of components are proxies aka Web caches such as Open Source Squid.
Here a set of users configure their browser to use a proxy. They share the proxy, which means that if user A has downloaded
a page, user B that asks for the page later is served by the proxy and no more by the HTTP server.
Proxies can run in routers. They can cooperate with other proxies and be used to build caching infrastructures.
The analysis of current offer suggest we must consider three aspects:
- Providing a component set allowing the distribution of presentations
- Implementing infrastructures we call constellations for presentation deployment
- Interfacing with static content existing solutions. A solution optimized for static content will always handle better
static content than general purpose PageBox.
Internet
When we started working on PageBox concept, we were focused on this technical/performance problem.
Now we think that there is another aspect to consider. Internet used to be a network, a pipe that linked clients and servers
hosted on its border. Today numerous ASPs offer Web Hosting. An advantage of Web Hosting is ASPs have bigger links than most
companies. In some cases they are also ISPs. We can see them as a part of the infrastructure and note that Internet is becoming
an added-value network.
Therefore we see PageBox as a technology enabling Internet to host presentation.
Let's see why it is possible and why it is good.
- Availability of mature PKIs
We have the following security needs:
- The repositories must be able to check the identity of the Presentation providers and of the subscribing PageBoxes.
- The PageBoxes must be able to run the Presentations in a sandbox and to check the identity of the repositories.
- The Presentations must be able to check the identity of the data providers and - as usual - of the users.
- The data providers must be able to check the identity of the Presentations.
We can address these needs with data providers, Presentation providers, PageBox hosts and repositories managed by independant
organizations thank to Public/Private key infrastructure and independant Certificate Authorities such as Verisign.
- Move to standard protocols
XML over HTTP, XML over SMTP, SOAP and IIOP have free interoperable implementations.
It should also be the case of the coming XML Protocol (XMLP)
The Data Providers can publish their interfaces using DTDs, W3C schemas and IDL files. Then their applications become
Web Services.
XML over HTTP (synchronous) and XML over SMTP (asynchronous) is the most promising solution:
- HTTP and SMTP are firewall proof
- HTTP over SSL is free, mature and available on all environments
- SMTP is ubiquitous
- Non-repudiation can be easily implemented using a digital signature encrypted using the free JSSE
RosettaNet Implementation Framework (RNIF) and ebXML
are good examples of protocols using XML over HTTP or SMTP. Messages have this format:
XML protocol
SOAP and XMLP provide standard APIs to generate and parse such messages.
Web service descriptions can be stored in and retrieved from a Universal Description, Discovery and Integration (UDDI) repository.
Note that queries to UDDI themselves use SOAP. UDDI is important for presentation providers. It allows:
- Listing possible data providers
- Finding how to invoke chosen providers
UDDI is defined to allow different technologies working together. It supports describing any socket, IIOP, XML over HTTP binding.
However today only SOAP and HTTP binding are standardized with Web Service Description Language (WSDL).
You can download its specification on IBM or
Microsoft sites
- Data source/data presentation link
Today the company that owns the data also provides the presentation allowing accessing them.
We think that it is bad because:
- It hinders competition. Customers have to use the presentation of the data provider.
The other drawbacks we enumerate below are usual drawbacks of markets without competition.
- Data providers become two head companies instead focusing on their core business.
- There is no incentive to improve processes. The data providers get money when users decides to pay its shopping cart
whereas every customer action such as availability has a processing cost, often higher than the final purchase.
- The end consumers don't get a fair share of the process enhancement.
Big data providers
Producing large amount of data requires time and money. Big data providers have computers for a long time and often still
produce most of these data on mainframes. So their presentation are already PageBox-like presentations:
- They call other systems to get or update data.
- Therefore they manage only few data on their own.
Big data providers would spare money by focusing on the delivery of gateways handling industry-standardized XML messages.
Small data providers
Small data providers case is different: their application server relies on databases and only marginally calls other systems.
They have two problems:
- To match the competition, they have to invest more and more on Presentation instead focusing on their core business
- As they have few data to offer, they cannot compete with big data providers.
Presentation providers can merge data of many small providers to build a comprehensive offer and address the second issue.
Small data providers can use their existing application server to handle XML data and address the first issue.
Presentation providers
What we said about data providers is not related to PageBoxes:
data providers exist and are already moving to XML to address B2B needs.
On the other hand, Presentation providers are new:
- A Presentation provider is primarily a software company as Presentation administration is handled by Constellations.
- It can specialize on a type of presentation - for instance mobile phones and PDA or Flash and SVG
- It accesses data providers through their access points using their published message.
This analysis consider long-term perspectives. We think that we needed to do so in order to define the right target and
clarify PageBox goals. We hope that it helps to understand the short-term product.
Target and status
- Roles
Data providers
Data providers provide access to their data in a standard protocol.
Presentation providers
Presentation providers write presentations and publish them to Presentation repositories.
A presentation can call one or many Data providers. It can access its own data but it should be mostly read-only.
A presentation is deployed on a large number of PageBoxes so a Presentation provider cannot assume a given user
will always be served by the same PageBox - but for the duration of an Application Server session.
Users
Users access presentations using browsers.
Presentation repositories
Presentation repositories accept subscription requests from Presentation hosts and publication requests from Presentation
providers. When a Presentation repository receives a Publication request, it notifies subscribing PageBoxes that download the
presentation.
Presentation hosts
Presentation hosts host Presentations in PageBoxed application servers.
- Infrastructure
Data Provider directories
Data Provider directories use UDDI protocol and allow Presentation Providers to discover information about Web services:
- Data Provider access point
- Data Provider services (messages, functions)
UDDI has been designed to address Business to Business (B2B) needs.
As a Presentation Provider is a regular trading partner, UDDI satisfies PageBox requirements.
Presentation hosts
We developed a PageBox named JSPservlet under GPL 2. Its source is on SourceForge site in pagebox repository.
We distinguish two sorts of Presentation hosts:
"Turn key" hosts
These hosts are Network Appliances and routers.
We consider our JSPservlet for Application Servers addresses Network Appliance environment:
- Linux or similar Operating System
- Support for free (Tomcat) or inexpensive (Resin) Application Servers
- Enough resources (256 MB or more)
As resources are more expensive on routers, we developed a JSPservlet for embedded servers to reduce the footprint
and next a diskless version for devices without disks - still running on embedded servers.
JSPservlet for embedded servers run on Sun Java Embedded Server and can be ported with minimal effort to other embedded servers.
These embedded servers conform to Open Services Gateway Initiative (OSGi) specification. You can download the specification on
OSGi site. An interesting aspect is that it primarily targets home gateways,
which can be of interesting for specialized Presentation providers.
For instance, a game editor could use PageBoxes to host Web archive based games. In that case, the consumer subscribes to the
editor repository and its home PageBox downloads games from the repository.
An interesting aspect of our implementation is that it supports standard Web Application including
JSP 1.1 tag libraries and JSP beans. The only thing we failed to hide is that JSPservlet for embedded servers inherit from JES 2
the support of Servlet specification 2.1 - not 2.2 as on Application Servers.
Application Server hosts
Application Server hosts are hosts that already operate Application Servers.
We distinguish three sorts of Application Server hosts:
ASPs
We see them as our main short-term target as they make the creation of constellations relatively inexpensive.
A company could be charged something like $650-1500 a month for a world-wide constellation of fifty PageBoxes - roughly 1/3-1/2
of the hosting cost of a dedicated server. At this cost, it would get an highly redundant solution and provide a better response
time.
ASPs that support Java use JServ (an Apache module), Tomcat and Resin. It is for that reason we developed JSPservlet on
Tomcat and Resin. We consider a JServ version though it will have the same drawback as embedded server version - support of an
old servlet specification -.
Organizations such as universities
We believe these organization mostly run the same Tomcat and Resin as ASPs and therefore JSPservlet for Application
Servers should meet their needs. We think it could be interesting to create cross-university constellations to reduce
network bills and enhance response time.
Companies
Company use Tomcat and Resin but also commercial application servers. As we explain below, JSPservlet is designed to be highly
portable. If you meet some problem to port JSPservlet to your Application Server, please contact us. On the other hand if you
made the port let us know and if possible send us the code.
Companies can use PageBoxes in four ways:
- Using a public constellation.
- Creating a private Internet constellation.
- Creating a private network constellation.
- Using standalone PageBoxes. Consider a simple B2B issue: Company A has developped a Web Application, it sells to company B.
Company A keeps the data access part on its side and deploys the presentation part on a PageBox hosted in Company B.
The return on investment can come quickly if the companies would need to update their link otherwise.
It is enough if Company B is a single-site medium or small company.
If Company B is a corporate with 2000 offices around the world, then it is of its interest to deploy a private constellation
and to allow Company A to publish its presentation on Company B repository.
Today JSPservlet is a regular Web application. Though a closer integration could have some value, we found it has significant
advantages:
- Application Server provide support for servlet and JSPs.
- It is easy to port JSPservlet to another application server.
- It is easy to install JSPservlet on a machine where an application server is installed.
Users don't need to know the URL of the closest PageBox. Thank to Mapper component they can enter the well-known URL
of their application hosted in a PageBox. Then they are automatically routed to one of the closest Pageboxes.
The mechanism identifies safely:
- The country
- If there are PageBoxes on the same domain as the requestor.
Mapper can be customized to suit more closely to a specific need.
We plan to complement Mapper with the integration with a Web cache - Squid.
Repositories
We implemented a comprehensive repository tool, PublisherServer. It accepts:
- Subscription requests in HTTP
- Publication requests in HTTP, using a Java application, PublisherClient
- A set of administrative pages to check the state of the subscriptions and publications and to display repository archives
- Fault-tolerant mechanisms to retry publication, unpublication and unsubscription when a PageBox is broken
PublisherServer runs in an Application Server.
We plan to implement access control functions. We favor simplicity of use, so we probably will use a password mechanism.
Constellations
We plan to implement a test constellation.
Any help is welcome but we would especially appreciate free hosting and even more free hosting in Europe,
Asia and Pacific.
Security
From the beginning we considered security as one of the most important issues PageBox had to address.
We have implemented sandboxes and SSL support. We have still some features to implement such as secured publishing but
without technical risks.
The most important point is our implementation can and should be able to check archive credentials on a Certificate
Authority.
More specifically, PageBox should be configured to query an LDAP CA to check that:
- The archive certificates were signed by a trusted CA
- The archive certificates were not revoked
It implies a cost probably negligible for commercial use.
For test purposes on the coming Ursa Minor constellation, we plan to support test
certificates issued by Verisign and Thawte.
It would be useful to have our own Certificate authority for a public, cheap or free constellation.
There are some issues. One of them is a certificate is only valid if the CA can check the identity of the requestor.
Mail address is not an option. A credit card number is... We would be happy if someone could help us to provide a free or
cheap CA.
Goals
- Standard conformant
- Public domain reference implementation
This reference implementation must also leverage on Public domain products such as Tomcat or Cocoon and avoid competing
with other Public domain products.
- Secure
- Reliable
- Cheap
We try to reduce cost in three areas:
- Setting cost. We already developed helper servlets.
- Security management. This part is not completed but our goal is to provide an automated security management.
- Troubleshooting. We already provide log display through HTTP.
- Thin and fast
For the moment we have been successful in that area. PageBox code is small and has no significant overhead even
when it sandboxes Presentations.
Technology
In the current implementation, JSPservlet we use Java language and J2EE technology.
We are very satisfied for the following reasons:
- Functionality. We especially used:
- Class loaders
- Java 2 security
- Productivity
- Portability
- Availability of robust and scalable application servers in Open Source
Our conviction is PageBox has to be written in some form of VM-run or managed code.
The only appropriate environment - beside Java - is .NET and we consider this option seriously to support
ASP+/VB/C# development.
We also want to support PHP development.
We have checked that it supports:
- Java servlets and JavaServer Pages, taglibs, beans and classes
- XML, XSL, XSP with Cocoon, Xerces and Xalan libraries
It should also support the same scripting languages as Bean Scripting Framework (BSF),
Netscape Rhino (Javascript), VBScript, Perl, Tcl, Python, NetRexx and Rexx...
AS doc JES2 doc
Diskless doc
Publisher doc
Publisher client doc
Configurator doc
Doc & downloads
CVS repository
Contact:support@pagebox.net ©2001-2004 Alexis Grandemange.
Last modified
.
|
|