Welcome to the Pubwan mini wiki at Scratchpad!

{{{newpage|You can use the box below to create new pages for this mini-wiki.

Welcome to pubwan

Pubwan is a proposed project in open source, non-profit data mining. Applications to consumer economics, consumer education and consumer decision support are of especial interest. Activities of interest include data collection, data aggregation, encoding of normative criteria, data analysis and mathematical modeling. And if necessary, reverse engineering.

Rationale behind pubwan

There is an unmet need for aggregation of tabular data in the public domain. Software has had a presence in the public domain for some time, and now there is a lot of text in the public domain thanks to Wikipedia and other GDFL works. What is still lacking in the public domain is the type of machine-readable commercial data (there is a good deal of scientific data) suitable for analysis with database or spreadsheet applications, or data visualization tools.

Strategies for implementing pubwan

Volunteer recruiting

Keeping pubwan in the non-profit sector will require volunteer labor and funding primarily (if not exclusively) by individual donors. Data mining in service to consumer decision support, in itself, is nothing new. The catch is that for-profit services of the 'shop bot' variety treat their database content as proprietary. Information is dispensed one data point at a time. No claim of objectivity is made, and in fact 'exclusivity deals' with vendors are promoted as a selling point. So, a business intelligence engine that answers to consumers has to be controlled by consumers, and cannot be run as a business.

Volunteer recruiting is a challenge for any non-profit organization. People must be persuaded that they can afford to offer time or money; neither of which are ever plentiful. People must also be persuaded that their contributions can make a real difference. For the pubwan movement, this means overcoming certain compelling objections. Given the existence already of automation-leveraged comparison shopping via commercial shop bots, people need reasons to believe that a volunteer-run alternative really can provide a significantly higher-resolution map of the market. The effectiveness of volunteer effort depends largely on whether the movement ever grows to a critical mass, which is to say a large enough number of participants for network effects to kick in.

Data collection

People volunteer information about their consumer behavior in copious amounts. They do this by the use of credit and debit cards, 'loyalty cards' of retail establishments, purchasing RFID-tagged items, filling out warranty cards, allowing cookies to be copied onto their computers, and by countless other means. Even those who gripe about loss of privacy participate as often as not, because the business community has made volunteering information a path of lesser resistance than not volunteering information. The pubwan movement does not have the infrastructure to direct information its way by such means. People must be persuaded to put some amount of effort into feeding information to pubwan, which of course is a form of volunteer work. People must also be persuaded to sacrifice a certain amount of privacy. Hopefully this can be accomplished by pointing out the fact that privacy is a lost cause because it simply isn't technologically feasible, and that pubwan has the potential to offer a large return in market transparency for a small investment of sacrificed privacy.

Mathematical modeling

Since one objective of pubwan is to create a higher resolution map of the consumer marketplace than is currently publicly available, its technical arsenal must include some tools of mathematical modeling.

Maxhi schema

This schema is based on the assumption that there are some quantities one wishes to maximize and others one wishes to minimize. Additionally, it is assumed that some parameters of a transaction are prioritized over others. Maxhi schema is simply a notational shorthand for labeling parameters as desirable or undesirable (max or min) and high or low (hi or lo) priority. 'Parameters,' in this context, which may represent quantity of product supplied, price or performance specifications. Maxhi schema can also be used on 'boolean' (yes/no) parameters such as presence or absence of certain product features, or absence or presence of 'strings attached' to a transaction.

Swar schema

This is the idea of using spatial modeling to visualize data. It is exemplified by social network theory, spring embedding and other graph visualization techniques.

Hybrid schema

This is simply combination of the maxhi and swar schemata.

Data mining

Data mining entails the care and feeding of large data sets, another reason why critical mass is probably the main obstacle to implementing pubwan. Data mining has a sister discipline called knowledge discovery. This results from the idea that there may be powerful insights to be gained by recognizing patterns in large data sets. The motives behind conceptualizing pubwan have included the desire to discover knowledge about elasticities of supply and demand, and personal utility functions plotted from empirical data. Of especial interest is the discovery of empirical methods to measure quantitatively how 'steep' is the tradeoff between competing objectives, especially efficiency objectives (essentially bang per buck ratio) and 'normative' objectives, which are public spirited in motivation. Examples of the latter might include minimizing carbon footprint, minimizing packaging, looking for the union label, and really anything implied (or at least claimed) by as-yet-untabulated information already in the public domain.

Existing projects and ideas relevant to pubwan goals

Hypothetical ideas

Database strategies

The key to implementing pubwan will be implementing a publicly accessible database. Unprivileged and hopefully anonymous read/write access must be accomplished, but data integrity must also be maintained. Disinformation will almost certainly be a problem.

Consumerium (now defunct)

One of many projects to promote informed conscientious consumption. Being a wiki, warehoused text not tables. Never reached critical mass. Language barriers may also have stymied progress.

Dumpster Divulger

A proposed informational amplification of the time-honored practice of garbage picking. Those of us who do know things retrieved are always a small subset of things noticed as interesting. Dumpster divulger is about sharing such tips with the larger community for a more efficient aggregate harvest.

Allocation models

Participatory economics (parecon)

Brainchild of Michael Albert. Probably naïve as an end-run around the Invisible Hand due to gross overestimation of computational feasibility, but incorporates some key features of pubwan. Essentially participants are asked to provide a systematic run-down of what their interests are concerning both production and consumption. In theory both work assignments and allotments of consumer goods can be allocated with satisfactory efficiency based on the data thus provided. Parecon is said to be implemented and operational in various anarchist-publishing-collective type organizations in Chicago, Montréal and perhaps other places.


Conceptually similar to parecon; this is the brainchild of John C. Lawrence.

Partially or fully implemented ideas

Movements and activism


'Sur' and 'sous' are the French words for 'above' and 'below,' respectively. In information theory, transparency is the idea that daylight is the best disinfectant, which is to say that investigation is best empowered by information. Asymmetric transparency is when providing information is one party's prerogative, and leveraging it is another's. The classic example is the 'blind box' help wanted ad that invites you to send a resume to an anonymous PO box, and later might invite you to come fill out an application for enjoyment, in which you sign waivers of your privacy rights concerning your consumer credit history, your body fluids, your personality profile, etc. In optics, examples of asymmetric transparency include the one-way mirror at the primate research center, and the 'mirror shades' so popular among the police. Asymmetric transparency is the basis for a surveillance society, modeled locally as a panopticon, and writ large as total information awareness. Practitioners of sousveillance hope that widespread use of cameras and microphones by members of the public can serve as an antidote to the surveillance society.

In terms of the (mostly retail trade oriented) applications of pubwan suggested so far, pubwan is to the dupermarket loyalty card as sousveillance is to surveillance. When you volunteer information to the dupermarket to avoid paying their above-market 'nominal' prices, you participate in a vast data mining project potentially capable of discovering deep insights about consumer behavior, plotting high-resolution demand curves, and providing clues as to your personality type (apparently called clusters in the trade). The dupermarket loyalty card collects this information automagically. You may find it worthwhile to manually [sic] relay some or all of this data to pubwan, where, if the project succeeds beyond our loftiest expectations, you will have contributed to open-source data mining yielding insights about retailer behavior, supply curves, and how much extra the Invisible Hand thinks such value-added features as 'fair trade certified' or 'American made' are worth to you...

By systematically coding and tabulating your preferences and priorities using maxhi schema, you may even gain some insights concerning yourself!

Database strategies

Barcode wikia
Online UPC database

Allocation models

the Sen-type social welfare function

Hardware for data capture

CueCat™, CueHack, CueJack
Community content is available under CC-BY-SA unless otherwise noted.