Photo Wiki

Ward Cunningham proposed this idea in April, 2000. It was imagined as one step toward a truely distributed (p2p) wiki.

Excerpts from this archive are reproduced below.

I've been wondering if it might be possible to make something a little like wiki for people who don't care to write but love to take pictures.

My first thought was an imagemap hack where you touch something in the image and it takes you there, or if nothing is mapped to that spot, asks you to find or describe where you wanted to go. Pictures could be uploaded with a description that would aid in search.

My worry is that pictures are too physical to make this work. First, the link density will be low because only so much shows up in any picture. And second, the distance between interesting pictures will be too great to travel there by clicking and downloading images over even a fast web.

Now I am thinking that I'd like to chop images into little pieces so that the pieces can be reassembled to make collages, perhaps through some emergent process. I'd be pleased if some sort of Monty Python humor surfaced. -- WardCunningham

The problem with using images only to provide context is that they are almost acontextual. It's Wittgenstein all over the floor. You need languages to express higher order thoughts, and give higher order context. If you take this another way, most photo albums come with captions to explain where the photo came from and why it is significant.

Now, pictures may be worth a thousand words, but that information is embedded mostly non-spatially. That is, things like contrast, colors, etc. provide information that an image map can't distinguish.

However, with low-context linking, you get an almost animalistic information structure. As if you couldn't think higher thoughts than, "a tree" and "another tree." (compared to, "let's file down the tree into toothpicks that I can sell to people in Uruguay.") This could be interesting, albeit non-rewarding for the casual reader. People like context.

Given a caption, however, this could be solved easily. It's just that this would barely distinguish a photo wiki from a normal Wiki. But then again, that's not necessarily a /bad/ thing, is it? Perhaps you could use images as the primary key instead of page names.

Another solution is to use a hypergraph instead of a plain old graph. A hypergraph distinguishes itself from normal graphs by allowing edges to be many-to-many connectors instead of one-to-one connectors. Then, one vertex could use one edge to connect to several other vertices simultaneously.

In this way, you could link a "smile" to a collection of all smiles.

The best way I know of representing a hypergraph is to have two types of vertices (let's call them nodes): content and structural. A content node would represent one image. A structural node would act as a collection or what have you to navigate between photos. Perhaps it would have thumbnails. -- SunirShah

I like Sunir's idea of the hypergraph. Here is how I am thinking it might work.

Suppose I were browsing the photo wiki and saw a picture that reminded me of one of my own. So I upload my jpeg and then point to the feature on both pictures that they share in common, forever associating the two images. Now, if I am looking at an image that has been heavily linked, something (javascript rollovers?) will give me a sense of what features are linked, and how many times. If I click on a feature with half a dozen links, then I'll get a page with half a dozen images, all similarly linked. By retrieving a half dozen pictures at a time I am more likely to see something that speaks to me.

Consider how this might work if there were lots of photo wiki servers. When I link my photo to one on your server, my linking tool will notify your server so that future retrievals of your image will be properly linked to mine. The nature of that notification is just about the only bit of protocol upon which our two servers must agree. This could be as simple as a http POST to each server that included:

  • the url of the local image that is being linked
  • the region of that image that contains the feature
  • the url of the other image that has the feature

If we made regions a standard size then the second item could just be the click location on an image map. This is getting pretty simple.

It would be up to every photo wiki server to do something sensible with this information. I've already suggested that I'd like to see multiple images come up in response to a click on a well linked feature. Javascript could do this by constructing a frame that included each retrieved image in a separate pane. That would allow each to do their click handling a little differently if they chose to. -- Ward

Ward's latest description of his photo wiki concept is intriguing. Here's what I like about it.

  • It has the characteristic simplicity and elegance we have come to expect from Ward. For instance, the idea of a 'standard' size for click regions really makes things easy.

  • It has the kind of appeal that could attract a cult following of sorts, and provoke people to go out shopping for digital cameras and cable modems - good for the economy.

  • It uses an extremely simple protocol which allows for tons of cloning fun, and allows for some individuality on the server. Both would encourage server 'owners' to get involved.

  • It leaves photo uploading policy to the individual server owners.

I think I understand everything you've described Ward, but I am wondering about specific ideas you have for implementing the linking UI. You mention a 'linking tool', but I'm not sure what you have in mind.

Here would be my ham-handed approach:

Each photo has an Add-Link link associated with it (perhaps not textual, but a small and tasteful graphic?). Following that link would present a form with the same photo and a field for entering the URL of the photo to be linked to. The user would fill in the 'linked photo' URL field and click the desired feature in the photo on the form to establish the link.

Is that anything like what you had in mind? -- TimVoght

I was vague about the linking tool because I hadn't worked it out myself. I was thinking of some frame based page that would show you the two pictures and let you pick the spots on both that match.

Let's imagine that the whole system were implemented in just a couple .cgi scripts. What would their urls look like? What do we need to say about the behavior of the scripts and the pages they produce in order to allow lots of implementations to work together? Let me write a few stories to get us started:

  • I browse upon an interesting photo and send its url to a friend. This is enough to get him interested in browsing and contributing to photo wiki.

  • I'm looking at two pictures from two different sites, probably using two browser windows. I connect the two pictures to each other by copying urls from each page into forms retrieved from the others and clicking on the features they have in common.

  • I browse to a site that hosts pictures like those I take. I file upload a few jpegs to the site and learn immediately what their assigned urls will be. I can caption the photos at my leisure.

  • I search the web for "photowiki art cars" and find a dozen servers hosting hundreds of pictures captioned as such.

  • I ask any site to list recent-uploads or recent-links. The answer is a page of thumbnails.


I like those stories. What do you think about this one? -- Tim

  • I am looking at a set of photos that I found linked from a ray-tracing of a '49 Buick Super. The photos have captions. I press a no-captions link at the bottom of the page, and get a page with the same photos, but without captions. That page has a with-captions link. The captions preference of the current page remains in effect for any photowiki page I click to from there.


Interesting story. I think it is less about the desirability of captions than about global preferences in a distributed system. I wonder if there is some way that a particular site could introduce a new preference and have it "catch on" at other sites? For example, a site operator might notice that many requests come with the "captions=off" preference and finally decide to implement this functionality. -- Ward

Use dynamic property binding and object swizzling to make new preferences.

Let's try that again.

A user logs in and gets a bunch of XML as his session information, including something that looks like


When the user is handed off to another site, the other site ignores that property but keeps it as part of the user's session token. The other site may add


The user returns to the first site and gets the session token again. It sees <frames>on</frames> and doesn't know what to do with that so it just ignores it. However, it too just leaves it in the user's session token.

However, since <captioning> is still there, it remembers what to do.

To do this, you must define a standard XML layout that is simple enough that <captioning> won't be placed in a hundred different places. (hint: make it flat) You must define a standard protocol that people use to transfer session tokens. You must set up a community mailing list or some other communication method so people can announce new token ids and you can avoid duplicates. That way, the chance another site will use <caption> instead of <captioning> lowers. And you must find a way of storing session tokens persistently so the user can enter the system at multiple points.

I'm not sure whether or not cookies can be shared between sites. If they can be, you've got the transferring and multiple point of entry problems licked. However, you're limited to 4k worth of options. (should be plenty... ;) -- Sunir

I first want to be sure my story was clear: what I was proposing was simply that the preference would be propagated from server to server through the URLs (using no more complicated mechanism than the way wiki passes search arguments now). As soon as the user browsed outside the system, the preferences would be lost. And yes, as Ward implies, it would be up to the server as to whether it would honor the preference.

The idea of sites introducing new preferences is really cool, and seems like it would work in a neat evolutionary way, with the fittest new preferences surviving.

The more my own concept of the photo wiki takes form, the more it seems that the individual server operator is given great latitude for creativity. Option preferences as Ward mentioned would be one.

Another thing that occurred to me was that - at least the way I'm thinking - each server could establish it's own policy for image map grid size. Taking that a step further, it seems that a creative server provider might build an experimental server that processes submitted images (with a perhaps not-too-complicated edge detection algorithm) and generates an optimum image map layout with finer areas where the image has smaller detail, etc...-- Tim

When given the option between an automatic process that is 90% accurate and a simple human process that is 100% accurate, choose the simple human process.

Allow the readers to draw the image map. -- Sunir

Ward's suggestion of a fixed-size grid is definitely the way to go. My main point is that the policy is entirely up to the server. Casual browsers will not care how the feature grid was generated. I want a server provider who is interested in image processing to be able to try something like this. Someone else might choose to go with a simple-minded quadrant approach. -- Tim

If you are planning for future enhancements to the image map, you will either have to provide some sort of masking multichannel or a vector map as part of the protocol.

If the map will be limited to just line-delimited segments, the vectored approach would be sufficient. On a similar vein to the HTML <map> tag (if I recall correctly). On the other hand, you'd have to validate that the vectors are indeed correctly formed. Overlapping partition lines, unclosed segments, end points off the grid, etc. could easily nuke the server if it wasn't defensive enough.

If you wish freeform maps, you will need a channel for each link. This could be expensive, even if you limit the mask to only the bounding rectangle of the active region. On the other hand, masks are simpler to validate. Hey, one of the rare more powerful and simpler choices!

However, the drawbacks show clearly why a fixed grid is a good thing. Simple. But, they are more likely to be complex to a user. Imagine trying to select an object that falls on the intersection of two grid lines. The object will be in four segments!

I'm personally partial to the mask idea. The masks can easily be compressed to a series of pairs of 8 or 16 bit values per line and it's trivial to determine of two marks overlap or perform a pixelwise hit test.

Advanced (read: whizbang) data structures can even make this superfast if you're really into that sort of thing. Not that I am. I don't think speed is more of an issue than space. -- Sunir

My comments about image map flexibility may have been out of line.For the time being, I'm looking back to Ward's original 3-point protocol description, and trying to grok it more fully. -- Tim

I'm getting confused talking about photo wiki because I don't know if my concept of how it works jibes with anyone elses. For that reason, I've tried to spell out my concept here. I know that this differs some from what Ward proposed. I've become enamored with the idea of putting as much policy as possible out of the protocol and into the server. I've used a lot of words in an attempt to be unambiguous. I hope it's readable...-- Tim

  • A PhotoPage is an HTML page containing, minimally, an image map and a link for accessing an AddLinkTool related to that PhotoPage.

  • A PhotoPageUrl is not a direct link to a PhotoPage, but to a cgi script which returns PhotoPages. The string following the script name in a PhotoPageUrl must conform to a standard format (2)

  • An AddLinkTool provided by a server must provide a user with a way to link a PhotoPageUrl to an area on an image in a PhotoPage held by the server. The mechanism for doing this is server-specific, and is fully encapsulated by the AddLinkTool. (1)

  • When a mapped area of an image on a PhotoPage is clicked, the server (on which that PhotoPage is held) delivers a framed HTML page, each frame containing a PhotoPage that was linked to the mapped area of the image clicked. (3)


(1) By not specifying a standard linking script format, this rule practically rules out linking via tools other than the one provided by the server. The advantage is that the specific image mapping mechanism used is independent of the protocol.

(2) TBD

(3) 'framed HTML page' may be too restrictive. For instance, might we allow a server to deliver each PhotoPage in its own browser frame?

Here you'll find my 'spike' attempt at a photo wiki. If you try hard enough, you may surely break it. Instructions are sparse, just play.

A couple of notes: To make a link, click 'Link a photo to this one'. In the linking form, fill in a URL and click the spot where you want the link to be established. The correct URL to use in the form is shown on the image page you're linking to.

The images you see were things I had sitting around. The copyright status is questionable on some of them. Please keep that in mind. I'll be removing the questionable ones soon. This is only a short-term demo. There is currently no mechanism for submitting images.

Criticisms of 'fine points' of the implementation will not be particularly productive. I already have a list as long as my arm...

I'm hoping this will engender some discussions about the photo wiki problem domain.--Tim


Last edited April 1, 2001
Return to WelcomeVisitors