Website Redefined (12)

1 Name: talysman!!/0CigS8/ : 2007-03-23 17:01 ID:VVLgo/r2 [Del]

This is mostly running notes as I work on the website this weekend. Some of the information is based on the XML structures discussion we had here earlier:

http://www.interrobangcartel.com/forums/kareha.pl/1169526414/

And some of it is based on data from the temporary wiki:

http://ibcwiki.spaceroom.org/tiki-index.php

Plus discussions Charlie, Doctroid and I had in the Moo.

2 Name: talysman!!/0CigS8/ : 2007-03-23 17:25 ID:VVLgo/r2 [Del]

Simple XML Structure:

This is a blend of the two approaches mentioned in the wiki, but stripped down, with the idea that we'll design for expandability. All that matters right now is how to implement this quickly.

We're going to need three categories of XML file:

  • The site config file. In fact, we don't even need this, yet. We'll code it into the CGI script and migrate it out later. A user.xml file would fall in this category, too, but that's for later.
  • The XML for each page. This does not contain text to be displayed. It only contains display templates, media to link, and data sources.
  • XML files for each stand-alone type. For us, currently, this is albumart.xml and song.xml. There may be other types at a later date.

Text that gets displayed on a page (for example, lyrics for a song, or liner notes on how a song or album came to be) is not included in any XML file. It's a plain-text file that gets translated by Markdown. For each tab on a page, there is a directory with the same name (lyrics/, comments/, liner_notes/) and a file in that directory with a name based on the page name (lyrics/Schrodingers_Car.txt, comments/Schrodingers_Car.txt, etc.)

However, I'm debating whether to make the XML data a separate file (pages/Schrodingers_Car.xml) or an entry in a pages.xml. It depends on how big we expect pages.xml to be.

3 Name: talysman!!/0CigS8/ : 2007-03-23 17:52 ID:VVLgo/r2 [Del]

Pages XML Content:

Each page has the following XML elements:

  • page-name -- used to find .txt files and for page title.
  • page-type -- identifies which template to use.
  • page-image -- displayed to left of info box, for example.
  • tab type="caption" -- a tab that will appear on the page.

The page-name and page-type elements can only appear once. The page-image element can appear more than once, in which case the page lets you cycle through the images.

There can be more than one tab element, but each one must have a separate caption. If there is no XML element with a name matching the caption, the script looks in a directory named caption/ for a text file named page-name.txt and displays it, as mentioned before. If there is an XML element with a name matching the caption, the script looks for a template file named caption.xslt (or whatever we decide to use for templates) and fills the template with data contained in the XML element.

4 Name: talysman!!/0CigS8/ : 2007-03-23 18:23 ID:VVLgo/r2 [Del]

Example Dynamic XML Tabs:

  • tab type="credits"

This builds a credits tab from a series of XML elements of the form <credits type="artist-type" name="artist-name" />. There can be multiple entries with the same artist-type, which will be displayed as a comma-separated list. However, true duplicates will be ignored.

Each artist name will link to an artist page.

  • tab type="tracks"

This builds a track listing tab from a series of XML elements of the form <tracks type="1" name="songname" />. The type attribute here is used for the track number, and the songname is looked up in the songs.xml file to get credit information.

Each song name will link to a page for that song.

  • tab type="contributions"

This builds a contributions tab from a series of XML elements of the form <contributions type="contrib-type" name="contribution">. If the type is "albumart" or "songs", the script looks up the name in the appropriate XML file to get the exact contribution(s), so there only needs to be one entry for each piece, even if the artist made multiple contributions. The script displays all relevant contributions based on credits XML elements in the albumart.xml or songs.xml file.

If the type is not the same name as an XML file, the script looks up the type in pages.xml to find all the credits entries and displays the ones with the same name. The name should be the same as page-name, in this case. Example: <contributions type="Needs More Wanger" name="Kibo"> should find a credit for Kibo for the font design. This form is mainly used to add credits to an artist's page that don't fit the usual album art or songs categories, like coming up with the album concept or name, or posing for a photo someone else took.

If there is neither an XML file nor a page named "contrib-type", the script simply displays "contrib-type: contribution" on the page.

5 Name: talysman!!/0CigS8/ : 2007-03-23 19:59 ID:VVLgo/r2 [Del]

Page XML:

Assuming all the pages are in a single pages.xml file (for now,) the structure would look something like this, for songs:

<page>

<page-name name="The Robot Song" />
<page-type type="songs" />
<tab type="lyrics" />
<tab type="linernotes" />
<tab type="credits" />
<tab type="comments" />

</page>

Things I'm thinking about:

  • The <page> element could have included type and name attributes, but I decided separate page-name and page-type elements was better. I'll test various representations, however, using XML::Simple
  • The script will create another variable, $page_file_name, containing a converted form of the page name (underscores instead of spaces, for example.) It will use this to build file names for inclusion.
  • I'm worried about tab order, but haven't decided which way is best. Probably a separate array.
  • Album name and images aren't listed. The template gets the album name for songs from the songs.xml file, not the pages.xml file. Likewise, that's where the <credits> elements are. The script looks up the album name in pages.xml to find a page-image element.
  • I'm debating what to do about versions, now. I might go back to the old way of making them separate pages, but there should be a main page that links to multiple versions. No matter how it's handled, the info will be in the songs.xml file.

Here's the XML for albums:

<page>

<page-name name="Needs More Wanger" />
<page-type type="album" />
<page-image type="albumart" name="Wanger_Front_Cover" />
<page-image type="albumart" name="Wanger_Insert" />
<page-image type="albumart" name="Wanger_Back_Cover" />
<tab type="linernotes" />
<tab type="tracks" />
<tab type="credits" />
<tab type="comments" />
<tracks type="1" name="The Robot Song (dalek version)" />
<credits type="font design" name="Kibo" />
<credits type="title" name="Dean Lenort" />

</page>

Incomplete tracks and credits listing, but it gets the point across. Again, no credits for individual songs or for the album art are included in this page entry, because that info is imported from albumart.xml and songs.xml.

Here's the XML for artists:

<page>

<page-name name="Kerri" />
<page-type type="artist" />
<page-image type="photos" name="Kerri" />
<tab type="bio" />
<tab type="links" />
<tab type="contributions" />
<tab type="comments" />
<links url="someurl" name="home page" />
<contributions type="songs" name="The Sun! She Explode!" />
<contributions type="songs" name="Beep" />
<contributions type="songs" name="Rewind (Simonesque version)" />
<contributions type="songs" name="Hatefukker" />

</page>

Again, not complete, but you get the idea: "bio" is the name of a directory where we can find Kerri.txt, elements named "links" are combined for display on the links tab. The page-image element is used differently, here. Presumably, there is no photos.xml file, so the script looks in a directory named photos/ for any images beginning with "Kerri". The contributions tab is populated with data from songs.xml for the four songs listed.

6 Name: talysman!!/0CigS8/ : 2007-03-24 22:00 ID:VVLgo/r2 [Del]

Renaming of Attributes:

Some of the elements/attributes don't work well with XML::Simple, or at least I haven't found a good combination of options to make them work. I'm testing changes.

So far, these new forms for two of the elements work the way I expect:

  • <tab id="tracks" />
  • <tracks id="1" name="The Robot Song (Dalek version)" />

These work with the following code:

my $ref = {
'tab' => {
'tracks' => {}
},
'tracks' => {
'1' => { name => 'The Robot Song (Dalek version)' }
}
};
my $xml = XMLout($ref, RootName => 'page',
KeyAttr => { tab => 'id', tracks => 'id' };

I'm still verifying workable forms for other elements.

7 Name: talysman!!/0CigS8/ : 2007-03-25 01:39 ID:VVLgo/r2 [Del]

Further Testing:

I have a hash structure I can convert into XML and convert the XML back to the same structure. To get around some problems with the credits, I changed the format to match that of the tracks element. I did the same with the picture image, and embedded this in a pair of <page> tags. I also decided on a standardized set of attributes to use in the tags. Here is the structure for an album:

<page name="Needs More Wanger" type="album">
<credits id="1" name="Tim Chuma" type="concept" />
<credits id="2" name="Kibo" type="font design" />
<credits id="3" name="Dean Lenort" type="title" />
<image id="1" name="Wanger Front Cover" type="albumart" />
<image id="2" name="Wanger Back Cover" type="albumart" />
<image id="3" name="Wanger Insert" type="albumart" />
<tab key="comments" />
<tab key="credits" />
<tab key="linernotes" />
<tab key="tracks" />
<tracks id="1" name="The Robot Song (Dalek version)" />
<tracks id="2" name="The George Hammond Conspiracy" />
<tracks id="3" name="Pigskin Loofah (rip cut)" />
<tracks id="4" name="Pigskin Loofah (buzz cut)" />
<tracks id="5" name="Pumpkin, Mrs. Farnsworth (English Country Garden Mix)" />
<tracks id="6" name="The Robot Song (Data version)" />
<tracks id="7" name="Free Your Cones (the rest will follow)" />
<tracks id="8" name="Young Human Body Transplant 13" />
<tracks id="9" name="Chalice of Fire" />
<tracks id="10" name="320 World" />
<tracks id="11" name="Pumpkin, Mrs. Farnsworth (London Share House Mix)" />
<tracks id="12" name="Comar:" />
<tracks id="13" name="Beep" />
<tracks id="14" name="Bonus Track" />
<tracks id="15" name="Pumpkin, Mrs. Farnsworth (Spaghetti West End Mix" />
</page>

So: the arrays fold on the "id" tag first, which is always a number. They fold on the "key" tag next, which is always used to either insert a text file or build data from a tag with the same name as the "key" value. The other two tags are: "type", which is used to reference either an XML file or a directory with the same name as the value of "type"; and "name", which is always a text value that can be displayed on the page or looked up in an XML file (specified by "type".)

Next, I will be testing song and artist page types in case there are any other problems to work out.

8 Name: talysman!!/0CigS8/ : 2007-03-25 19:32 ID:VVLgo/r2 [Del]

Further Testing, Part II:

I've verified that the following XML structure for songs can be converted to a hash, and the hash can be converted back to the same XML structure.

<page name="The Robot Song" type="song">
<tab key="comments" />
<tab key="credits" />
<tab key="linernotes" />
<tab key="lyrics" />
<tab key="versions" />
<versions id="1" name="The Robot Song (Dalek version)" />
<versions id="2" name="The Robot Song (Data version)" />
</page>

I tested the extra "versions" tab and versions XML elements to verify they work, but I haven't decided whether to put information here or in the song.xml file. These are actually arrangements, rather than versions, according to the way we have discussed them before. Songs that use the same lyrics should probably be distinguished in the song file, not the page file, so for the canonical version for song pages, we should drop the <tab key="versions" /> and the <versions> elements.

However, songs like "Zebra Races" that have two sets of lyrics should probably include links on each page for the alternate version. For now, I am assuming there will be an XML element named <alternates>, in the same format as <versions> or <tracks>, which points to alternate versions of the lyrics.

9 Name: talysman!!/0CigS8/ : 2007-03-25 20:13 ID:VVLgo/r2 [Del]

Further Testing, Part III:

I've verified the third XML structure, the one for artists. I can convert this XML to a hash and back again to the same XML structure.

<page name="sanspoof" type="artist">
<tab key="comments" />
<tab key="contributions" />
<tab key="bio" />
<tab key="links" />
<links id="1" name="LiveJournal" url="http://sanspoof.livejournal.com" />
<contributions id="1" name="Bad Coelacanth" type="albumart" />
<contributions id="2" name="Reverse Archaeologist" type="song" />
<contributions id="3" name="Interests Are For Jerks" type="song" />
</page>

(Yes, I cheated, I picked an artist with only a few contributions.)

Again, contributions are assembled from other XML files, specified by the "type" attribute. The script looks up "Bad Coelacanth" in albumart.xml and verifies that "Sanspoof" has a credit for album art.

I thought about putting the contact info and user preferences in the same XML file, but then we have problems with who has the write to edit the file and who gets to see the hidden content. It's better to put that in a users.xml file instead.

At this point, I've decided that the XML files referenced by the "types" attribute should all be singular (song.xml, albumart.xml, video.xml when we add it,) while tab "key" attributes should be plural in most cases (comments, contributions, credits, with the exception of bio,) as should any XML elements or directories those keys refer to. This should help us keep our naming semi-straight.

With the exception of the site config file, any system-wide xml files should be plural, too. I'm thinking of users.xml and pages.xml. After comparing the drawbacks of putting all the pages in one XML file vs. individual files for each page, I've decided that the least memory-intensive approach is to do both: an individual XML file for each page, but a global pages.xml file that only contains one XML element per page. I think a pages.xml file would give us the benefit of being able to point more than one name towards a given page, so that we could include abbreviated album titles like "Supermarket" and "Wanger", and also it makes a future search filter easier. Also, we could point "Zebra Races", for example, to a partial results page with links to both versions.

Another possibility I'm toying with is a page-tagging feature, but that's for later.

10 Name: talysman!!/0CigS8/ : 2007-03-25 21:42 ID:VVLgo/r2 [Del]

The pages.xml File:

I tested this as a structure for pages.xml:

<pages>
<song key="Zebra Races" name="Zebra Races I" also="Zebra Races II" />
<song key="Zebra Races II" name="Zebra Races II" also="Zebra Races" />
<album key="Supermarket" name="The Last Days of the Crazy People's Supermarket" />
<album key="Last Days of the Crazy People's Supermarket" name="The Last Days of the Crazy People's Supermarket" />
<artist key="Sanspoof" name="Sanspoof" />
<artist key="JWGH" name="Jacob Haller" />
<artist key="Jacob Haller" name="Jacob Haller" />
<album key="Wanger" name="Needs More Wanger" />
<album key="Needs More Wanger" name="Needs More Wanger" />
</pages>

Originally, I had wanted to use type="song", type="album", and type="artist", matching the way I used these in the individual page files. However, I couldn't get the structure I wanted, so I went with <song>, <album> and <artist> elements. This adds an extra step when looking up a page (the script has to check each type,) but it makes sorting easier.

The new attribute for these XML elements is "also", which indicates a related page of the same type. This allows us to include references to similar pages. It's probably not important, but if we have to, we can use a type prefix followed by a colon to indicate a similar page of a different type, such as if we were to do a song with the same name as an album.

11 Name: talysman!!/0CigS8/ : 2007-04-03 23:41 ID:VVLgo/r2 [Del]

<box> Elements:

I want to add another element type to page files, called <box>. It works pretty much the same as <tab>: you would see a couple elements that looked like this:

    <box key="image">
<box key="">

The script interprets the key attribute in the same way as for <tab>, either reading a text file in a directory of that name and translating the text markup so it can be displayed in the box, or reading data from the data structure using the key name as an index.

The blank key (which maybe will be left out, so that it just reads as <box>) indicates that data should be read from the data structure root.

Not all data is used. Which data is actually displayed is determined by the template file. The (slightly modified) way for handling templates is: the script takes the page type (song, album, artist, or default,) adds the box key name, and then adds the word "box" to determine the name of the template file. So, there would be templates named songimagebox.xslt, albumimagebox.xslt, artistimagebox.xslt, and defaultimagebox.xslt, which might all be identical but could have minor differences. There would also be a songbox.xslt, albumbox.xslt, artistbox.xslt, and defaultbox.xslt, which would not be identical; these would be the primary info box.

Positions of each box on the page would be determined primarily by CSS.

The difference between boxes and tabs, display-wise, is that boxes are always visible, while only one tab is visible at a time.

Note also I have added a "default" type, which does not necessarily need to be indicated; it's assumed that if type isn't specified, the type is "default". This can be used for generic pages that don't fit other categories, like a list of unrecorded songs, for example. These would not display an image box, and the info box would only include the page title. There would be only two tabs, one for the page content and one for comments.

12 Name: talysman!!/0CigS8/ : 2007-04-04 23:31 ID:VVLgo/r2 [Del]

The song.xml File:

I know I keep changing whether this is singular or plural, but I'm sticking with singular now. The song.xml file is a list of all songs that have been recorded. It identifies the credits, album name and other info for each recorded version. I tested the following XML structure and was able to confirm the hash it produced:

<songs>
<song name="The Robot Song" >
<credits id="1" name="NotR" type="arrangement" version="1" />
<credits id="2" name="NotR" type="vocals" version="1" />
<credits id="3" name="Casey Bennetto" type="vocals" version="2" />
<credits id="4" name="Casey Bennetto" type="arrangement" version="2" />
<songinfo id="1" name="Needs More Wanger" type="album" version="1" />
<songinfo id="2" name="Needs More Wanger" type="album" version="2" />
<versions id="1" name="The Robot Song (Dalek version)" />
<versions id="2" name="The Robot Song (Data version)" />
</song>
</songs>

In order to keep the data structure as shallow as possible and make incorporation into page data as painless as possible, I added one more element attribute: version, which indicates a numeric value that matches the value of the id attribute of one of the <versions> elements.

It occurs to me that the id should be unique and not necessarily sequential, except in the case of the <tracks> element used in album pages.

Name: Link:
Leave these fields empty (spam trap):
More options...
Image: