June 11, 2006
Information Architecture Session 3
How do Web search tools see your site: One word: TEXT. They don’t see graphics. They don’t see text saved in a graphic. If you want search tools to index your site, you have to have text.
But how do search engines actually work? This Web site provides a good introduction.
For a more in-depth explanation of how search engines work, in terms of indexing & retrieval, read this article by IST's very own Liz Liddy.
So, when a search engine's spider visits your site, it will grab all of the text on all of the pages that it can get to by following links. It grabs all of the text that it finds, and it knows what's text & what's HTML tags because it just ignores everything inside <>s. The spider extracts from the HTML document the words on the page and their locations. Then an indexing application indexes these words.
Then, when a user searches, the tool will use an algorithm to determine the order of sites for the listing. However, this algorithm is kept hidden. This has lead to a number of facts that you need to be aware of:
Search engine spiders don’t visit every site every day. In fact, spiders visit sites very rarely. Therefore, the items in the index will be out of date when you change your site until the spider visits again – which could be a while. You can, however, force a spider from a specific search engine to index or re-index your site by finding the page on a search engine's site that allows you to submit a website – this is in a different place on every search engine's site, so look around for it.
Think about the decisions a search engine has to make in this process. If a user only searches for one word, the search engines has to search through all of the pages it has seen to rank them based upon that word. Therefore, the search engine looks at the position of the word on the page (higher = more relevant), if the word is in a TITLE or HEADER tag, and the number of times the word appears on the page. So, if you want a search engine to return your page on a keyword, you need to use that keyword often and put it in prominent positions on the page.
Using Internet technology/tools like Flash, graphics, or frames can disturb this process, as it hides the text from search engines. Frames put the text in a different page, so if the search engine doesn’t index the whole site (which is common), it won’t find the text.
Some search engines only index the front page of a Web site. If that front page is an elaborate graphic splash page, guess what happens?
Different search engines index different Web pages. In fact, some research states that even the most comprehensive search engines only index about 15% of the Web. In order to let search engines know about your site, it’s good to submit it directly to the search engines.
Over the years, unscrupulous Web designers have tried to come up with ways to make their site appear higher in the search results. When the search engine makers realize this is happening, they will then remove pages doing this. Here are some examples you need to be aware of:
When people realized that repeating a word would make it ranked better, they would make small text and/or text the same color as the background with keywords repeated many times. Most search tools will discard pages that have “invisible” or tiny text.
Another tactic involves creating a page that has many many keywords that may or may not be related to the content in the site which when accessed, instantly redirects the user to the “real” site. The search tools, however, grab the first page and index it. To combat this, most search tools will discard sites that redirect traffic (Sound like a intro/splash page?)
There are some tags you can use in your header to aid the search tools. However, not all search tools look at these tags, and different tools give these different weight. But, it’s a good idea to include them on every page in the HEAD section of the page.
Here is the HTML:
The description tag is the short description of your site. Many search tools will take this and use it in the results listing. The keyword tag will tell the search tools what words are most important in describing the site. As both of these can be abused very easily, many search tools don’t put much weight on these META tags.
We will discuss current search topics such as paid-placement and paid-inclusions as well as more on controlled vocabulary and metadata later in the course.
Nomenclature, Verbal Branding and Naming
Nomenclature is another term for taxonomy (or controlled vocabulary). I use it for this session because the word does a good job of bridging the IA concepts of controlled vocabulary and labeling. Especially labeling.
When developing original content specifically for a site and its target users, naming/labeling that content takes on a higher level of responsibility. Verbally branding a section of content communicates the core characteristics and culture of an organization, a product or service, and the site’s purpose (i.e, the reason it was built in the first place).
Names must be easy to understand, and they must be distinct and memorable…
May 31, 2006
Information Architecture: Session 2
So we started a new week and a new session for my IA class. Here is an excerpt from the lecture:
Web Development Today
This is a broad topic. My intent though is for us to take a step back from the Web browser and desktop and put what we do as information professionals into a context (or perspective) that often gets lost in the rush to get projects done on time and within budget.
Not too long ago, Web site development was the sole domain of an organization’s IT department. As Internet usage increased, organizations started to allocate more funds to the “Web guys and girls” and soon enough new departments were formed with cool names like New Media, Interactive, and Online. Seeing the potential for increased communication power, productivity, and sales (to name a few applications), more traditional departments, such as marketing and human resources started to get in on the Web action as well. Before you knew it, an organization’s once novel Internet Web site [of mostly re-appropriated press information] quickly evolved into a useful and highly malleable extension of that organization’s mission, goals, values, image, and overall strategy for success.
Today, high-level, high-traffic Web sites/services are borne out of the integrated effort of every department within that site’s parent organization. It is the information architect’s job to think about, make sense of, and organize the macro and micro goals and requirements of all those departments into a cohesive and easy-to-navigate online space.
In your first critiques, many of you concluded that the assigned Web site, wor710.com was, more or less, not a very effective site based on its organization and labeling. These components are two of the founding building blocks of a site’s information architecture. And, as Rosenfeld and Morville write, “Information architecture happens, with or without information architects.” Decisions are made and Web sites take shape everyday.
Large-scale and high-profile Web site development is a highly integrated effort that involves the input of a diverse group of stakeholders. The WOR site however, was not borne out of such an effort. With no IT department, the station engineer doubles as the “Web guy.” So, if the site looks as if it was built as an afterthought, that’s because it many ways it was. Why? Simply put, the site is not a top priority within the company.
Using Krug’s ‘don’t make me think’ doctrine, you would never know from just hitting their homepage that WOR is the first radio station in New York to broadcast in high-definition (HD). I don’t know about you, but if my station was the first to do anything technologically new I would want to get that message across sooner rather than later. The fact is, WOR, while over 80-years-old, is one of the most technologically advanced stations in New York but, again, you would never know it by visiting their Web site.
One of the sites I have assigned this session is WABC 77AM's wabcradio.com. Why? In the last few years, WABC has become a direct competitor with WOR in the New York City/Tri-State market. While the majority of the two stations’ Web site content is categorically the same, it’s clear that WABC has put more thought into the structural design of their online content and overall Web presence. How though? Therein lies one part of your homework this week…
End lecture.
I also wanted to include a link to this diagram that we have to "absorb" this week. Is it called "The Elements of User Experience."
May 28, 2006
Escape from the walled garden (Library 2.0-style)
May 25, 2006
PennTags
So I was doing some research on this whole subject of tagging the library OPAC and I came across this posting at Shifted Librarian. It seems that there already is a tagging project going on at Penn. The project is called PennTags If you look at a sample record from their catalog you can see a link on the bottom called "Add to PennTags." Interestingly enough if you hit the link you are prompted for your Penn Library ID and password, so obviously you must be connected to the school somehow to add to it.
I still think that once more of these projects are underway it would be invaluable to aggregate them somehow and make them available to all library catalog users. This would be a great way to make the process truly social.
Update 5/25/2006...
I have been looking at the PennTags page and I really just find it confusing. There is a tag cloud on the top which gives you a view of the popular tags created, and then a very, very long list of random information. Doesn't seem like a very user-friendly space.
I think these tags should be more closely linked to the catalog. The designers seem to have been working under the assumption that people will want to tag library records in a selfish manner. While I am not privy to the successes of this project, I can't think of a good reason to tag a catalog record? With all the different citation services, like RefWorks, that organize bibliographic materials for you, why would someone need to go back to an OPAC record? Anyone?
May 23, 2006
Democratizing the subject heading
I found another hot social software application called Library Thing. This is software that allows you to create a catalog of book records that you, or anyone you want, can tag and and add to. This discussion has been going around, but wouldn't it be interesting (useful?) if you could set up a directory of tags (supplied by students, faculty, and staff) for your library's collection. This way you would have increased options for searching-the more formal OPAC and the very informal tag directory.
I don't think giving students more options is such a bad idea. What I would worry about is putting the time into creating it and having no interest or response from the people it is created for...you need that social input in a tagging system.
Update 5/24/2006
I have been thinking about this idea more and I just wanted to get some points down before I forget them.
First, I think we have to remember that tagging is almost entirely a selfish activity. The motivation to do it seems strictly about organizing your own stuff-your own pictures, your own links- so you can find them later or share them with friends. What makes these social bookmarking directories work is the fact that you can search others selfish tagging activities. If you find taggers with similar selfish organizational intentions you suddenly have access to a great deal of content that will more than likely be useful.
When it comes to a library OPAC there will clearly be a lack of interest in tagging it from most ordinary users, there just is no motivation that I can see. To create a useful directory you would need library staff tagging records. This is the workflow I thought of. You could have the cataloging department (in the case of my library- cataloging person) tag each new acquisition when they process the book.
To make this valuable as a secondary way to search the library collection you would need tags that went beyond the controlled subject headings and metadata categories offered in the traditional OPAC. My thinking is the content should be tagged at a finer granularity than that of the catalog. What level of granularity is what I am not clear about. It would depend on the type of material being tagged. If it was just a plan old non-fiction book on one suject you could include tags that referenced chapter or section content. This would provide useful fodder for searching.
If you go beyond the simple example of a non-fiction book the process just seems to get hazy and complicated. For instance, how would you tag the content of a fiction book in any meaningful way beyond what the OPAC does. It seems you would need to read the book first to figure out the theme or something like that. What about an encyclopedia, would you makes tags of the entire index, with each entry having its own. There is just not enough time and the sheer variety of methods of content delivery make this seem like an impossible task.
You could compromise and only tag certain genres of books, but this would leave you with an incomplete (and hence unuseful) directory. Would this not lead to a situation where the directory would be useful to one type of student searching for one type of material, while being completely useless for others. How confusing is that?
Also, if you left the process of tagging solely to the library staff, doesn't this leave the "social" aspects of tagging (and all its benefits) at the curb? Even if you could make a useful tag directory out of an OPAC don't you think that this ability to search the collection in two ways would lead to confusion on the part of the patron? Its hard enough getting our students to understand and actually use the regular catalog, now we want to offer an option? Is this something that we want to do?
So, I really want to believe that creating a database of tags for a library collection is a great idea, in theory it is. The obstacles are just too many at this time (in the way I am thinking of the process) to do it. Please readers convince me that it is a good idea to tag a library catalog, I really want to believe.
Update 5/24/2006...
I think I have the solution to the problem. Where my thinking went astray is seeing the social tag directory and the OPAC as two separate entities (and nary the twain shall meet). What if you could incorporate peoples' tag "suggestions" into the library's catalog? This would negate the confusion of having two different search databases.
This is what I am imagining. Someone (anyone) is searching the catalog and they come across a record that for some reason or another really excites them. Maybe it perfectly coincides with their research, or its their favorite book of all time, whatever, it doesn't matter. If we could have a link in the catalog to an email account with a notice such as this:
"Think you know what this book is about. Tag it!"
The user could easily send a tag suggestion to the appropriate library staff member in charge of catalog updating. It could then be incorporated into the record, in effect enriching it. The person in charge of this tag incorporation could act as a filter. Of course you could set up your own filtering parameters, but I would say the looser the better. Anything short of vandalism or blatant misrepresentation should be allowed. The result would be a better and less authoritarian catalog.
Now this would create other issues. One would be in what way would this new data be incorporated into the catalog? Should it be inputed into a pre-existing field like subject or should a new one be made to differentiate between the two. I think the best way is the one that is most user-friendly, and user-friendly means as flexible and robust as possible. So I would say make a new sub-category under "subject" labeled "tags" or "social tags." Make the default subject search a search of both "subject" and "tag" but give the ability to search them separately.
This is a very local endeavor. If we made it global then we would we have a very powerful tool. Since this would be a process that effected the catalog, and most library catalogs are connected to OCLC through WorldCat it seems that creating a linked directory of tags that could be applied to all WorldCat catalogs would be a way to get a wide array of user tags and the most enrichment possible.
Think about it, a bunch of tags created for one library book in multiple locations- recognized, sorted, cataloged and stored by OCLC. OCLC could then make all these tags available for linkage to any local library whom wanted them. It would be a process that starts locally, aggregates globally and then disseminates locally again.
OCLC needs to get on this project.
May 22, 2006
Summer class in Information Architecture
I am taking a class over the summer through IST @ Syracuse University. The name of the class is "Information Architecture for Internet Services." My intention is to convey some of the things I learn in that class here. I also plan to post links to all of the assignments I hand in and other tid bits I think m
ay be useful to the "librarian community."
Our first assignment is to critique this website based on two criteria: organization and labeling. In the first week introductory lecture we are instructed to "critique...in terms of how you currently understand the two criteria above." So the assignment is basically a benchmark to prove how much we have learned throughout the semester.
The assignment is due on Friday, I will post it when I turn it in.
Here is a key excerpt from the introductory lecture:
Gio Ponti once observed that the job of an architect is to "interpret the life of the inhabitants." The information architect's job is not that different. The IA's role is to interpret the needs of the Web site user then meet those needs by aiding in the planning and design of that online space.
Information architecture is the intersection of technology, strategy, and design. If well thought out and planned then all three elements will seamlessly connect to produce a cohesive and rewarding user-experience. If poorly planned, users will be lost, confused and frustrated. At which point, there's a good chance they will not revisit/reuse the site again.
My role for this course is to teach you the fundamental principles, concepts and know-how of information architecture as it applies to user-centered Web site design and development—regardless of your current programming or graphic design skills. Starting with the first critique, begin to look at Web sites more objectively, from a user's perspective. Also, consider non-Web user-experiences, products, and spaces (e.g., stores, airports, ATMs, cell phones) and think about how they organize and present their content.
The Venn diagram above is a graphical representation of the principles behind Information Architecture (IA). Each of the three circles comes together to instruct the Information Architect as to how content should be organzed and displayed. Did you notice how the professor calls IA the "intersection of technology, strategy, and design," while the picture I included uses the words "content" "context" and "users?" Hmmmm...I am getting the sense that there is some ambiguity in the basic taxonomy of IA, such as many aspects of the study of information in the academic setting. Could we dare say that IA is controversial...?
Alright...I will also include a link here to the IA page on Wikipedia. Take a look, you may learn a thing or two. I did.
ay be useful to the "librarian community."
Our first assignment is to critique this website based on two criteria: organization and labeling. In the first week introductory lecture we are instructed to "critique...in terms of how you currently understand the two criteria above." So the assignment is basically a benchmark to prove how much we have learned throughout the semester.
The assignment is due on Friday, I will post it when I turn it in.
Here is a key excerpt from the introductory lecture:
Gio Ponti once observed that the job of an architect is to "interpret the life of the inhabitants." The information architect's job is not that different. The IA's role is to interpret the needs of the Web site user then meet those needs by aiding in the planning and design of that online space.
Information architecture is the intersection of technology, strategy, and design. If well thought out and planned then all three elements will seamlessly connect to produce a cohesive and rewarding user-experience. If poorly planned, users will be lost, confused and frustrated. At which point, there's a good chance they will not revisit/reuse the site again.
My role for this course is to teach you the fundamental principles, concepts and know-how of information architecture as it applies to user-centered Web site design and development—regardless of your current programming or graphic design skills. Starting with the first critique, begin to look at Web sites more objectively, from a user's perspective. Also, consider non-Web user-experiences, products, and spaces (e.g., stores, airports, ATMs, cell phones) and think about how they organize and present their content.
The Venn diagram above is a graphical representation of the principles behind Information Architecture (IA). Each of the three circles comes together to instruct the Information Architect as to how content should be organzed and displayed. Did you notice how the professor calls IA the "intersection of technology, strategy, and design," while the picture I included uses the words "content" "context" and "users?" Hmmmm...I am getting the sense that there is some ambiguity in the basic taxonomy of IA, such as many aspects of the study of information in the academic setting. Could we dare say that IA is controversial...?
Alright...I will also include a link here to the IA page on Wikipedia. Take a look, you may learn a thing or two. I did.May 21, 2006
Cataloging a blog posting: Part 1 Introducing Lazybase
I'm in the process of cataloging my blog postings so they can be searchable in a database. I found this really cool site, Lazybase which offers a free social database application. Meaning you can create a list of just about anything you'd like, construct it and make it searchable in just about anyway you'd like, and make it available to just about anyone in the form of a URL. Let's just say the options here are quite duanting.
My adventure with Lazybase will be a series of postings. Right now I am thinking about how to set up the database in the most user friendly way, given the restrictions set up by the makers of this really cool socialsoftware application.
The next posting in this series will deal with some of those restrictions.