mBsLOG

    Welcome to my weblog. It is an unconventional blog in that I am not planning to post daily or weekly, but only as topics of interest emerge. I enjoyed playing a little with my initials and the word blog and am amused by the fact that it is as much something I am slogging through as something I am blogging about. This listing only shows the five most recent posts.

    • Here is an index of all the topics with direct links to the post.
    • Here are the posts from 2007.
    • Here are the posts from 2008.
    • Here are the posts from 2009.
    • Here are the posts from 2010.

    I will try to discipline myself to keep a more or less regular set of reflections coming, but I can't promise. I have disabled commenting and discussion as it ended up being more maintainence and cleanup than I cared to deal with. That doesn't mean your comments and thoughts aren't welcome. Should you wish to comment on what I have said, I will be happy to add your comments verbatim so long as they are not spam. Simply send an email to me at Pitt -- see my home page. I will insert it in the appropriate post with attribution if you wish. Please reference the title and date of the post on which you are commenting. Also, if you want to suggest a topic that might be covered or discussed, let me know and I will try to include it.

    Here is access my mBsLOG as an rss feed.


    Sat, 15 Dec 2007

    Bookmarks and Meaning(December 15, 2007)

    Social bookmarking systems provide a new source of infomration about resources. In this post, I try to set out some conceptual views of social bookmarking as a mechanism for asking what might be derived from an analysis of social bookmarks. The delicious system works as follows:

    • A user posts a url
    • To save the URL, the user must describe it -- this could be defaulted to a title, but it may be more bookmarker centered than page author centered
    • The user may add user notes and tags
    • The user may decide not to share the bookmark, making it private

    With this in mind, at the very least, a social bookmarking system would include a triple that consists of a URLID=normalized URL, a USERID, a DESCription, OPTTag(s), OPTNotes, and SHARE(default TRUE). A conceptual table such as this has the potential to provide the following information:

    • The number of URL's that have been recorded
    • The number of users of the system
    • The number of user-URL's that are marked private
    • The number of user-URL's that are shared
    • The number of URL's that are tagged
    • The number of user-URL's that have user notes

    For users, we can determine the following information

    • The minimum, maximum, average, median number of total, shared, and private URLs/user
    • Various measures of the variance in the total, shared, and private URLs across users
    • The minimum, maximum, average, median number of tags/user
    • Various measures of the variance in the number of tags across users
    • The minimum, maximum, average, median number of descriptions/user
    • Various measures of the variance in the number of descriptions across users

    For URLs, we can determine:

    • The minimum, maximum, average, median number of total, shared, and private users/URL
    • Various measures of the variance in the total, shared, and private URLs across URLs
    • The minimum, maximum, average, median number of tags/URL
    • Various measures of the variance in the number of tags across URLs
    • The minimum, maximum, average, median number of unique tags/URL
    • Various measures of the variance in the number of unique tags across URLs

    Beyond these measures we can examine a number of issues

    • Looking at tags, ordered by frequency of occurrence:
      • are there obvious groupings of types of tags(semantic, affective, personal)
      • do the most frequently occurring tags tell us anything about the collection
      • are there patterns in the cooccurence of tags -- that is, for some threshold of frequency of co-occurence across URL's, is there a clear relationship between the co-occuring terms that allows us to simplify or clarify the tagging. Does the same hold for low co-occurence terms -- i.e. can we say some things about the terms.
      • Is it possible to develop a tag map that would work as follows: take the n most frequently occurring terms and set them around the circumference of a circle. Take any term that co-occurs with one of those terms more than x%(e.g. 90%) of the time and bundle it with the more frequently occurring term. (If this was one of the original n, add a new n to the circle.) Take terms that co-occur 50-90% of the time and place them on strings proportionally distant from the terms they co-occur with. If they co-occur with two or three terms on the circle, web them such that they are proportionally distant from all the terms. If they only occur with one term, fan them outside the circle proporionally distant from the term. What kind of term map does that provide -- how might it be improved?
    • When we look at tags by users,
      • can we identify communities of interest? (common frequently occurring tags)
      • can be identify expertise (high number of URLs with l evels of commonly used tags)

    There are surely many more questions that we might try to answer and there are surely more formal ways of formulating what might be inferred. I will be returning to this entry in the coming months and trying to add more thoughts about this.

    [/2007/12] permanent link



     
     

    Accesses since January 1, 2007: