Forums and other forms of WWW users’ communities’ optimization

Author - Andriy Peleschyshyn, Doctor of Science

This article was written two years ago, and only now is published in the Lviv Polytechnic National University scientific journal.
Therefore, you will not find the term Web 2.0 in this article, because it did not exist then. However, the article is about how to control and optimize Internet community, the basic concept for Web 2.0
This article is published with simplifications and shortenings, particular y without the formulas, with the simplified headlines, and without some illustrations.
In this article definition and optimization features of the special sites subjects, including forums, blogs and other online communities’ forms are considered.
The principal difference between Internet-communities Web sites and traditional sites, that represent their owners, is a high degree of informational content dependence on the site visitors.
Such a direct dependence does not exist in traditional representative or informational sites.
This Internet-communities Web sites feature sharply distinguishes them from other sites in the process of solving of specific problems that are connected with site positioning in the global environment.
Firstly, it concerns the problem of thematic determination and effective representation.
As for the traditional sites, the topics for Internet-community Web sites are generated from the site informational content, and are reflected in the WWW audience, which finds this information, shows the interest and uses it. [17]
The distinction is the fact that the audience directly fills the site with information, forms, clarifies and changes the topics of Internet- community site. In addition to the direct informational communication “site-user” there is reverse communication of such type: " user-site”, which is dominant for some types of online communities (in fact, the site becomes derived from the community).
There is a wide range of different forms and ways of Internet-communities Web sites implementing.
But today the dominant forms of such sites types are the following:
• forums (Web-conference);
• blogs (online journals);
• integrated environment that unite blogs and forums;
• business community (Internet auctions, Internet exchanges, etc.);
• chats.
In this article forums, as the universal form of Internet-community realization, are considered more detailed. The features of other communities’ sites forms are analyzed.
The aim of this research is determination of basic approaches to forums and users’ audience modeling, determination of principles of the forums topics and content formation and determination of topics optimization methods.

Forums’ peculiarities.
In the problems of subjects definition and optimization (as in other problems associated with the systemic WWW characteristics) forum is to be considered as an autonomous site.
Even if it accompanies an existing site (“forum on the site”).
There are a number of significant reasons for this. [4] [14] [12]
• Forums, as a rule, are implemented by individual technical solutions, autonomous from the rest of the site.
• Forums often have other responsible persons and accompanying personnel - forum administrators and moderators.
• All forums pages have quite typical and regular structure, which usually differs from the main site pages (this factor is becoming more important with the Web Mining and Semantic Web Technologies development and practical realization) [8] [1].
When the forums are reviewed as an autonomous site, it is also important to note a number of features that are significant in the investigated problem, [13] [10].
These features are the following:
• Forums often have a large number of pages. Forums are one of the sites types, with potentially enormous number of pages.
• Forum pages are filled with textual information, and these texts on the pages are the products of human communication.
It prearranges a good correspondence between search engines requests vocabulary and the text messages on the forum.
Namely - the usage of frequently used vocabulary, popular language phenomena that are specific to the spoken language (such as slang, neologisms, acronyms, etc.), usage of the words with the typical orthographical errors.
• Forums pages have little rank in the pages ranking algorithms based on the links such as PageRank.
It is because of the large number of links on each page, and the large number of service pages on which there is no external links (outside from the forum).
This leads to the even ranking rate distribution among the pages and accordingly, its reduction for each individual page.
Although, the forum pages (and consequently, some special pages - such as forum index) overall rank can be high.
• Moving from search engines to the pages of discussions.
Visitors, who get to the forum with the help of the search results on search engines, get to the particular discussions, avoiding the index page and other special highly ranged pages.
• Search requests, with the help of which users get to the forums pages, usually consist of several words.
This feature is based on the previous one - the low pages rank and the large number of pages, filled with text, that are well ranged by search engines in the process of lexical criteria usage.
• Direct navigation on the pages of discussions.
Many site visitors get on the site with the help of direct links, which often lead to the particular discussion. It's possible that the index of deep external links to forums is one of the highest among other sites such as mass media sites.
• A large number of the visitors get on the site by the direct recommendations of other users.
It's possible that this figure for the forums and chats is the highest among other kinds of sites.
• Forums correlation. There is a tendency to the emergence of a large number of links to the forums pages from other forums. In particular, this is in the case of parallel discussions on the same topic or when the same people take part in several forums.
The model of forum audience.
Forum and site user’s model.

We will consider as WWW user people and software agents, which solve specific problems, providing access to the sites via Internet.
Formally, the user is described with the help of the following relation:
(basic characteristics, user’s objectives and user’s history) where:
basic characteristics are basic information that identifies the user;
user’s goals - a set of objectives, which the user aims to achieve at work in the WWW;
User’s history - the history of user’s meaningful action in the WWW.
User’s objectives- will be divided in the WWW system into the following classes:
Information goals - getting the necessary information from the WWW system;
Operational objectives - carrying out certain transactions in the WWW (purchase or sale of goods, e-mail sending and other);
Communicative purposes - communication in the WWW with other users.
An important subclass of the user’s information aims is navigational (search) purposes - finding of the necessary resources in the WWW.
History of WWW user may be presented as the history of user’s transactions in the WWW system.
Transaction is the sequence of interrelated user requests to WWW services and the results of their processing.
One transaction can consist of many users’ requests to the WWW.
However, request can simultaneously enter to several transactions.
The main objectives of the forum users are informational and communicative purposes. Transactions that are carried out to reach these goals are:
• first visit to the forum;
• re-visiting of the forum;
• registration on the forum;
• secondary action on the forum (voting, search, etc.);
• placement of items in the current topic;
• theme opening;
• citing of the forum;
• actions concerning the forum community (evaluation of other forum users, moderation, etc.).
For each forum a list of possible transactions may differ, but the above list is quite typical, especially with regard to trends in the forums implementation on the basis of the finished typical software model.
In this list transactions are ordered in the importance growth for the typical forum.
For the description of the forum user’s transaction just one timestamp is enough, (the transaction occurrence time).
User’s moving via the forum, is complicated enough because of the complexity of navigation system. But it leads to the execution of one of the previously named transactions with certain informational content.
Transactions type importance rates definition strongly depends on the forum type and the problems that its owners put.
However, as a rule, the importance of transactions types is equal to the above introduced order in the transactions types list.
The general principle of determining of importance for the type of transaction is considered in the following text.
The more affects this transaction forum community and content, the more important is it.
Measure of the forum’s user usefulness.
For individual site visitor (user) the significance extent, which reflects the mathematical expectation of certain goals, reached by the owners, concerning this visitor will be determined.
The user significance depends on the importance of specific actions, which he made at the forum, and the action type importance.
The user’s significance model is complicated in comparison with the expressions for the ordinary site.
It is caused by the following factor: steps of one and the same type can significantly differ for the forum owners in the importance.
In particular, the following fact is important - the significance of individual user actions (and also the user’s significance) - may be negative.
Moreover, the absolute value of users with negative significance indicator can be larger than the absolute value of the users with positive significance indicator.
This fact is caused by the availability of the interaction mechanisms between the forum users (there is no such an interaction on traditional sites).
User with a negative significance can have bad impact on the other forum users, which destroys forum community.
That, in turn, leads to forum topics deformation, forum activity reduction and forum authority lowering.
Unwanted actions, which have a negative value, is a very widespread for forums phenomenon that causes the emergence special terms (flame, flood, off top, spam, etc.), which reflect such action.
In general, users’ undesirable actions are the following:
• tactless and impolite behavior at the forum;
• messages without content (flood);
• provocation of enmity incitement between forum users (flame);
• messages that destroy the progress of discussions (off top);
• intrusive form of advertising (spam).

The measure of the user’s significance is the basis for further assessment of the site's audience significance, its structuring and site’s thematic optimization according to the audience significance.

Users’ interaction and ranking.
User’s significance evaluation is based on his forum actions significance, and basically, on user’s posts significance assessment and other interaction forms with other users.
In general, this evaluation requires the expenditure of much labor and cannot be clearly formalized. However, there are approaches, which help to simplify the task of assessing the value of users.
The general idea of simplification is the usage of the forum community for the formation of generalized evaluation.
This allows to reduce the cost for forum maintenance for the forum owners and to implement additional mechanisms for integrated forum community forming.
In particular:
• costs for forum user’s behavior monitoring reduce;
• response time for the operational situation at the forum reduce;
• the subjective sense of participation in the forum activity among the users increase.
However, this approach also has its drawbacks, the main of which is acquiring of certain independence of the forum community from its owners and reduction of opportunities for the forum managing.
The main generalized forum user significance evaluation has the following quantities:
• Users activity on the forum is, as a rule, the number of user’s posts.
• Authority among other users can be estimated informally or through special ratings and polls.
• How often users posts are cited.
To determine the credibility and the level of user words citing simple measures, such as the number of quotes or other users’ average estimate, or generalized weighted measures constructed by analogy to the extent of Page Rank can be applied.

Determination and optimization of forum thematic.
Let consider the main approaches to the forums thematic formation.
The main methods of topics formalization for classical sites that display information, formed by their owners, are:
• identification of the Internet resources Global Catalogues, to which the site belongs;
• identification of topics based on requests to search engines;
• contextual and purpose-oriented site advertising;
• identification of the global site environment.
In this case, for the site thematic optimization the site’s owner should carry out a set of measures:
• to find the relevant sections of catalogs and register the site;
• to improve the site technically;
• to form a text content and site metadata that reflect the site thematic;
• to form the site environment and to place the desired links to the site in the WWW.
However, site’s thematic optimization requires from the site owners principally other actions.
All of the above recounted methods for the site thematic determination remain urgent for forums (with the exception of site advertising. It is rarely used in forums).
For owners there is a significant difference in the possibility to influence the forum thematic compared with the ordinary sites.
In fact, from this set of measures owners can directly apply only the first two. Other measures (which are essentially more important) the owners can carry into practice in a very limited or modified form.
The site community fully implements the forming of the text content and placing of the links to the site.
The forum administration can only coordinate and intermediate influence this process.
Site owners (Forum Administration) coordinate the site community actions in subjects’ formation with the help of following tools and opportunities:

• General forum themes declaration. This declaration is used to the forum general description (for example on the front or special pages) and in the process of forum metadata formation (domain name, title, keyword, etc.).

• Formation of the basic forum environment in the WWW. The forum can be registered as a forum at the thematic site or group of sites. Forum partners can be identified. This issue will be discussed more detailed.
1. Determination of forum structure in sections.
2. Forum content rules defining.
3. Technical procedures determination (for example, old themes removal and illegal vocabulary blocking).
4. Forum moderators’ definition.
5. The usage of incentives for taking certain steps in theme creating.
Thus, the forum thematic has hierarchical nature.

Let’s consider each of the forum content determination levels more detailed.
1. Forum description.
It is used in the process of forum front pages, registration pages, ad text and press releases formation.
The text of forum description should be formed so that will interest those WWW users, which have potentially positive usefulness and withdraw those users that can be harmful to the forum. Thus, the qualitative forum description is a special linguistic and psychological problem.

2. Forum rules.
They are used for further regulation of the lower levels forum topics formation.
Forum rules can normalize internal processes of community self-organization and the basic restrictions to the forum content (forum working language, permissible and prohibited vocabulary, style, etc.).
These restrictions significantly affect the result of the forum topics definition through search requests.
For example, the prohibition of slang and bad vocabulary can significantly reduce the proportion of teenagers and people without higher education among the forum audience.
On the contrary, actively encourage of the neologisms and professional slang usage may increase the proportion of subject area professionals among the forum audience.
Together, the forum rules determine the forum overall descriptive characteristics, which might also be considered as the thematic forum description (although, very superficial and casual).
Examples of such descriptions might be: “Ukrainian-speaking forum for intelligent communication”; “Democratic forum, without censorship”;
“Forum, where everyone can speak free”;
“Special forum without advertising and flame”.
3. Forum metadata is used to form the pages titles, official reports, special tags, etc. Metadata should be formed with the same requirements as the forum description.
4. Basic forum environment.
The basic forum environment is the most important set of external links to the forum front pages, that were placed by the forum owners and with the help of which a significant portion of visitors comes to the forum (at least in the first stages of its functioning).
In practice the basic forum environment are sites for which this forum is defined as "a forum of the site."
From the point of basic environment view, forums are classified:
• Forums, which are an important part of the site functionality and are strongly integrated into the site.
Such forums can provide the function to write comments to the articles;
New messages on the forum can be displayed on the front pages of the site.
• Forums, which accompany the existing site or sites with a low degree of integration.
For example, forums, for which on the site only links of “forum site” style on the leading forum pages are determined.
• Forums that exist independently, and do not accompany any one site.
• Forums that generate accompanying site.
Most autonomous forums often generate an accompanying website for special functions (the most interesting forum topics review, general information about the forum, the forum report, newsletters, etc.).
Only in the first and, to the lesser extent, in the second case, the basic forum environment plays a fundamental role in the forum subject determination.
5. The forum structure.
It includes forum distribution on sections; moderators and their powers list.
6. Forum informational filling are messages texts and additional information.
7. External forum environment.
These are external links to the pages deep inside the forum, and in the first place, to the pages of discussions.
According to the forum themes hierarchy, the following forum content determining coordination process scheme takes place.
One of the most important aspects of the forum themes and content formation is the existence of inverse connection between the forum themes and arrival of new visitors that would take part in the forum content formation.
The importance of this connection is in the fact that the forum correspondents can become WWW users with negative for the forum significance (harmful for the existing forum community).
Taking into consideration that the damage caused by such users can significantly overbalance the income received from "positive" correspondents, one of the biggest problems of forum content formation coordinating process is to minimize the threat of forum community destruction and the appearance of unwanted correspondents.
The solving of this problem must be carried out comprehensive and should include:
1. Software and algorithmic means to prevent the unwanted forum users registration (such as filters in the registration process, and new users registration process pre-moderation);
2.Software and algorithmic means to prevent the appearance of unwanted massages on the forum. (Spam filters, lexical filters, certain user’s messages moderation).
3. Ranking means and powers preferment of users, which have high degree of credibility in the forum community and with the positive significance for the forum owners.
4. Reaction means of the forum administration and community on unacceptable messages (deletion of such messages, access points and user’s blocking, etc.);
5.Policy of the community informing about the measures, which take owners concerning the unwanted users’ unacceptable actions (administrative actions notice, such actions discussion style , etc.).

Other Internet communities Web-based forms.
In addition to forums, there are other forms of Internet communities in the WWW
• blogs (online journals);
• integrated environment that unite blogs and forums;
• business community (Internet auctions, Internet exchanges, notice boards);
• chats.
Let’s consider the differences between forums and other WWW community forms.
In addition, beyond the WWW, there are also special forms of Internet users’ communities organization such as news groups, discussion lists, etc.
Blogs
A blog (shortened from web log) is a specific form of community organization around a specific senior blog author or authors.
Compared with the forum new topics in the blog can be created only by the blog administration.
From the point of powers separation view, the blog users set has simplified structure.
Videlicet, the blog moderation function is usually carried out by the blog’s administrator.
From the point of blog’s external environment forming view, its principled difference from the forum is active usage of news channels in specialized formats (such as RSS, RDF, Atom), that helps other sites to import news from blogs on their pages.
Thus, a major component of the external blogs environment, the set of sites that import news, is formed.
The news export in special formats also allows expanding of the traditional blog’s topics presentation methods.
In particular, catalogs of RSS-resources, where news sites tapes are recorded, gain popularity.
The search services also introduce special technologies and searching services through the news tapes.
Integrated environments (Social Networks).
Integrated environments are formed by several leading global services.
In particular, such as Live Journal, Google Orkut, Blogger, etc.
The user of this integrated environment can automatically run his own blog, to comment on other blogs and participate in the forum (and facilitate his own one), which operate on the basis of this environment.
Formally, in this integrated environment, particular author blog or forum can be considered as the site. In such case, the site slightly differs from the above described forums and blogs.
Significant are the following differences.
• Less number of correspondents. Only the users of global service may be correspondents.
• A smaller percentage of traffic from search engines. The main navigation mechanisms that lead to the blog or forum of this type are direct links to them.
• Lowest level of information threats. In addition to the traditional opportunities, resources owners can expect more support from integrated environment administrators.
Conclusions
Site thematic optimization with the help of traditional for forums methods becomes more as the result of other actions in forums thematic optimization.
• Community optimization, rules building, the definition of responsible peoples, and their responsibilities fulfillment, the involvement of the community to forum thematic optimization and its representation.
• In fact, the owners of forums form a framework within which independent from them site thematic determination process takes place. So, the problem of thematic optimization transforms into the problem of policy forum coordinating optimization.
Forum thematic is defined by the forum community. It creates forum content and forms forum’s external environment (in particular, the set of links to the forum) and carrying out the self organization process (including the new community members, pushing aside existing one, etc.).